When it comes to internal mechanics, it works similar to Map but with one difference. You cannot assign any key you want, it will be generated with special hash function, which will calculate key based on element value.
It is likely to happen, that result of the hash function will be the same as for different value. Such phenomenon is called a collision. Another difference when comparing to Map is fact, that nor values or keys must be unique. It was achieved by mapping key to many elements holding values. Key’s mapping target is called a bucket, and values in it are called entries.
Hash Function
To create value for key we must use hash function on value we want to add. Special feature of such function is to provide fixed-length hash for variable-length data. The bigger length of the hash, the lower chance for collision to happen. For the sake of example let’s make it 4 numbers long.
Calculating Hash
The idea for our hash function is dead-simple. It takes a value and calculates its length, a number of characters to be precise. Since hash must have a fixed length we will give it four digits. If number of characters is less than 10, it will be padded with 3 zeroes, for chars number between 10 and 99 with 2 zeroes, and so on.
Key | Hash | Collides with |
---|---|---|
hash(“coffee break”) | 1200 | - |
hash(“tea pot”) | 7000 | - |
hash(“vegan milk”) | 1100 | - |
hash(“bitter chocolate”) | 1600 | - |
hash(“small cup”) | 9000 | - |
Handling Collisions
So far, so good. No collision took place, no need to worry. But what happens if hash is not unique? Let’s add another string “orange cream”. Oh, dear. Hash which was returned is collided with one generated for “coffee break”.
Key | Hash | Collides with |
---|---|---|
hash(“coffee break”) | 1200 | - |
… | … | … |
hash(“orange cream”) | 1200 | hash(“coffee break”) |
What can we do in this situation? There are generally two ways of handling this.
Separate Chaining
In this first resolution, buckets get form of Linked List which is element corresponding to its hash-based key. If key is not unique, value will be appended to underlying list.
Open Addressing
Another solution for collision resolution is to keep this value under different hash, which will be resolved by so-called probing, or search for alternative location. To find this within structure it must be started with colliding key and then properly searched.
Usages
The best thing about Hash Table is that it knows position of particular value only by hashing it before. Hashed value is our key, so access to this element takes no time. Even if we search for collision-making value, thanks to resolution like Separate Chaining or Open Addressing we can still get the best from it.