Sunday 21 February 2016

How Hashmap works in java ?

Hash map is one of the most important API to master if one is preparing for a java interview.
The interviewer expects the person to know each and every aspects of it.
Some questions can be:
1) Explain the get and put method of hash map or hash map works in java ?
2) What are the properties of the key you use for the hash map ?
3) Can hash map be used in a multithreaded environment ?
4) What is a concurrent hash map
5) Changes made to java 8 to hash map?
6) Complexity of get and put in hash map?
7) when is the complexity of get and put operation o(1) in java ?

In this section we shall only discuss the working of the hash map and discuss the property of the key to be used in hash map.

How hash map works in java ?

The built up starts with the interviewer asking have you worked on collections. To which we all proudly say yes. Then a follow up questions on Array List, Linked List, then the main question on how hashmap works.

Now to answer simply Hash Map works on the principle of hashing.

We need to realize hash map is nothing but a datastucture which can maintain key and values.

Hash map internally has a Entry class whose structure is something like given below:

class Entry<K,V>{
K Key;
V Value;
Entry next;
}

Here the next represents the next element in the hash bucket(more on this later).

This Entry object maintains the key and value. For example if we make a map like one below:

HashMap<String. String> map = new HashMap<String,String>();
map.put("first","example");

Now internally these key and value gets set in the Key and Value attribute of the Entry class.

So remember for now that hash map maintains an Entry Object for every key, value pair we enter.

Now since we add more than 1 key value pair hash map needs a Data Strucutre to store the Entry object for each pair.

Hashmap fulfills this by keeping an array.

So hash map class internally also has an Entry[], where it stores the Entry Objects.

Now the question is how does the Entry Object fill in the array ? That is how does it decide which location of the array does it needs to fill the Entry object ?

The answer to this actually gives the hash map its name.

For every key entered the hash map calculates its hashcode and using this hashcode it decides the position where it needs to store the Entry object.

Therefore to explain more clearly, whenever we call the put method of hash map, the hash map internally passes the key to its hash mehtod which takes in key.hashcode() as an argument. HashMap recalculates the hash value to prevent anyone from entering a very high or a very low key.

Now the value obtained from hash method is then used to calculate the array location using the & operator along with the initial capacity.

So,

KEY - > Hash  -> & with initial capacity -> array location

Then the entry is populated in this array location.

Before going any further let us disccuss a boundary case.

what if someone were to enter a null key in the hash map.

Wont it break and throw a NPE at key.hashCode() ?

Well actually HashMap has special provisions for null keys, whenever it encounters null value HashMap stores the entry for this in the 0th location of the array.
Note that HashMap can have only one entry key.
Now a problem comes when two objects have same hash key.
We need to understand that now since both keys have the same hash code the array location obtained for them would be same.
Now we shall come back to the structure of Entry class we were discussing earlier. Entry class is a single Linked List(uptill Java 7 more on this in the next blogs).

class Entry<K,V>{
K Key;
V Value;
Entry next;
}

Therefore whenever a key with the same hash is added it is added in the same array location in the linked List.

How does get method works ?
An object is retrieved from hash map in way very similar to the way its stored.
Now since we know the key value pair is stored in the Entry array the location for which is determined by the hash code of the key, so the first thing the hash map needs to do while retrieving is determining the hash code of the key object.
So whenever we call the get method internally hash map determines the hash code of the key object and determines the array location. In a simple case scenario(when there is only one object in the array, it retrieves it and the operation is o(1) (more on this in the later blogs).

The problem comes when there are more than one object in the array.

Hash Map then has to traverse the Linked List and call equals method on each object to determine the correct one. The complexity is now o(n) (more on this later)

What are the properties of the key you use for the hash map ?
To make an object as the key of hash map we need to refer to its working.
Since hash map stores the object based on its hash code and retrieves it using the equals method, The first thing that we should do is overide the equals and hash method based on the equals and hash code contract which states:

1) if two objects are equal then there hash code is always same.
2) if two objects hash code are same then they may or may not be equal.

Secondly since equal and hashcode are made using the state of an object they must ideally be immutable. As if we add mutable objects as key then there is a chance that they might never be retrieved.

No comments:

Post a Comment