How HashMap works internally - collections

HashMap objHashMap = new HashMap();
objHashMap.put("key1", "Value1");
objHashMap.put("key1", "Value2");
System.out.println(objHashMap.get("key1"));
Above code displaying "Value2" how and why

check this
/**
* Associates the specified value with the specified key in this map. If the
* map previously contained a mapping for the key, the old value is
* replaced.
*
* #param key
* key with which the specified value is to be associated
* #param value
* value to be associated with the specified key
* #return the previous value associated with <tt>key</tt>, or <tt>null</tt>
* if there was no mapping for <tt>key</tt>. (A <tt>null</tt> return
* can also indicate that the map previously associated
* <tt>null</tt> with <tt>key</tt>.)
*/
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry<K , V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
Lets note down the steps one by one:
1) First of all, key object is checked for null. If key is null, value is stored in table[0] position. Because hash code for null is always 0.
2) Then on next step, a hash value is calculated using key’s hash code by calling its hashCode() method. This hash value is used to calculate index in array for storing Entry object. JDK designers well assumed that there might be some poorly written hashCode() functions that can return very high or low hash code value. To solve this issue, they introduced another hash() function, and passed the object’s hash code to this hash() function to bring hash value in range of array index size.
3) Now indexFor(hash, table.length) function is called to calculate exact index position for storing the Entry object.
4) Here comes the main part. Now, as we know that two unequal objects can have same hash code value, how two different objects will be stored in same array location [called bucket].
Answer is LinkedList. If you remember, Entry class had an attribute “next”. This attribute always points to next object in chain. This is exactly the behavior of LinkedList.
So, in case of collision, Entry objects are stored in LinkedList form. When an Entry object needs to be stored in particular index, HashMap checks whether there is already an entry?? If there is no entry already present, Entry object is stored in this location.
If there is already an object sitting on calculated index, its next attribute is checked. If it is null, and current Entry object becomes next node in LinkedList. If next variable is not null, procedure is followed until next is evaluated as null.
What if we add the another value object with same key as entered before. Logically, it should replace the old value. How it is done? Well, after determining the index position of Entry object, while iterating over LinkedList on calculated index, HashMap calls equals method on key object for each Entry object. All these Entry objects in LinkedList will have similar hash code but equals() method will test for true equality. If key.equals(k) will be true then both keys are treated as same key object. This will cause the replacing of value object inside Entry object only.
In this way, HashMap ensure the uniqueness of keys.

Because Hash maps store only unique keys for each value, this means that you can't put 2 keys with the same name in it, when you do you will overwrite the value for that key, so if you want to store 2 different values you need to store two different keys in it.
HashMap objHashMap = new HashMap();
objHashMap.put("key1", "Value1");
objHashMap.put("key2", "Value2"); //CHANGED THIS KEY to "key2"
System.out.println(objHashMap.get("key1"));

Related

Java 8 Map merge VS compute, essential difference?

It seems Both merge and compute Map methods are created to reduce if("~key exists here~") when put.
My problem is: add to map a [key, value] pair when I know nothing: neither key existing in map nor it exist but has value nor value == null nor key == null.
words.forEach(word ->
map.compute(word, (w, prev) -> prev != null ? prev + 1 : 1)
);
words.forEach(word ->
map.merge(word, 1, (prev, one) -> prev + one)
);
Is the only difference 1 is moved from Bifunction to parameter?
What is better to use? Does any of merge, compute suggests key/val are existing?
And what is essential difference in use case of them?
The documentation of Map#compute(K, BiFunction) says:
Attempts to compute a mapping for the specified key and its current mapped value (or null if there is no current mapping). For example, to either create or append a String msg to a value mapping:
map.compute(key, (k, v) -> (v == null) ? msg : v.concat(msg))
(Method merge() is often simpler to use for such purposes.)
If the remapping function returns null, the mapping is removed (or remains absent if initially absent). If the remapping function itself throws an (unchecked) exception, the exception is rethrown, and the current mapping is left unchanged.
The remapping function should not modify this map during computation.
And the documentation of Map#merge(K, V, BiFunction) says:
If the specified key is not already associated with a value or is associated with null, associates it with the given non-null value. Otherwise, replaces the associated value with the results of the given remapping function, or removes if the result is null. This method may be of use when combining multiple mapped values for a key. For example, to either create or append a String msg to a value mapping:
map.merge(key, msg, String::concat)
If the remapping function returns null, the mapping is removed. If the remapping function itself throws an (unchecked) exception, the exception is rethrown, and the current mapping is left unchanged.
The remapping function should not modify this map during computation.
The important differences are:
For compute(K, BiFunction<? super K, ? super V, ? extends V>):
The BiFunction is always invoked.
The BiFunction accepts the given key and the current value, if any, as arguments and returns a new value.
Meant for taking the key and current value (if any), performing an arbitrary computation, and returning the result. The computation may be a reduction operation (i.e. merge) but it doesn't have to be.
For merge(K, V, BiFunction<? super V, ? super V, ? extends V>):
The BiFunction is invoked only if the given key is already associated with a non-null value.
The BiFunction accepts the current value and the given value as arguments and returns a new value. Unlike with compute, the BiFunction is not given the key.
Meant for taking two values and reducing them into a single value.
If the mapping function, as in your case, only depends on the current mapped value, then you can use both. But I would prefer:
compute if you can guarantee that a value for the given key exists. In this case the extra value parameter taken by the merge method is not needed.
merge if it is possible that no value for the given key exists. In this case merge has the advantage that null does NOT have to be handled by the mapping function.

the relation between add key/value and mapassign method in go1.10.3

I am reading the source code of map in go1.10.3.It seemed there exist corresponding method about operation such as:
makemap(t *maptype, hint int, h *hmap) *hmap ==> m = make(map[xx]yy)
mapaccess1(t *maptype, h *hmap, key unsafe.Pointer)==> m['key']
but I cant find the correspond method for the operation which add key/value as below:
m['xx']='yy'
there exist a method called mapassign which has some similarity with this
operation.
mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer
this will add a new key to the map, but as we can see, the input arguments has no value. And another question is when it has already this key, it maybe update this key.
if !alg.equal(key, k) {
continue
}
// already have a mapping for key. Update it.
if t.needkeyupdate {//why??
typedmemmove(t.key, k, key)
}
since the two key is equal, why should update it?
summary:
1. the relation between add key/value operation and method mapassign?
2. why it maybe need to update the key since the insert key and the key which has already exist is equal in mapassign method?
In the operation m[k] = v, the caller copies the value v to the address returned by mapassign.
The comments in the function needkeyupdate explain why some types need key updates: floating point & complex -0 and 0 are equal, but different values; string might have smaller backing store.

Groovy Map.get(key, default) mutates the map

I have following Groovy script:
mymap = ['key': 'value']
println mymap
v = mymap.get('notexistkey', 'default')
println v
println mymap
When I run it I get following console output:
[key:value]
default
[key:value, notexistkey:default]
I'm surprised that after calling mymap.get('notexistkey', 'default') where second parameter is a default value returned when given key does not exist, the key notexistkey is added to the map I've called the method on. Why? Is that expected behaviour? How can I prevent this mutation?
Use Java's Map.getOrDefault(key, value) instead:
mymap = ['key': 'value']
println mymap
v = mymap.getOrDefault('notexistingkey', 'default')
println v
println mymap
Output:
[key:value]
default
[key:value]
Groovy SDK adds Map.get(key, default) via DefaultGroovyMethods.get(map, key, default) and if you take a look what Javadoc says you will understand that this behaviour is expected:
Looks up an item in a Map for the given key and returns the value - unless there is no entry for the given key in which case add the default value to the map and return that.
And this is what implementation of this method looks like:
/**
* Looks up an item in a Map for the given key and returns the value - unless
* there is no entry for the given key in which case add the default value
* to the map and return that.
* <pre class="groovyTestCase">def map=[:]
* map.get("a", []) << 5
* assert map == [a:[5]]</pre>
*
* #param map a Map
* #param key the key to lookup the value of
* #param defaultValue the value to return and add to the map for this key if
* there is no entry for the given key
* #return the value of the given key or the default value, added to the map if the
* key did not exist
* #since 1.0
*/
public static <K, V> V get(Map<K, V> map, K key, V defaultValue) {
if (!map.containsKey(key)) {
map.put(key, defaultValue);
}
return map.get(key);
}
it's pretty old concept (since Groovy 1.0). However I would recommend not using it - this .get(key, default) operation is neither atomic, nor synchronized. And problems start when you use it on ConcurrentMap which is designed for concurrent access - this method breaks its contract, because there is no synchronization between containsKey, put and final get call.

how to compare two objects using viewstate

how to compare two objects using viewstate.
what is the meaning of below line.
if (!((byte[])ViewState["ROW"]).SequenceEqual(obj.RowID))
{
return null
}
can anyone please help on this
ViewState["ROW"] : This part will retrieve data from the ViewState stored with key ROW
(byte[])ViewState["ROW"] : This part will cast your data stored in ViewState to byte array
SequenceEqual : is extension method from System.Linq, which checks whether two sequences are same or not
((byte[])ViewState["ROW"]).SequenceEqual(obj.RowID) : Compares sequence of ViewState["Row"] and obj.RowID
if (!((byte[])ViewState["ROW"]).SequenceEqual(obj.RowID)) : This will return null if sequences of ViewState["Row"] and obj.RowID are not same.
what is the meaning of below line.
Basically, SequenceEqual is a LINQ Enumerable extension function which desinged to determine if a source sequence (e.g. byte[]) is equals to another sequence.
Assuming you are comparing two byte arrays (sequenses) in your provided code, if they both equals in their sequence of elements, you will get true, otherwise, false would be the result.
For example, the following sequences are equals and the SequenceEqual will return true:
byte[] chars1 = {56,32,12,32,65, 87};
byte[] chars2 = {56,32,12,32,65, 87};
bool res = chars1.SequenceEqual(chars2); // Will return true

how to design/create key for key/value storage?

I want to store serialized objects (or whatever) in a key/value cache.
Now I do something like this :
public string getValue(int param1, string param2, etc )
{
string key = param1+"_"+param2+"_"+etc;
string tmp = getFromCache();
if (tmp == null)
{
tmp = getFromAnotherPlace();
addToCache( key, tmp);
}
return tmp;
}
I think it can be awkward. How can I design the key?
if i understood the question, i think the simplest and smartest way to make a key is to use an unidirectional hash function as MD5, SHA1 ecc...
At least two reason for doing this:
The resulting key is unique for sure!(actually both MD5 and SHA1 have been cracked (= )
The resulting key has a fixed lenght!
You have to give your object as argument of the function and you have your unique key.
I don t know very much c# but i am quite sure you can find an unidirectional hash function builted-in.
First of all your key seems to be composed out of a lot of characters. Keep in mind that the key name also occupies memory (1byte / char) so try to keep it as short as possible. I've seen situations where the key name was larger than the value, which can happen if you have cases where you store an empty array or an empty value.
The key structure. I guess from your example that the object you want to store is identified by the params (one being the item id maybe, or maybe filters for a search [...]). Start with a prefix. The prefix should be the name of the object class (or a simplified name depicting the object in general).
Most of the time, keys will have a prefix + identifier. In your example you have multiple identifiers. If one of them is a unique id, go with only prefix + id and it should be enough.
If the object is large and you don't always use all of it then change your strategy to a multiple key storage. Use one main key for storing the most common values, or for storing the components of the object, values of which are stored in separate keys. Make use of pipes and get the whole object in one connection using one "multiple" query :
mainKey = prefix + objectId;
object = getFromCache(mainKey);
startCachePipeline();
foreach (object[properties] as property) {
object->property = getFromCache(prefix + objectId + property);
}
endCachePipeline();
The structure for an example "Person" object would then be something like :
person_33 = array(
properties => array(age, height, weight)
);
person_33_age = 28;
person_33_height = 6;
person_33_weight = 150;
Memcached uses memory most efficient when objects stored inside are of similar sizes. The bigger the size difference between objects (not talking about 1 lost big object or singular cases, although memory gets wasted then as well) the more wasted memory.
Hope it helps!

Resources