Why Concurrent hash map does not allow null key or null values

Why Concurrent hash map does not allow null key or null values - concurrenthashmap

I have read many articles but unable to understand the reason why Concurrent hash map does not allow null key or null values.
Some articles gives this explanation:
if (m.containsKey(k)) {
return m.get(k);
} else {
throw new KeyNotPresentException();
}
Since m is a concurrent map, key k may be deleted between the containsKey and get calls, causing this snippet to return a null that was never in the table, rather than the desired KeyNotPresentException.
But, same would be the case with not-null key.
Can anyone please explain the reason for the same.

The main reason that nulls aren't allowed in ConcurrentMaps (ConcurrentHashMaps, ConcurrentSkipListMaps) is that ambiguities that may be just barely tolerable in non-concurrent maps can't be accommodated. ... get(key) returns null , you can't detect whether the key explicitly maps to null vs the key isn't mapped.

Related

Checking for existence of a field in firebase real-time database

For firebase functions, we may sometimes need to check if a field in the real-time database exists, e.g., we may be looping through all the user records in the real-time database, and only some of the users may have a field, say, "car". As I understand it from the selected answer in the SO post, we can either call exists() or check for null.
The first one would be to check car for existence:
if(dataSnapshot.child("car").exists()) {
//Do something
}
The second one would be to check the car field for nullness:
if(dataSnapshot.child("car").val() != null) {
//Do the other thing
}
The question is, how do these two ways compare, firstly, in terms of results? exists() === false is the same as null and exists() === true is the same as !null, or are there any special cases to consider? Secondly, how about performance? Is a null check faster, does it use fewer resources (memory, etc.), compared to the function call that is exists()? Or is it purely a matter of preference/taste?

The answer can be given by taking a look at the API source code.
Let's start with the .exists() method. At this link (row 361), you can see that the exists method perform a call to the underlying isEmpty() method.
In the same file, at row 462, we can see the implementation of the val() method: it call the val() method of the inner node. Well, let's then get a look at it: at this link, at row 197, you can see that the very first thing that the val() method will do is calling the isEmpty() method and, if it is true, it will return a null value.
This tells us that the two methods (using exists() or evaluating the return value of val() and checking if it is null or not) gives the same result.
Let's get to the performance side: both ways performs the same. If you're going to call these methods intensively, exists() will perform less inner calls, so it should be the preferred one.

NoSuchMethodError: The getter 'hash' was called on null [duplicate]

I have some code and when I run it produces an error, saying:
NoSuchMethod: the method 'XYZ' was called on null
What does that mean and how do I fix it?

Why do I get this error?
Example
As a real world comparison, what just happened is this conversation:
Hey, how much gas is left in the tank of the car?
What are you talking about, we don't have a car.
That is exactly what is happening in your program. You wanted to call a function like _car.getGasLevel(); but there is no car, the variable _car is null.
Obviously, in your program it might not be a car. It could be a list or a string or anything else really.
Technical explanation
You are trying to use a variable that is null. Either you have explicitly set it to null, or you just never set it at all, the default value is null.
Like any variable, it can be passed into other functions. The place where you get the error might not be the source. You will have to follow the leads from the actual null value to where it originally came from, to find what the problem is and what the solution might be.
null can have different meanings: variables not set to another value will be null, but sometimes null values are used by programmers intentionally to signal that there is no value. Databases have nullable fields, JSON has missing values. Missing information may indeed be the information itself. The variable bool userWantsPizzaForDinner; for example might be used for true when the user said yes, false when the user declined and it might still be null when the user has not yet picked something. That's not a mistake, it's intentionally used and needs to be handled accordingly.
How do I fix it?
Find it
Use the stack trace that came with the error message to find out exactly which line the error was on. Then set a breakpoint on that line. When the program hits the breakpoint, inspect all the values of the variables. One of them is null, find out which one.
Fix it
Once you know which variable it is, find out how it ended up being null. Where did it come from? Was the value never set in the first place? Was the value another variable? How did that variable got it's value. It's like a line of breadcrumbs you can follow until you arrive at a point where you find that some variable was never set, or maybe you arrive at a point where you find that a variable was intentionally set to null. If it was unintentional, just fix it. Set it to the value you want it to have. If it was intentional, then you need to handle it further down in the program. Maybe you need another if to do something special for this case. If in doubt, you can ask the person that intentionally set it to null what they wanted to achieve.

simply the variable/function you are trying to access from the class does not exist
someClass.xyz();
above will give the error
NoSuchMethod: the method 'xyz' was called on null
because the class someClass does not exist
The following will work fine
// SomeClass created
// SomeClass has a function xyz
class SomeClass {
SomeClass();
void xyz() {
print('xyz');
}
}
void main() {
// create an instance of the class
final someClass = SomeClass();
// access the xyz function
someClass.xyz();
}

Firebase RTD, atomic "move" ... delete and add from two "tables"?

In Firebase Realtime Database, it's a pretty common transactional thing that you have
"table" A - think of it as "pending"
"table" B - think of it as "results"
Some state happens, and you need to "move" an item from A to B.
So, I certainly mean this would likely be a cloud function doing this.
Obviously, this operation has to be atomic and you have to be guarded against racetrack effects and so on.
So, for item 123456, you have to do three things
read A/123456/
delete A/123456/
write the value to B/123456
all atomically, with a lock.
In short what is the Firebase way to achieve this?
There's already the awesome ref.transaction system, but I don't think it's relevant here.
Perhaps using triggers in a perverted manner?
IDK
Just for anyone googling here, it's worth noting that the mind-boggling new Firestore (it's hard to imagine anything being more mind-boggling than traditional Firebase, but there you have it...), the new Firestore system has built-in .......
This question is about good old traditional Firebase Realtime.

Gustavo's answer allows the update to happen with a single API call, which either complete succeeds or fails. And since it doesn't have to use a transaction, it has much less contention issues. It just loads the value from the key it wants to move, and then writes a single update.
The problem is that somebody might have modified the data in the meantime. So you need to use security rules to catch that situation and reject it. So the recipe becomes:
read the value of the source node
write the value to its new location while deleting the old location in a single update() call
the security rules validate the operation, either accepting or rejecting it
if rejected, the client retries from #1
Doing so essentially reimplements Firebase Database transactions with client-side code and (some admittedly tricky) security rules.
To be able to do this, the update becomes a bit more tricky. Say that we have this structure:
"key1": "value1",
"key2": "value2"
And we want to move value1 from key1 to key3, then Gustavo's approach would send this JSON:
ref.update({
"key1": null,
"key3": "value1"
})
When can easily validate this operation with these rules:
".validate": "
!data.child("key3").exists() &&
!newData.child("key1").exists() &&
newData.child("key3").val() === data.child("key1").val()
"
In words:
There is currently no value in key3.
There is no value in key1 after the update
The new value of key3 is the current value of key1
This works great, but unfortunately means that we're hardcoding key1 and key3 in our rules. To prevent hardcoding them, we can add the keys to our update statement:
ref.update({
_fromKey: "key1",
_toKey: "key3",
key1: null,
key3: "value1"
})
The different is that we added two keys with known names, to indicate the source and destination of the move. Now with this structure we have all the information we need, and we can validate the move with:
".validate": "
!data.child(newData.child('_toKey').val()).exists() &&
!newData.child(newData.child('_fromKey').val()).exists() &&
newData.child(newData.child('_toKey').val()).val() === data.child(newData.child('_fromKey').val()).val()
"
It's a bit longer to read, but each line still means the same as before.
And in the client code we'd do:
function move(from, to) {
ref.child(from).once("value").then(function(snapshot) {
var value = snapshot.val();
updates = {
_fromKey: from,
_toKey: to
};
updates[from] = null;
updates[to] = value;
ref.update(updates).catch(function() {
// the update failed, wait half a second and try again
setTimeout(function() {
move(from, to);
}, 500);
});
}
move ("key1", "key3");
If you feel like playing around with the code for these rules, have a look at: https://jsbin.com/munosih/edit?js,console

There are no "tables" in Realtime Database, so I'll use the term "location" instead to refer to a path that contains some child nodes.
Realtime Database provides no way to atomically transaction on two different locations. When you perform a transaction, you have to choose a single location, and you may only make changes under that single location.
You might think that you could just transact at the root of the database. This is possible, but those transactions may fail in the face of concurrent non-transaction write operations anywhere within the database. It's a requirement that there must be no non-transactional writes anywhere at the location where transactions take place. In other words, if you want to transact at a location, all clients must be transacting there, and no clients may write there without a transaction.
This rule is certainly going to be problematic if you transact at the root of your database, where clients are probably writing data all over the place without transactions. So, if you want perform an atomic "move", you'll either have to make all your clients use transactions all the time at the common root location for the move, or accept that you can't do this truly atomically.

Firebase works with Dictionaries, a.k.a, key-value pair. And to change data in more than one table on the same transaction you can get the base reference, with a dictionary containing "all the instructions", for instance in Swift:
let reference = Database.database().reference() // base reference
let tableADict = ["TableA/SomeID" : NSNull()] // value that will be deleted on table A
let tableBDict = ["TableB/SomeID" : true] // value that will be appended on table B, instead of true you can put another dictionary, containing your values
You should then merge (how to do it here: How do you add a Dictionary of items into another Dictionary) both dictionaries into one, lets call it finalDict,
then you can update those values, and both tables will be updated, deleting from A and "moving to" B
reference.updateChildValues(finalDict) // update everything on the same time with only one transaction, w/o having to wait for one callback to update another table

Why are the keys and values of a borrowed HashMap accessed by reference, not value?

I have a function that takes a borrowed HashMap and I need to access values by keys. Why are the keys and values taken by reference, and not by value?
My simplified code:
fn print_found_so(ids: &Vec<i32>, file_ids: &HashMap<u16, String>) {
for pos in ids {
let whatever: u16 = *pos as u16;
let last_string: &String = file_ids.get(&whatever).unwrap();
println!("found: {:?}", last_string);
}
}
Why do I have to specify the key as a reference, i.e., file_ids.get(&whatever).unwrap() instead of file_ids.get(whatever).unwrap()?
As I understand it, the last_string has to be of type &String, meaning a borrowed string, because the owning collection is borrowed. Is that right?
Similar to the above point, am I correct in assuming pos is of type &u16 because it takes borrowed values from ids?

Think about the semantics of passing parameters as references or as values:
As reference: no ownership transfer. The called function merely borrows the parameter.
As value: the called function takes ownership of the parameter and may not be used by the caller anymore.
Since the function HashMap::get does not need ownership of the key to find an element, the less restrictive passing method was chosen: by reference.
Also, it does not return the value of the element, only a reference. If it returned the value, the value inside the HashMap would no longer be owned by the HashMap and thus be inaccessible in the future.

TL;DR: Rust is not Java.
Rust may have high-level constructs, and data-structures, but it is at heart a low-level language, as illustrated by one of its guiding principle: You don't pay for what you don't use.
As a result, the language and its libraries will as much as possible attempt to eliminate any cost that is superfluous, such as allocating memory needlessly.
Case 1: Taking the key by value.
If the key is a String, this means allocating (and deallocating) memory for each and every look-up, when you could use a local buffer that is only allocated once and for all.
Case 2: Returning by value.
Returning by value means that either:
you remove the entry from the container to give it to the user
you copy the entry in the container to give it to the user
The latter is obviously inefficient (copy means allocation), the former means that if the user wants the value back in another insertion has to take place again, which means look-up etc... and is also inefficient.
In short, returning by value is inefficient in this case.
Rust, therefore, takes the most logical choice as far as efficiency is concerned and passes and returns by value whenever practical.

While it seems unhelpful when the key is a u16, think about how it would work with a more complex key such as a String.
In that case taking the key by value would often mean having to allocate and initialise a new String for each lookup, which would be expensive.

How to avoid race conditions on cursor.observe?

Race condiditions
In my Meteor application, I made an observe within a publish, that insert some new data in certain conditions. The point is that sometimes we have duplicated subscriptions, and race condition leads us to duplicate inserted data.
If it is not possible to have "singleton observers":
How can we avoid race conditions and duplicated inserted data on database?
Example:
Meteor.publish("fortuneUpdate", function () {
var selector = {user: this.userId, seen:false};
DailyFortunes.find(selector).observe({
removed: function(doc, beforeIndex){
if(DailyFortunes.find(selector).count()<1)
createDailyFortune(this.userId);
}
});
}
This question has been moved from How cursor.observe works and how to avoid multiple instances running?

According to Tom, it is not possible, for now, to ensure that calls to subscribe that have the same arguments are shared.
So, if you are having the same problem I had, of redundant data created inside observers, I suggest you, as workaround, to:
Create robust indexes that prevent repeted data creating. Compound Keys is probable what you need here.
Treat duplicate key error exceptions inside your observer ignoring race conditions.
example:
Collection.find(selector).observe({
removed: function(document){
try {
// Workaround to avoid race conditions > https://stackoverflow.com/q/13095647/599991
createNewDocument();
} catch (e) {
// XXX string parsing sucks, maybe
// https://jira.mongodb.org/browse/SERVER-3069 will get fixed one day
if (e.name !== 'MongoError') throw e;
var match = e.err.match(/^E11000 duplicate key error index: ([^ ]+)/);
if (!match) throw e;
//if match, just do nothing.
}
self.flush();
}
});

This is an odd pattern. Can you share some example code?
Generally I'd either expect to see mutations in a method, or setting up an observe inside Meteor.startup() on the server. (The latter is tricky if you're running multiple server processes, but so are many other things in a multi process regime. We'll have a better pattern down the line.)
Because it can be arbitrary JS, a publish function has to run once per subscribing client. It may log new subscriptions, set up per-client server state, or vary its behavior based on this.userId or even a random source. For example, consider a subscription that returns 10 randomly selected documents from a DB collection to each subscribed client!
So the place to optimize the case of many clients subscribing to the same data set is at the DB query layer: if a thousand clients are subscribed to the same DB query, we'll just run that underlying query once.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex