Hazelcast map: using equals for Map keys - dictionary

I'm using Hazelcast 2.6.
I have a map where the keys are Objects.
As I can read from the Hazelcast documentation http://hazelcast.org/docs/latest/javadoc/com/hazelcast/core/IMap.html
"This class is not a general-purpose ConcurrentMap implementation! While this class implements the Map interface, it intentionally violates Map's general contract, which mandates the use of the equals method when comparing objects. Instead of the equals method this implementation compares the serialized byte version of the objects."
There is a way to force Hazelcast using equals instead of the serialized byte version of the objects?

I found the answer:
In Hazelcast you can't rely on the equals/hashcode defined for the key object.
You have to use objects that contains just the attributes that makes it unique.
From here: http://hazelcast.org/mastering-hazelcast/chapter-5/#hashcode-and-equals
In most cases you probably will make use of some basic type like a
Long, Integer or String as key. But in some cases you will need to
create custom keys. But to do it correctly in Hazelcast, you need to
understand how this mechanism [which mechanism?] works because it
works differently compared to traditional map implementations. When
you store a key/value in a Hazelcast map, instead of storing the
Object, the object are serialized to byte arrays and these are stored.
To use the hash/equals in Hazelcast you need to know the following
rules:
for keys: the hash/equals is determined based on the content of the
byte array, so equal keys need to result in equal byte arrays. See
[serialization chapter; serializable for warning].
for values: the hash/equals is determined based on the
in-memory-format; for BINARY the binary format is used. For OBJECT and
CACHED the equals of the object is used.

If you want to use your equals implementation, you can try the setting the in-memory format for the map to be "Object". The data will then be stored in deserialized form.

Related

Where is the mutability of Objects defined in ECMAScript?

In this question about the passing of arguments in JavaScript functions, we learn that everything is passed by value in JavaScript.
In Mozilla documents, it is mentioned that the primitive types are immutable, and objects are. Although I came from the procedural and structured programming school, I was able to quickly pick up the concepts.
In the ECMAScript standard, it is defined that "An Object is 'logically' a collection of properties". The standard also defines how objects may be compared, but left out on what happens when an object goes through the GetValue() pseudo-function that converts references into values.
So, I gave an answer in the question basically saying that this area had been left undefined.
My Question
I feel that by "left undefined", I meant, it wasn't philosophically thoroughly clear, what the value of an Object is. The standard had gone through a few revisions, and its size is ever increasing.
In short, an object is a collection, but what is the value of the collection? Is it the makeup of its content? Or is it individuality? Or have I been missing out on some important texts?
In the ECMAScript spec, every object is defined to have certain 'internal methods', some of which (e.g., [[DefineOwnProperty]] and [[Put]]) can change the state of the object. Ultimately, the mutability of objects is defined via the use of such internal methods.
GetValue() doesn't leave out what happens to objects -- step #1 is:
If Type(V) is not Reference, return V.
So it you pass it an object, you get back the same object.
(Which refutes one of your premises, but I'm not sure it resolves your question.)
See section 4.3.26 "property" of the 5.1 edition. The note says:
Depending upon the form of the property the value may be represented either directly as a data value (a primitive value, an object, or a function object) or indirectly by a pair of accessor functions.
We can take this as meaning a data value is one of the following:
Primitive Value: such as C language double, _Bool, ((void*)0), etc.
An object: which can be interpreted as a special C language structure containing the underlaying information about the object.
Function object: which is just a special case of 2, possibly the result of JIT compilation.
The reason this note for the definition of property is important, is because, everything - even function block scopes - are objects (or at least described in terms of one). Therefore, if we can determine that "the value of an object" is its individuality rather than its content makeup, then with the fact that every object accessible from a JavaScript program, is accessed as if its the property of some other object.
In section 4.2 "Language Overview", it says:
A primitive value is a member of one of the following built-in types: Undefined, Null, Boolean, Number, and String; an object is a member of the remaining built-in type Object; and a function is a callable object.
Although this is an informal section, it can be seen that an object differs from a primitive value in a significant way.
As an interpretation, let's consider the value of an object is the object itself as we can infer from the "GetValue()" pseudo function - the overview says "an object is a member of the ... type Object - therefore, the value is the membership to the Object type.
To use a physics analogy to explain the relationship between membership and individuality, we see too electrons. They are identical in content, they're both the members of the Universe, yet they are two different individuals.
Therefore, we infer that - the value of a JavaScript Object, is its individuality.
Finally as to the question as asked in the title.
The mutibility of individual objects is defined in terms of a series of specificational pseudo functions, and immutability of other types is defined using the definition of value membership of types and specification pseudo functions operating on the primitive type values.

Why `additionalProperties` is the way to represent Dictionary/Map in Swagger/OpenAPI 2.0

Although I have seen the examples in the OpenAPI spec:
type: object
additionalProperties:
$ref: '#/definitions/ComplexModel'
it isn't obvious to me why the use of additionalProperties is the correct schema for a Map/Dictionary.
It also doesn't help that the only concrete thing that the spec has to say about additionalProperties is:
The following properties are taken from the JSON Schema definition but their definitions were adjusted to the Swagger Specification. Their definition is the same as the one from JSON Schema, only where the original definition references the JSON Schema definition, the Schema Object definition is used instead.
items
allOf
properties
additionalProperties
Chen, I think your answer is correct.
Some further background that might be helpful:
In JavaScript, which was the original context for JSON, an object is like a hash map of strings to values, where some values are data, others are functions. You can think of each name-value pair as a property. But JavaScript doesn't have classes, so the property names are not predefined, and each object can have its own independent set of properties.
JSON Schema uses the properties keyword to validate name-value pairs that are known in advance; and uses additionalProperties (or patternProperties, not supported in OpenAPI 2.0) to validate properties that are not known.
For clarity:
The property names, or "keys" in the map, must be strings. They cannot be numbers, or any other value.
As you said, the property names should be unique. Unfortunately the JSON spec doesn't strictly require uniqueness, but uniqueness is recommended, and expected by most JSON implementations. More background here.
properties and additionalProperties can be used alone or in combination. When additionalProperties is used alone, without properties, the object essentially functions as a map<string, T> where T is the type described in the additionalProperties sub-schema. Maybe that helps to answer your original question.
When evaluating an object against a single schema, if a property name matches one of those specified in properties, its value only needs to be valid against the sub-schema provided for that property. The additionalProperties sub-schema, if provided, will only be used to validate properties that are not included in the properties map.
There are some limitations of additionalProperties as implemented in Swagger's core Java libraries. I've documented these limitations here.
First thing, I found a better explanation for additionalProperties:
For an object, if this is given, in addition to the properties defined in properties all other property names are allowed. Their values must each match the schema object given here. If this is not given, no other properties than those defined in properties are allowed.
So here is how I finally understood this:
Using properties, we can define a known set of properties similar to Python's namedtuple, however if we wish to have something more like Python's dict, or any other hash/map where we can't specify how many keys there are nor what they are in advance, we should use additionalProperties.
additionalProperties will match any property name (that will act as the dict's key, and the $ref or type will be the schema of the dict's value, and since there should not be more than one properties with the same name for every given object, we will get the enforcement of unique keys.
Note that unlike Python's dict that accepts any immutable value as a key, since the keys here are in essence property names, they must be strings. (Thanks Ted Epstein for that clarification). This limitation can be tracked down to pair := string : value in the json specification.

Key Value Pair Collection, where the values can be of multiple types?

Tl;dr I'd like to set up a collection of key value pairs, where the values can be of multiple types, but are also serializable with ISerializable. Is this possible, and if so how can I go about achieving it?
I'm trying to replace code in an existing system where Hashtables are stored in a Session variables, allowing developers to store multiple types of object within them against specified keys. I'm attempting to convert the system so it can use SqlServer SessionState (storing the Session data in a db, rather than in memory), which requires everything added to the Session object be ISerializable.
Hopefully this could be achieved somehow using my own Generic data class that wrappers the objects of multiple types in the collection? I just can't quite see how (I've used plenty of Generic collections, just never set up my own generic classes, so I'm struggling to see how I'd do this).
Many thanks in advance, for any advice on possible approaches to this.
Basically, you want Dictionary<TKey, TValue> where TValue has to implement ISerializable. Since Dictionary has all implementation you need and only is too permissive with TValue type, it is enough to create class that inherits Dictionary and adds generic constraint:
class DictionaryOfSerializables<TKey, TValue> : Dictionary<TKey, TValue> where TValue : ISerializable
{ }

JSON.net ContractResolver vs. JsonConverter

I've been working with JSON.net for a while. I have written both custom converters and custom contract resolvers (generally from modifying examples on S.O. and the Newtonsoft website), and they work fine.
The challenge is, other than examples, I see little explanation as to when I should use one or the other (or both) for processing. Through my own experience, I've basically determined that contract resolvers are simpler, so if I can do what I need with them, I go that way; otherwise, I use custom JsonConverters. But, I further know both are sometimes used together, so the concepts get further opaque.
Questions:
Is there a source that distinguishes when to user one vs. the other? I find the Newtonsoft documentation unclear as to how the two are differentiated or when to use one or the other.
What is the pipeline of ordering between the two?
Great question. I haven't seen a clear piece of documentation that says when you should prefer to write a custom ContractResolver or a custom JsonConverter to solve a particular type of problem. They really do different things, but there is some overlap between what kinds of problems can be solved by each. I've written a fair number of each while answering questions on StackOverflow, so the picture has become a little more clear to me over time. Below is my take on it.
ContractResolver
A contract resolver is always used by Json.Net, and governs serialization / deserialization behavior at a broad level. If there is not a custom resolver provided in the settings, then the DefaultContractResolver is used. The resolver is responsible for determining:
what contract each type has (i.e. is it a primitive, array/list, dictionary, dynamic, JObject, plain old object, etc.);
what properties are on the type (if any) and what are their names, types and accessibility;
what attributes have been applied (e.g. [JsonProperty], [JsonIgnore], [JsonConverter], etc.), and
how those attributes should affect the (de)serialization of each property (or class).
Generally speaking, if you want to customize some aspect of serialization or deserialization across a wide range of classes, you will probably need to use a ContractResolver to do it. Here are some examples of things you can customize using a ContractResolver:
Change the contract used for a type
Serialize all Dictionaries as an Array of Key/Value Pairs
Serialize ListItems as a regular object instead of string
Change the casing of property names when serializing
Use camel case for all property names
Camel case all property names except dictionaries
Programmatically apply attributes to properties without having to modify the classes (particularly useful if you don't control the source of said classes)
Globally use a JsonConverter on a class without the attribute
Remap properties to different names defined at runtime
Allow deserializing to public properties with non-public setters
Programmatically unapply (ignore) attributes that are applied to certain classes
Optionally turn off the JsonIgnore attribute at runtime
Make properties which are marked as required (for SOAP) not required for JSON
Conditionally serialize properties
Ignore read-only properties across all classes
Skip serializing properties that throw exceptions
Introduce custom attributes and apply some custom behavior based on those attributes
Encrypt specially marked string properties in any class
Selectively escape HTML in strings during deserialization
JsonConverter
In contrast to a ContractResolver, the focus of a JsonConverter is more narrow: it is really intended to handle serialization or deserialization for a single type or a small subset of related types. Also, it works at a lower level than a resolver does. When a converter is given responsibility for a type, it has complete control over how the JSON is read or written for that type: it directly uses JsonReader and JsonWriter classes to do its job. In other words, it can change the shape of the JSON for that type. At the same time, a converter is decoupled from the "big picture" and does not have access to contextual information such as the parent of the object being (de)serialized or the property attributes that were used with it. Here are some examples of problems you can solve with a JsonConverter:
Handle object instantiation issues on deserialization
Deserialize to an interface, using information in the JSON to decide which concrete class to instantiate
Deserialize JSON that is sometimes a single object and sometimes an array of objects
Deserialize JSON that can either be an array or a nested array
Skip unwanted items when deserializing from an array of mixed types
Deserialize to an object that lacks a default constructor
Change how values are formatted or interpretted
Serialize decimal values as localized strings
Convert decimal.MinValue to an empty string and back (for use with a legacy system)
Serialize dates with multiple different formats
Ignore UTC offsets when deserializing dates
Make Json.Net call ToString() when serializing a type
Translate between differing JSON and object structures
Deserialize a nested array of mixed values into a list of items
Deserialize an array of objects with varying names
Serialize/deserialize a custom dictionary with complex keys
Serialize a custom IEnumerable collection as a dictionary
Flatten a nested JSON structure into a simpler object structure
Expand a simple object structure into a more complicated JSON structure
Serialize a list of objects as a list of IDs only
Deserialize a JSON list of objects containing GUIDs to a list of GUIDs
Work around issues (de)serializing specific .NET types
Serializing System.Net.IPAddress throws an exception
Problems deserializing Microsoft.Xna.Framework.Rectangle

hashtable keys() keySet() which is better

Just curiously I am asking which is the better method to use Hashtable.keys() or Hashtable.keySet(). Any one would have been sufficient. Why have they given 2 methods with different return types. Is there any performance drawback/benefit of one over the other ?
keySet is there because
it returns a Set view of the keys contained in this Hashtable. The Set is backed by the Hashtable, so changes to the Hashtable are reflected in the Set, and vice-versa. The Set supports element removal (which removes the corresponding entry from the Hashtable), but not element addition.
And keys just returns an enumeration of the keys in this hashtable, no changes will be reflected after getting enumeration.
Besides the funcitonal difference mentioned by Rahul, Hashtable itself is an old artifact of earlier java version and retrofitted to implement Map interface.
So keySet is a later construct required by the Map interface.
Additionally, if this is new code that you are writing, you should read up the api details for this data structure on http://docs.oracle.com/javase/7/docs/api/java/util/Hashtable.html and see if you should consider the guideline and use HashMap or other later Collections instead.

Resources