Maximum number of elements in map - dictionary

What is the maximum number of elements that can be stored in a Map in GO? If I need to access data from Map frequently, is it a good idea to keep on adding items to Map and retrieving from it, in a long running program?

There is no theoretical limit to the number of elements in a map except the maximum value of the map-length type which is int. The max value of int depends on the target architecture you compile to, it may be 1 << 31 - 1 = 2147483647 in case of 32 bit, and 1 << 63 - 1 = 9223372036854775807 in case of 64 bit.
Note that as an implementation restriction you may not be able to add exactly max-int elements, but the order of magnitude will be the same.
Since the builtin map type uses a hashmap implementation, access time complexity is usually O(1), so it is perfectly fine to add many elements to a map, you can still access elements very fast. Note that however adding many elements will cause a rehashing and rebuilding the internals, which will require some additional calculations - which may happen occasionally when adding new keys to the map.
If you can "guess" or estimate the size of your map, you can create your map with a big capacity to avoid rehashing. E.g. you can create a map with space for a million elements like this:
m := make(map[string]int, 1e6)

"A maximum number"? Practically no.
"A good idea"? Measure, there cannot be a general answer.

Related

Best NoSQL server structure for a large array which can reordered

I have a large array that can be reordered and stored in a NoSQL database (Firebase Realtime Database). I use Ionic4 ion-reorder-group to sort them.
I am currently saving the data as follows:
{
Data:
0:{
N1:'',
N2:'',
N3:false
},
1:{
N1:'',
N2:'',
N3:false
},
...
}
But by add an object at the beginning, I have to update the full-array in the server. Problem is if I have 1'000 objects and add 10 new ones at the beginning, I have written 10'000 objects in the server.
So what is the best way to structure a large reorder array list? My idea was to work with pushId, but I have not found a solution where the problem no longer exists. In the code below you can see my idea, but the problem is the same that everything has to be updated and a lot of pushId need more memory...
{
Data:
[pushId]:{
P: 0, // Position at the array
N1:'',
N2:'',
N3:false
},
...
}
The problem of assigning ordered labels to the items of a list as they are inserted/deleted/moved is well known and referred to as "online list labeling". It's a strict version of the "order maintenance problem", which has a wikipedia page here: https://en.wikipedia.org/wiki/Order-maintenance_problem
The best you can do is amortized O(log N) relabelings per insertion, so if you have a list of 1000 objects and you insert 10 things, you will have to update around 100 items -- not too bad.
It's not too hard to achieve that bound. Perhaps the simplest algorithm is the first one described in this paper: "Two Simplified Algorithms for Maintaining Order in a List", by Bender, Cole, Demaine, etc, which you can get for free here: https://erikdemaine.org/papers/DietzSleator_ESA2002/paper.pdf
For a list of up to N items, it will require to have up to N2 available keys or so. You would use 64-bit ints for up to 232 items, for example.
Here's a simple incarnation:
We will assign 64-bit labels for up to 232 items
For each k-bit prefix of a label (consisting of the high-order bits), there are therefore 264-k available labels with that prefix.
When we insert an item between labels a and b, we can choose any label between them. If they are adjacent, we have to relabel some first.
To relabel, find the longest prefix with a low enough density (used labels/available labels), and evenly space all the labels with that prefix.
A prefix with m available labels has low enough density if less than sqrt(m) labels are used. A prefix of length k, then, has low enough density if there are less than 232-k/2 used labels with that prefix.
Refer to the linked paper for the proof that this provides the claimed amortized complexity

What is the most efficient and portable way to define an order on pointers?

I have a slice that contains pointers to values. In a performance-critical part of my program, I'm adding or removing values from this slice. For the moment, inserting a value is just an append (O(1) complexity), and removal consists in searching the slice for the corresponding pointer value, from 0 to n-1, until the pointer is found (O(n)). To improve performance, I'd like to sort values in the slice, so that searching can be done using dichotomy (so O(log(n)).
But how can I compare pointer values? Pointer arithmetic is forbidden in go, so AFAIK to compare pointer values p1 and p2 I have to use the unsafe package and do something like
uintptr(unsafe.Pointer(p1)) < uintptr(unsafe.Pointer(p2))
Now, I'm not comfortable using unsafe, at least because of its name. So, is that method correct? Is it portable? Are there potential pitfalls? Is there a better way to define an order on pointer values? I know I could use maps, but maps are slow as hell.
As said by others, don't do this. Performance can't be that critical to resort to pointer arithmetic in Go.
Pointers are comparable, Spec: Comparison operators:
Pointer values are comparable. Two pointer values are equal if they point to the same variable or if both have value nil. Pointers to distinct zero-size variables may or may not be equal.
Just use a map with the pointers as keys. Simple as that. Yes, indexing maps is slower than indexing slices, but then again, if you'd want to keep your slice sorted and you wanted to perform binary searches in that, then the performance gap decreases, as the (hash) map implementation provides you O(1) lookup while binary search is only O(log n). In case of big data set, the map might even be faster than searching in the slice.
If you anticipate a big number of pointers in the map, then pre-allocate a big one with make() passing an estimated upper size, and until your map exceeds this size, no reallocation will occur.
m := make(map[*mytype]struct{}, 1<<20) // Allocate map for 1 million entries

Iterating results back into an OpenCL kernel

I have written an openCL kernel that takes 25million points and checks them relative to two lines, (A & B). It then outputs two lists; i.e. set A of all of the points found to be beyond line A, and vice versa.
I'd like to run the kernel repeatedly, updating the input points with each of the line results sets in turn (and also updating the checking line). I'm guessing that reading the two result sets out of the kernel, forming them into arrays and then passing them back in one at a time as inputs is quite a slow solution.
As an alternative, I've tested keeping a global index in the kernel that logs which points relate to which line. This is updated at each line checking cycle. During each iteration, the index for each point in the overall set is switched to 0 (no line), A or B or so forth (i.e. the related line id). In subsequent iterations only points with an index that matches the 'live' set being checked in that cycle (i.e. tagged with A for set A) are tested further.
The problem is that, in each iteration, the kernels still have to check through the full index (i.e. all 25m points) to discover wether or not they are in the 'live' set. As a result, the speed of each cycle does not significantly improve as the size of the results set decrease over time. Again, this seems a slow solution; whilst avoiding passing too much information between GPU and CPU it instead means that a large number of the work items aren't doing very much work at all.
Is there an alternative solution to what I am trying to do here?
You could use atomics to sort the outputs into two arrays. Ie if we're in A then get my position by incrementing the A counter and put me into A, and do the same for B
Using global atomics on everything might be horribly slow (fast on amd, slow on nvidia, no idea about other devices) - instead you can use a local atomic_inc in a 0'd local integer to do exactly the same thing (but for only the local set of x work-items), and then at the end do an atomic_add to both global counters based on your local counters
To put this more clearly in code (my explanation is not great)
int id;
if(is_a)
id = atomic_inc(&local_a);
else
id = atomic_inc(&local_b);
barrier(CLK_LOCAL_MEM_FENCE);
__local int a_base, b_base;
int lid = get_local_id(0);
if(lid == 0)
{
a_base = atomic_add(a_counter, local_a);
b_base = atomic_add(b_counter, local_b);
}
barrier(CLK_LOCAL_MEM_FENCE);
if(is_a)
a_buffer[id + a_base] = data;
else
b_buffer[id + b_base] = data;
This involves faffing around with atomics which are inherently slow, but depending on how quickly your dataset reduces it might be much faster. Additionally if B data is not considered live, you can omit getting the b ids and all the atomics involving b, as well as the write back

Size of Qt containers: is QMap much larger than Qlist?

I am developing a software which maps information in 3D space. I use a container to hold this information. The container which I use is
QList< QList< QMap<float, QList<quint16> > > > lidardata;
which basically is a 2D grid representing a rectangular area where each cell is 1 meter x 1 meter, and in each cell the QMap contains a key value representing height and a list of four related values at that height. This way I can store five values (height + other values). I insert values in a loop like this (rown and coln are row and column indexes respectively)
QList<quint16> NEWLIST;
NEWLIST << (width*100.0) << (1000.0+sens*100.0) << (quint16)(intensity*1000.0) ;
lidardata[ rown ][ coln ].insert( heightValue, NEWLIST);
Before this approach instead of using QMap<float, QList<quint16> I used QList<quint16> and just appending the 5 values.
Now the question is: running my program runs out of memory quite fast. It took up about 800Mb of memory to complete with the first solution (QList instead of QMap), now it runs out (at about 1.4 Gb) at 75% of the total data-storing process.
Can someone confirm that storing the same amount of information using QMap<float, QList<quint16> instead of QList<quint16> does require a lot more space in memory?
Does anyone have any hints to limit space? I will go back to the old solution if nothing comes up.
As mentionned in comment:
your code may suffer from Primitive Obsession.
Try to solve your problem using the ValueObject fix stated in this tutorial : create a class with all needed attibutes, and work on instances of this class instead of maintaining nested Qlists and QMaps.

What is the cost of deleting a value from a hashtable?

Now I have this question where I was asked the cost of deleting a value from a hash table when we used linear probing while the insertion process.
What I could figure out from reading various stuff on the internet is that it has to do something with the load factor. Though I am not sure, but I read a relation between the load factor and no of probes required and it is No of probes = 1 / (1-LF).
So I believe the cost has to be dependent on the probe sequence. But then another thought ruins everything.
What if the element was inserted in p probes and now I am trying to delete this element. But before this I had already deleted few elements having the same hash code and were a part of insertion in probes less than p.
In this case I reach to a stage where I see a slot empty in the hash table but I am not sure if the element I am trying to delete is already deleted or is at some other location as a result of probing.
I also found that once I delete an element I must mark this slot with some special indicator to inform that it is available, but this doesn't solve my problem of being uncertain about the element which I am willing to delete.
Could anyone please suggest how to find the cost in such cases?
Is the approach going to vary if it is non-linear probing?
The standard approach is "lookup the element, mark as deleted". Marking obviously has O(1) cost, so the total operation cost is the same as just lookup: O(1) expected. It can be as high as O(n) in degenerate cases (e.g. all elements have the same hash). O(1) expected is all we can say theoretically.
About the load factor. The higher the load factor (ratio of number of occupied buckets to the total number), the larger is the expected factor (but this doesn't change the theoretical O cost). Note that in this case load factor includes number of both present in the table elements plus the number of buckets that got marked as deleted previously.
Other probing kinds (e.g. quadratic) don't change the theoretical cost, but may alter the expected constant factor or its variance. If you look at "fallback" sequences, in linear ordering the sequences of different buckets overlap. This means that if for some bucket the sequence is long, for adjacent buckets it will also be long. E.g.: if buckets 4 to 10 are occupied, sequence for bucket #4 is 7 bucket long (4, 5, 6, ..., 10), for #5 it's 6 and so on. For quadratic probing this is not the case.
However, linear probing has the benefit of better memory-cache behavior, since you check memory cells close to each other. In practice, though, for quadratic probing fallback sequences are rarely long enough for this to matter.
Finally, in linear probing case, it is possible to work without deleted mark, but for this you'd have to complicate deleting procedure considerably (still O(1) expected, though, but with much higher constant factor). Whether it is worth it has to be decided with actual profiling; for example, this simplifies inserting somewhat and lookup a bit. For a C++ implementation this would have the downside that erase() would invalidate iterators, though.

Resources