This might be a quite stupid question but, knowing the poor efficiency of searching for an element inside a list (singly-linked or doubly-linked), why not using a vector or dynamic array to store the elements of the list in order, therefore making it easier to access elements ?
Linked lists used to be more important because they are stored non-contiguously which is better for memory management. Linked lists and vectors / arrays both have a search time complexity of O(N). It's only faster to access array elements if you know the index in advance. Linked lists are for niche cases where you are frequently inserting elements into the beginning of the array. Linked lists let you do this in O(1) time as opposed to arrays O(n) because the other elements need to be shifted.
Related
I have a list of some data type. However, I also want to index the elements of that list with a trie, so I can do more efficient lookups. But I don't want to store the same elements twice, so I want to store the elements in the list, and in the trie I store pointers to elements, in the leaf nodes. Is this possible? I could store the index of the element in the list, however getting an element of a linked list by index is slow, so that won't do.
Apologies if this is a misunderstanding of the OCaml memory model.
Just store the element. Under the hood, this doesn't copy the value, it just copies a pointer to the value, except for values that are stored in a single memory word (just like a pointer).
In other words, things like let b = a do not make a copy a. They make b an alias of a.
Values are automatically shared in Ocaml. The only case where you wouldn't want sharing is a mutable object (reference, or structure or object with mutable fields). If you want two mutable objects with the same current value but such that an assignment will only affect one of the objects, then you need to make a copy.
I have been working with Java, but in general, what's the advantage of saving elements in a hash table when we have an array?
In my understanding, ff we have arr[100], then accessing i-th element is O(1) since it's just addition of (arr type) * (i) to the base pointer of arr itself. Then how is hashtable useful and when would it be useful as opposed to array? Is
Thank you
In Java you should be using HashMaps. The Object class in Java has a int hashcode method where it creates a unique number for the object in mind.
For example, this is the hashcode of a String in java.
In hashmaps you can assign a value to a key. For example, you could be doing: <Username(String), Customer(Custom Object)>. With arrays, to find a specific Customer (If you don't know the index) you would have to go through the entire array (O(n)) in the worst case to find that.
Without hashmaps, and using some more search optimized data structures like Binary Search Trees, it would take log(n) time (O(log n)) time to find the customer.
With a hashmap, you can get the customer's object immediately. Without having to go through the entire collection of the customers.
So basically, hashmaps "Map" a "hash" integer value to a key, and then use that key to find the value.
Also just as a bonus, remember since we're putting larger information inside a small integer, we will be facing the so called "hash collision" where two keys have the same hash value but they're not the same actual things. In this case we're obviously not going to find the information instantly, however again, instead of having to search for all the records to find our specific one, we just need to search a smaller "bucket" of values which is substantially smaller than our actual collection.
Everyone was telling me that a List is heavy on performance, so I was wondering is it the same with a dictionary? Because a dictionary doesn't have a fixed size. Is there also a dictionary with a fixed size, just like a normal array?
Thanks in advance!
A list can be heavy on performance, but it depends on your use case.
If your use case is the indexing of a very large data set, in which you plan to search for elements during runtime, then a Dictionary will behave with O(1) Time Complexity for retrievals (which is great!).
If you plan to insert/remove a little bit of data here and there at runtime then that's okay. But, if you plan to do constant insertions at runtime then you will be taking a hit on performance due to the hashing and collision handling functions.
If your use case requires a lot of insertions, removals, iteration through the consecutive data, then a list would be and fast. But if you are planning to search constantly at runtime, then a list could take a hit performance-wise.
Regarding the Dictionary and size:
If you know the size/general range of your data set then you could technically account for that and initialize accordingly. Or you could write your own Dictionary and Hash Table implementation.
In all:
Each data structure has it's advantages and disadvantages. So think about what you plan to do with the data at runtime, then pick accordingly.
Also, keeping a data structure time and space complexity table is always handy :P
This is depends on your needs.
If you just add and then iterate items in a List in sequental way - this is a good choice.
If you have a key for every item and need fast random access by key - use Dictionary.
In both cases you can specify the initial size of the collection to reduce memory allocation.
If you have a varying number of items in the collection, you'll want to use a list vs recreating an array with the new number of items in the collection.
With a dictionary, it's a little easier to get to specific items in the collection, given you have a key and just need to look it up, so performance is a little better when getting an item from the collection.
List and dictionary are part of the System.Collections namespace, which are mutable types. There is a System.Collections.Immutable namespace, but it's not yet supported in Unity.
We want to use Riak's Links to create a doubly linked list.
The algorithm for it is quite simple, I believe:
Let 'N0' be the new element to insert
Get the head of the list, including its 'next' link (N1)
Set the 'previous' of N1 to be the N0.
Set the 'next' of N0 to be N1
Set the 'next' of the head of the list to be N0.
The problem that we have is that there is an obvious race condition here, because if 2 concurrent clients get the head of the list, one of the items will likely be 'lost'. Any way to avoid that?
Riak is an eventually consistent system when talking about CAP theorem.
Provided you set the bucket property allow_multi=true, if two concurrent clients get the head of the list then write, you will have sibling records. On your next read you'll receive multiple values (siblings) and will then have to resolve the conflict and write the result. Given that we don't have any sort of atomicity this will possibly lead to additional conflicts under heavy write concurrency as you attempt to update the linked objects. Not impossible to resolve, but definitely tricky.
You're probably better off simply serializing the entire list into a single object. This makes your conflict resolution much simpler.
In functional languages like Racket or SML, we usually perform list operations in recursive call (pattern matching, list append, list concatenation...). However, I'm not sure the general implementation of these operations in functional languages. Will operations like create, update or delete elements in a list return a whole new copy of a list? I once read in a book an example about functional programming disadvantage; that is, every time a database is updated, a whole new copy of a database is returned.
I questioned this example, since data in FP is inherently immutable, thus the creating lists from existing lists should not create a whole new copy. Instead, a new list is simply just a different collection of reference to existing objects in other lists, based on filtering criteria.
For example, list A = [a,b,c], and list B=[1,2,3], and I created a new list that contains the first two elements from the existing lists, that is C=[a,b,1,2]. This new list simply contains references to a,b, from A and 1,2 from B. It should not be a new copy, because data is immutable.
So, to update an element in a list, it should only take a linear amount of time find an element in a list, create a new value and create a new list with same elements as in the old list except the updated one. To create a new list, the running environment merely updates the next pointer of the previous element. If a list is holding non-atomic elements (i.e. list, tree...), and only one atomic element in one of the non-atomic element is updated, this process is recursively applied for the non-atomic element until the atomic element is updated as described above. Is this how it should be implemented?
If someone creates a whole deep copy of a list every time a list is created from existing lists/added/updated/deleted/ with elements, they are doing it wrong, aren't they?
Another thing is, when the program environment is updated (i.e. add a new key/value entry for a new variable, so we can refer to it later), it doesn't violate the immutable property of functional programming, is it?
You are absolutely correct! FP languages with immutable data will NEVER do a deep copy (unless they are really poorly implemented). As the data is immutable there are never any problems in reusing it. It works in exactly the same way with all other structures. So for example if you are working with a tree structure then at most only the actual tree will be copied and never the data contained in it.
So while the copying sounds very expensive it is much less than you would first think if you coming from an imperative/OO background (where you really do have to copy as you have mutable data). And there are many benefits in having immutable data.