Since, I'm new to this language so, I'm having hard time in understanding these differences.
what is the difference between the two?
vector<int> *ad;
and
vector<int*> ad;
also, how are these two lines equivalent?
vector<int> * ad = new vector<int>[5];
and
vector<int> ad[5];
vector<int> *ad;
declares ad as a pointer, the type of ad is such that it can be assigned to point to a vector of integers. Those integers are held by-value in the vector. The vector "owns" them and controls the lifetime of the integers. Because ad has not been assigned it doesn't actually point to such a vector (yet)/
vector<int*> ad;
declares ad as a vector of pointers which can point to integers. The vector owns the pointers, they have not been assigned to point to any specific integers. This time the vector does actually exist, but its empty.
vector<int> * ad = new vector<int>[5];
declares ad as pointer to a vector of integers and assigns it to point to the first element of a new array of 5 vector of integers. That array of vectors is put on the heap, it will continue to exist until it is deleted.
vector<int> ad[5];
declare ad as an array of 5 vectors of integers. ad will exist until it goes out of scope. The vectors will be empty.
Related
I want to append an index to an Integer array during a loop. Like add 3 to [1,2] and get an array like [1,2,3]. I don't know how to write it in the format and I cannot get the answer on the Internet.
You can use Vectors to do the something similar using the & operator. You can access the individual elements just like an array, though you use () instead of []. Or you can just use a for loop and get the element directly.
See the below example:
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Containers.Vectors;
procedure jdoodle is
-- Create the package for a vector of integers
package Integer_Vectors is new Ada.Containers.Vectors(Positive,Integer);
-- Using subtype to make just the Vector type visible
subtype Vector is Integer_Vectors.Vector;
-- Make all the primitive operations of Vector visible
use all type Vector;
-- Create a variable
V : Vector;
begin
-- Add an element to the vector in each iteration
for i in 1..10 loop
V := V & i;
end loop;
-- Show the results
for Element of V loop
Put_Line(Element'Image);
end loop;
-- Print the 2nd element of the array
Put_Line(Integer'Image(V(2)));
end jdoodle;
Ada arrays are first class types, even when the array type is anonymous. Thus a 2-dimensional array is a different type than a 3-dimensional array. Furthermore, arrays are not defined by directly specifying the number of elements in a dimension as they are in languages derived from C.
For instance, if you define a 2-dimensional array such as
type array_2d is array (Integer range 0..1, Integer range 1..2);
You have defined an array with a first range of 0..1 and a second range of 1..2. This would be a square matrix containing 4 elements.
You cannot simply add another dimension to an object of the type array_2d described above. Such an array must be of a different type.
Furthermore, one cannot change the definition of an array object after it is created.
Ok, so while this is a simple question it gets into the interesting details of design very quickly. The first thing is that an array "always knows its bounds" -- this language design element impacts the usage of the language profoundly: instead of having to pass the length of the array as a parameter, and possibly going out-of-sync, like in C you simply pass the array and let the "it knows its bounds" take care of the management.
Second, the Array is always static-length, meaning it cannot be changed once created. Caveat: Creating an array can be done in a "dynamic" setting, like querying user-input.
Third, if the subtype is unconstrained than it can have "variable" length. -- This means that you can have something like Type Integer_Vector is Array(Positive Range <>) of Integer and have parameters that operate on any size value passed in (and return-values that can be any size). This, in turn, means that your handling of such subtypes tends itself toward the more general.
Fourth, all of these apply and combine so that a lot of the 'need' for dynamically sized arrays aren't needed -- yes, there are cases where it is needed, or where it is more convenient to have a single adjustable object; this is what Ada.Containers.Vectors addresses -- but in the absence of needing a truly dynamically-sizing object you can use processing for achieving your goals.
Consider the following example:
Type Integer_Vector is Array(Positive range <>) of Integer;
Function Append( Input : Integer_Vector; Element : Integer ) return Integer_Vector is
( Input & Element );
X : Constant Integer_Vector:= (1,2);
Y : Integer_Vector renames Append(X, 3);
These three design choices combine to allow some intere
I know that some part of the vector (the actual data) is stored in the heap, while some data (a struct containing the length, capacity and pointer to the actual data in heap) is stored on the stack.
What about a vector of vectors (i.e the elements of the vector are other vectors, e.g. a vector of strings)? What parts of this outer container vector are stored in the heap and on thee stack? What about the individual inner elements?
It is not true that a Vec (the struct containing pointer, length and capacity) is always stored on the stack. You can move any type (excluding self-referential ones, which can't be moved) from the stack to the heap, by putting it in a Box, Vec or other heap-using smart pointer. Just consider a straightforward type like i64: it might be stored on the stack (or in a register if the compiler so chooses), but if you write vec![7i64], you have an i64 stored on the heap and the only thing left on the stack is the Vec itself (a pointer plus length and capacity).
With this analogy, it's not hard to see that the same applies for String: it can be on the stack, but you can put it on the heap by creating a Vec<String>. Therefore, if you have a Vec<String> with length 100, there are 101 independent heap allocations: one owned by the Vec and one owned by each of the Strings.
See also
If I make a struct and put it in a vector, does it reside on the heap or the stack?
What is the memory layout of a vector of arrays?
The 6.3.6 Vectors section in the Scheme R5RS standard states the following about vectors:
Vectors are heterogenous structures whose elements are indexed by integers. A vector typically occupies less space than a list of the same length, and the average time required to access a randomly chosen element is typically less for the vector than for the list.
This description of vectors is a bit diffuse.
I'd like to know what this actually means in terms of the vector-ref and list-ref operations and their complexity. Both procedures returns the k-th element of a vector and a list. Is the vector operation O(1) and is the list operation O(n)? How are vectors different than lists? Where can I find more information about this?
Right now I'm using association lists as a data structure for storing key/value pairs for easy lookup. If the keys are integers it would perhaps be better to use vectors to store the values.
The very specific details of vector-ref and list-ref are implementation-dependent, meaning: each Scheme interpreter can implement the specification as it sees fit, so an answer for your question can not be generalized to all interpreters conforming to R5RS, it depends on the actual interpreter you're using.
But yes, in any decent implementation is a safe bet to assume that the vector-ref operation is O(1), and that the list-ref operation is probably O(n). Why? because a vector, under the hood, should be implemented using a data structure native to the implementation language, that allows O(1) access to an element given its index (say, a primitive array) - therefore making the implementation of vector-ref straightforward. Whereas lists in Lisp are created by linking cons cells, and finding an element at any given index entails traversing all the elements before it in the list - hence O(n) complexity.
As a side note - yes, using vectors would be a faster alternative than using association lists of key/value pairs, as long as the keys are integers and the number of elements to be indexed is known beforehand (a Scheme vector can not grow its size after its creation). For the general case (keys other than integers, variable size) check if your interpreter supports hash tables, or use an external library that provides them (say, SRFI 69).
A list is constructed from cons cells. From the R5RS list section:
The objects in the car fields of successive pairs of a list are the elements of the list. For example, a two-element list is a pair whose car is the first element and whose cdr is a pair whose car is the second element and whose cdr is the empty list. The length of a list is the number of elements, which is the same as the number of pairs.
For example, the list (a b c) is equivalent to the following series of pairs: (a . (b . (c . ())))
And could be represented in memory by the following "nodes":
[p] --> [p] --> [p] --> null
| | |
|==> a |==> b |==> c
With each node [] containing a pointer p to the value (it's car), and another pointer to the next element (it's cdr).
This allows the list to grow to an unlimited length, but requires a ref operation to start at the front of the list and traverse k elements in order to find the requested one. As you stated, this is O(n).
By contrast, a vector is basically an array of values which could be internally represented as an array of pointers. For example, the vector #(a b c) might be represented as:
[p p p]
| | |
| | |==> c
| |
| |==> b
|
|==> a
Where the array [] contains a series of three pointers, and each pointer is assigned to a value in the vector. So internally you could reference the third element of the vector v using the notation v[3]. Since you do not need to traverse the previous elements, vector-ref is an O(1) operation.
The main disadvantage is that vectors are of fixed size, so if you need to add more elements than the vector can hold, you have to allocate a new vector and copy the old elements to this new vector. This can potentially be an expensive operation if your application does this on a regular basis.
There are many resources online - this article on Scheme Data Structures goes into more detail and provides some examples, although it is much more focused on lists.
All that said, if your keys are (or can become) integers and you either have a fixed number of elements or can manage with a reasonable amount of vector reallocations - for example, you load the vector at startup and then perform mostly reads - a vector may be an attractive alternative to an association list.
Does anyone know how to do this and what the pseudo code would look like?
As we all know a hash table stores key,value pairs and when a key is a called, the function will return the value associated with that key. What I want to do is understand the underlying structure in creating that mapping function. For example, if we lived in a world where there were no previously defined functions except for arrays, how could we replicate the Hashmaps that we have today?
Actually, some of todays Hashmap implentations are indeed made out of arrays as you propose. Let me sketch how this works:
Hash Function
A hash function transforms your keys into an index for the first array (array K). A hash function such as MD5 or a simpler one, usually including a modulo operator, can be used for this.
Buckets
A simple array-based Hashmap implementation could use buckets to cope with collissions. Each element ('bucket') in array K contains itself an array (array P) of pairs. When adding or querying for an element, the hash function points you to the correct bucket in K, which contains your desired array P. You then iterate over the elements in P until you find a matching key, or you assign a new element at the end of P.
Mapping keys to buckets using the Hash
You should make sure that the number of buckets (i.e. the size of K) is a power of 2, let's say 2^b. To find the correct bucket index for some key, compute Hash(key) but only keep the first b bits. This is your index when cast to an integer.
Rescaling
Computing the hash of a key and finding the right bucket is very quick. But once a bucket becomes fuller, you will have to iterate more and more items before you get to the right one. So it is important to have enough buckets to properly distribute the objects, or your Hashmap will become slow.
Because you generally don't know how much objects you will want to store in the Hashmap in advance, it is desirable to dynamically grow or shrink the map. You can keep a count of the number of objects stored, and once it goes over a certain threshold you recreate the entire structure, but this time with a larger or smaller size for array K. In this way some of the buckets in K that were very full will now have their elements divided among several buckets, so that performance will be better.
Alternatives
You may also use a two-dimensional array instead of an array-of-arrays, or you may exchange array P for a linked list. Furthermore, instead of keeping a total count of stored objects, you may simply choose to recreate (i.e. rescale) the hashmap once one of the buckets contains more than some configured number of items.
A variation of what you are asking is described as 'array hash table' in the Hash table Wikipedia entry.
Code
For code samples, take a look here.
Hope this helps.
Could you be more precise? Does one array contain the keys, the other one the values?
If so, here is an example in Java (but there are few specificities of this language here):
for (int i = 0; i < keysArray.length; i++) {
map.put(keysArray[i], valuesArray[i]);
}
Of course, you will have to instantiate your map object (if you are using Java, I suggest to use a HashMap<Object, Object> instead of an obsolete HashTable), and also test your arrays in order to avoid null objects and check if they have the same size.
Sample Explanation:
At the below source, basically it does two things:
1. Map Representation
Some (X number of List) of lists
X being 2 power N number of lists is bad. A (2 power N)-1, or (2 power N)+1, or a prime number is good.
Example:
List myhashmap [hash_table_size];
// an array of (short) lists
// if its long lists, then there are more collisions
NOTE: this is array of arrays, not two arrays (I can't see a possible generic hashmap, in a good way with just 2 arrays)
If you know Algorithms > Graph theory > Adjacency list, this looks exactly same.
2. Hash function
And the hash function converts string (input) to a number (hash value), which is index of an array
initialize the hash value to first char (after converted to int)
for each further char, left shift 4 bits, then add char (after converted to int)
Example,
int hash = input[0];
for (int i=1; i<input.length(); i++) {
hash = (hash << 4) + input[i]
}
hash = hash % list.size()
// list.size() here represents 1st dimension of (list of lists)
// that is 1st dimension size of our map representation from point #1
// which is hash_table_size
See at the first link:
int HTable::hash (char const * str) const
Source:
http://www.relisoft.com/book/lang/pointer/8hash.html
How does a hash table work?
Update
This is the Best source: http://algs4.cs.princeton.edu/34hash/
You mean like this?
The following is using Ruby's irb as an illustration:
cities = ["LA", "SF", "NY"]
=> ["LA", "SF", "NY"]
items = ["Big Mac", "Hot Fudge Sundae"]
=> ["Big Mac", "Hot Fudge Sundae"]
price = {}
=> {}
price[[cities[0], items[1]]] = 1.29
=> 1.29
price
=> {["LA", "Hot Fudge Sundae"]=>1.29}
price[[cities[0], items[0]]] = 2.49
=> 2.49
price[[cities[1], items[0]]] = 2.99
=> 2.99
price
=> {["LA", "Hot Fudge Sundae"]=>1.29, ["LA", "Big Mac"]=>2.49, ["SF", "Big Mac"]=>2.99}
price[["LA", "Big Mac"]]
=> 2.49
I'm trying to learn Ada for a course at the University, and I'm having a lot of problems wrapping my head around some of the ideas in it.
My current stumbling block: Let's say I have a function which takes a Matrix (just a 2-dimensional array of Integers), and returns a new, smaller matrix (strips out the first row and first column).
I declare the matrix and function like this:
type MATRIX is array(INTEGER range <>, INTEGER range <>) of INTEGER;
function RemoveFirstRowCol (InMatrix: in MATRIX) return MATRIX is
Then I decide on the size of the Matrix to return:
Result_matrix: MATRIX (InMatrix'First(1) .. InMatrix'Length(1) - 1, InMatrix'First(2) .. InMatrix'Length(2) - 1);
Then I do the calculations and return the Result_matrix.
So here's my problem: when running this, I discovered that if I try to return the result of this function into anything that's not a Matrix declared with the exact proper size, I get an exception at runtime.
My question is, am I doing this right? It seems to me like I shouldn't have to know ahead of time what the function will return in terms of size. Even with a declared Matrix bigger than the one I get back, I still get an error. Then again, the whole idea of Ada is strong typing, so maybe this makes sense (I should know exactly the return type).
Anyways, am I doing this correctly, and is there really no way to use this function without knowing in advance the size of the returned matrix?
Thanks,
Edan
You don't need to know the size of the returned matrix in advance, nor do you need to use an access (pointer) type. Just invoke your function in the declarative part of a unit or block and the bounds will be set automatically:
procedure Call_The_Matrix_Reduction_Function (Rows, Cols : Integer) is
Source_Matrix : Matrix(1 .. Rows, 1 .. Cols);
begin
-- Populate the source matrix
-- ...
declare
Result : Matrix := RemoveFirstRowCol (Source_Matrix)
-- Result matrix is automatically sized, can also be declared constant
-- if appropriate.
begin
-- Process the result matrix
-- ...
end;
end Call_The_Matrix_Reduction_Function;
Caveat: Since the result matrix is being allocated on the stack, you could have a problem if the numbers of rows and columns are large.
Because your MATRIX type is declared with unbound indexes, the type is incomplete. This means that it can be returned by a function. In this case, this acts as it were pointer. Of course the compiler does not know the exact indexes in compile time, the result matrix will always be allocated in heap.
Your solution should be working. The only problem is when you create the result matrix is, that it will work only if the original matrix index starts with 0.
m:MATRIX(11..15,11..20);
In this case m'first(1) is 11, m'length(1) is 5! So you get:
Result_matrix:MATRIX(11..4,11..9);
which is CONSTRAINT_ERROR...
Use the last attribute instead. Even if you usually use with 0 index.
But remember, you do not need to use pointer to the MATRIX because the MATRIX is also incomplete, and that's why it can be used to be returned by a function.
The caller knows the dimensions of the matrix it passes to your function, so the caller can define the type of the variable it stores the function's return value in in terms of those dimensions. Does that really not work?
your function cannot know the size of the result matrix in compile time
you need to return a pointer to the new matrix :
type Matrix is array (Positive range <>, Positive range <>) of Integer;
type Matrix_Ptr is access Matrix;
-- chop the 1'th row and column
function Chopmatrix (
Inputmatrix : in Matrix )
return Matrix_Ptr is
Returnmatrixptr : Matrix_Ptr;
begin
-- create a new matrix with is one row and column smaller
Returnmatrixptr := new Matrix(2 .. Inputmatrix'Last, 2.. Inputmatrix'Last(2) );
for Row in Inputmatrix'First+1 .. Inputmatrix'Last loop
for Col in Inputmatrix'First+1 .. Inputmatrix'Last(2) loop
Returnmatrixptr.All(Row,Col) := Inputmatrix(Row,Col);
end loop;
end loop;
return Returnmatrixptr;
end Chopmatrix ;