I am creating a Hash Table that uses nodes to chain. What is a collision? - hashtable

I can't seem to get an answer that I understand. What is a collision when you have a hash table that uses linked nodes?
Is the collision +1 for every index that you must pass to get to the index needed for that node you are adding?
I know that collisions are unavoidable, I have learned that much through my research but I haven't been able to figure out what constitutes a collision when dealing with a hash table that has linked nodes.
My program after finding its proper place in the array (array of pointers to nodes), sticks the new node at the front. Each element points at a node that points at another node, I have essentially multiple linked lists. So, does the collision count only include the first node of the element where the new node belongs because I stick it at the front, or does it include every single node in the linked list for that element.
for example, if the name "Smith" goes to the element [5], which also has 5 other nodes that are linked together, and I add it to the front, how would I decide what the collision count is?
Thanks for any help!

A collision is when 2 distinct entries produces the same output through the Hash function.
Say your (poorly designed) hash function H consists in adding all the digits of a number:
5312 -> 5 + 3 + 1 + 2 = 11
1220 -> 1 + 2 + 2 + 0 = 5
So H(5312) = 11 and H(1220) = 5
H has a lot of collisions (this is why you should not use it):
H(4412) = 4 + 4 + 1 + 2 = 11
H(9200) = 9 + 2 + 0 + 0 = 11
etc...

Related

Implementation of Speck cipher

I am trying to implement the speck cipher as specified here: Speck Cipher. On page 18 of the document you can find some speck pseudo-code I want to implement.
It seems that I got a problem on understanding the pseudo-code. As you can find there, x and y are plaintext words with length n. l[m-2],...l[0], k[0] are key words (as for words, they have length n right?). When you do the key expansion, we iterate for i from 0 to T-2, where T are the round numbers (for example 34). However I get an IndexOutofBoundsException, because the array with the l's has only m-2 positions and not T-2.
Can someone clarify what the key expansions does and how?
Ah, I get where the confusion lies:
l[m-2],...l[0], k[0]
these are the input key words, in other words, they represent the key. These are not declarations of the size of the arrays, as you might expect if you're a developer.
Then the subkey's in array k should be derived, using array l for intermediate values.
According to the formulas, taking the largest i, i.e. i_max = T - 2 you get a highest index for array l of i_max + m - 1 = T - 2 + m - 1 = T + m - 3 and therefore a size of the array of one more: T + m - 2. The size of a zero-based array is always the index of the last element - plus one, after all.
Similarly, for subkey array k you get a highest index of i_max + 1, which is T - 2 + 1 or T - 1. Again, the size of the array is one more, so there are T elements in k. This makes a lot of sense if you require T round keys :)
Note that it seems possible to simply redo the subkey derivation for each round if you require a minimum of RAM. The entire l array doesn't seem necessary either. For software implementations that doesn't matter a single iota of course.

Is there a closed form available for the following table?

Below is a table which has a recursive relation as current cell value is the sum of the upper and left cell.
I want to find the odd positions for any given row denoted by v(x) as represented in the first column.
Currently, I am maintaining two one arrays which I update with new sum values and literally checking if each positions value is odd or even.
Is there a closed form that exists which would allow me to directly say what are the odd positions available (say, for the 4th row, in which case it should tell me that p1 and p4 are the odd places).
Since it is following a particular pattern I feel very certain that a closed form should exist which would mathematically tell me the positions rather than calculating each value and checking it.
The numbers that you're looking at are the numbers in Pascal's triangle, just rotated ninety degrees. You more typically see it written out like this:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
...
You're cutting Pascal's triangle along diagonal stripes going down the left (or right, depending on your perspective) strips, and the question you're asking is how to find the positions of the odd numbers in each stripe.
There's a mathematical result called Lucas's theorem which is useful for determining whether a given entry in Pascal's triangle is even or odd. The entry in row m, column n of Pascal's triangle is given by (m choose n), and Lucas's theorem says that (m choose n) mod 2 (1 if the number is odd, 0 otherwise) can be found by comparing the bits of m and n. If n has a bit that's set in a position where m doesn't have that bit set, then (m choose n) is even. Otherwise, (m choose n) is odd.
As an example, let's try (5 choose 3). The bits in 5 are 101. The bits in 3 are 011. Since the 2's bit of 3 is set and the 2's bit of 5 is not set, the quantity (5 choose 3) should be even. And since (5 choose 3) = 10, we see that this is indeed the case!
In pseudocode using relational operators, you essentially want the following:
if ((~m & n) != 0) {
// Row m, entry n is even
} else {
// Row m, entry n is odd.
}

Point handling for closed loop searching

I have set of line segments. Each contains only 2 nodes. I want to find the available closed cycles which produces by joining line segments. Actually, I am looking for the smallest loop if there exist more than one occurrence. If can, please give me a good solution for this.
So, for example I have added below line list together with their point indices to get idea about m case. (Where First value = line number, second 2 values are the point indices)
0 - 9 11
1 - 9 18
2 - 9 16
3 - 11 26
4 - 11 45
5 - 16 25
6 - 16 49
7 - 18 26
8 - 18 25
9 - 18 21
10 - 25 49
11 - 26 45
So, assume I have started from the line 1. That is I have started to find connected loops from point 9, 18. Then, could you please explain (step by step) how I can get the "closed loops" from that line.
Well, I don't see any C++ code, but I'll try to suggest a C++ solution (although I'm not going to write it for you).
If your graph is undirected (if it's directed, s/adjacent/in-edges' vertices/), and you want to find all the shortest cycles passing through some vertex N, then I think you could follow this procedure:
G <= a graph
N <= some vertex in G
P <= a path (set of vertexes/edges connecting them)
P_heap <= a priority queue, ascending by distance(P) where P is a path
for each vertex in adjacent(N):
G' = G - edge(vertex, N)
P = dijkstraShortestPath(vertex, N, G')
push(P, P_heap)
You could also just throw out all but the shortest loop, but that's less succinct. As long as you don't allow negative edge weights (which, since you'll be using line segment length for weights, you don't), I think this should work. Also, fortunately Boost.Graph provides all of the necessary functionality to do this in C++ (you don't even have to implement Dijkstra's algorithm)! You can find documentation about it here:
http://www.boost.org/doc/libs/1_47_0/libs/graph/doc/table_of_contents.html
EDIT: you will have to create the graph from that data you listed first before you can do this, so you'll just define your graph's property_map accordingly and make sure the distance between a vertex you're about to insert and all vertexes currently in the graph is greater than zero, because otherwise the vertex is already in the graph and you don't want to insert it again.
Happy graphing!

Exam question about hash tables (interpretation of wording)

I was confused about the wording of a particular exam question about hash tables. The way I understand it there could be two different answers depending on the interpretation. So I was wondering if someone could help determine which understanding is correct. The question is below:
We have a hash table of size 7 to store integer keys, with hash function h(x) = x mod 7. If we use linear probing and insert elements in the order 1, 15, 14, 3, 9, 5, 27, how many times will an element try to move to an occupied spot?
I'll break down my two different understandings of this question. First of all the initial indexes of each element would be:
1: 1
15: 1
14: 0
3: 3
9: 2
5: 5
27: 6
First interpretation:
1: is inserted into index 1
15: tries to go to index 1, but due to a collision moves left to index 0. Collision count = 1
14: tries to go to index 0, but due to collision moves left to index 6. Collision count = 2
3: is inserted into index 3
9: is inserted into index 2
5: is inserted into index 5
27: tries to go to index 6, but due to collisions moves to index 5 and then to 4 which is empty. Collision count = 4
Answer: 4?
Second interpretation:
Only count the time when 27 tries to move to the occupied index 5 because of a collision with the element in index 6.
Answer: 1?
Which answer would be correct?
Thanks.
The wording is silly.
The teacher arguably wants #1 but I would argue that #2 is pedantically correct because an element will only ever try to move to an occupied spot once, as pointed out. In the other cases it does not move to an occupied spot but rather from an occupied spot to a free spot.
Tests in school are sort of silly -- the teacher (or TA) already knows what he/she wants. There is a line to draw between "being pedantically correct" and "giving the teacher what they want". (Just never, ever give in to the provably wrong!)
One thing that has never (at least that I recall ;-) failed me in a test or homework is providing an answer with a solid -- and correct -- justification for the answer; this may include also explaining the "other" answer.
Teacher/environment, repertoire, hubris and grade (to name a few) need to be balanced.
Happy schooling.
Interpretation 1 is correct. Collision with 6 means that slot 6 is occupied, so why don't you count it?

How do I calculate the number of permutations in base 3 combinatorics?

I've never been much for math and I'm hoping that someone can help me out with the following.
I have 5 boxes:
1 2 3 4 5
[ ] [ ] [ ] [ ] [ ]
The boxes can either be white, gray, or black (or think of it as 0, 1, 2)
How many possible states can the box set be in?
What is the pseudocode (or in any language) to generate all the possible outcomes??
ie...
00000
00001
00011
00111
etc, etc...
I really appreciate any help anyone can give me with this.
the answer for the number of combinations is: 3x3x3x3x3 (3^5) since each box can have 3 possible colors.
As for generating the outcomes, see if you can figure it out using this matrix with 0, 1, or 2 to represent the color of the box. On a smaller scale (lets assume 3 boxes) it would look like this:
0 0 0
0 0 1
0 0 2
0 1 0
0 1 1
0 1 2
0 2 0
0 2 1
0 2 2
1 0 0
1 0 1
1 0 2
1 1 0
1 1 1
1 1 2
1 2 0
1 2 1
1 2 2
2 0 0
2 0 1
2 0 2
2 1 0
2 1 1
2 1 2
2 2 0
2 2 1
2 2 2
This is a classic permutation generation problem. You have 3 possibilities for each position, and 5 positions. The total number of generated string is 3^5 = 243.
You need recursion if you want a general solution (a simple iterative loop only works for a single instance of the problem).
Here's a quick example:
public static void Main(string[] args){
Generate("", 5);
}
private void Generate(string s, int limit)
{
if (s.Length == limit)
Console.WriteLine(s);
else
{
Generate(s+"0", limit);
Generate(s+"1", limit);
Generate(s+"2", limit);
}
}
To answer your first question, what would the answer be if the boxes could contain only one of two values? So, what's the answer if the boxes contain one of three values?
To answer your second question, what pseudocode generates all possible outcomes of one box? Now, pseudocode generates all possible outcomes of two boxes?
I'd recommend solving the problem on paper first. Try to solve it with a smaller number of boxes (maybe three), and list all possibilities. Then, think of how your reasoning went, or how you'd explain what you did to a small child.
This seems like a homework problem. I'll just give you some help as to the solution then.
What you are saying is that each box has three states, which are all independent. One box would have 3 solutions, and two boxes would have 3 * 3 solutions - for each state of the first box the second box would have three states as well. Extend that to 5 boxes.
To generate each solution, you can just cycle through it. It is easy to make nested for loops for each box, and multiplying by powers of 10 can let you show the number at once.
You can generalize the code for multiple boxes in a similar way.
Thank you all for your answers, at least those of you who actually gave me one.
While I can appreciate that the question sounded like it was pulled straight out of Computer Science 101, it wasn't. The irony of the matter is that it was for real life on a real deadline and I didn't have time to hearken back to when I was being taught this stuff and said to myself, "when am I ever going to need this crap"
If I wanted to be patronized and treated like a school boy I would go back to my elementary school and ask my 5th grade teacher if I can go to the bathroom
Thanks again
the number of states is 3^5.
pseudocode is
for value from 0 to 3^5-1
print base3(value)
where base3 is a function that repeatedly takes modulo 3 to get a digit, then removes that digit (by dividing by 3)
Hint: imagine that each box is a position in a number and each colour is a different digit. In the real world, how many combinations (including zero) do you get with 2 positions and 10 possible digits? What about 3 positions? What's the relationship between adding an extra position and the number of combinations, given the number of digits you have available?
Unique number of combinations: 3^5=243
Code:
n = 0
for i = 0 to 3^5-1
{
s = ""
for j = 1 to 5
{
d = n mod 3
s = toascii(d) . s
n = n / 3
}
println s
i = i + 1
}
Here's how I first learned to do this: first think about how many choices you are making. You are making five choices, one for each box. So write down five blank lines with multiplication signs:
__ x __ x __ x __ x __ = ?
In each blank, write the number of objects you have to choose from for that box. Since you have 3 numbers to choose from for each box, you write a 3 in every blank:
3 x 3 x 3 x 3 x 3 = 243
This gives you the total number of permutations for those choices.
The number of possibilities is 3 to the power 5
If you loop from 0 to that number minus 1 and express it in base 3 you will have all the possibilities (remember to prepend 0s where necessary)
In Ruby:
number_of_possibilities = 3**5-1
for i in (0..number_of_possibilities)
base_3_number = i.to_s(3)
puts "%05d" % base_3_number # number formatting used to prepend 0s where necessary
end
Can I ask what about this you don't understand or whats tripping you up? I see that everyone here has simply answered the question, but if you've copied their answers, you've learned nothing, and thus completely missed the point of the homework. Assuming your next lesson builds upon this one, you're just going to fall further behind.
If you either worked for me or were in my class I'd simply ask the following...
"How do you think the problem should be solved?" The answer to which might reveal where you're getting hung up. A wise professor of mine at CMU once said "I can't help you understand this until you know what you don't understand" I never did figure out what I didn't understand and I dropped his class, but the lesson stuck with me.
I know its probably too late, but for these homework questions I really think we should be helping the person learn as opposed to simply providing an answer and doing their homework for them.
Your problem needs nothing more than the rule of product in combinatorics.
You can choose the state of the first box in 3 ways, and the state of the second box in 3 ways, and ... and the state of the 5th box in 3 ways. The number of ways in which you can set the state of all the boxes is the product of all the five (equal) numbers of ways, i.e. 3x3x3x3x3 = 35.
Similar question: how many numbers can you form with 5 digits in the decimal system, counting the leading zeros? That is, how many numbers are there from 00000 to 99999? You can choose the first digit in 10 ways (0...9), and so on and so on, and the answer is 10x10x10x10x10 = 100000, as you already know.
Don't even try to write code to answer this! The reason is that you need some very large numbers (factorials) to calculate it. These create numbers much larger than any base type in the CLR. You can use this opensource library to do the calculation.
void solve(int p=0,int n=5,int d=0)
{
if (n==p)
{
int rev=d;
int i=0;
while (i<5) {
cout << rev%10;
rev /= 10;
i++;## Heading ##
}
cout << endl;
return;
}
for(int i=0; i<3 ; i++)
{
solve(p+1,n, d*10 + i);
}
}

Resources