hash functions and hash tables

hash functions and hash tables - hashtable

I have been assigned a question dealing with hash tables and hash functions and I am currently stuck on how to assign the size using the function given.
I was given the problem that h(n) = n/10 which is simple enough
Then I am told that the size is 10, and I have to insert numbers given to me into a list.
My problem is that I thought the array that I put these numbers would be from 0-9. My numbers are 352, 4079, 25, 251, 498, 591, 242 which confuses me more because if I do say something like 4079/10 that is way more than 9 so I cannot put that in the array....
I think I am probably missing something huge here, but I don't know what.

Related

How to find the value of the common ratio in geometric series without the method of substitution

It has been sometime since I got myself to solve mathematical equations.
So I can't seem to find a way to come to a simplified equation for finding the common ratio knowing the final sum and the value of n in the formula :
finalSum = r(1 - r^n)/(1 - r);
For example: 3, 9, 27, 81, 243
363 = r(1 - r^5)/(1 - r);
Given above is a very simple example. But I'll be dealing these in decimals. Is there any way of getting a simplified equation to get the value of r? Or is the method of substitution the only way?
PS: This is for a program I'm writing
Please let me know.
Thanks

Substituion is the easiest method for finding the common ratio.
This resource may be helpful when dealing with common ratios of geometric sequences:
Geometric Sequences

Simple function to generate random number sequence without knowing previous number but know current index (no variable assignment)?

Is there any (simple) random generation function that can work without variable assignment? Most functions I read look like this current = next(current). However currently I have a restriction (from SQLite) that I cannot use any variable at all.
Is there a way to generate a number sequence (for example, from 1 to max) with only n (current number index in the sequence) and seed?
Currently I am using this:
cast(((1103515245 * Seed * ROWID + 12345) % 2147483648) / 2147483648.0 * Max as int) + 1
with max being 47, ROWID being n. However for some seed, the repeat rate is too high (3 unique out of 47).
In my requirements, repetition is ok as long as it's not too much (<50%). Is there any better function that meets my need?
The question has sqlite tag but any language/pseudo-code is ok.
P.s: I have tried using Linear congruential generators with some a/c/m triplets and Seed * ROWID as Seed, but it does not work well, it's even worse.
EDIT: I currently use this one, but I do not know where it's from. The rate looks better than mine:
((((Seed * ROWID) % 79) * 53) % "Max") + 1

I am not sure if you still have the same problem but I might have a solution for you.
What you could do is use Pseudo Random M-sequence generators based on shifting registers. Where you just have to take high enough order of you primitive polynomial and you don't need to store any variables really.
For more info you can check the wiki page
What you would need to code is just the primitive polynomial shifting equation and I have checked in an online editor it should be very easy to do. I think the easiest way for you would be to use Binary base and use PRBS sequences and depending on how many elements you will have you can choose your sequence length. For example this is the implementation for length of 2^15 = 32768 (PRBS15), the primitive polynomial I took from the wiki page (There youcan find the primitive polynomials all the way to PRBS31 what would be 2^31=2.1475e+09)
Basically what you need to do is:
SELECT (((ROWID << 1) | (((ROWID >> 14) <> (ROWID >> 13)) & 1)) & 0x7fff)
The beauty of this approach is if you take the sequence of the PRBS with longer period than your ROWID largest value you will have unique random index. Very simple. :)
If you need help with searching for primitive polynomials you can see my github repo which deals exactly with finding primitive polynomials and unique m-sequences. It is currently written in Matlab, but I plan to write it in python in next few days.
Cheers!

What about using good hash function and map result into [1...max] range?
Along the lines (in pseudocode). sha1 was added to SQLite 3.17.
sha1(ROWID) % Max + 1
Or use any external C code for hash (murmur, chacha, ...) as shown here

A linear congruential generator with appropriately-chosen parameters (a, c, and modulus m) will be a full-period generator, such that it cycles pseudorandomly through every integer in its period before repeating. Although you may have tried this idea before, have you considered that m is equivalent to max in your case? For a list of parameter choices for such generators, see L'Ecuyer, P., "Tables of Linear Congruential Generators of Different Sizes and Good Lattice Structure", Mathematics of Computation 68(225), January 1999.
Note that there are some practical issues to implementing this in SQLite, especially if your SQLite version supports only 32-bit integers and 64-bit floating-point numbers (with 52 bits of precision). Namely, there may be a risk of—
overflow if an intermediate multiplication exceeds 32 bits for integers, and
precision loss if an intermediate multiplication results in a greater-than-52-bit number.
Also, consider why you are creating the random number sequence:
Is the sequence intended to be unpredictable? In that case, a linear congruential generator alone is not enough, and you should generate unique identifiers by other means, such as by combining unique numbers with cryptographically random numbers.
Will the numbers generated this way be exposed in any way to end users? If not, there is no need to obfuscate them by "shuffling" them.
Also, depending on the SQLite API you're using (for your programming language), there may be a way to write a custom function to convert the seed and ROWID to a random unique number. The details, however, depend heavily on the specific SQLite API. Another answer shows an example for Perl.

Generete unique random number on large number range

what i ask about is if exist a way to generate unique random numbers without helper structures.
I mean if already exist some mathematics functions (or algorithms) that natively generate random numbers only at once on a field (i would not try to write some kind of hash function specific for this problem).
This because i would generate a lot of unique numbers (integer) choosen between 0 and 10.000.000.000 (about 60% of the field), so a random repetition is not so improbable and store previously generated number in a structure for a subsequent lookup (even if well optimized, like bit arrays) could be too expensive (spatially and temporally).
P.S.
(Note that when i write random i really mean pseudo random)

If you want to ensure uniqueness then do not use a hash function, but instead use an encryption function to encrypt the numbers 0, 1, 2, 3 ... Since encryption is reversible then every number (up to the block size) is uniquely encrypted and will produce a unique result.
You can either write a simple Feistel cypher with a convenient block size or else use the Hasty Pudding cypher, which allows a large range of block sizes. Whenever an input number generates too large an output, then just go to the next input number.
Changing the key of the cypher will generate a different series of output numbers. the same series of numbers can be regenerated whenever needed by remembering the key and starting again with 0, 1, 2 ... There is no need to store the entire sequence. As you say, the sequence is pseudo-random and so can be regenerated easily if you know the key.

Instead of pseudo-random numbers, you could try so-called quasi-random numbers, which are more accurately called low-discrepancy sequences. [1]
[1] https://en.wikipedia.org/wiki/Low-discrepancy_sequence

Vigenere Cipher - decryption (by hand)

This is a Vigenere cipher-text
EORLL TQFDI HOEZF CHBQN IFGGQ MBVXM SIMGK NCCSV
WSXYD VTLQS BVBMJ YRTXO JCNXH THWOD FTDCC RMHEH
SNXVY FLSXT ICNXM GUMET HMTUR PENSU TZHMV LODGN
MINKA DTLOG HEVNI DXQUG AZGRM YDEXR TUYRM LYXNZ
ZGJ
The index of coincidence gave a shift of six (6): I know this is right (I used an online Java applet to decrypt the whole thing using the key 'QUARTZ').
However, in this question we are only told the first and last two letters of the Key - 'Q' and 'TZ.'
So far I have split the ciphertext into slices using this awesome applet. So the first slice is 0, k, 2k, 3k, 4k; the second is 1, k + 1, 2k + 1, 3k + 1; et cetera.
KeyPos=0: EQEQQSCXQJJHDEYIUTSVMTVUMTYJ
KeyPos=1: OFZNMICYSYCWCHFCMUULILNGYUX
KeyPos=2: RDFIBMSDBRNOCSLNERTONOIADYN
KeyPos=3: LICFVGVVVTXDRNSXTPZDKGDZERZ
KeyPos=4: LHHGXKWTBXHFMXXMHEHGAHXGXMZ
KeyPos=5: TOBGMNSLMOTTHVTGMNMNDEQRRLG
My idea was to calculate the highest-frequency letter in each block, hoping that the most frequent letter would give me some clue as to how to find 'U,' 'A' and 'R.' However, the most frequent letters in these blocks are:
KeyPos=0: Q,4 T,3 E,3, J,3
KeyPos=1: C,4 U,3 Y,3
KeyPos=2: N,4 O,3 R,3 D,3 B,2
KeyPos=3: V,4 D,3 Z,3
KeyPos=4: H,6 X,6 M,3 G,3
KeyPos=5: M,4 T,4 N,3 G,3
Which yields QCNVHM, or QUNVHM (being generous), neither of which are that close to QUARTZ. There are online applets that can crack this no problem, so it mustn't be too short a text to yield decent frequency counts from the blocks.
I guess I must be approaching this the wrong way. I just hoped one of you might be able to offer some clue as to where I am going wrong.
p.s. This is for a digital crypto class.

Interesting question...
I don't have a programmatic solution for cracking the original ciphertext, but I was able to solve it with a little mind power and some helpful JavaScript.
I started by using this page and the information you supplied. Provide the ciphertext, a key length of 6 and hit initialize. What's nice about the approach here is that unknowns in either the plaintext or key are left as hyphens.
Update the key, adding only what you know Q---TZ and click 'update plaintext'. At this point we know:
o---sua---opo---oca---nha---enc---rom---dth---ama---int---ept---our---mun---tio---ewi---eus---the---ond---loc---onf---now---hed---off---ere---nsw---esd---tmi---ght
Here's where I applied a bit of brain power. You start recognizing bits of the plaintext. the, now and off make an appearance. At the end, there's ght - this made me think the prior letter is likely a vowel. For example light or thought. I replaced the corresponding hyphen with u and clicked update keyword to find what letter would have produced that combination. The matching letter turns out to be F. I think updated the plaintext to see the results. They didn't look promising. So I tried i instead which resulted in:
o--usua--ropo--loca--onha--eenc--prom--edth--eama--eint--cept--gour--mmun--atio--wewi--beus--gthe--cond--yloc--ionf--mnow--thed--poff--mere--insw--nesd--atmi--ight
Now we're getting somewhere. At the start I see something that might be usual, and further in I see int--cept and near the end w--nesd-- at mi--ight. Voila. Filling in the letters for wednesday and updating the keyword yielded QUARTZ.
... So, how to port this approach to code? Not sure about the best way to do that just yet. The idea of using the known characters in the key, partially decrypting the ciphertext and brute forcing the rest is appealing. But without a dictionary handy, I'm not sure what the best brute-forcing method would be...
To be continued (maybe)...

An algorithm wouldn't just consider the most frequent letters but the frequency pattern of the whole alphabet. Technically you compute the index of coincidence for each possible shift and consider the maximal ones.

Big O Log problem solving

I have question that comes from a algorithms book I'm reading and I am stumped on how to solve it (it's been a long time since I've done log or exponent math). The problem is as follows:
Suppose we are comparing implementations of insertion sort and merge sort on the same
machine. For inputs of size n, insertion sort runs in 8n^2 steps, while merge sort runs in 64n log n steps. For which values of n does insertion sort beat merge sort?
Log is base 2. I've started out trying to solve for equality, but get stuck around n = 8 log n.
I would like the answer to discuss how to solve this mathematically (brute force with excel not admissible sorry ;) ). Any links to the description of log math would be very helpful in my understanding your answer as well.
Thank you in advance!

http://www.wolframalpha.com/input/?i=solve%288+log%282%2Cn%29%3Dn%2Cn%29
(edited since old link stopped working)

Your best bet is to use Newton;s method.
http://en.wikipedia.org/wiki/Newton%27s_method

One technique to solving this would be to simply grab a graphing calculator and graph both functions (see the Wolfram link in another answer). Find the intersection that interests you (in case there are multiple intersections, as there are in your example).
In any case, there isn't a simple expression to solve n = 8 log₂ n (as far as I know). It may be simpler to rephrase the question as: "Find a zero of f(n) = n - 8 log₂ n". First, find a region containing the intersection you're interested in, and keep shrinking that region. For instance, suppose you know your target n is greater than 42, but less than 44. f(42) is less than 0, and f(44) is greater than 0. Try f(43). It's less than 0, so try 43.5. It's still less than 0, so try 43.75. It's greater than 0, so try 43.625. It's greater than 0, so keep going down, and so on. This technique is called binary search.
Sorry, that's just a variation of "brute force with excel" :-)
Edit:
For the fun of it, I made a spreadsheet that solves this problem with binary search: binary‑search.xls . The binary search logic is in the second data column, and I just auto-extended that.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex