In hash table, why not update h(x)(=h0(x) itself but use h1(x), h2(x) ..? - hashtable

In hashtable, let’s say that
H0(x) = x mod m
And if we say that to avoid collision, we use linear probing
Hi(x) = (x + i) mod m
But whatever the hash function is, whether we do linear probing or quadratic probing, While searching for a key, it has to go through all the collisions right?
So the idea is like this :
What about updating the hash function h(x) itself as
If element a, after passing all the collisions, was placed in
(H0(a) + La) mod m
Then what about updating the hash function itself as
H(x) = h0(x) + La*f(x-a)
while f(x-a) is 1 only if x=a ?
Then, when later searching for a, we dont have to go through all the collisions but just find the element rightaway..?
What is the problem with this idea?

Related

Power with integer exponents in Isabelle

Here is my definition of power for integer exponents following this mailing-list post:
definition
"ipow x n = (if n < 0 then (1 / x) ^ n else x ^ n)"
notation ipow (infixr "^⇩i" 80)
Is there a better way to define it?
Is there an existing theory in Isabelle that already includes it so that I can reuse its results?
Context
I am dealing with complex exponentials, for instance consider this theorem:
after I proved it I realized I need to work with integers n not just naturals and this involves using powers to take out the n from the exponential.
I don't think something like this exists in the library. However, you have a typo in your definition. I believe you want something like
definition
"ipow x n = (if n < 0 then (1 / x) ^ nat (-n) else x ^ nat n)"
Apart from that, it is fine. You could write inverse x ^ nat (-n), but it should make little difference in practice. I would suggest the name int_power since the corresponding operation with natural exponents is called power.
Personally, I would avoid introducting a new constant like this because in order to actually use it productively, you also need an extensive collection of theorems around it. This means quite a bit of (tedious) work. Do you really need to talk about integers here? I find that one can often get around it in practice (in particular, note that the exponentials in question are periodic anyway).
It may be useful to introduce such a power operator nevertheless; all I'm saying is you should be aware of the trade-off.
Side note: An often overlooked function in Isabelle that is useful when talking about exponentials like this is cis (as in ‘cosine + i · sine‘). cis x is equivalent to ‘exp(ix)’ where x is real.

Math-operations (or hash functions) without order dependency

The problem I am thinking about is hash functions, although I'm mainly interested in the mathematical terms/background to describe my requested property.
Consider the case where I have a hash-function taking a secret (S) and a number (X) which creates another number (Y):
Hash : S, X → Y
I then define two different hash-functions with their own secrets (a and b):
H1(X) := Hash(a, X)
H2(X) := Hash(b, X)
The property I want is that:
H1(H2(x)) = H2(H1(X))
(I think this is called that the functions commute?)
Taking a step back from programming and thinking about math we can look at different operations. If the function consist of one operation only, then I'm quite sure that this property will always be satisfied if the operation has both associative and commutative properties. However there are operations which are order insensitive but non-commutative, e.g. division. How does I know if my choice of hash function will make it commute?
Some examples that seems to work:
Simple addition:
Hash(S, X) := S + X
Bitwise xor:
Hash(S, X) := S xor X
Modular exponentiation:
Hash(S, X) := X^S mod p
if S ∈ N and X ∈ Z
How do I know if my choice of hash function will make it commute?
Commutativity under composition is an unusual property. It's not typical unless the functions are using a commutative operation of some underlying algebraic structures, such as "multiply by x". This is the form of your three examples.
The practical answer is "if you don't have a proof that it's commutative, assume it's not commutative". There's no general algorithm that will provide that proof for you.

RSA decryption methodology

I'm not learning cryptography yet, and this exercise - in the form it was delivered as a homework, was more of an exercise on reading composite functions and the like. Either way, I took a look at some part of the source code and didn't understand this.
For RSA encryption, the source code manipulated the string in such a way:
Message is being hashed into an integer list. (int1, int2, int3...)
Encrypt int1
Subtract result from int2 ( int2 - e(int1))
Modulo with the modulo key (n)
RSA transform with a key.
However, the RSA decryption method is done by:
1) RSA_transform
2) Result is added
3) Modulo with n
The part that puzzles me about the RSA decryption is the need for modulo after the adding and rsa_transform. If it's needed, shouldnt it be used in reverse order of how the chain of operations was carried out in RSA encryption?
Also, an "invert_modulo" was provided in the source code. I originally believed this to be a key in decrypting the message, but it wasn't so. What could "invert_modulo" be used for?
I cannot understand the first part of your question as the steps to hash the string is not clear also i don't get 3rd part of your encryption step. As for the Second question invert_modulo is the "MODULAR MULTIPLICATIVE INVERSE".
While working with modular airthmetic we always want our answer to be in the integer range 0 to M-1(where M is the number we modulo with) simple operations like addition , multiplication and subtraction are easy to perform : like (a+b) MOD M, it is well defined for the constraints of modular airthmetic.
Problem arises wen we try to divide : (a/b) MOD M
as you can see here a/b may not always always give an integer, therefore (a/b) does not lie in the integer range 0 to M-1. so to overcome this we try to find an inverse of b that we would rather multiply a with, i.e : (a*b_inverse) MOD M.
b_inverse can be defined as : (b*b_inverse) MOD M = 1.
i.e b_inverse is a number in the range 0 to M-1, which when multiplied with b, modulo M yields 1.
Note : also note that modular inverse of some numbers might not exist we can check that by taking the GCD of M and the number concerned(in our example "b") if GCD is not equal to 1 the the modular inverse does not exist.

Big O Algebra simplify

To simplify a big O expression
We omit all constants
We ignore lower powers of n
For example:
O(n + 5) = O(n)
O(n² + 6n + 7) = O(n²)
O(6n1/3 + n1/2 + 7) = O(n1/2)
Am I right in these examples?
1. We omit all constants
Well, strictly speaking, you don't omit all constants, only the outermost multiplicaive constant. That means O(cf(n)) = O(f(n)). Additive constants are fine too, since
f(n) < f(n)+c < 2f(n) starting with some n, therefore O(f(n)+c) = O(f(n)).
But you don't omit constants inside composite functions. Might be done sometimes (O(log(cn)) or even O(log(n^c)) for instance), but not in general. Consider for example 2^2n, it might be tempting to drop the 2 and put this in O(2^n), which is wrong.
2. We ignore lower powers of n
True, but remember, you don't always work with polynomial functions. You can generally ignore any added asymptotically lower functions. Say you have f(n) and g(n), when g(n) = O(f(n)), then O(f(n) + g(n)) = O(f(n)).
You cannot do this with multiplication.
You're almost right. The second rule should be that you ignore all but the term with the largest limit as n goes towards infinity. That's important if you have terms that are not powers of n, like logs or other mathematical functions.
It's also worth being aware that big O notation sometimes covers up important other details. An algorithm that is O(n log n) will have better performance than one that is O(n^2), but only if the input is large enough for those most terms to dominate the running time. It may be that for the sizes inputs you actually have to deal with in a specific application, the O(n^2) algorithm actually performs better!

Is finding the equivalence of two functions undecidable?

Is it impossible to know if two functions are equivalent? For example, a compiler writer wants to determine if two functions that the developer has written perform the same operation, what methods can he use to figure that one out? Or can what can we do to find out that two TMs are identical? Is there a way to normalize the machines?
Edit: If the general case is undecidable, how much information do you need to have before you can correctly say that two functions are equivalent?
Given an arbitrary function, f, we define a function f' which returns 1 on input n if f halts on input n. Now, for some number x we define a function g which, on input n, returns 1 if n = x, and otherwise calls f'(n).
If functional equivalence were decidable, then deciding whether g is identical to f' decides whether f halts on input x. That would solve the Halting problem. Related to this discussion is Rice's theorem.
Conclusion: functional equivalence is undecidable.
There is some discussion going on below about the validity of this proof. So let me elaborate on what the proof does, and give some example code in Python.
The proof creates a function f' which on input n starts to compute f(n). When this computation finishes, f' returns 1. Thus, f'(n) = 1 iff f halts on input n, and f' doesn't halt on n iff f doesn't. Python:
def create_f_prime(f):
def f_prime(n):
f(n)
return 1
return f_prime
Then we create a function g which takes n as input, and compares it to some value x. If n = x, then g(n) = g(x) = 1, else g(n) = f'(n). Python:
def create_g(f_prime, x):
def g(n):
return 1 if n == x else f_prime(n)
return g
Now the trick is, that for all n != x we have that g(n) = f'(n). Furthermore, we know that g(x) = 1. So, if g = f', then f'(x) = 1 and hence f(x) halts. Likewise, if g != f' then necessarily f'(x) != 1, which means that f(x) does not halt. So, deciding whether g = f' is equivalent to deciding whether f halts on input x. Using a slightly different notation for the above two functions, we can summarise all this as follows:
def halts(f, x):
def f_prime(n): f(n); return 1
def g(n): return 1 if n == x else f_prime(n)
return equiv(f_prime, g) # If only equiv would actually exist...
I'll also toss in an illustration of the proof in Haskell (GHC performs some loop detection, and I'm not really sure whether the use of seq is fool proof in this case, but anyway):
-- Tells whether two functions f and g are equivalent.
equiv :: (Integer -> Integer) -> (Integer -> Integer) -> Bool
equiv f g = undefined -- If only this could be implemented :)
-- Tells whether f halts on input x
halts :: (Integer -> Integer) -> Integer -> Bool
halts f x = equiv f' g
where
f' n = f n `seq` 1
g n = if n == x then 1 else f' n
Yes, it is undecidable. This is a form of the halting problem.
Note that I mean that it's undecidable for the general case. Just as you can determine halting for sufficiently simple programs, you can determine equivalency for sufficiently simple functions, and it's not inconceivable that this could be of some use for an application. But you cannot make a general method for determining equivalency of any two possible functions.
The general case is undecidable by Rice's Theorem, as others have already said (Rice's Theorem essentially says that any nontrivial property of a Turing-complete formalism is undecidable).
There are special cases where equivalence is decidable, the best-known example is probably equivalence of finite state automata. If I remember correctly equivalence of pushdown automata is already undecidable by reduction to Post's Correspondence Problem.
To prove that two given functions are equivalent you would require as input a proof of the equivalence in some formalism, which you can then check for correctness. The essential parts of this proof are the loop invariants, as these cannot be derived automatically.
In the general case it's undecidable whether two turing machines have always the same output for the identical input. Since you can't even decide whether a tm will halt on the input, I don't see how it should be possible to decide whether both halt AND output the same result...
It depends on what you mean by "function."
If the functions you are talking about are guaranteed to terminate -- for example, because they are written in a language in which all functions terminate -- and operate over finite domains, it's "easy" (although it might still take a very, very long time): two functions are equivalent if and only if they have the same value at every point in their shared domain.
This is called "extensional" equivalence to distinguish it from syntactic or "intensional" equivalence. Two functions are extensionally equivalent if they are intensionally equivalent, but the converse does not hold.
(All the other people above noting that it is undecidable in the general case are quite correct, of course, this is a fairly uncommon -- and usually uninteresting in practice -- special case.)
Note that the halting problem is decidable for linear bounded automata. Real computers are always bounded, and programs for them will always loop back to a previous configuration after sufficiently many steps. If you are using an unbounded (imaginary) computer to keep track of the configurations, you can detect that looping and take it into account.
You could check in your compiler to see if they are "exactly" identical, sure, but determining if they return identical values would be difficult and time consuming. You would have to basically call that method and perform its routine over an infinite number of possible calls and compare the value with that from the other routine.
Even if you could do the above, you would have to account for what global values change within the function, what objects are destroyed / changed in the function that do not affect the outcome.
You can really only compare the compiled code. So compile the compiled code to refactor?
Imagine the run time on trying to compile the code with "that" compiler. You could spend a LOT of time on here answering questions saying: "busy compiling..." :)
I think if you allow side effects, you can show that the problem can be morphed into the Post correspondence problem so you can't, in general, show if two functions are even capable of having the same side effects.
Is it impossible to know if two functions are equivalent?
No. It is possible to know that two functions are equivalent. If you have f(x), you know f(x) is equivalent to f(x).
If the question is "it is possible to determine if f(x) and g(x) are equivalent with f and g being any function and for all functions g and f", then the answer is no.
However, if the question is "can a compiler determine that if f(x) and g(x) are equivalent that they are equivalent?", then the answer is yes if they are equivalent in both output and side effects and order of side effects. In other words, if one is a transformation of the other that preserves behavior, then a compiler of sufficient complexity should be able to detect it. It also means that the compiler can transform a function f into a more optimal and equivalent function g given a particular definition of equivalent. It gets even more fun if f includes undefined behavior, because then g can also include undefined (but different) behavior!

Resources