Game Theory with prediction - encryption

To impress two (german) professors i try to improve the game theory.
AI in Computergames.
Game Theory:  Intelligence is a well educated proven Answer to an Question.
This means a thoughtfull decision is choosing an act who leads to an optimal result.
Question -> Resolution -> Answer -> Test (Check)
For Example one robot is fighting another robot.
This robot has 3 choices:
-move forward
-hold position
-move backward
The resulting Programm is pretty simple
randomseed = initvalue;
while (one_is_alive)
{
choice = randomselect(options,probability);
do_choice(roboter);
}  
We are using pseudorandomness.
The test for success is simply did he elimate the opponent.
The robots have automatically shooting weapons :
struct weapon
{
range
damage
}
struct life
{
hitpoints
}
Now for some Evolution.
We let 2 robots fight each other and remember the randomseeds.
What is the sign of a succesfull Roboter ?
struct {
ownrandomseed;
list_of_opponentrandomseed; // the array of the beaten opponents.
}
Now the question is how do we choose the right strategy against an opponent ?
 
We assume we have for every possible seed-strategy the optimal anti-strategy.  
Now the only thing we have to do is to observe the numbers from the opponent
and calculate his seed value.Then we could choose the right strategy.
For cracking the random generator we can use the manual method :
http://alumni.cs.ucr.edu/~jsun/random-number.pdf
or the brute Force :
https://jazzy.id.au/2010/09/20/cracking_random_number_generators_part_1.html

It depends on the algorithm used to generate the (pseudo) random numbers. If the pseudo random number generator algorithm is known, you can guess the seed by observing a number of states (robot moves). This is similar to brute force guessing a password, used for encryption, as, some encryption algorithms are known as stream ciphers, and are basically (sometimes exactly), a one time pad that is used to obfuscate the data. Now, lets say that you know the pseudorandom number generator used is a simple lagged fibonacci generator. Then, you know that they are generating each number by calculating x(n) = x(n - 2) + x(n - 3) % 3. Therefore, by observing 3 different robot moves, you will then be able to predict all of the future moves. The seed, is the first 3 numbers supplied that give the sequence you observe. Now, most random number generators are not this simple, some have up to 1024 bit length seeds, and would be impossible for a modern computer to cycle through all of those possibilities in a brute force manner. So basically, what you would need to do, is to find out what PRNG algorithm is used, find out all possible initial seed values, and devise an algorithm to determine the seed the opponent robot is using based upon their actions. Depending on the algorithm, there are ways of guessing the seed faster than testing each and every one. If there is a faster way of guessing such a seed, this means that the PRNG in question is not suitable from cryptographic applications, as it means passwords are easier guessed. AES256 itself has a break, but it still takes theoretically 2 ^ 111 guesses (instead of the brute force 2 ^256 guesses), which means it has been broken, technically, but 2 ^ 111 is still way too many operations for modern computers to process in a meaningful time frame.
if the PRNG was lagged fibonacci (which is never used anymore, I am just giving a simple example) and you observed that the robot did option 0, then, 1, then 2... you would then know that the next thing the robot will do is... 1, since 0 + 1 % 3 = 1. You could also backtrack, and figure out what the initial values were for this PRNG, which represents the seed.

Related

How to quantitatively measure how simplified a mathematical expression is

I am looking for a simple method to assign a number to a mathematical expression, say between 0 and 1, that conveys how simplified that expression is (being 1 as fully simplified). For example:
eval('x+1') should return 1.
eval('1+x+1+x+x-5') should returns some value less than 1, because it is far from being simple (i.e., it can be further simplified).
The parameter of eval() could be either a string or an abstract syntax tree (AST).
A simple idea that occurred to me was to count the number of operators (?)
EDIT: Let simplified be equivalent to how close a system is to the solution of a problem. E.g., given an algebra problem (i.e. limit, derivative, integral, etc), it should assign a number to tell how close it is to the solution.
The closest metaphor I can come up with it how a maths professor would look at an incomplete problem and mentally assess it in order to tell how close the student is to the solution. Like in a math exam, were the student didn't finished a problem worth 20 points, but the professor assigns 8 out of 20. Why would he come up with 8/20, and can we program such thing?
I'm going to break a stack-overflow rule and post this as an answer instead of a comment, because not only I'm pretty sure the answer is you can't (at least, not the way you imagine), but also because I believe it can be educational up to a certain degree.
Let's assume that a criteria of simplicity can be established (akin to a normal form). It seems to me that you are very close to trying to solve an analogous to entscheidungsproblem or the halting problem. I doubt that in a complex rule system required for typical algebra, you can find a method that gives a correct and definitive answer to the number of steps of a series of term reductions (ipso facto an arbitrary-length computation) without actually performing it. Such answer would imply knowing in advance if such computation could terminate, and so contradict the fact that automatic theorem proving is, for any sufficiently powerful logic capable of representing arithmetic, an undecidable problem.
In the given example, the teacher is actually either performing that computation mentally (going step by step, applying his own sequence of rules), or gives an estimation based on his experience. But, there's no generic algorithm that guarantees his sequence of steps are the simplest possible, nor that his resulting expression is the simplest one (except for trivial expressions), and hence any quantification of "distance" to a solution is meaningless.
Wouldn't all this be true, your problem would be simple: you know the number of steps, you know how many steps you've taken so far, you divide the latter by the former ;-)
Now, returning to the criteria of simplicity, I also advice you to take a look on Hilbert's 24th problem, that specifically looked for a "Criteria of simplicity, or proof of the greatest simplicity of certain proofs.", and the slightly related proof compression. If you are philosophically inclined to further understand these subjects, I would suggest reading the classic Gödel, Escher, Bach.
Further notes: To understand why, consider a well-known mathematical artefact called the Mandelbrot fractal set. Each pixel color is calculated by determining if the solution to the equation z(n+1) = z(n)^2 + c for any specific c is bounded, that is, "a complex number c is part of the Mandelbrot set if, when starting with z(0) = 0 and applying the iteration repeatedly, the absolute value of z(n) remains bounded however large n gets." Despite the equation being extremely simple (you know, square a number and sum a constant), there's absolutely no way to know if it will remain bounded or not without actually performing an infinite number of iterations or until a cycle is found (disregarding complex heuristics). In this sense, every fractal out there is a rough approximation that typically usages an escape time algorithm as an heuristic to provide an educated guess whether the solution will be bounded or not.

Generate very very large random numbers

How would you generate a very very large random number? I am thinking on the order of 2^10^9 (one billion bits). Any programming language -- I assume the solution would translate to other languages.
I would like a uniform distribution on [1,N].
My initial thoughts:
--You could randomly generate each digit and concatenate. Problem: even very good pseudorandom generators are likely to develop patterns with millions of digits, right?
You could perhaps help create large random numbers by raising random numbers to random exponents. Problem: you must make the math work so that the resulting number is still random, and you should be able to compute it in a reasonable amount of time (say, an hour).
If it helps, you could try to generate a possibly non-uniform distribution on a possibly smaller range (using the real numbers, for instance) and transform. Problem: this might be equally difficult.
Any ideas?
Generate log2(N) random bits to get a number M,
where M may be up to twice as large as N.
Repeat until M is in the range [1;N].
Now to generate the random bits you could either use a source of true randomness, which is expensive.
Or you might use some cryptographically secure random number generator, for example AES with a random key encrypting a counter for subsequent blocks of bits. The cryptographically secure implies that there can be no noticeable patterns.
It depends on what you need the data for. For most purposes, a PRNG is fast and simple. But they are not perfect. For instance I remember hearing that Monte Carlos simulations of chaotic systems are really good at revealing the underlying pattern in a PRNG.
If that is the sort of thing that you are doing, though, there is a simple trick I learned in grad school for generating lots of random data. Take a large (preferably rapidly changing) file. (Some big data structures from the running kernel are good.) Compress it to increase the entropy. Throw away the headers. Then for good measure, encrypt the result. If you're planning to use this for cryptographic purposes (and you didn't have a perfect entropy data set to work with), then reverse it and encrypt again.
The underlying theory is simple. Information theory tells us that there is no difference between a signal with no redundancy and pure random data. So if we pick a big file (ie lots of signal), remove redundancy with compression, and strip the headers, we have a pretty good random signal. Encryption does a really good job at removing artifacts. However encryption algorithms tend to work forward in blocks. So if someone could, despite everything, guess what was happening at the start of the file, that data is more easily guessable. But then reversing the file and encrypting again means that they would need to know the whole file, and our encryption, to find any pattern in the data.
The reason to pick a rapidly changing piece of data is that if you run out of data and want to generate more, you can go back to the same source again. Even small changes will, after that process, turn into an essentially uncorrelated random data set.
NTL: A Library for doing Number Theory
This was recommended by my Coding Theory and Cryptography teacher... so I guess it does the work right, and it's pretty easy to use.
RandomBnd, RandomBits, RandomLen -- routines for generating pseudo-random numbers
ZZ RandomLen_ZZ(long l);
// ZZ = psuedo-random number with precisely l bits,
// or 0 of l <= 0.
If you have a random number generator that generates random numbers of X bits. And concatenated bits of [X1, X2, ... Xn ] create the number you want of N bits, as long as each X is random, I don't see why your large number wouldn't be random as well for all intents and purposes. And if standard C rand() method is not secure enough, I'm sure there's plenty of other libraries (like the ones mentioned in this thread) whose pseudo-random numbers are "more random".
even very good pseudorandom generators are likely to develop patterns with millions of digits, right?
From the wikipedia on pseudo-random number generation:
A PRNG can be started from an arbitrary starting state using a seed state. It will always produce the same sequence thereafter when initialized with that state. The maximum length of the sequence before it begins to repeat is determined by the size of the state, measured in bits. However, since the length of the maximum period potentially doubles with each bit of 'state' added, it is easy to build PRNGs with periods long enough for many practical applications.
You could perhaps help create large random numbers by raising random numbers to random exponents
I assume you're suggesting something like populating the values of a scientific notation with random values?
E.g.: 1.58901231 x 10^5819203489
The problem with this is that your distribution is going to be logarithmic (or is that exponential? :) - same difference, it isn't even). You will never get a value that has the millionth digit set, yet contains a digit in the one's column.
you could try to generate a possibly non-uniform distribution on a possibly smaller range (using the real numbers, for instance) and transform
Not sure I understand this. Sounds like the same thing as the exponential solution, with the same problems. If you're talking about multiplying by a constant, then you'll get a lumpy distribution instead of a logarithmic (exponential?) one.
Suggested Solution
If you just need really big pseudo-random values, with a good distribution, use a PRNG algorithm with a larger state. The Periodicity of a PRNG is often the square of the number of bits, so it doesn't take that many bits to fill even a really large number.
From there, you can use your first solution:
You could randomly generate each digit and concatenate
Although I'd suggest that you use the full range of values returned by your PRNG (possibly 2^31 or 2^32), and populate a byte array with those values, splitting it up as necessary. Otherwise you might be throwing away a lot of bits of randomness. Also, scaling your values to a range (or using modulo) can easily screw up your distribution, so there's another reason to try to keep the max number of bits your PRNG can return. Be careful to pack your byte array full of the bits returned, though, or you'll again introduce lumpiness to your distribution.
The problem with those solution, though, is how to fill that (larger than normal) seed state with random-enough values. You might be able to use standard-size seeds (populated via time or GUID-style population), and populate your big-PRNG state with values from the smaller-PRNG. This might work if it isn't mission critical how well distributed your numbers are.
If you need truly cryptographically secure random values, the only real way to do it is use a natural form of randomness, such as that at http://www.random.org/. The disadvantages of natural randomness are availability, and the fact that many natural-random devices take a while to generate new entropy, so generating large amounts of data might be really slow.
You can also use a hybrid and be safe - natural-random seeds only (to avoid the slowness of generation), and PRNG for the rest of it. Re-seed periodically.

Math question regarding Python's uuid4

I'm not great with statistical mathematics, etc. I've been wondering, if I use the following:
import uuid
unique_str = str(uuid.uuid4())
double_str = ''.join([str(uuid.uuid4()), str(uuid.uuid4())])
Is double_str string squared as unique as unique_str or just some amount more unique? Also, is there any negative implication in doing something like this (like some birthday problem situation, etc)? This may sound ignorant, but I simply would not know as my math spans algebra 2 at best.
The uuid4 function returns a UUID created from 16 random bytes and it is extremely unlikely to produce a collision, to the point at which you probably shouldn't even worry about it.
If for some reason uuid4 does produce a duplicate it is far more likely to be a programming error such as a failure to correctly initialize the random number generator than genuine bad luck. In which case the approach you are using it will not make it any better - an incorrectly initialized random number generator can still produce duplicates even with your approach.
If you use the default implementation random.seed(None) you can see in the source that only 16 bytes of randomness are used to initialize the random number generator, so this is an a issue you would have to solve first. Also, if the OS doesn't provide a source of randomness the system time will be used which is not very random at all.
But ignoring these practical issues, you are basically along the right lines. To use a mathematical approach we first have to define what you mean by "uniqueness". I think a reasonable definition is the number of ids you need to generate before the probability of generating a duplicate exceeds some probability p. An approcimate formula for this is:
where d is 2**(16*8) for a single randomly generated uuid and 2**(16*2*8) with your suggested approach. The square root in the formula is indeed due to the Birthday Paradox. But if you work it out you can see that if you square the range of values d while keeping p constant then you also square n.
Since uuid4 is based off a pseudo-random number generator, calling it twice is not going to square the amount of "uniqueness" (and may not even add any uniqueness at all).
See also When should I use uuid.uuid1() vs. uuid.uuid4() in python?
It depends on the random number generator, but it's almost squared uniqueness.

Quantum Computing and Encryption Breaking

I read a while back that Quantum Computers can break most types of hashing and encryption in use today in a very short amount of time(I believe it was mere minutes). How is it possible? I've tried reading articles about it but I get lost at the a quantum bit can be 1, 0, or something else. Can someone explain how this relates to cracking such algorithms in plain English without all the fancy maths?
Preamble: Quantum computers are strange beasts that we really haven't yet tamed to the point of usefulness. The theory that underpins them is abstract and mathematical, so any discussion of how they can be more efficient than classical computers will inevitably be long and involved. You'll need at least an undergraduate understanding of linear algebra and quantum mechanics to understand the details, but I'll try to convey my limited understanding!
The basic premise of quantum computation is quantum superposition. The idea is that a quantum system (such as a quantum bit, or qubit, the quantum analogue of a normal bit) can, as you say, exist not only in the 0 and 1 states (called the computational basis states of the system), but also in any combination of the two (so that each has an amplitude associated with it). When the system is observed by someone, the qubit's state collapses into one of its basis states (you may have heard of the Schrödinger's cat thought experiment, which is related to this).
Because of this, a register of n qubits has 2^n basis states of its own (these are the states that you could observe the register being in; imagine a classical n-bit integer). Since the register can exist in a superposition of all these states at once, it is possible to apply a computation to all 2^n register states rather than just one of them. This is called quantum parallelism.
Because of this property of quantum computers, it may seem like they're a silver bullet that can solve any problem exponentially faster than a classical computer. But it's not that simple: the problem is that once you observe the result of your computation, it collapses (as I mentioned above) into the result of just one of the computations – and you lose all of the others.
The field of quantum computation/algorithms is all about trying to work around this problem by manipulating quantum phenomena to extract information in fewer operations than would be possible on a classical computer. It turns out that it's very difficult to contrive a "quantum algorithm" that is faster than any possible classical counterpart.
The example you ask about is that of quantum cryptanalysis. It's thought that quantum computers might be able to "break" certain encryption algorithms: specifically, the RSA algorithm, which relies on the difficulty of finding the prime factors of very large integers. The algorithm which allows for this is called Shor's algorithm, which can factor integers with polynomial time complexity. By contrast the best classical algorithm for the problem has (almost) exponential time complexity, and the problem is hence considered "intractable".
If you want a deeper understanding of this, get a few books on linear algebra and quantum mechanics and get comfortable. If you want some clarification, I'll see what I can do!
Aside: to better understand the idea of quantum superposition, think in terms of probabilities. Imagine you flip a coin and catch it on your hand, covered so that you can't see it. As a very tenuous analogy, the coin can be thought of as being in a superposition of the heads and tails "states": each one has a probability of 0.5 (and, naturally, since there are two states, these probabilities add up to 1). When you take your hand away and observe the coin directly, it collapses into either the heads state or the tails state, and so the probability of this state becomes 1, while the other becomes 0. One way to think about it, I suppose, is a set of scales that is balanced until observation, at which point it tips to one side as our knowledge of the system increases and one state becomes the "real" state.
Of course, we don't think of the coin as a quantum system: for all practical purposes, the coin has a definite state, even if we can't see it. For genuine quantum systems, however (such as an individual particle trapped in a box), we can't think about it in this way. Under the conventional interpretation of quantum mechanics, the particle fundamentally has no definite position, but exists in all possible positions at once. Only upon observation is its position constrained in space (though only to a limited degree; cf. uncertainty principle), and even this is purely random and determined only by probability.
By the way, quantum systems are not restricted to having just two observable states (those that do are called two-level systems). Some have a large but finite number, some have a countably infinite number (such as a "particle in a box" or a harmonic oscillator), and some even have an uncountably infinite number (such as a free particle's position, which isn't constrained to individual points in space).
It's highly theoretical at this point. Quantum Bits might offer the capability to break encryption, but clearly it's not at that point yet.
At the Quantum Level, the laws that govern behavior are different than in the macro level.
To answer your question, you first need to understand how encryption works.
At a basic level, encryption is the result of multiplying two extremely large prime numbers together. This super large result is divisible by 1, itself, and these two prime numbers.
One way to break encryption is to brute force guess the two prime numbers, by doing prime number factorization.
This attack is slow, and is thwarted by picking larger and larger prime numbers. YOu hear of key sizes of 40bits,56bits,128bits and now 256,512bits and beyond. Those sizes correspond to the size of the number.
The brute force algorithm (in simplified terms) might look like
for(int i = 3; i < int64.max; i++)
{
if( key / i is integral)
{
//we have a prime factor
}
}
So you want to brute force try prime numbers; well that is going to take awhile with a single computer. So you might try grouping a bunch of computers together to divide and conquer. That works, but is still slow for very large keysizes.
How a quantum bit address this is that they are both 0 and 1 at the same time. So say you have 3 quantum bits (no small feat mind you).
With 3 qbits, your program can have the values of 0-7 simulatanously
(000,001,010,011 etc)
, which includes prime numbers 3,5,7 at the same time.
so using the simple algorithm above, instead of increasing i by 1 each time, you can just divide once, and check
0,1,2,3,4,5,6,7
all at the same time.
Of course quantum bits aren't to that point yet; there is still lots of work to be done in the field; but this should give you an idea that if we could program using quanta, how we might go about cracking encryption.
The Wikipedia article does a very good job of explaining this.
In short, if you have N bits, your quantum computer can be in 2^N states at the same time. Similar conceptually to having 2^N CPU's processing with traditional bits (though not exactly the same).
A quantum computer can implement Shor's algorithm which can quickly perform prime factorization. Encryption systems are build on the assumption that large primes can not be factored in a reasonable amount of time on a classical computer.
Almost all our public-key encryptions (ex. RSA) are based solely on math, relying on the difficulty of factorization or discrete-logarithms. Both of these will be efficiently broken using quantum computers (though even after a bachelors in CS and Math, and having taken several classes on quantum mechanics, I still don't understand the algorithm).
However, hashing algorithms (Ex. SHA2) and symmetric-key encryptions (ex. AES), which are based mostly on diffusion and confusion, are still secure.
In the most basic terms, a normal no quantum computer works by operating on bits (sates of on or off) uesing boolean logic. You do this very fast for lots and lots of bits and you can solve any problem in a class of problems that are computable.
However they are "speed limits" namely something called computational complexity.This in lay mans terms means that for a given algorithm you know that the time it takes to run an algorithm (and the memory space required to run the algorithm) has a minimum bound. For example a algorithm that is O(n^2) means that for a data size of n it will require n^2 time to run.
However this kind of goes out the window when we have qbits (quantum bits) when you are doing operations on qbits that can have "in between" values. algorithms that would have very high computational complexity (like factoring huge numbers, the key to cracking many encryption algorithms) can be done in much much lower computational complexity. This is the reason that quantum computing will be able to crack encrypted streams orders of magnitude quicker then normal computers.
First of all, quantum computing is still barely out of the theoretical stage. Lots of research is going on and a few experimental quantum cells and circuits, but a "quantum computer" does not yet exist.
Second, read the wikipedia article: http://en.wikipedia.org/wiki/Quantum_computer
In particular, "In general a quantum computer with n qubits can be in an arbitrary superposition of up to 2^n different states simultaneously (this compares to a normal computer that can only be in one of these 2^n states at any one time). "
What makes cryptography secure is the use of encryption keys that are very long numbers that would take a very, very long time to factor into their constituent primes, and the keys are sufficiently long enough that brute-force attempts to try every possible key value would also take too long to complete.
Since quantum computing can (theoretically) represent a lot of states in a small number of qubit cells, and operate on all of those states simultaneously, it seems there is the potential to use quantum computing to perform brute-force try-all-possible-key-values in a very short amount of time.
If such a thing is possible, it could be the end of cryptography as we know it.
quantum computers etc all lies. I dont believe these science fiction magazines.
in fact rsa system is based on two prime numbers and their multipilation.
p1,p2 is huge primes p1xp2=N modulus.
rsa system is
like that
choose a prime number..maybe small its E public key
(p1-1)*(p2-1)=R
find a D number that makes E*D=1 mod(R)
we are sharing (E,N) data as public key publicly
we are securely saving (D,N) as private.
To solve this Rsa system cracker need to find prime factors of N.
*mass of the Universe is closer to 10^53 kg*
electron mass is 9.10938291 × 10^-31 kilograms
if we divide universe to electrons we can create 10^84 electrons.
electrons has slower speeds than light. its move frequency can be 10^26
if anybody produces electron size parallel rsa prime factor finders from all universe mass.
all universe can handle (10^84)*(10^26)= 10^110 numbers/per second.
rsa has limitles bits of alternative prime numbers. maybe 4096 bits
4096 bit rsa has 10^600 possible prime numbers to brute force.
so your universe mass quantum solver need to make tests during 10^500 years.
rsa vs universe mass quantum computer
1 - 0
maybe quantum computer can break 64/128 bits passwords. because 128 bit password has 10^39 possible brute force nodes.
This circuit is a good start to understand how qubit parallelism works. The 2-qubits-input is on the left side. Top qubit is x and bottom qubit ist y. The y qubit is 0 at the input, just like a normal bit. The x qubit on the other hand is in superposition at the input. y (+) f(x) stands here for addition modulo 2, just meaning 1+1=0, 0+1=1+0=1. But the interesting part is, since the x-qubit is in superposition, f(x) is f(0) and f(1) at the same time and we can perform the evaluation of the f function for all states simultaneously without using any (time consuming) loops. Having enough quibits we can branch this into endlessly complicating curcuits.
Even more bizarr imo. is the Grover's algorithm. As input we get here an unsorted array of integers with arraylength = n. What is the expected runtime of an algorithm, that finds the min value of this array? Well classically we have at least to check every 1..n element of the array resulting in an expected runtime of n. Not so for quantum computers, on a quantum computer we can solve this in expected runtime of maximum root(n), this means we don't even have to check every element to find the guaranteed solution...

Psuedo-Random-Number-Generator from a computable normal number

Isn't it easily possible to construct a PRNG in such a fashion? Why is it not done?
That is, as far as I know we could simply have a PRNG that takes a seed n. When you ask for a random bit, it takes the nth digit of the binary expansion of the computable normal number, and increments n.
My first thought was that perhaps we hadn't found a computable normal number, but we have. The remaining thought is that there is a good reason not to-- either there's some property of PRNGs that I'm not familiar with that such a method would not have, or it would be impractical somehow, or is otherwise outstripped by other methods.
That would make predicting the output really simple.
Say, for example, you generate the integer 0x54a30b7f. If you have 4GiB of pi (or random noise or an actual normal number), chances are there's only going to be one (or maybe a handful) occurrence of that particular integer and I can predict with reasonably high probability all future numbers. This is a serious problem in the case of cryptographically strong PRNGs. If instead of simple sequential scan you use some function, I just have to follow the function which if it is difficult enough to follow it turns into a PRNG in it's own right.
If you are not concerned about the cryptographic strength of your generator, then there are much more compact ways of generating random numbers. Mersenne Twister, for example, has a much larger period without requiring a 4GiB lookup table.

Resources