Why adding two big integers will get a negative one? - math

I have tried these in my own computer.
eg. 2000000000+2000000000=-12315555331
I don't know why this doesn't meet the standards, maybe because of the length.
So I'm writing this just to pass the check.

Because take longing as an example. The first bit out of the 32 is sign bit, which means the number is negative if it is 1 and positive if 0.

Related

Bloom filter probability for 16 bit integers

I hava a list of 16 bit unsigned integers.
The only thing I want to know if a number is in the list.
I could use a bit vector for this. I just set bit[i] if i is in the list. To know if a number y is in the list i would look up bit[y]. This would be always give correct answer.
But I would like to save memory. And use less bits but allow false positives. I.e a number would be allow to return that it is in the list even though it is not. But no false negatives.
From my understanding bloom filter would achieve that.
What I do not understand is that when I use a bloom filter calculator the probability by setting for example:
400 entries (n)
400 bits (m)
1 hash function (k)
The probability for false positives is not zero. For example at https://hur.st/bloomfilter/
If I for example on average have 400 different 16 bit numbers and want to use as little space as possible would a bloom filter be a good idea?
How should one think about the probabilty?
Clearly a bit vector gives perfect result with 16k bits. How much less space can I use with bloom filter and at which cost in in increased falsed positives.
Is there something better that could solve my problem.
A Bloom Filter is likely the right object for what you need. The space requirements are what are at issue:
you must have enough elements to make it reasonable (worth storing the hash function generators, at least), and
enough bits to reduce the false positives to an acceptable level.
In other words, don’t focus on using as little space as possible; rather focus on using enough space. You will still be using less space than anything short of a custom, specialized filter.

rounding to the nearest zero, bitwise

I just wonder how can i round to the nearest zero bitwise? Previously, I perform the long division using a loop. However, since the number always divided by a number power by 2. I decide to use bit shifting. So, I can get result like this:
12/4=3
13/4=3
14/4=3
15/4=3
16/4=4
can I do this by performing the long division like usual?
12>>2
13>>2
if I use this kind of bit shifting, are the behavior different for different compiler? how about rounding up? I am using visual c++ 2010 compiler and gcc. thx
Bitwise shifts are equivalent to round-to-negative-infinity divisions by powers of two, meaning that the answer is never bigger than the unrounded value (so e.g. (-3) >> 1 is equal to -2).
For non-negative integers, this is equivalent to round-to-zero.

R rounding explanation

can any one please explain why this gives different outputs?
round(1.49999999999999)
1
round(1.4999999999999999)
2
I have read the round documentation but it does not mention anything about it there.
I know that R represents numbers in binary form, but why does adding two extra 9's changes the result?
Thanks.
1.4999999999999999 can't be represented internally, so it gets rounded to 1.5.
Now, when you apply round(), the result is 2.
Put those two numbers into variable and then print it - you'll see they are different.
Computers doesn't store this kind of numbers with this exact value, (They don't use decadic numbers internaly)
I have never used R, so I don't know is this is the issue, but in other languages such as C/C++ a number like 1.4999999999999999 is represented by a float or a double.
Since these have finite precision, you cannot represent something like 1.4999999999999999 exactly. It might be the case that 1.4999999999999999 actually gets stored as 1.50000000000000 instead due to limitations on floating point precision.

Maximizing Stored Information (Entropy?)

So I'm not sure if this question belongs here or maybe Math overflow. In any case, my question is about information theory.
Let's say I have a 16 bit word. There are 65,536 unique configurations of 1's and 0's in that number. What each one of those configurations represents is unimportant as depending on your notation (2's complement vs signed magnitude etc.) the same configuration can mean different things.
What I'm wondering is are there any techniques to store more information than that in a 16 bit word?
My original ideas were like odd/even parity or something but then I realized that's already determined by the configuration... i.e. there is no extra information encoded in that. I'm beginning to wonder if no such thing exists.
EDIT For example, let's say some magical computer (thinking quantum or something here) could understand 0,1,a. Then obviously we have 3^16 configurations and can now store more than the numbers [0 - 65,536]. Are there any other properties of a 16 bit word that you can mess with in order to encode extra information in your bit stream?
EDIT2 I am really struggling to put this into words. Right now when I look at a 16 bit word in the computer, the property which conveys information to me the relative ordering of individual 1's and 0's. Is there another property or way of looking at a 16 bit word which would allow more than 2^16 unique "configurations"? (Note it would no longer be a configuration, but 2^16 xxxx's where xxxx is a noun describing an instance of that property). The only thing I can really think of is something like if we looked at the number of 1 to 0 transitions or something rather than whether each bit was actually a 1 or 0? Now transitions does not yield more than 2^16 combinations because it is ultimately solely dependent on the configuration of 1's and 0's. I'm looking for properties that would derive from the configuration of 1's and 0's AND something else thus resulting in MORE than 2^16. Does anyone even know what this would be called if it did exist?
EDIT3 Ok I got it. My question boils down to this: How do we prove that the configuration of 1's and 0's in a word completely defines it? I.E. How do we prove that you need no other information besides the bitmap to show equality between two 16 bit words?
FINAL EDIT
I have an example... If instead of looking at the presence of 1's and 0's we look at transition between bits we can store 2^16 alphabet characters. If the bit to left is the same, treat it as a 1, if it transitions, treat it as a 0. Using the 16 bit word as a circularly linked list type structure where each link represent 0/1 we basically for a 16 bit word out of the transition between bits. That is an exact example of what I was looking for but that results in 2^16, nothing better. I am convinced that you cannot do better and am marking the correct answer =(
The amount of information in a particular configuration of 16 0/1s is determined by the probability of this configuration (this is called self-information). This can be bigger than 16 bits if the configuration is less likely than 1/(2^16), but that means that some other configurations are more likely than 1/(2^16) and so will contain less information than 16 bits.
To take into account all the possible configurations, you have to use the expected value of self-information (called entropy) of individual configurations. This value will reach its maximum when the probabilities of all configurations are equal (that is 1/(2^16)) and then it will be exactly 16 bits.
So the answer is no, you cannot store more than 16 bits of information in 16 0/1s.
See
http://en.wikipedia.org/wiki/Information_theory
http://en.wikipedia.org/wiki/Self-information
EDIT It is important to realize that bit does not stand for 0 or 1, but it is a unit of information, that is -log_2 P(w) where P(w) is the probability of a particular configuration.
You cannot store more than 2 states in one digit of a semiconductor device. You answered it yourself. The only way more information can be fitted into 16 digits is if each digit were to have many possible values.

error correction code upper bound

If I want to send a d-bit packet and add another r bits for error correction code (d>r)
how many errors I can find and correct at most?
You have 2^d different kinds of packets of length d bits you want to send. Adding your r bits to them makes them into codewords of length d+r, so now you have 2^d possible codewords you could send. The receiver could get 2^(d+r) different received words(codewords with possible errors). The question then becomes, how do you map those 2^(d+r) received words to the 2^d codewords?
This comes down to the minimum distance of the code. That is, for each pair of codewords, find the number of bits where they differ, then take the smallest of those values.
Let's say you had a minimum distance of 3. You received a word and you notice that it isn't one of the codewords. That is, there's an error. So, for the lack of a better decoding algorithm, you flip the first bit, and see if its a codeword. If it isn't you flip it back and flip the next one. Eventually, you get a codeword. Since all codewords differ in 3 positions, you know this codeword is the "closest" to the received word, since you would have to flip 2 bits in the received word to get to another codeword. If you didn't get a codeword from flipping just one bit at a time, you can't figure out where the errors are, since there are multiple codewords you could get to by flipping two bits, but you know there are at least two errors.
This leads to the general principle that for a minimum distance md, you can detect md-1 errors and correct floor((md-1)/2) errors. Calculating the minimum distance depends on the details of how you generate the codewords, otherwise known as the code. There are various bounds you can use to figure out an upper limit on md based on d and (d+r).
Paul mentioned the Hamming Code, which is a good example. It achieves the Hamming bound. For the (7,4) Hamming code, you have 4 bit messages and 7 bit codewords, and you achieve a minimum distance of 3. Obviously*, you are never going to get a minimum distance greater than the number of bits you are adding so this is the very best you can do. Don't get too used to this though. The Hamming code is one of the few examples of a non-trivial perfect code, and most of those have a minimum distance that is less than the number of bits you add.
*It's not really obvious, but I'm pretty sure it's true for non-trivial error correcting codes. Adding one parity bit gets you a minimum distance of two, allowing you to detect an error. The code consisting of {000,111} gets you a minimum distance of 3 by adding just 2 bits, but it's trivial.
You should probably read the wikipedia page on this:
http://en.wikipedia.org/wiki/Error_detection_and_correction
It sounds like you specifically want a Hamming Code:
http://en.wikipedia.org/wiki/Hamming_code#General_algorithm
Using that scheme, you can look up some example values from the linked table.

Resources