asymmetric difference betwen two binary numbers (bitsets)

asymmetric difference betwen two binary numbers (bitsets) - math

It is quick and easy to determine shared/different bits between two binary numbers by AND or XOR. Let's say we have A: 10011 and B:11001 we can get the difference.
10011 XOR 11001 = 01010 (1s are different 0 similar.)
Are there any quick and easy logic or arithmetic operations that could produce similar but asymmetric output (1s showing these that are for example present in A but missing in B or vice vs.)
Example 10011 ??? 11001 = 00010 (1s mean present in left hand operand missing in right)
Could it be done with some quick arithmetic/logic or would I have to start some loop to go through the comparisons one by one?
I got to this question when I was contemplating on storing some presence/absence data in bytes as bit flags (for memory efficiency) -- and were already gleeful in the fact that were I do thusly I could then do quick and easy data diffing operations, but for many applications the direction of difference is also important.

The more canonical way to express this is A AND (NOT B), where NOT flips all bits.

tkausl - comment on the question answers it successfully.
(A XOR B) AND A would really do the trick.
XOR would generate the difference between A and B and then AND mask with A to show only these that are present in A. Result difference showing these bits that are set in A but not in B.

Related

How to choose the lengths of my sub sequences for a shell sort?

Let's assume we have a sequence a_i of length n and we want to sort it using shell sort. To do so, we would choose sub sequences of out a_i's of length k_i.
I'm now wondering how to choose those k_i's. You usually see that if n=16 we would choose k_1=8, k_2=4, k_3=2, k_4=1. So we would pair-wise compare the number's for each k_i and at the end use insertionSort to finish our sorting.
The idea of first sorting sub sequences of length k_i is to "pre-sort" the sequence for the insertionSort. Right?
Questions:
Now, depending on how we choose our k_i, we get a better performance. Is there a rule I can use here to choose the k_i's?
Could I also choose e.g. n=15, k_1=5, k_2=3, k_3=2?
If we have n=10 and k_1=5, would we now go with {k_2=2, k_3=1} or {k_2=3, k_2=2, k_3=1} or {k_2=3, k_3=1}?

The fascinating thing about shellsort is that for a sequence of n (unique) entries a unique set of gaps will be required to sort it efficiently, essentially f(n) => {gap/gaps}
For example, to most efficiently - on average - sort a sequence containing
2-5 entries - use insertion sort
6 entries - use shellsort with gaps {4,1}
7 or 8 entries - use a {5,1} shellsort
9 entries - use a {6,1} shellsort
10 entries - use a {9,6,1} shellsort
11 entries - use a {10,6,1} shellsort
12 entries - use a {5,1} shellsort
As you can see, 6-9 require 2 gaps, 10 and 11 three and 12 two. This is typical of shellsort's gaps: from one n to the next (i e n+1) you can be fairly sure that the number and makeup of gaps will differ.
A nasty side-effect of shellsort is that when using a set of random combinations of n entries (to save processing/evaluation time) to test gaps you may end up with either the best gaps for n entries or the best gaps for your set of combinations - most likely the latter.
I speculate that it is probably possible to create algorithms where you can plug in an arbitrary n and get the best gap sequence computed for you. Many high-profile computer scientists have explored the relationship between n and gaps without a lot to show for it. In the end they produce gaps (more or less by trial and error) that they claim perform better than those of others who have explored shellsort.
Concerning your foreword given n=16 ...
a {8,4,2,1} shellsort may or may not be an efficient way to sort 16 entries.
or should it be three gaps and, if so, what might they be?
or even two?
Then, to (try to) answer your questions ...
Q1: a rule can probably be formulated
Q2: you could ... but you should test it (for a given n there are n! possible sequences to test)
Q3: you can compare it with the correct answer (above). Or you can test it against all 10! possible sequences when n=10 (comes out to 3628800 of them) - doable

Encoding DNA strand in Binary

Hey guys I have the following question:
Suppose we are working with strands of DNA, each strand consisting of
a sequence of 10 nucleotides. Each nucleotide can be any one of four
different types: A, G, T or C. How many bits does it take to encode a
DNA strand?
Here is my approach to it and I want to know if that is correct.
We have 10 spots. Each spot can have 4 different symbols. This means we require 4^10 combinations using our binary digits.
4^10 = 1048576.
We will then find the log base 2 of that. What do you guys think of my approach?

Each nucleotide (aka base-pair) takes two bits (one of four states -> 2 bits of information). 10 base-pairs thus take 20 bits. Reasoning that way is easier than doing the log2(4^10), but gives the same answer.
It would be fewer bits of information if there were any combinations that couldn't appear. e.g. some codons (sequence of three base-pairs) that never appear. But ten independent 2-bit pieces of information sum to 20 bits.
If some sequences appear more frequently than others, and a variable-length representation is viable, then Huffman coding or other compression schemes could save bits most of the time. This might be good in a file-format, but unlikely to be good in-memory when you're working with them.
Densely packing your data into an array of 2bit fields makes it slower to access a single base-pair, but comparing the whole chunk for equality with another chunk is still efficient. (memcmp).
20 bits is unfortunately just slightly too large for a 16bit integer (which computers are good at). Storing in an array of 32bit zero-extended values wastes a lot of space. On hardware with good unaligned support, storing 24bit zero-extended values is ok (do a 32bit load and mask the high 8 bits. Storing is even less convenient though: probably a 16b store and an 8b store, or else load the old value and merge the high 8, then do a 32b store. But that's not atomic.).
This is a similar problem for storing codons (groups of three base-pairs that code for an amino acid): 6 bits of information doesn't fill a byte. Only wasting 2 of every 8 bits isn't that bad, though.
Amino-acid sequences (where you don't care about mutations between different codons that still code for the same AA) have about 20 symbols per position, which means a symbol doesn't quite fit into a 4bit nibble.
I used to work for the phylogenetics research group at Dalhousie, so I've sometimes thought about having a look at DNA-sequence software to see if I could improve on how they internally store sequence data. I never got around to it, though. The real CPU intensive work happens in finding a maximum-likelihood evolutionary tree after you've already calculated a matrix of the evolutionary distance between every pair of input sequences. So actual sequence comparison isn't the bottleneck.

do the maths:
4^10 = 2^2^10 = 2^20
Answer: 20 bits

How to perform mathematical operations on large numbers

I have a question about working on very big numbers. I'm trying to run RSA algorithm and lets's pretend i have 512 bit number d and 1024 bit number n. decrypted_word = crypted_word^d mod n, isn't it? But those d and n are very large numbers! Non of standard variable types can handle my 512 bit numbers. Everywhere is written, that rsa needs 512 bit prime number at last, but how actually can i perform any mathematical operations on such a number?
And one more think. I can't use extra libraries. I generate my prime numbers with java, using BigInteger, but on my system, i have only basic variable types and STRING256 is the biggest.

Suppose your maximal integer size is 64 bit. Strings are not that useful for doing math in most languages, so disregard string types. Now choose an integer of half that size, i.e. 32 bit. An array of these can be interpreted as digits of a number in base 232. With these, you can do long addition and multiplication, just like you are used to with base 10 and pen and paper. In each elementary step, you combine two 32-bit quantities, to produce both a 32-bit result and possibly some carry. If you do the elementary operation in 64-bit arithmetic, you'll have both of these as part of a single 64-bit variable, which you'll then have to split into the 32-bit result digit (via bit mask or simple truncating cast) and the remaining carry (via bit shift).
Division is harder. But if the divisor is known, then you may get away with doing a division by constant using multiplication instead. Consider an example: division by 7. The inverse of 7 is 1/7=0.142857…. So you can multiply by that to obtain the same result. Obviously we don't want to do any floating point math here. But you can also simply multiply by 14286 then omit the last six digits of the result. This will be exactly the right result if your dividend is small enough. How small? Well, you compute x/7 as x*14286/100000, so the error will be x*(14286/100000 - 1/7)=x/350000 so you are on the safe side as long as x<350000. As long as the modulus in your RSA setup is known, i.e. as long as the key pair remains the same, you can use this approach to do integer division, and can also use that to compute the remainder. Remember to use base 232 instead of base 10, though, and check how many digits you need for the inverse constant.
There is an alternative you might want to consider, to do modulo reduction more easily, perhaps even if n is variable. Instead of expressing your remainders as numbers 0 through n-1, you could also use 21024-n through 21024-1. So if your initial number is smaller than 21024-n, you add n to convert to this new encoding. The benefit of this is that you can do the reduction step without performing any division at all. 21024 is equivalent to 21024-n in this setup, so an elementary modulo reduction would start by splitting some number into its lower 1024 bits and its higher rest. The higher rest will be right-shifted by 1024 bits (which is just a change in your array indexing), then multiplied by 21024-n and finally added to the lower part. You'll have to do this until you can be sure that the result has no more than 1024 bits. How often that is depends on n, so for fixed n you can precompute that (and for large n I'd expect it to be two reduction steps after addition but hree steps after multiplication, but please double-check that) whereas for variable n you'll have to check at runtime. At the very end, you can go back to the usual representation: if the result is not smaller than n, subtract n. All of this should work as described if n>2512. If not, i.e. if the top bit of your modulus is zero, then you might have to make further adjustments. Haven't thought this through, since I only used this approach for fixed moduli close to a power of two so far.
Now for that exponentiation. I very much suggest you do the binary approach for that. When computing xd, you start with x, x2=x*x, x4=x2*x2, x8=…, i.e. you compute all power-of-two exponents. You also maintain some intermediate result, which you initialize to one. In every step, if the corresponding bit is set in the exponent d, then you multiply the corresponding power into that intermediate result. So let's say you have d=11. Then you'd compute 1*x1*x2*x8 because d=11=1+2+8=10112. That way, you'll need only about 1024 multiplications max if your exponent has 512 bits. Half of them for the powers-of-two exponentiation, the other to combine the right powers of two. Every single multiplication in all of this should be immediately followed by a modulo reduction, to keep memory requirements low.
Note that the speed of the above exponentiation process will, in this simple form, depend on how many bits in d are actually set. So this might open up a side channel attack which might give an attacker access to information about d. But if you are worried about side channel attacks, then you really should have an expert develop your implementation, because I guess there might be more of those that I didn't think about.

You may write some macros you may execute under Microsoft for functions like +, -, x, /, modulo, x power y which work generally for any integer of less than ten or hundred thousand digits (the practical --not theoretical-- limit being the internal memory of your CPU). Please note the logic is exactly the same as the one you got at elementary school.
E.g.: p= 1819181918953471 divider of (2^8091) - 1, q = ((2^8091) - 1)/p, mod(2^8043 ; q ) = 23322504995859448929764248735216052746508873363163717902048355336760940697615990871589728765508813434665732804031928045448582775940475126837880519641309018668592622533434745187004918392715442874493425444385093718605461240482371261514886704075186619878194235490396202667733422641436251739877125473437191453772352527250063213916768204844936898278633350886662141141963562157184401647467451404036455043333801666890925659608198009284637923691723589801130623143981948238440635691182121543342187092677259674911744400973454032209502359935457437167937310250876002326101738107930637025183950650821770087660200075266862075383130669519130999029920527656234911392421991471757068187747362854148720728923205534341236146499449910896530359729077300366804846439225483086901484209333236595803263313219725469715699546041162923522784170350104589716544529751439438021914727772620391262534105599688603950923321008883179433474898034318285889129115556541479670761040388075352934137326883287245821888999474421001155721566547813970496809555996313854631137490774297564881901877687628176106771918206945434350873509679638109887831932279470631097604018939855788990542627072626049281784152807097659485238838560958316888238137237548590528450890328780080286844038796325101488977988549639523988002825055286469740227842388538751870971691617543141658142313059934326924867846151749777575279310394296562191530602817014549464614253886843832645946866466362950484629554258855714401785472987727841040805816224413657036499959117701249028435191327757276644272944743479296268749828927565559951441945143269656866355210310482235520220580213533425016298993903615753714343456014577479225435915031225863551911605117029393085632947373872635330181718820669836830147312948966028682960518225213960218867207825417830016281036121959384707391718333892849665248512802926601676251199711698978725399048954325887410317060400620412797240129787158839164969382498537742579233544463501470239575760940937130926062252501116458281610468726777710383038372260777522143500312913040987942762244940009811450966646527814576364565964518092955053720983465333258335601691477534154940549197873199633313223848155047098569827560014018412679602636286195283270106917742919383395056306107175539370483171915774381614222806960872813575048014729965930007408532959309197608469115633821869206793759322044599554551057140046156235152048507130125695763956991351137040435703946195318000567664233417843805257728.
The last step took about 0.1 sec.
wpjo (willibrord oomen on academia.edu)

Addition and multiplication in a Galois Field

I am attempting to generate QR codes on an extremely limited embedded platform. Everything in the specification seems fairly straightforward except for generating the error correction codewords. I have looked at a bunch of existing implementations, and they all try to implement a bunch of polynomial math that goes straight over my head, particularly with regards to the Galois fields. The most straightforward way I can see, both in mathematical complexity and in memory requirements is a circuit concept that is laid out in the spec itself:
With their description, I am fairly confident I could implement this with the exception of the parts labeled GF(256) addition and GF(256) Multiplication.
They offer this help:
The polynomial arithmetic for QR Code shall be calculated using bit-wise modulo 2 arithmetic and byte-wise
modulo 100011101 arithmetic. This is a Galois field of 2^8
with 100011101 representing the field's prime modulus
polynomial x^8+x^4+x^3+x^2+1.
which is all pretty much greek to me.
So my question is this: What is the easiest way to perform addition and multiplication in this kind of Galois field arithmetic? Assume both input numbers are 8 bits wide, and my output needs to be 8 bits wide also. Several implementations precalculate, or hardcode in two lookup tables to help with this, but I am not sure how those are calculated, or how I would use them in this situation. I would rather not take the 512 byte memory hit for the two tables, but it really depends on what the alternative is. I really just need help understanding how to do a single multiplication and addition operation in this circuit.

In practice only one table is needed. That would be for the GP(256) multiply. Note that all arithmetic is carry-less, meaning that there is no carry-propagation.
Addition and subtraction without carry is equivalent to an xor.
So in GF(256), a + b and a - b are both equivalent to a xor b.
GF(256) multiplication is also carry-less, and can be done using carry-less multiplication in a similar way with carry-less addition/subtraction. This can be done efficiently with hardware support via say Intel's CLMUL instruction set.
However, the hard part, is reducing the modulo 100011101. In normal integer division, you do it using a series of compare/subtract steps. In GF(256), you do it in a nearly identical manner using a series of compare/xor steps.
In fact, it's bad enough where it's still faster to just precompute all 256 x 256 multiplies and put them into a 65536-entry look-up table.
page 3 of the following pdf has a pretty good reference on GF256 arithmetic:
http://www.eecs.harvard.edu/~michaelm/CS222/eccnotes.pdf

(I'm following up on the pointer to zxing in the first answer, since I'm the author.)
The answer about addition is exactly right; that's why working in this field is convenient on a computer.
See http://code.google.com/p/zxing/source/browse/trunk/core/src/com/google/zxing/common/reedsolomon/GenericGF.java
Yes multiplication works, and is for GF256. a * b is really the same as exp(log(a) + log(b)). And because GF256 has only 256 elements, there are only 255 unique powers of "x", and same for log. So these are easy to put in a lookup table. The tables would "wrap around" at 256, so that is why you see the "% size". "/ size" is slightly harder to explain in a sentence -- it's because really 1-255 "wrap around", not 0-255. So it's not quite just a simple modulus that's needed.
The final piece perhaps is how you reduce modulo an irreducible polynomial. The irreducibly polynomial is x^8 plus some lower-power terms, right -- call it I(x) = x^8 + R(x). And the polynomial is congruent to 0 in the field, by definition; I(x) == 0. So x^8 == -R(x). And, conveniently, addition and subtraction are the same, so x^8 == -R(x) == R(x).
The only time we need to reduce higher-power polynomials is when constructing the exponents table. You just keep multiplying by x (which is a shift left) until it gets too big -- gets an x^8 term. But x^8 is the same as R(x). So you take out the x^8 and add in R(x). R(x) merely has powers up to x^7 so it's all in a byte still, all in GF(256). And you know how to add in this field.
Helps?

Quantifying the non-randomness of a specialized random generator?

I just read this interesting question about a random number generator that never generates the same value three consecutive times. This clearly makes the random number generator different from a standard uniform random number generator, but I'm not sure how to quantitatively describe how this generator differs from a generator that didn't have this property.
Suppose that you handed me two random number generators, R and S, where R is a true random number generator and S is a true random number generator that has been modified to never produce the same value three consecutive times. If you didn't tell me which one was R or S, the only way I can think of to detect this would be to run the generators until one of them produced the same value three consecutive times.
My question is - is there a better algorithm for telling the two generators apart? Does the restriction of not producing the same number three times somehow affect the observable behavior of the generator in a way other than preventing three of the same value from coming up in a row?

As a consequence of Rice's Theorem, there is no way to tell which is which.
Proof: Let L be the output of the normal RNG. Let L' be L, but with all sequences of length >= 3 removed. Some TMs recognize L', but some do not. Therefore, by Rice's theorem, determining if a TM accepts L' is not decidable.
As others have noted, you may be able to make an assertion like "It has run for N steps without repeating three times", but you can never make the leap to "it will never repeat a digit three times." More appropriately, there exists at least one machine for which you can't determine whether or not it meets this criterion.
Caveat: if you had a truly random generator (e.g. nuclear decay), it is possible that Rice's theorem would not apply. My intuition is that the theorem still holds for these machines, but I've never heard it discussed.
EDIT: a secondary proof. Suppose P(X) determines with high probability whether or not X accepts L'. We can construct an (infinite number of) programs F like:
F(x): if x(F), then don't accept L'
else, accept L'
P cannot determine the behavior of F(P). Moreover, say P correctly predicts the behavior of G. We can construct:
F'(x): if x(F'), then don't accept L'
else, run G(x)
So for every good case, there must exist at least one bad case.

If S is defined by rejecting from R, then a sequence produced by S will be a subsequence of the sequence produced by R. For example, taking a simple random variable X with equal probability of being 1 or 0, you would have:
R = 0 1 1 0 0 0 1 0 1
S = 0 1 1 0 0 1 0 1
The only real way to differentiate these two is to look for streaks. If you are generating binary numbers, then streaks are incredibly common (so much so that one can almost always differentiate between a random 100 digit sequence and one that a student writes down trying to be random). If the numbers are taken from [0,1] uniformly, then streaks are far less common.
It's an easy exercise in probability to calculate the chance of three consecutive numbers being equal once you know the distribution, or even better, the expected number of numbers needed until the probability of three consecutive equal numbers is greater than p for your favourite choice of p.

Since you defined that they only differ with respect to that specific property there is no better algorithm to distinguish those two.
If you do triples of randum values of course the generator S will produce all other triples slightly more often than R in order to compensate the missing triples (X,X,X). But to get a significant result you'd need much more data than it will cost you to find any value three consecutive times the first time.

Probably use ENT ( http://fourmilab.ch/random/ )

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex