Encoding DNA strand in Binary - math

Hey guys I have the following question:
Suppose we are working with strands of DNA, each strand consisting of
a sequence of 10 nucleotides. Each nucleotide can be any one of four
different types: A, G, T or C. How many bits does it take to encode a
DNA strand?
Here is my approach to it and I want to know if that is correct.
We have 10 spots. Each spot can have 4 different symbols. This means we require 4^10 combinations using our binary digits.
4^10 = 1048576.
We will then find the log base 2 of that. What do you guys think of my approach?

Each nucleotide (aka base-pair) takes two bits (one of four states -> 2 bits of information). 10 base-pairs thus take 20 bits. Reasoning that way is easier than doing the log2(4^10), but gives the same answer.
It would be fewer bits of information if there were any combinations that couldn't appear. e.g. some codons (sequence of three base-pairs) that never appear. But ten independent 2-bit pieces of information sum to 20 bits.
If some sequences appear more frequently than others, and a variable-length representation is viable, then Huffman coding or other compression schemes could save bits most of the time. This might be good in a file-format, but unlikely to be good in-memory when you're working with them.
Densely packing your data into an array of 2bit fields makes it slower to access a single base-pair, but comparing the whole chunk for equality with another chunk is still efficient. (memcmp).
20 bits is unfortunately just slightly too large for a 16bit integer (which computers are good at). Storing in an array of 32bit zero-extended values wastes a lot of space. On hardware with good unaligned support, storing 24bit zero-extended values is ok (do a 32bit load and mask the high 8 bits. Storing is even less convenient though: probably a 16b store and an 8b store, or else load the old value and merge the high 8, then do a 32b store. But that's not atomic.).
This is a similar problem for storing codons (groups of three base-pairs that code for an amino acid): 6 bits of information doesn't fill a byte. Only wasting 2 of every 8 bits isn't that bad, though.
Amino-acid sequences (where you don't care about mutations between different codons that still code for the same AA) have about 20 symbols per position, which means a symbol doesn't quite fit into a 4bit nibble.
I used to work for the phylogenetics research group at Dalhousie, so I've sometimes thought about having a look at DNA-sequence software to see if I could improve on how they internally store sequence data. I never got around to it, though. The real CPU intensive work happens in finding a maximum-likelihood evolutionary tree after you've already calculated a matrix of the evolutionary distance between every pair of input sequences. So actual sequence comparison isn't the bottleneck.

do the maths:
4^10 = 2^2^10 = 2^20
Answer: 20 bits

Related

How to perform mathematical operations on large numbers

I have a question about working on very big numbers. I'm trying to run RSA algorithm and lets's pretend i have 512 bit number d and 1024 bit number n. decrypted_word = crypted_word^d mod n, isn't it? But those d and n are very large numbers! Non of standard variable types can handle my 512 bit numbers. Everywhere is written, that rsa needs 512 bit prime number at last, but how actually can i perform any mathematical operations on such a number?
And one more think. I can't use extra libraries. I generate my prime numbers with java, using BigInteger, but on my system, i have only basic variable types and STRING256 is the biggest.
Suppose your maximal integer size is 64 bit. Strings are not that useful for doing math in most languages, so disregard string types. Now choose an integer of half that size, i.e. 32 bit. An array of these can be interpreted as digits of a number in base 232. With these, you can do long addition and multiplication, just like you are used to with base 10 and pen and paper. In each elementary step, you combine two 32-bit quantities, to produce both a 32-bit result and possibly some carry. If you do the elementary operation in 64-bit arithmetic, you'll have both of these as part of a single 64-bit variable, which you'll then have to split into the 32-bit result digit (via bit mask or simple truncating cast) and the remaining carry (via bit shift).
Division is harder. But if the divisor is known, then you may get away with doing a division by constant using multiplication instead. Consider an example: division by 7. The inverse of 7 is 1/7=0.142857…. So you can multiply by that to obtain the same result. Obviously we don't want to do any floating point math here. But you can also simply multiply by 14286 then omit the last six digits of the result. This will be exactly the right result if your dividend is small enough. How small? Well, you compute x/7 as x*14286/100000, so the error will be x*(14286/100000 - 1/7)=x/350000 so you are on the safe side as long as x<350000. As long as the modulus in your RSA setup is known, i.e. as long as the key pair remains the same, you can use this approach to do integer division, and can also use that to compute the remainder. Remember to use base 232 instead of base 10, though, and check how many digits you need for the inverse constant.
There is an alternative you might want to consider, to do modulo reduction more easily, perhaps even if n is variable. Instead of expressing your remainders as numbers 0 through n-1, you could also use 21024-n through 21024-1. So if your initial number is smaller than 21024-n, you add n to convert to this new encoding. The benefit of this is that you can do the reduction step without performing any division at all. 21024 is equivalent to 21024-n in this setup, so an elementary modulo reduction would start by splitting some number into its lower 1024 bits and its higher rest. The higher rest will be right-shifted by 1024 bits (which is just a change in your array indexing), then multiplied by 21024-n and finally added to the lower part. You'll have to do this until you can be sure that the result has no more than 1024 bits. How often that is depends on n, so for fixed n you can precompute that (and for large n I'd expect it to be two reduction steps after addition but hree steps after multiplication, but please double-check that) whereas for variable n you'll have to check at runtime. At the very end, you can go back to the usual representation: if the result is not smaller than n, subtract n. All of this should work as described if n>2512. If not, i.e. if the top bit of your modulus is zero, then you might have to make further adjustments. Haven't thought this through, since I only used this approach for fixed moduli close to a power of two so far.
Now for that exponentiation. I very much suggest you do the binary approach for that. When computing xd, you start with x, x2=x*x, x4=x2*x2, x8=…, i.e. you compute all power-of-two exponents. You also maintain some intermediate result, which you initialize to one. In every step, if the corresponding bit is set in the exponent d, then you multiply the corresponding power into that intermediate result. So let's say you have d=11. Then you'd compute 1*x1*x2*x8 because d=11=1+2+8=10112. That way, you'll need only about 1024 multiplications max if your exponent has 512 bits. Half of them for the powers-of-two exponentiation, the other to combine the right powers of two. Every single multiplication in all of this should be immediately followed by a modulo reduction, to keep memory requirements low.
Note that the speed of the above exponentiation process will, in this simple form, depend on how many bits in d are actually set. So this might open up a side channel attack which might give an attacker access to information about d. But if you are worried about side channel attacks, then you really should have an expert develop your implementation, because I guess there might be more of those that I didn't think about.
You may write some macros you may execute under Microsoft for functions like +, -, x, /, modulo, x power y which work generally for any integer of less than ten or hundred thousand digits (the practical --not theoretical-- limit being the internal memory of your CPU). Please note the logic is exactly the same as the one you got at elementary school.
E.g.: p= 1819181918953471 divider of (2^8091) - 1, q = ((2^8091) - 1)/p, mod(2^8043 ; q ) = 23322504995859448929764248735216052746508873363163717902048355336760940697615990871589728765508813434665732804031928045448582775940475126837880519641309018668592622533434745187004918392715442874493425444385093718605461240482371261514886704075186619878194235490396202667733422641436251739877125473437191453772352527250063213916768204844936898278633350886662141141963562157184401647467451404036455043333801666890925659608198009284637923691723589801130623143981948238440635691182121543342187092677259674911744400973454032209502359935457437167937310250876002326101738107930637025183950650821770087660200075266862075383130669519130999029920527656234911392421991471757068187747362854148720728923205534341236146499449910896530359729077300366804846439225483086901484209333236595803263313219725469715699546041162923522784170350104589716544529751439438021914727772620391262534105599688603950923321008883179433474898034318285889129115556541479670761040388075352934137326883287245821888999474421001155721566547813970496809555996313854631137490774297564881901877687628176106771918206945434350873509679638109887831932279470631097604018939855788990542627072626049281784152807097659485238838560958316888238137237548590528450890328780080286844038796325101488977988549639523988002825055286469740227842388538751870971691617543141658142313059934326924867846151749777575279310394296562191530602817014549464614253886843832645946866466362950484629554258855714401785472987727841040805816224413657036499959117701249028435191327757276644272944743479296268749828927565559951441945143269656866355210310482235520220580213533425016298993903615753714343456014577479225435915031225863551911605117029393085632947373872635330181718820669836830147312948966028682960518225213960218867207825417830016281036121959384707391718333892849665248512802926601676251199711698978725399048954325887410317060400620412797240129787158839164969382498537742579233544463501470239575760940937130926062252501116458281610468726777710383038372260777522143500312913040987942762244940009811450966646527814576364565964518092955053720983465333258335601691477534154940549197873199633313223848155047098569827560014018412679602636286195283270106917742919383395056306107175539370483171915774381614222806960872813575048014729965930007408532959309197608469115633821869206793759322044599554551057140046156235152048507130125695763956991351137040435703946195318000567664233417843805257728.
The last step took about 0.1 sec.
wpjo (willibrord oomen on academia.edu)

what options are there for representing numbers with more than 2^81 digits?

I came across an interesting math problem that would require me to do some artithmetic with numbers that have more than 281 digits. I know that its impossible to represent a number this large with a system where there is one memory unit for each digit but wondered if there were any ways around this.
My initial thought was to use a extremely large base instead of base 10 (decimal). After some thought I believe (but can't verify) that the optimal base would be the square root of the number of digits (so for a number with 281 digits you'd use base 240ish) which is a improvement but that doesn't scale well and still isn't really practical.
So what options do I have? I know of many arbitrary precision libraries, but are there any that scale to support this sort of arithmetic?
Thanks o7
EDIT: after thinking some more i realize i may be completely wrong about the "optimal base would be the square root of the number of digits" but a) that's why im asking and b) im too tired to remember my initial reasoning for assumption.
EDIT 2: 1000,000 in base ten = F4240 in base 16 = 364110 in base 8. In base 16 you need 20 bits to store the number in base 8 you need 21 so it would seem that by increasing the base you decrees the total number of bits needed. (again this could be wrong)
This is really a compression problem pretending to be an arithmetic problem. What you can do with such a large number depends entirely on its Kolmogorov complexity. If you're required to do computations on such a large number, it's obviously not going be arrive as 2^81 decimal digits; the Kolmogorov complexity would too high in that case and you can't even finish reading the input before the sun goes out. The best way to deal with such a number is via delayed evaluation and symbolic rational types that a language like Scheme provides. This way a program may be able to answer some questions about the result of computations on the number without actually having to write out all those digits to memory.
I think you should just use scientific notation. You will lose precision, but you can not store numbers that large without losing precision, because storing 2^81 digits will require more than 10^24 bits(about thousand billion terabytes), which is much more that you can have nowadays.
that have more than 2^81 digits
Non-fractional number with 2^81 bits, will take 3*10^11 terabytes of data. Per number.
That's assuming you want every single digit and data isn't compressible.
You could attempt to compress the data storing it in some kind of sparse array that allocates memory only for non-zero elements, but that doesn't guarantee that data will be fit anywhere.
Such precision is useless and impossible to handle on modern hardware. 2^81 bits will take insane amount of time to simply walk through number (9584 trillion years, assuming 1 byte takes 1 millisecond), never mind multiplication/division. I also can't think of any problem that would require precision like that.
Your only option is to reduce precision to first N significant digits and use floating point numbers. Since data won't fit into double, you'll have to use bignum library with floating point support, that provides extremely large floating point numbers. Since you can represent 2^81 (exponent) in bits, you can store beginning of a number using very big floating point.
1000,000 in base ten
Regardless of your base, positive number will take at least floor(log2(number))+1 bits to store it. If base is not 2, then it will take more than floor(log2(number))+1 bits to store it. Numeric base won't reduce number of required bits.

Generate very very large random numbers

How would you generate a very very large random number? I am thinking on the order of 2^10^9 (one billion bits). Any programming language -- I assume the solution would translate to other languages.
I would like a uniform distribution on [1,N].
My initial thoughts:
--You could randomly generate each digit and concatenate. Problem: even very good pseudorandom generators are likely to develop patterns with millions of digits, right?
You could perhaps help create large random numbers by raising random numbers to random exponents. Problem: you must make the math work so that the resulting number is still random, and you should be able to compute it in a reasonable amount of time (say, an hour).
If it helps, you could try to generate a possibly non-uniform distribution on a possibly smaller range (using the real numbers, for instance) and transform. Problem: this might be equally difficult.
Any ideas?
Generate log2(N) random bits to get a number M,
where M may be up to twice as large as N.
Repeat until M is in the range [1;N].
Now to generate the random bits you could either use a source of true randomness, which is expensive.
Or you might use some cryptographically secure random number generator, for example AES with a random key encrypting a counter for subsequent blocks of bits. The cryptographically secure implies that there can be no noticeable patterns.
It depends on what you need the data for. For most purposes, a PRNG is fast and simple. But they are not perfect. For instance I remember hearing that Monte Carlos simulations of chaotic systems are really good at revealing the underlying pattern in a PRNG.
If that is the sort of thing that you are doing, though, there is a simple trick I learned in grad school for generating lots of random data. Take a large (preferably rapidly changing) file. (Some big data structures from the running kernel are good.) Compress it to increase the entropy. Throw away the headers. Then for good measure, encrypt the result. If you're planning to use this for cryptographic purposes (and you didn't have a perfect entropy data set to work with), then reverse it and encrypt again.
The underlying theory is simple. Information theory tells us that there is no difference between a signal with no redundancy and pure random data. So if we pick a big file (ie lots of signal), remove redundancy with compression, and strip the headers, we have a pretty good random signal. Encryption does a really good job at removing artifacts. However encryption algorithms tend to work forward in blocks. So if someone could, despite everything, guess what was happening at the start of the file, that data is more easily guessable. But then reversing the file and encrypting again means that they would need to know the whole file, and our encryption, to find any pattern in the data.
The reason to pick a rapidly changing piece of data is that if you run out of data and want to generate more, you can go back to the same source again. Even small changes will, after that process, turn into an essentially uncorrelated random data set.
NTL: A Library for doing Number Theory
This was recommended by my Coding Theory and Cryptography teacher... so I guess it does the work right, and it's pretty easy to use.
RandomBnd, RandomBits, RandomLen -- routines for generating pseudo-random numbers
ZZ RandomLen_ZZ(long l);
// ZZ = psuedo-random number with precisely l bits,
// or 0 of l <= 0.
If you have a random number generator that generates random numbers of X bits. And concatenated bits of [X1, X2, ... Xn ] create the number you want of N bits, as long as each X is random, I don't see why your large number wouldn't be random as well for all intents and purposes. And if standard C rand() method is not secure enough, I'm sure there's plenty of other libraries (like the ones mentioned in this thread) whose pseudo-random numbers are "more random".
even very good pseudorandom generators are likely to develop patterns with millions of digits, right?
From the wikipedia on pseudo-random number generation:
A PRNG can be started from an arbitrary starting state using a seed state. It will always produce the same sequence thereafter when initialized with that state. The maximum length of the sequence before it begins to repeat is determined by the size of the state, measured in bits. However, since the length of the maximum period potentially doubles with each bit of 'state' added, it is easy to build PRNGs with periods long enough for many practical applications.
You could perhaps help create large random numbers by raising random numbers to random exponents
I assume you're suggesting something like populating the values of a scientific notation with random values?
E.g.: 1.58901231 x 10^5819203489
The problem with this is that your distribution is going to be logarithmic (or is that exponential? :) - same difference, it isn't even). You will never get a value that has the millionth digit set, yet contains a digit in the one's column.
you could try to generate a possibly non-uniform distribution on a possibly smaller range (using the real numbers, for instance) and transform
Not sure I understand this. Sounds like the same thing as the exponential solution, with the same problems. If you're talking about multiplying by a constant, then you'll get a lumpy distribution instead of a logarithmic (exponential?) one.
Suggested Solution
If you just need really big pseudo-random values, with a good distribution, use a PRNG algorithm with a larger state. The Periodicity of a PRNG is often the square of the number of bits, so it doesn't take that many bits to fill even a really large number.
From there, you can use your first solution:
You could randomly generate each digit and concatenate
Although I'd suggest that you use the full range of values returned by your PRNG (possibly 2^31 or 2^32), and populate a byte array with those values, splitting it up as necessary. Otherwise you might be throwing away a lot of bits of randomness. Also, scaling your values to a range (or using modulo) can easily screw up your distribution, so there's another reason to try to keep the max number of bits your PRNG can return. Be careful to pack your byte array full of the bits returned, though, or you'll again introduce lumpiness to your distribution.
The problem with those solution, though, is how to fill that (larger than normal) seed state with random-enough values. You might be able to use standard-size seeds (populated via time or GUID-style population), and populate your big-PRNG state with values from the smaller-PRNG. This might work if it isn't mission critical how well distributed your numbers are.
If you need truly cryptographically secure random values, the only real way to do it is use a natural form of randomness, such as that at http://www.random.org/. The disadvantages of natural randomness are availability, and the fact that many natural-random devices take a while to generate new entropy, so generating large amounts of data might be really slow.
You can also use a hybrid and be safe - natural-random seeds only (to avoid the slowness of generation), and PRNG for the rest of it. Re-seed periodically.

Good Idea/Bad Idea: Using Qt's QSet on very large dataset?

Is it a bad idea to use QSet to keep track of a very large set of fairly large strings? Each string is 54 characters (108 bytes). The set may contain thousands of entries (I'm not sure on the exact number yet). The QSet will only be used for insertion and membership query.
If it is a bad idea, I'm definitely open to suggestions. My 54 character strings are composed of only 6 different characters (e.g. "AAAAAAAAABBBBBBBBBCCCCCCCCCDDDDDDDDDEEEEEEEEEFFFFFFFFF"). This seems like a good candidate for compression, perhaps? Any other suggestions are welcome.
Realize that by using a built-in set, you're going to have some path-level compression based on the nature of your data. Of course, this depends on the container's implementation.
Look at some information on radix trees, digital search trees, red-black trees, etc. You'll see that you don't need to store each and every string, but rather the patterns. For instance, let's simplify your problem: we have only 3 characters that can appear an maximum of 2 times each, and each string is 6 characters long. Three possible strings are:
AABBCC, AABCBC, and AACBCB
With these examples, we could get away with using a maximum of 6 + 3 + 4 = 13 nodes instead of a full 18 nodes. not substantial, but I don't know what you're doing either. As with any type of compression, the more your prefix patterns are reused, the more compression you have.
Edit:
The numbers 13 and 18 come from the path-level compression. For instance, in straight C (for argument/discussion), if I am implementing my string storage class as a wrapper around an array I would probably just have an array of character pointers with each pointer referencing a spot in memory that contains a pattern. In the example I gave above, this would take 18 characters ( 6 * 3 = 18). Adding on the size of the array (let's say that sizeof(char*) is 4, our array would take 3 * 4 bytes of storage = 12 + 18 or 30 bytes total to store our patterns.
If I am instead storing the patterns in a sort of digital search tree, I make a small tradeoff. The nodes in my tree are going to be larger than 1 byte apiece (1 byte for the character in the node, 4 bytes for the "next" pointer in each node, 5 bytes apiece). The first pattern we store is AABBCC. This is 6 nodes in the tree. Next is AABCBC. We reuse the path AAB from the first tree and need only an additional 3 nodes for CBC. The last pattern is AACBCB. We reuse AA, and need 4 new nodes for CBCB. This is a total of 13 nodes * 5 bytes = 65 bytes of storage. However, if you have a lot of long, repeating patterns in the prefix of your data, then you'll see some prefix path-level compression.
If this isn't the case for you, I would look into Huffman or LZW compression. This will require you to build a dictionary of patterns that have integer numbers tied to them. When you compress, you build the dictionary and create integer id's for each pattern in your text. You then replace the patterns in your text with the integer id's. When uncompressing, you do the opposite. I don't have the time to describe these algorithms in more detail, so you'll need to look them up.
It's a tradeoff in simplicity/time. If your data will allow it, take the shorter method and just use the built-in container. If not, you will need something more tailored to your data.
I don't think you'd have any additional problems using QSet over another sort of container, such as std::set, a map, or a vector. If you are wondering about running out of memory, that probably depends on how many thousands of the strings you need to store, and if there was a way to encode them more concisely. (For example, if the characters always occur in the same order but vary in relative lengths, store the length for each character rather than all of the characters.) However, even 50,000 of these strings is only around 5 MB, and 500,000 of them is only 50 MB to store, discounting storage overhead, which is a moderate amount of memory on modern machines.
QSet does sound like a good idea. It's basically just a hash-table and it can optimize its bucket size dynamically. Perfect.
Another suggestion for compressing the key:
Treat it as a base-6 number string (think A=0, B=1, ... F=5) and convert it into binary (int).
QByteArray ba("112"); // instead of "BBC"
int num = ba.toInt(0, 6 /*base*/); // num == 44
6^3 < 2^8, so we can represent every 3 chars in your string with a 1 byte int (or char) and make a bytearray of it. That would cut down the size of the key from 54 bytes to 18 bytes.
From your earlier comment: "In my strings, there will always be 54 characters, and there will always be 9 of each character. The order is the only thing that changes."
Don't store raw strings then. You could just compress them into the 6 characters actually used, and then make a QSet of those. A trivial compression would be {a,b,c,d,e,f}, and if the character set is known beforehand (and only those 6 characters) you could even pack things into a 16-bit integer.

Hardware Cache Formulas (Parameter)

The image below was scanned (poorly) from Computer Systems: A Programmer's Perspective. (I apologize to the publisher). This appears on page 489.
Figure 6.26: Summary of cache parameters http://theopensourceu.com/wp-content/uploads/2009/07/Figure-6.26.jpg
I'm having a terribly difficult time understanding some of these calculations. At the current moment, what is troubling me is the calculation for M, which is supposed to be the number of unique addresses. "Maximum number of unique memory addresses." What does 2m suppose to mean? I think m is calculated as log2(M). This seems circular....
For the sake of this post, assume the following in the event you want to draw up an example: 512 sets, 8 blocks per set, 32 words per block, 8 bits per word
Update: All of the answers posted thus far have been helpful but I still think I'm missing something. cwrea's answer provides the biggest bridge for my understand. I feel like the answer is on the tip of my mental tongue. I know it is there but I can't identify it.
Why does M = 2m but then m = log2(M)?
Perhaps the detail I'm missing is that for a 32-bit machine, we'd assume M = 232. Does this single fact allow me to solve for m? m = log2(232)? But then this gets me back to 32... I have to be missing something...
m & M are related to each other, not defined in terms of each other. They call M a derived quantity however since usually the processor/controller is the limiting factor in terms of the word length it uses.
On a real system they are predefined. If you have a 8-bit processor, it generally can handle 8-bit memory addresses (m = 8). Since you can represent 256 values with 8-bits, you can have a total of 256 memory addresses (M = 2^8 = 256). As you can see we start with the little m due to the processor constraints, but you could always decide you want a memory space of size M, and use that to select a processor that can handle it based on word-size = log2(M).
Now if we take your assumptions for your example,
512 sets, 8 blocks per set, 32 words
per block, 8 bits per word
I have to assume this is an 8-bit processor given the 8-bit words. At that point your described cache is larger than your address space (256 words) & therefore pretty meaningless.
You might want to check out Computer Architecture Animations & Java applets. I don't recall if any of the cache ones go into the cache structure (usually they focus on behavior) but it is a resource I saved on the past to tutor students in architecture.
Feel free to further refine your question if it still doesn't make sense.
The two equations for M are just a relationship. They are two ways of saying the same thing. They do not indicate causality, though. I think the assumption made by the author is that the number of unique address bits is defined by the CPU designer at the start via requirements. Then the M can vary per implementation.
m is the width in bits of a memory address in your system, e.g. 32 for x86, 64 for x86-64. Block size on x86, for example, is 4K, so b=12. Block size more or less refers to the smallest chunk of data you can read from durable storage -- you read it into memory, work on that copy, then write it back at some later time. I believe tag bits are the upper t bits that are used to look up data cached locally very close to the CPU (not even in RAM). I'm not sure about the set lines part, although I can make plausible guesses that wouldn't be especially reliable.
Circular ... yes, but I think it's just stating that the two variables m and M must obey the equation. M would likely be a given or assumed quantity.
Example 1: If you wanted to use the formulas for a main memory size of M = 4GB (4,294,967,296 bytes), then m would be 32, since M = 2^32, i.e. m = log2(M). That is, it would take 32 bits to address the entire main memory.
Example 2: If your main memory size assumed were smaller, e.g. M = 16MB (16,777,216 bytes), then m would be 24, which is log2(16,777,216).
It seems you're confused by the math rather than the architectural stuff.
2^m ("2 to the m'th power") is 2 * 2... with m 2's. 2^1 = 2, 2^2 = 2 * 2 = 4, 2^3 = 2 * 2 * 2 = 8, and so on. Notably, if you have an m bit binary number, you can only represent 2^m different numbers. (is this obvious? If not, it might help to replace the 2's with 10's and think about decimal digits)
log2(x) ("logarithm base 2 of x") is the inverse function of 2^x. That is, log2(2^x) = x for all x. (This is a definition!)
You need log2(M) bits to represent M different numbers.
Note that if you start with M=2^m and take log2 of both sides, you get log2(M)=m. The table is just being very explicit.

Resources