How do I Base64 encode a 2 byte sequence? - networking

I'm given a 2 byte sequence and asked to Base64 encode it:
00000001 00010001
From what I understand you can only encode sequences of 6 bits when working with Base64.
So because 16 bits is not divisible by 6 I'm a little stuck.
The solution I can see is to convert the given 2 byte sequence into a 3 byte sequence so it becomes divisible by 6. But how do I do this without changing the value of the initial sequence?

Basically, you pad it out with zeroes to the next multiple of 6 bits, and pad out the last four-character sequence with =s. Since the last two zero bytes don't make up a full input byte, the decoder knows to ignore them. (The = padding isn't totally necessary, but it's customary to make the end result always a multiple of 4 characters long.)
For instance, the sequence you've got is:
00000001 00010001
Breaking that up into groups of 6, we get:
000000 010001 0001
Pad with zeroes:
000000 010001 000100
Convert to ASCII:
ARE
And pad that out:
ARE=

Related

When dealing with hexadecimal numbers how do I use bit offset and length to get a value?

I have the following number
0000C1FF61A40000
The offset or start is 36 or 0x23
The length of the number is 12 or 0xc
Can someone help me understand how to get the resulting value? I thought the offset meant what pair of hex numbers to start with and then length would be how many to grab. There definitely aren't 36 pairs, only 8. Not sure how I'd do a length of 12 with only 8.
Each hex digit represents four binary bits. Therefore your offset of 36 bits (which BTW is 0x24, not 0x23) is equivalent to 9 hex digits. So discard the rightmost 9 digits from your original number, leaving you with 0000C1F.
Then the length of the number you want is 12 bits, which is 3 hex digits. So discard all but the rightmost 3 digits, leaving you with C1F as the answer.
If the numbers of bits had not been nice multiples of 4 then you would have had to convert the original hex number into binary, then discard offset number of bits from the right, retain only the rightmost length bits from the result, and finally convert those length bits back into hex.

How to convert 0x80 in a 7-bit variable-length integer?

Im reading a book about network protocol structures.
There is an illustration in a chapter about variable-length quantity, which I dont fully understand.(see attachment)
The subject is to convert different numbers to variable-length 7-bit integers.
The first line shows that 0x3F is stored in a single octet as 0x3F.
The second line shows that 0x80 is stored in two octets one as 0x80 and second as 0x01.
However I dont understand why its not 0x81 in the first octet and 0x00 in the second.
Because according to wikipedia, converting numbers into variable-length 7bit integers goes as follows:
Represent the value in binary notation (e.g. 137 as 10001001)
Break it up in groups of 7 bits starting from the lowest significant bit (e.g. 137 as 0000001 0001001). This is equivalent to representing the number in base 128.
Take the lowest 7 bits and that gives you the least significant byte (0000 1001). This byte comes last.
For all the other groups of 7 bits (in the example, this is 000 0001), set the MSB to 1 (which gives 1000 0001 in our example). Thus 137 becomes 1000 0001 0000 1001 where the bits in boldface are something we added. These added bits denote if there is another byte to follow or not. Thus, by definition, the very last byte of a variable length integer will have 0 as its MSB.
So lets do these steps for 0x80:
binary notation: 1000 0000
in groups of 7 bits starting from LSB: 0000001 0000000
and 4. set MSB as described: 1000 0001 0000 0000
Converting that binary number into two hex octets, gives me 0x81 and 0x00.
Which leads me to the question: Is there a printing fail in the book or did I missunderstood something?
Which book is that?
There may be many possible encoding schemes. One of them could go like this:
1. Represent the value in binary notation (e.g. 0x80 as 10000000)
2. Break it up in groups of 7 bits starting from the lowest significant bit: 0000001 0000000
3. Start with the lowest 7 bits: if this is *not* the last group of 7 bits, then set MSB: 10000000; if it's the last, then leave it
alone: 00000001
4. Output starting LSB first: 10000000 00000001, i.e. 0x80 0x01
So what does the book say? What encoding scheme are they using?

When does a hexidecimal number pivot to letters rather than numbers

Assume this number 173250103518582539668252657343418508842, if I wanted to convert it to a hexadecimal number such that a 10 = F, 11 = E, etc. where are the breaks/how does that work?
I've done a bit of research online and I can't seem to find the answer. It's a really low-level question, I know.
6 characters in there's a 10, would that be flipped to an F or would that get missed because whatever triggers the flip in the int -> string hexadecimal conversion happens another way?
Hexadecimal is an encoding used to express binary data in base-16 where the ascending sequence is 0-9a-f (upper or lower case a-f), once character per 4-bits (4-bits has 16 possible values). Thus 2 hex characters per byte.
binary bits (msb on left) and hexadecimal:
0000 0
0001 1
0010 2
0011 3
...
1001 9
1010 a
...
1111 f
To say "10 = F, 11 = E" is not hexadecimal.
To encode the decimal number 173250103518582539668252657343418508842 convert it is a Big Integer and then hexadecimal encode the underlying bytes to hexadecimal.
or
To encode the ASCI string "173250103518582539668252657343418508842" to hexadecimal convert each letter to the underlying ASCII binary code and then encode that into hexadecimal: "313733323530313033353138353832353339363638323532363537333433343138353038383432".
See Hexadecimal and ASCII.
Aside: My first day as a programmer I had to know hex, binary and ASCII encoding, funny how things change.

What's the significance of the bit group size in base64 or base32 encoding (RFC 4648)?

Why would they chose to use a 24-bit or 40-bit (that's really odd) bit group/word size for base 64 and base 32 respectively.
Specifically, can someone explain why the the least common multiple is significant?
lcm(log2(64), 8) = 24
lcm(log2(32), 8) = 40
Base 64 encoding basically involves taking a stream of 8-bit bytes and transforming it to a stream of 6-bit characters that can be represented by printable ASCII characters.
Taking a single byte at a time means you have one 6 bit character with 2 bits left over.
Taking two bytes (16 bits) means you have two 6-bit characters with 4 bits left over.
Taking 3 bytes (24 bits) means you have three bytes that can be split exactly into 4 characters with no bits left over.
So the lcm of bytes size and character size is naturally the size you need to split your input into.
6 bit characters are chosen because this is the largest size that you can use printable ascii characters for all values. If you went up to 7 bits you would need non-printing characters.
The argument for base 32 is similar, but now you are using 5-bit characters, so the lcm of 8 and 5 is the word size. This character size allows for case insensitive printable characters, 6 bit characters require differentiating between upper and lower cases.

Using Hexadecimal to represent a 7 bit code

My computer uses ASCII, American Standard Code for Information Exchange.
It is my understanding that this uses a 7 bit code to represent all the letters, symbols, and numbers needed for the english language.
It is my understanding that these 7 bits can be represented with hexadecimal codes.
I thought that hexadecimal needed 8 bits. 4 bits per number.
Can some one explain to me how the hexadecimal system can be used to represent the codes in the 7 bit ASCII system.
Thanks in advance.
Hexadecimal numbers don't need eight bits, each hex digit can represent four bits but there's no upper limit, since you can just use more digits:
0xfffffffffffffffffffffffffffffffffffffffffffffffffffffffffff...
For representing seven-bit values, you can just use the lower half of the eight-bit hex numbers, 0x00 through 0x7f.
That gives you the binary numbers 0000000 through 1111111.
With 7 bits you can represent every number from 0 to 2^7 = 127 (decimal) = 7f (hexadecimal).
Hexadecimal doesn't need 7 bits, it is just another way to write numbers.
You can combine any combination of up to 4 bits into one hexadecimal character.
1 bit: 0 to 1
2 bits: 0 to 3
3 bits: 0 to 7
4 bits: 0 to F
For 7 bits you need 2 hex digit, one coding 3 bits and the other 4 bits so you get a code of 00 to 7F.
Or you use 8 bits, but the most significant bit is always 0.

Resources