In Julia language how can one set, clear and reverse a single bit? I hope you won't consider this question out of scope or too broad; if so, please comment it instead of downvote it.
Having this paragraph from Julia-Lang doc:
Currently, only sizes that are multiples of 8 bits are supported.
Therefore, boolean values, although they really need just a single
bit, cannot be declared to be any smaller than eight bits.
First it's possible to take a look at the binary representation of a variable like this:
julia> bits(Int(10))
"00000000000000000000000000001010"
and secondly, one can create byte value directly using its binary form:
julia> val=0b100
0x04
julia> typeof(val)
UInt8
and lastly, the best way to change value of a bit is performing right binary operation on its byte value:
julia> val | 0b10 # set 2nd bit
0x06
julia> bits(ans)
"00000110"
julia> val & 0b11111011 # clear 3nd bit
0x00
I assume you are wanting to clear, set and check the states of specific bits in a byte.
Where:
N represents the integer in question and;
n is the number of the bit in question (i.e: the n-th bit) and;
the LSB is 1-th bit
set the n-th bit: N |= 2^(n-1)
clear the n-th bit: N &= ~(2^(n-1))
check the state of a bit by copying and shifting: (N >> (n-1)) & 1 != 0
check the state of a bit using a bit-mask: mask = 2^(n-1); N & mask != mask
reverse/toggle/invert the n-th bit, using XOR: N ⊻= 2^(n-1). The xor function may also be used.
Related
It is my understanding that numbers are negated using the two's compliment, which to my understanding is: !num + 1.
So my question is does this mean that, for variable 'foo'=1, a negated 'foo' will be the exactly the same as variable 'bar'=255.
f we were to check if -'foo' == 'bar' or if -'foo' == 255, would we get that they are equal?
I know that some languages, such as Java, keep a sign bit -- so the comparisons would yield false. What of languages that do not? And I'm assuming that assembler/native machine does not have a sign bit.
In addition to all of this, I read about a zero flag or a carry-over flag that is set when a 'negative' number is added to another (of any sign) number. This flag being set whenever it is added because of the way two's complement works, 0x01 + 0xff = 0x00 (with the leading 1 truncated). What exactly is this flag used for?
And my last question, for other math operations (such as multiplication), would I have to re-negate the number (so it is now positive), perform the operation, and negate the result? E.g., !((!neg + 1) * pos) + 1.
Edit
Finished the question, so feel free fire away.
Yes, in two’s complement, the number x is represented as ~x+1, where ~x is the bitwise complement of the binary numeral for x in some fixed number bits. E.g., for eight bits, the binary numeral for x is 000000001, so the bitwise complement is 11111110, and adding one produces 11111111.
There is no way to distinguish -1 in eight-bit two’s complement from 255 in eight-bit binary (with no sign). They both have the same representation in bits: 11111111. If you are using both of these numbers, you must either separately remember which one is eight-bit two’s complement and which one is plain eight-bit binary or you must use more than eight bits. In other words, at the raw bit level, 11111111 is just eight bits; it has no value until we decide how to interpret it.
Java and typical other languages do not maintain a sign bit separate from the value of a number; the sign is part of the encoding of the number. Also, typical languages do not allow you to compare different types. If you have a two’s complement x and an unsigned y, then either one must be converted to the type of the other before comparison or they must both be converted to a third type. Thus, if you compare x and y, and one is converted to the other, then the conversion will overflow or wrap, and you cannot expect to get the correct mathematical result. To compare these two numbers, we might convert each of them to a wider integer, such as 32-bits, then compare. Converting the eight-bit two’s complement 11111111 to a 32-bit integer produces -1, and converting the eight-bit plain binary 11111111 to a 32-bit integer produces 255, and then the comparison reports they are unequal.
The zero flag and the carry flag you read about are flags that are set when a comparison instruction is executed in a computer processor. Most high-level languages do not give you direct access to these flags. Many processors have an instruction with a form like this:
cmp a, b
That instruction subtracts b from a and discards the difference but remembers several flags that describe the subtraction: Was the result zero (zero flag)? Did a borrow occur (borrow flag)? Was the result negative (sign flag)? Did an overflow occur (overflow flag)?
The compare instruction requires that the two things being compared be the same type (two’s complement or unsigned), but it does not care which type. The results can be tested later by checking particular combinations of the flags depending on the type. That is, the information recorded in the flags can distinguish whether one two’s complement number was greater than another or whether one unsigned number was greater than another, depending on what tests are made. There are conditional branch instructions that test the desired flag properties.
There is generally no need to “un-negate” a number to perform arithmetic operations. Processors include arithmetic instructions that work on two’s complement numbers. Usually the add and subtract instructions are type-agnostic, the same way the compare instruction is, but the multiply and divide instructions are not (except for some forms of multiply that return partial results). The add and subtract instructions can be type-agnostic because the wrapping that occurs in the arithmetic works for both two’s complement and unsigned. However, that wrapping does not work for multiplication and division.
I'm writing an emulator for the 6502, and basically, there are some instructions where there's an offset saved in one of the registers (mostly X and Y) and I'm wondering, since branch instructions use signed 8 bit integers, do the registers keep their values as 8 bit signed? Meaning this:
switch(opcode) {
//Bunch of opcodes
case 0xD5:
//Read the memory area with final address being address + x offset
int rempResult = a - readMemory(address + x);
//Comparing some things, setting/disabling flags
//Incrementing program counter and cycles/ticks
break;
//More opcodes
}
Let's say in this situation that x = 0xEE. In regular binary, this would mean that x = 238. In the 6502 however, the branch instruction uses signed offset for jumping to memory addresses, so I'm wondering, is the 238 interpreted as -18 in this case, or is it just regular unsigned 8 bit value?
It varies.
They're not explicitly signed or unsigned for arithmetic, logical, shift, or load and store operations.
The conditional branches (and the unconditional one on the later 6502 descendants) all take the argument as signed; otherwise loops would be extremely awkward.
zero, x addressing is achieved by performing an 8-bit addition of x to the zero page address, ignoring carry, and reading from the zero page. So e.g.
LDX #-126 ; which is +130 if unsigned
LDA 23, x
Would read from address 23+130 = 153. But had it been 223+130 then the end read would have been from (223 + 130) MOD 256 = 97.
absolute, x/y is unsigned and carry works correctly (but costs an extra cycle)
(zero, x) is much like the direct version in that the offset is signed but the result is always within the zero page. Then the real address is read from there.
(zero), y is unsigned with carry working and costing.
The "sign" is simply the value of the most significant (aka bit 7) in an 8-bit byte.
6502 has support for signed values in these ways:
The N bit in .P - but it really just tells you if the last instruction turned on or off bit 7 of a memory location or register. It was common to use BPL/BMI to do stuff based on bit 7 in a memory location for flag or "boolean" like use.
The V bit of .P which is flipped "when the result of adding two positive numbers overflows and ends up negative, and when the result of adding two negative numbers overflows and ends up positive"
And of course obeying the sign bit for relative branch instructions only, e.g. BEQ with a value with bit 7 set will move to a lower memory location, not a higher one.
Beyond that, whether that bit means anything is completely up to you and your program. What really makes numbers signed or unsigned is how you display the numbers.
The linked article above goes into what one's complement and two's complement is and how it makes the mathematics work without the 6502 having to care too much about the sign.
I understand that there are a few questions already addressing this. However, my question varies in some sort. Suppose I have to get the LSB of a value (hexadecimal) stored in a register; for e.g;
If register $t0 contains the value 0xA4, I need to obtain and store the value 4
If register $t0 contains the value 0xBF, I need to obtain and store the value F
I understand that the bitwise ANDoperation works for decimal values. Could someone please provide some assistance as to how I go about getting the LSB?
Kind regards
You can easily AND the number from which you want to extract the LSB itself like this:
0xA4 AND 0x0F
is same as (in binary)
10100100b AND 00001111
this basically means that only last four digits will be extracted from binary number and that is the LSB you want.
All the binary operations works on general purpose registers along w/ masks which are merely pure numbers (regardless of their underlying basis repr)
Even though x86 isn't MIPS you should have something like this
and EAX, 0xF
Greetings everybody. I have seen examples of such operations for so many times that I begin to think that I am getting something wrong with binary arithmetic. Is there any sense to perform the following:
byte value = someAnotherByteValue & 0xFF;
I don't really understand this, because it does not change anything anyway. Thanks for help.
P.S.
I was trying to search for information both elsewhere and here, but unsuccessfully.
EDIT:
Well, off course i assume that someAnotherByteValue is 8 bits long, the problem is that i don't get why so many people ( i mean professionals ) use such things in their code. For example in SharpZlib there is:
buffer_ |= (uint)((window_[windowStart_++] & 0xff |
(window_[windowStart_++] & 0xff) << 8) << bitsInBuffer_);
where window_ is a byte buffer.
The most likely reason is to make the code more self-documenting. In your particular example, it is not the size of someAnotherByteValue that matters, but rather the fact that value is a byte. This makes the & redundant in every language I am aware of. But, to give an example of where it would be needed, if this were Java and someAnotherByteValue was a byte, then the line int value = someAnotherByteValue; could give a completely different result than int value = someAnotherByteValue & 0xff. This is because Java's long, int, short, and byte types are signed, and the rules for conversion and sign extension have to be accounted for.
If you always use the idiom value = someAnotherByteValue & 0xFF then, no matter what the types of the variable are, you know that value is receiving the low 8 bits of someAnotherByteValue.
uint s1 = (uint)(initial & 0xffff);
There is a point to this because uint is 32 bits, while 0xffff is 16 bits. The line selects the 16 least significant bits from initial.
Nope.. There is no use in doing this. Should you be using a value that is having its importance more than 8 bits, then the above statement has some meaning. Otherwise, its the same as the input.
If sizeof(someAnotherByteValue) is more than 8 bits and you want to extract the least signficant 8 bits from someAnotherByteValue then it makes sense. Otherwise, there is no use.
No, there is no point so long as you are dealing with a byte. If value was a long then the lower 8 bits would be the lower 8 bits of someAnotherByteValue and the rest would be zero.
In a language like C++ where operators can be overloaded, it's possible but unlikely that the & operator has been overloaded. That would be pretty unusual and bad practice though.
EDIT: Well, off course i assume that
someAnotherByteValue is 8 bits long,
the problem is that i don't get why so
many people ( i mean professionals )
use such things in their code. For
example in Jon Skeet's MiscUtil there
is:
uint s1 = (uint)(initial & 0xffff);
where initial is int.
In this particular case, the author might be trying to convert an int to a uint. The & with 0xffff would ensure that it would still convert Lowest 2 Bytes, even if the system is not one which has a 2 byte int type.
To be picky, there is no guaranty regarding a machine's byte size. There is no reason to assume in a extremely portable program that the architecture byte is 8 bits wide. To the best of my memory, according to the C standard (for example), a char is one byte, short is wider or the same as char, int is wider or the same as short, long is wider or the same as int and so on. Hence, theoretically there can be a compiler where a long is actually one byte wide, and that byte will be, say, 10 bits wide. Now, to ensure your program behaves the same on that machine, you need to use that (seemingly redundant) coding style.
"Byte" # Wikipedia gives examples for such peculiar architectures.
Why is only XOR used in cryptographic algorithms, and other logic gates like OR, AND, and NOR are not used?
It isn't exactly true to say that the logical operation XOR is the only one used throughout all cryptography, however it is the only two way encryption where it is used exclusively.
Here is that explained:
Imagine you have a string of binary digits 10101
and you XOR the string 10111 with it you get 00010
now your original string is encoded and the second string becomes your key
if you XOR your key with your encoded string you get your original string back.
XOR allows you to easily encrypt and decrypt a string, the other logic operations don't.
If you have a longer string you can repeat your key until its long enough
for example if your string was 1010010011 then you'd simple write your key twice and it would become 1011110111 and XOR it with the new string
Here's a wikipedia link on the XOR cipher.
I can see 2 reasons:
1) (Main reason) XOR does not leak information about the original plaintext.
2) (Nice-to-have reason) XOR is an involutory function, i.e., if you apply XOR twice, you get the original plaintext back (i.e, XOR(k, XOR(k, x)) = x, where x is your plaintext and k is your key). The inner XOR is the encryption and the outer XOR is the decryption, i.e., the exact same XOR function can be used for both encryption and decryption.
To exemplify the first point, consider the truth-tables of AND, OR and XOR:
And
0 AND 0 = 0
0 AND 1 = 0
1 AND 0 = 0
1 AND 1 = 1 (Leak!)
Or
0 OR 0 = 0 (Leak!)
0 OR 1 = 1
1 OR 0 = 1
1 OR 1 = 1
XOR
0 XOR 0 = 0
0 XOR 1 = 1
1 XOR 0 = 1
1 XOR 1 = 0
Everything on the first column is our input (ie, the plain text). The second column is our key and the last column is the result of your input "mixed" (encrypted) with the key using the specific operation (ie, the ciphertext).
Now, imagine an attacker got access to some encrypted byte, say: 10010111, and he wants to get the original plaintext byte.
Let's say the AND operator was used in order to generate this encrypted byte from the original plaintext byte. If AND was used, then we know for certain that every time we see the bit '1' in the encrypted byte then the input (ie, the first column, the plain text) MUST also be '1' as per the truth table of AND. If the encrypted bit is a '0' instead, we do not know if the input (ie, the plain text) is a '0' or a '1'. Therefore, we can conclude that the original plain text is: 1 _ _ 1 _ 111. So 5 bits of the original plain text were leaked (ie, could be accessed without the key).
Applying the same idea to OR, we see that every time we find a '0' in the encrypted byte, we know that the input (ie, the plain text) must also be a '0'. If we find a '1' then we do not know if the input is a '0' or a '1'. Therefore, we can conclude that the input plain text is: _ 00 _ 0 _ _ _. This time we were able to leak 3 bits of the original plain text byte without knowing anything about the key.
Finally, with XOR, we cannot get any bit of the original plaintext byte. Every time we see a '1' in the encrypted byte, that '1' could have been generated from a '0' or from a '1'. Same thing with a '0' (it could come from both '0' or '1'). Therefore, not a single bit is leaked from the original plaintext byte.
Main reason is that if a random variable with unknown distribution R1 is XORed with a random variable R2 with uniform distribution the result is a random variable with uniform distribution, so basically you can randomize a biased input easily which is not possible with other binary operators.
The output of XOR always depends on both inputs. This is not the case for the other operations you mention.
I think because XOR is reversible. If you want to create hash, then you'll want to avoid XOR.
XOR is the only gate that's used directly because, no matter what one input is, the other input always has an effect on the output.
However, it is not the only gate used in cryptographic algorithms. That might be true of old-school cryptography, the type involving tons of bit shuffles and XORs and rotating buffers, but for prime-number-based crypto you need all kinds of mathematics that is not implemented through XOR.
XOR acts like a toggle switch where you can flip specific bits on and off. If you want to "scramble" a number (a pattern of bits), you XOR it with a number. If you take that scrambled number and XOR it again with the same number, you get your original number back.
210 XOR 145 gives you 67 <-- Your "scrambled" result
67 XOR 145 gives you 210 <-- ...and back to your original number
When you "scramble" a number (or text or any pattern of bits) with XOR, you have the basis of much of cryptography.
XOR uses fewer transistors (4 NAND gates) than more complicated operations (e.g. ADD, MUL) which makes it good to implement in hardware when gate count is important. Furthermore, an XOR is its own inverse which makes it good for applying key material (the same code can be used for encryption and decryption) The beautifully simple AddRoundKey operation of AES is an example of this.
For symmetric crypto, the only real choices operations that mix bits with the cipher and do not increase length are operations add with carry, add without carry (XOR) and compare (XNOR). Any other operation either loses bits, expands, or is not available on CPUs.
The XOR property (a xor b) xor b = a comes in handy for stream ciphers: to encrypt a n bit wide data, a pseudo-random sequence of n bits is generated using the crypto key and crypto algorithm.
Sender:
Data: 0100 1010 (0x4A)
pseudo random sequence: 1011 1001 (0xB9)
------------------
ciphered data 1111 0011 (0xF3)
------------------
Receiver:
ciphered data 1111 0011 (0xF3)
pseudo random sequence: 1011 1001 (0xB9) (receiver has key and computes same sequence)
------------------
0100 1010 (0x4A) Data after decryption
------------------
Let's consider the three common bitwise logical operators
Let's say we can choose some number (let's call it the mask) and combine it with an unknown value
AND is about forcing some bits to zero (those that are set to zero in the mask)
OR is about forcing some bits to one (those that are set to one in the mask)
XOR is more subtle you can't know for sure the value of any bit of the result, whatever the mask you choose. But if you apply your mask two times you get back your initial value.
In other words the purpose of AND and OR is to remove some information, and that's definitely not what you want in cryptographic algorithms (symmetric or asymmetric cipher, or digital signature). If you lose information you won't be able to get it back (decrypt) or signature would tolerate some minute changes in message, thus defeating it's purpose.
All that said, that is true of cryptographic algorithms, not of their implementations. Most implementations of cryptographic algorithms also use many ANDs, usually to extract individual bytes from 32 or 64 internal registers.
You typically get code like that (this is some nearly random extract of aes_core.c)
rk[ 6] = rk[ 0] ^
(Te2[(temp >> 16) & 0xff] & 0xff000000) ^
(Te3[(temp >> 8) & 0xff] & 0x00ff0000) ^
(Te0[(temp ) & 0xff] & 0x0000ff00) ^
(Te1[(temp >> 24) ] & 0x000000ff) ^
rcon[i];
rk[ 7] = rk[ 1] ^ rk[ 6];
rk[ 8] = rk[ 2] ^ rk[ 7];
rk[ 9] = rk[ 3] ^ rk[ 8];
8 XORs and 7 ANDs if I count right
XOR is a mathematical calculation in cryptography. It is a logical operation. There are other logical operations: AND, OR, NOT, Modulo Function etc. XOR is the most important and the most used.
If it's the same, it's 0.
If it's different, it's 1.
Example:
Message : Hello
Binary Version of Hello : 01001000 01100101 01101100 01101100 01101111
Key-stream : 110001101010001101011010110011010010010111
Cipher text using XOR : 10001110 11000110 00110110 10100001 01001010
Applications : The one-time pad/Vern-am Cipher uses the Exclusive or function in which the receiver has the same key-stream and receives the ciphertext over a covert transport channel. The receiver then Xor the ciphertext with the key-stream in order to reveal the plaintext of Hello. In One Time Pad, the key-stream should be at-least as long as the message.
Fact : The One Time Pad is the only truly unbreakable encryption.
Exclusive Or used in Feistel structure which is used in the block cipher DES algo.
Note : XOR operation has a 50% chance of outputting 0 or 1.
I think its simply because a given some random set of binary numbers a large number of 'OR' operations would tend towards all '1's, likewise a large number of 'AND' operations would tend towards all zeroes. Wheres a large number of 'XOR's produces a random-ish selection of ones and zeroes.
This is not to say that AND and OR are not useful - just that XOR is more useful.
The prevalence of OR/AND and XOR in cryptography is for two reasons:-
One these are lightning fast instructions.
Two they are difficult to model using conventional mathematical formulas