Is there a way to do a right bit-shift on a BigInt in Rust? - math

I get this error when attempting to do >> or >>= on a BigInt:
no implementation for `BigInt >> BigInt
using the num_bigint::BigInt library
Edit: More Context:
I am rewriting this program https://www.geeksforgeeks.org/how-to-generate-large-prime-numbers-for-rsa-algorithm/ from python/c++ into rust however I will focus on the python implementation as it is written to handle 1024 bit prime numbers which are extremely big.
In the code we run the Miller Rabin Primality test which includes shifting EC: (prime-candidate - 1) to the right by 1 if we find that EC % 2 == 0. As I mentioned in the python implementation EC can be an incredibly large integer.
It would be convenient to be able to use the same operator in rust, if that is not possible can someone suggest an alternative?

According to the documentation for the num-bigint crate, the BigInt struct does implement the Shr trait for the right-shift operator, just not when the shift amount is itself a BigInt. If you convert the shift amount to a standard integer type (e.g. i64) then it should work.
It is unlikely you would ever want to shift by an amount greater than i64::MAX, but if you do need this, then the correct result is going to be zero (because no computer has 2^60 bytes of memory), so you can write a simple implementation which checks for that case.

Related

Assert that a pointer is aligned to some value

Is there a guaranteed way to assert that a given raw pointer is aligned to some alignment value?
I looked at pointer's aligned_offset function, but the docs state that it is permissible for it to give false negatives (always return usize::MAX), and that correctness cannot depend on it.
I don't want to fiddle with the alignment at all, I just want to write an assertion that will panic if the pointer is unaligned. My motivation is that when using certain low-level CPU intrinsics passing a pointer not aligned to some boundary causes a CPU error, and I'd much rather get a Rust panic message pointing where the bug causing it is located than a SEGFAULT.
An example assertion (not correct according to aligned_offset docs):
#[repr(align(64))]
struct A64(u8);
#[repr(align(32))]
struct A32(u8);
#[repr(align(8))]
struct A8(u8);
fn main() {
let a64 = [A64(0)];
let a32 = [A32(0)];
let a8 = [A8(0), A8(0)];
println!("Assert for 64 should pass...");
assert_alignment(&a64);
println!("Assert for 32 should pass...");
assert_alignment(&a32);
println!("Assert for 8, one of the following should fail:");
println!("- full array");
assert_alignment(&a8);
println!("- offset by 8");
assert_alignment(&a8[1..]);
}
fn assert_alignment<T>(a: &[T]) {
let ptr = a.as_ptr();
assert_eq!(ptr.align_offset(32), 0);
}
Rust playground.
Just to satisfy my own neuroses, I went and checked the the source of ptr::align_offset.
There's a lot of careful work around edge cases (e.g. const-evaluated it always returns usize::MAX, similarly for a pointer to a zero-sized type, and it panics if alignment is not a power of 2). But the crux of the implementation, for your purposes, is here: it takes (ptr as usize) % alignment == 0 to check if it's aligned.
Edit:
This PR is adding a ptr::is_aligned_to function, which is much more readable and also safer and better reviewed than simply (ptr as usize) % alginment == 0 (though the core of it is still that logic).
There's then some more complexity to calculate the exact offset (which may not be possible), but that's not relevant for this question.
Therefore:
assert_eq!(ptr.align_offset(alignment), 0);
should be plenty for your assertion.
Incidentally, this proves that the current rust standard library cannot target anything that does not represent pointers as simple numerical addresses, otherwise this function would not work. In the unlikely situation that the rust standard library is ported to the Intel 8086 or some weird DSP that doesn't represent pointers in the expected way, this function would have to change. But really, do you care for that hypothetical that much?

How does Erlang store large numbers?

I have been tinkering with a factorial module as follows:
-module(factorial).
-export([factorial/1]).
factorial(0) ->
1;
factorial(Val)->
Val * factorial(Val-1).
If I run:
1> c(factorial).
{ok,factorial}
2> factorial:factorial(100).
I get:
93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
How does Erlang so effortless hold such large numbers? On erlang.org when it talks about number types it simply states that that they hold either integers or floats. It must be some kind of dynamic integer that adjust its byte size as necessary?
I find this very cool I just don't know how its done.
It is a common feature of many functional programming languages, called Arbitrary precision arithmetic.
Notice that in Erlang, arbitrary precision is available only for integers, not floats.

Floating point math on a processor that does not support it?

How is floating point math performed on a processor with no floating point unit ? e.g low-end 8 bit microcontrollers.
Have a look at this article: http://www.edwardrosten.com/code/fp_template.html
(from this article)
First you have to think about how to represent a floating point number in memory:
struct this_is_a_floating_point_number
{
static const unsigned int mant = ???;
static const int expo = ???;
static const bool posi = ???;
};
Then you'd have to consider how to do basic calculations with this representation. Some might be easy to implement and be rather fast at runtime (multiply or divide by 2 come to mind)
Division might be harder and, for instance, Newtons algorithm could be used to calculate the answer.
Finally, smart approximations and generated values in tables might speed up the calculations at run time.
Many years ago C++ templates helped me getting floating point calculations on an Intel 386 SX
In the end I learned a lot of math and C++ but decided at the same time to buy a co-processor.
Especially the polynomial algorithms and the smart lookup tables; who needs a cosine or tan function when you have sine function, helped a lot in thinking about using integers for floating point arithmetic. Taylor series were a revelation too.
In systems without any floating-point hardware, the CPU emulates it using a series of simpler fixed-point arithmetic operations that run on the integer arithmetic logic unit.
Take a look at the wikipedia page: Floating-point_unit#Floating-point_library as you might find more info.
It is not actually the cpu who emulates the instructions. The floating point operations for low end cpu's are made out of integer arithmetic instructions and the compiler is the one which generates those instructions. Basically the compiler (tool chain) comes with a floating point library containing floating point functions.
The short answer is "slowly". Specialized hardware can do tasks like extracting groups of bits that are not necessarily byte-aligned very fast. Software can do everything that can be done by specialized hardware, but tends to take much longer to do it.
Read "The complete Spectrum ROM disassembly" at http://www.worldofspectrum.org/documentation.html to see examples of floating point computations on an 8 bit Z80 processor.
For things like sine functions, you precompute a few values then interpolate using Chebyshev polynomials.

Addition and multiplication in a Galois Field

I am attempting to generate QR codes on an extremely limited embedded platform. Everything in the specification seems fairly straightforward except for generating the error correction codewords. I have looked at a bunch of existing implementations, and they all try to implement a bunch of polynomial math that goes straight over my head, particularly with regards to the Galois fields. The most straightforward way I can see, both in mathematical complexity and in memory requirements is a circuit concept that is laid out in the spec itself:
With their description, I am fairly confident I could implement this with the exception of the parts labeled GF(256) addition and GF(256) Multiplication.
They offer this help:
The polynomial arithmetic for QR Code shall be calculated using bit-wise modulo 2 arithmetic and byte-wise
modulo 100011101 arithmetic. This is a Galois field of 2^8
with 100011101 representing the field's prime modulus
polynomial x^8+x^4+x^3+x^2+1.
which is all pretty much greek to me.
So my question is this: What is the easiest way to perform addition and multiplication in this kind of Galois field arithmetic? Assume both input numbers are 8 bits wide, and my output needs to be 8 bits wide also. Several implementations precalculate, or hardcode in two lookup tables to help with this, but I am not sure how those are calculated, or how I would use them in this situation. I would rather not take the 512 byte memory hit for the two tables, but it really depends on what the alternative is. I really just need help understanding how to do a single multiplication and addition operation in this circuit.
In practice only one table is needed. That would be for the GP(256) multiply. Note that all arithmetic is carry-less, meaning that there is no carry-propagation.
Addition and subtraction without carry is equivalent to an xor.
So in GF(256), a + b and a - b are both equivalent to a xor b.
GF(256) multiplication is also carry-less, and can be done using carry-less multiplication in a similar way with carry-less addition/subtraction. This can be done efficiently with hardware support via say Intel's CLMUL instruction set.
However, the hard part, is reducing the modulo 100011101. In normal integer division, you do it using a series of compare/subtract steps. In GF(256), you do it in a nearly identical manner using a series of compare/xor steps.
In fact, it's bad enough where it's still faster to just precompute all 256 x 256 multiplies and put them into a 65536-entry look-up table.
page 3 of the following pdf has a pretty good reference on GF256 arithmetic:
http://www.eecs.harvard.edu/~michaelm/CS222/eccnotes.pdf
(I'm following up on the pointer to zxing in the first answer, since I'm the author.)
The answer about addition is exactly right; that's why working in this field is convenient on a computer.
See http://code.google.com/p/zxing/source/browse/trunk/core/src/com/google/zxing/common/reedsolomon/GenericGF.java
Yes multiplication works, and is for GF256. a * b is really the same as exp(log(a) + log(b)). And because GF256 has only 256 elements, there are only 255 unique powers of "x", and same for log. So these are easy to put in a lookup table. The tables would "wrap around" at 256, so that is why you see the "% size". "/ size" is slightly harder to explain in a sentence -- it's because really 1-255 "wrap around", not 0-255. So it's not quite just a simple modulus that's needed.
The final piece perhaps is how you reduce modulo an irreducible polynomial. The irreducibly polynomial is x^8 plus some lower-power terms, right -- call it I(x) = x^8 + R(x). And the polynomial is congruent to 0 in the field, by definition; I(x) == 0. So x^8 == -R(x). And, conveniently, addition and subtraction are the same, so x^8 == -R(x) == R(x).
The only time we need to reduce higher-power polynomials is when constructing the exponents table. You just keep multiplying by x (which is a shift left) until it gets too big -- gets an x^8 term. But x^8 is the same as R(x). So you take out the x^8 and add in R(x). R(x) merely has powers up to x^7 so it's all in a byte still, all in GF(256). And you know how to add in this field.
Helps?

How do programming languages handle huge number arithmetic

For a computer working with a 64 bit processor, the largest number that it can handle would be 264 = 18,446,744,073,709,551,616. How does programming languages, say Java or be it C, C++ handle arithmetic of numbers higher than this value. Any register cannot hold it as a single piece. How was this issue tackled?
There are lots of specialized techniques for doing calculations on numbers larger than the register size. Some of them are outlined in this wikipedia article on arbitrary precision arithmetic
Low level languages, like C and C++, leave large number calculations to the library of your choice. One notable one is the GNU Multi-Precision library. High level languages like Python, and others, integrate this into the core of the language, so normal numbers and very large numbers are identical to the programmer.
You assume the wrong thing. The biggest number it can handle in a single register is a 64-bits number. However, with some smart programming techniques, you could just combined a few dozens of those 64-bits numbers in a row to generate a huge 6400 bit number and use that to do more calculations. It's just not as fast as having the number fit in one register.
Even the old 8 and 16 bits processors used this trick, where they would just let the number overflow to other registers. It makes the math more complex but it doesn't put an end to the possibilities.
However, such high-precision math is extremely unusual. Even if you want to calculate the whole national debt of the USA and store the outcome in Zimbabwean Dollars, a 64-bits integer would still be big enough, I think. It's definitely big enough to contain the amount of my savings account, though.
Programming languages that handle truly massive numbers use custom number primitives that go beyond normal operations optimized for 32, 64, or 128 bit CPUs. These numbers are especially useful in computer security and mathematical research.
The GNU Multiple Precision Library is probably the most complete example of these approaches.
You can handle larger numbers by using arrays. Try this out in your web browser. Type the following code in the JavaScript console of your web browser:
The point at which JavaScript fails
console.log(9999999999999998 + 1)
// expected 9999999999999999
// actual 10000000000000000 oops!
JavaScript does not handle plain integers above 9999999999999998. But writing your own number primitive is to make this calculation work is simple enough. Here is an example using a custom number adder class in JavaScript.
Passing the test using a custom number class
// Require a custom number primative class
const {Num} = require('./bases')
// Create a massive number that JavaScript will not add to (correctly)
const num = new Num(9999999999999998, 10)
// Add to the massive number
num.add(1)
// The result is correct (where plain JavaScript Math would fail)
console.log(num.val) // 9999999999999999
How it Works
You can look in the code at class Num { ... } to see details of what is happening; but here is a basic outline of the logic in use:
Classes:
The Num class contains an array of single Digit classes.
The Digit class contains the value of a single digit, and the logic to handle the Carry flag
Steps:
The chosen number is turned into a string
Each digit is turned into a Digit class and stored in the Num class as an array of digits
When the Num is incremented, it gets carried to the first Digit in the array (the right-most number)
If the Digit value plus the Carry flag are equal to the Base, then the next Digit to the left is called to be incremented, and the current number is reset to 0
... Repeat all the way to the left-most digit of the array
Logistically it is very similar to what is happening at the machine level, but here it is unbounded. You can read more about about how digits are
carried here; this can be applied to numbers of any base.
Ada actually supports this natively, but only for its typeless constants ("named numbers"). For actual variables, you need to go find an arbitrary-length package. See Arbitrary length integer in Ada
More-or-less the same way that you do. In school, you memorized single-digit addition, multiplication, subtraction, and division. Then, you learned how to do multiple-digit problems as a sequence of single-digit problems.
If you wanted to, you could multiply two twenty-digit numbers together using nothing more than knowledge of a simple algorithm, and the single-digit times tables.
In general, the language itself doesn't handle high-precision, high-accuracy large number arithmetic. It's far more likely that a library is written that uses alternate numerical methods to perform the desired operations.
For example (I'm just making this up right now), such a library might emulate the actual techniques that you might use to perform that large number arithmetic by hand. Such libraries are generally much slower than using the built-in arithmetic, but occasionally the additional precision and accuracy is called for.
As a thought experiment, imagine the numbers stored as a string. With functions to add, multiply, etc these arbitrarily long numbers.
In reality these numbers are probably stored in a more space efficient manner.
Think of one machine-size number as a digit and apply the algorithm for multi-digit multiplication from primary school. Then you don't need to keep the whole numbers in registers, just the digits as they are worked on.
Most languages store them as array of integers. If you add/subtract two to of these big numbers the library adds/subtracts all integer elements in the array separately and handles the carries/borrows.
It's like manual addition/subtraction in school because this is how it works internally.
Some languages use real text strings instead of integer arrays which is less efficient but simpler to transform into text representation.

Resources