I am trying to construct a mortgage calculator app but my numbers are off a bit and I am wondering if anyone has any insight into what I am missing. The initial payment amount seems to be accurate but as you increase the year and interest rate the value is slightly off. This is for Canada if that makes any difference. The payment amount also doesn't divide evenly into the amount borrowed either. Here is the relevant code.
double r = interestAmountValue/1200;
double n = yearAmountValue * 12;
double rPower = pow(1+r, n);
double paymentAmt = loanAmountValue * r * rPower / (rPower - 1);
double totalPaymentd = paymentAmt * n;
double totalInterestd = totalPaymentd - loanAmountValue;
The type double is typically implemented as a floating point number. Although calculations using floating points can be fast, particularly when using dedicated hardware, the tradeoff is accuracy. This is a well documented problem: see volume II of The Art of Computer Programming: Semi-Numerical Algorithms for a thorough discussion.
If provided by your language, an arbitrary precision number type will probably result in better accuracy at the expense of some speed (with the loss of speed probably unnoticeable in most applications on modern hardware for typical problems). In Java this is the BigDecimal type.
Related
I am working on a project, and I need to divide a very large 64 bit long value. I absolutely do not care about the whole number result, and only care about the decimal value. The problem is that when dividing a large long with a small 64 bit double floating point value, I loose accuracy in the floating point value due to it needing to store the whole numbers.
Essentially what I am trying to do is this:
double x = long_value / double_value % 1;
but without loosing precision the larger the long_value is. Is there a way of writing this expression so that the whole numbers are discarded and floating point accuracy is not lost? Thanks.
EDIT: btw im out here trying to upvote all these helpful answers, but I just made this account for this question and you need 15 reputation to cast a vote
If your language provides an exact fmod implementation you can do something like this:
double rem = fmod(long_value, double_value);
return rem / double_value;
If long_value does not convert exactly to a double value, you could split it into two halves, fmod them individually, add these values together and divide that sum or sum - double_value by double_value.
If long_value or double_value is negative you may also need to consider different cases depending on how your fmod behaves and what result you expect.
long_value is congruent to:
long_value = long_value - double_value * static_cast<long>(long_value / double_value);
Then you can do this:
double fractionalPart = static_cast<double>(long_value / double_value) % 1;
Does the language you're using have a big integer/big rational library? To avoid loss of information, you'll have to "spread out" the information across more memory while you're transforming it so you don't lose the part you're interested in preserving. This is essentially what a big integer library would do for you. You could employ this algorithm (I don't know what language you're using so this is just pseudocode:
// e.g. 1.5 => (3, 2)
let (numerator, denominator) = double_value.ToBigRational().NumAndDenom();
// information-preserving version of long_value / double_value
let quotient = new BigRational(num: long_value * denominator, denom: numerator);
// information-preserving version of % 1
let remainder = quotient.FractionPart();
// some information could be lost here, but we saved it for the last step
return remainder.ToDouble();
I'm trying to calculate rewards for liquidity providers and I found this equation that Uniswap apparently uses:
Basic Formula (L = liquidity): (L_you / L_others) * (24h_swap_volume * pool_fee_rate)
And I'm trying to implement this in my smart contract but I can't seem to be able to because the liquidity held by others will always be larger than the liquidity you hold which requires a decimal value so my question is: How do I use this equation in a Solidity smart contract without falling into floating-point hell?
After doing a little more digging I found that you can use ABDKMath to achieve this you can find more information using the links below:
Library: https://github.com/abdk-consulting/abdk-libraries-solidity
Article: https://www.bitcoininsider.org/article/68630/10x-better-fixed-point-math-solidity
As a code example here is a quick snippet of how I implemented this:
uint256 fee = ABDKMath64x64.mulu(
ABDKMath64x64.divu(
tokensProvided,
others
),
taxableValue
);
This is probably not the best solution but it worked for me. Also, take a look at https://github.com/paulrberg/prb-math for a higher precision fixed-point math library.
Regarding gas efficiency here is a quote from the PRB math lib:
The typeless PRBMath library is faster than ABDKMath for abs, exp,
exp2, gm, inv, ln, log2. Conversely, it is slower than ABDKMath for
avg, div, mul, powu and sqrt. There are two technical reasons why
PRBMath lags behind ABDKMath's mul and div functions
So for this use case, I think it is a better idea to use ABDK Math instead of PRB math unless you plan on transferring more than 2 to the 128th power of Wei.
I am looking to take the log base n (10 would be fine) of a 256 bit unsigned integer as a floating point in rust, with no loss of precision. It would seem to me that I need to implement an 8xf64 512 bit float 512 type and use a Taylor series to approximate ln and then the log. I know there are assembly methods to obtain the log of an f64. I am wondering if anyone on stack overflow can think of a divide and conquer or other method which would be more efficient. I would be amenable to inline assembly operating on the 8xf64 512 bit array.
This might be a useful starting point / outline of an algorithm. IDK if it will get you exact results, like error <= 0.5ulp (i.e. the last bit of the mantissa of your 512-bit float correctly rounded), or even error <= 1 ulp. Perhaps worth looking into what extended-precision calculators like bc / dc / calc do.
I think log converges quickly, so if you're going to do Newton iterations to refine, this bit-scan method might be a fast way to get a good starting point. Even if you only really need about 256 mantissa bits correct, I don't know how big a polynomial it would take to get that, and each multiply / add / fma would be on 512-bit (8x) or 320-bit (5x double precision).
Start by converting integer to binary float
For normal-sized floating-point numbers, the usual method takes advantage of the logarithmic nature of binary floating point. Without 256-bit HW float, you'll want to find the ilog2(int) yourself, i.e. position of the highest set bit (Efficiently find least significant set bit in a large array?).
Then treat your 256-bit integer as the mantissa of a number in the [1..2) or [0.5 .. 1) range, and yes use a polynomial approximation for log2() that's accurate over that limited range. (Before actual soft-float stuff, you might want to left-shift the number so it's normalized, i.e. the highest set bit is at the top. i.e. x <<= clz(x).
Then a polynomial approximation over the mantissa
And then add the integer exponent + log_approx(mantissa) => log2(x).
Efficient implementation of log2(__m256d) in AVX2 has more detail on implementing log2(double) (with SIMD doing 4 at a time, very different from doing one extended precision calculation).
It includes some links to implementations, e.g. Agner Fog's VCL using the ratio of two polynomials instead of one larger polynomial, and various tricks to maintain as much precision as possible: https://github.com/vectorclass/version2/blob/9874e4bfc7a0919fda16596144d393da5f8bf6c0/vectormath_exp.h#L942. Such as further range reduction: if x > SQRT2*0.5, then increment the exponent and double the mantissa. (If 512-bit FP division is really expensive, you might just use more terms in one polynomial.) VCL is currently Apache licensed, so feel free to copy as much as you want from it into anything.
IDK if there are more tricks that might become more valuable for big extended precision, or for soft-float, which that implementation doesn't use. VCL's math functions spend more effort to maintain high precision than some faster approximations, but they're not exact.
Do you really need 512-bit float? Maybe only 320-bit (5x double)?
If you don't need more exponent-range than a double, you might be able to extend the double-double-arithmetic technique to wider floats, taking advantage of hardware FP to get 52 or 53 mantissa bits per 64-bit chunk. (From comments, apparently you're already planning to do that.)
You might not need 512-bit float to have sufficient precision. 256/52 = 4.92, so only 5x double chunks have more precision (mantissa bits) than your input, and could exactly represent any 256-bit integer. (IEEE double does have a large enough exponent range; -1022 .. +1023). And have enough to spare that log2(int) should map each 256-bit input to a unique monotonic output, even with some rounding error.
Has anyone experiences replacing floating point operations on ATMega (2560) based systems? There are a couple of very common situations which happen every day.
For example:
Are comparisons faster than divisions/multiplications?
Are float to int type cast with followed multiplication/division faster than pure floating point operations without type cast?
I hope I don't have to make a benchmark just for me.
Example one:
int iPartialRes = (int)fArg1 * (int)fArg2;
iPartialRes *= iFoo;
faster as?:
float fPartialRes = fArg1 * fArg2;
fPartialRes *= iFoo;
And example two:
iSign = fVal < 0 ? -1 : 1;
faster as?:
iSign = fVal / fabs(fVal);
the questions could be solved just by thinking a moment about it.
AVRs does not have a FPU so all floating point related stuff is done in software --> fp multiplication involves much more than a simple int multiplication
since AVRs also does not have a integer division unit a simple branch is also much faster than a software division. if dividing floating points this is the worst worst case :)
but please note, that your first 2 examples produce very different results.
This is an old answer but I will submit this elaborated answer for the curious.
Just typecasting a float will truncate it ie; 3.7 will become 3, there is no rounding.
Fastest math on a 2560 will be (+,-,*) with divide being the slowest due to no hardware divide. Typecasting to an unsigned long int after multiplying all operands by a pseudo decimal point that suits your fractal number(1) range that your floats are expected to see and tracking the sign as a bool will give the best range/accuracy compromise.
If your loop needs to be as fast as possible, avoid even integer division, instead multiplying by a pseudo fraction instead and then doing your typecast back into a float with myFloat(defined elsewhere) = float(myPseudoFloat) / myPseudoDecimalConstant;
Not sure if you came across the Show info page in the playground. It's basically a sketch that runs a benchmark on your (insert Arduino model here) Shows the actual compute times for various things and systems. The Mega 2560 will be very close to an At Mega 328 as far as FLOPs goes, up to 12.5K/s (80uS per divide float). Typecasting would likely handicap the CPU more as it introduces more overhead and might even give erroneous results due to rounding errors and lack of precision.
(1)ie: 543.509,291 * 100000 = 543,509,291 will move the decimal 6 places to the maximum precision of a float on an 8-bit AVR. If you first multiply all values by the same constant like 1000, or 100000, etc, then the decimal point is preserved and then you cast it back to a float number by dividing by your decimal constant when you are ready to print or store it.
float f = 3.1428;
int x;
x = f * 10000;
x now contains 31428
How is floating point math performed on a processor with no floating point unit ? e.g low-end 8 bit microcontrollers.
Have a look at this article: http://www.edwardrosten.com/code/fp_template.html
(from this article)
First you have to think about how to represent a floating point number in memory:
struct this_is_a_floating_point_number
{
static const unsigned int mant = ???;
static const int expo = ???;
static const bool posi = ???;
};
Then you'd have to consider how to do basic calculations with this representation. Some might be easy to implement and be rather fast at runtime (multiply or divide by 2 come to mind)
Division might be harder and, for instance, Newtons algorithm could be used to calculate the answer.
Finally, smart approximations and generated values in tables might speed up the calculations at run time.
Many years ago C++ templates helped me getting floating point calculations on an Intel 386 SX
In the end I learned a lot of math and C++ but decided at the same time to buy a co-processor.
Especially the polynomial algorithms and the smart lookup tables; who needs a cosine or tan function when you have sine function, helped a lot in thinking about using integers for floating point arithmetic. Taylor series were a revelation too.
In systems without any floating-point hardware, the CPU emulates it using a series of simpler fixed-point arithmetic operations that run on the integer arithmetic logic unit.
Take a look at the wikipedia page: Floating-point_unit#Floating-point_library as you might find more info.
It is not actually the cpu who emulates the instructions. The floating point operations for low end cpu's are made out of integer arithmetic instructions and the compiler is the one which generates those instructions. Basically the compiler (tool chain) comes with a floating point library containing floating point functions.
The short answer is "slowly". Specialized hardware can do tasks like extracting groups of bits that are not necessarily byte-aligned very fast. Software can do everything that can be done by specialized hardware, but tends to take much longer to do it.
Read "The complete Spectrum ROM disassembly" at http://www.worldofspectrum.org/documentation.html to see examples of floating point computations on an 8 bit Z80 processor.
For things like sine functions, you precompute a few values then interpolate using Chebyshev polynomials.