I'm trying to this very basic task in C, where I want to define a number of ints in a header file. I've done this like so:
#define MINUTE (60)
#define HOUR (60 * MINUTE)
#define DAY (24 * HOUR)
The problem is that while MINUTE and HOUR return the correct answer, DAY returns something weird.
Serial.println(MINUTE); // 60
Serial.println(HOUR); // 3600
Serial.println(DAY); // 20864
Can someone explain why this happens?
Assuming you have something like
int days = DAY;
or
unsigned days = DAY;
You seem to have 16 bit integers. The max. representable positive value for (signed) 2s complement integers with 16 bits is 32767, for unsigned it is 65535.
So , as 24 * 3600 == 86400, you invoke undefined behaviour for the signed int and wrap for the unsigned (the int will likely wrap, too, but that is not guaranteed).
This results in 86400 modulo 65356 (which is 2 to the power of 16) which happens to be 20864.
Solution: use stdint.h types: uint32_t or int32_t to get defined sized integers.
Edit: Using function arguments follows basically the same principle as the initialisers above.
Update: As you clamed, when directly passing the integr constant 86400 to the function, this will have type long, because the compiler will automatically choose the smallest type which can hold the values. It is very likely that the println methods are overloaded for long arguments, so they will print the correct value.
However, for the expression the original types are relevant. And all values 24, 60, 60 will have int type, so the result will also be int. The compiler will not use a larger type, just because the result might overflow. Use 24L and you will get a long result for the macros, too.
It looks like you actually managed to dig up an ancient 16 bit compiler (where did you find it? ) Otherwise I'd like to see the code that produces these numbers.
20864 = 86400 % 65536
Try storing the value in an int instead of a short.
Related
Program I wrote. ran it on practice environment of gfg:
class Solution{
public:
int nCr(int n, int r){
// code here
enter code here const unsigned int M = 1000000007;
long long dp[n+1]={0},i,ans;
if(n<r)
return 0;
dp[0]=1;
dp[1]=1;
for(i=2;i<=n;i++){
dp[i]=i*dp[i-1];
}
ans=(dp[n])/(dp[n-r]*dp[r]);
ans=ans%M;
return ans;
}
};
don't really understand what is going on. The division seems to be well defined.
The division seems to be well defined.
You are right suspecting the division as the SIGFPE error origin. As you know, division is well defined as long as the divisor is not zero. At first glance, one wouldn't expect that dp[n-r]*dp[r] could become zero. But the elements of dp have a limited range of values they can hold. With a 64-bit long long, the maximum representable value typically is 263−1 = 9223372036854775807. This means that dp[i] already has overflown for i > 20, though on common processors this overflow is silently ignored. Now, as computing the factorial by multiplication with even higher values of i proceeds, more and more zeros are "shifted in" from the right until eventually all 64 bits are zero; this is on common processors for i = 66 where the exception occurs when n-r or r are equal to or greater than 66.
I am doing a basic operation in Arduino and for some reason (this is why I need you) it gives me a totally inappropriate result. Below is the code:
long init_H_top; //I am declaring it a long to make sure I got enough bytes
init_H_top=251*255/360; //gives me -4 and it should be 178
Any idea why it does that?
I am very confused... Thanks!
Your variable may be a long but your constants (251, 255, and 360) are not.
They are int types so will calculate giving an int result which will then be put into the long variable, after any overflow has already done the damage.
Since Arduino has a 16-bit int type, 251 * 255 (64005) will exceed the maximum integer of 32767 and result in behaviour like you're seeing. The value 64005 is -1531 in 16-bit two's complement and, when you divide that by 360, you get about -4.25 which truncates to -4.
You should be using long constants to avoid this:
init_H_top = 251L * 255L / 360L;
I have a 16-bits sample between -32768 and 32767.
To save space I want to convert it to a 8-bits sample, so I divide the sample by 256, and add 128.
-32768 / 256 = -128 + 128 = 0
32767 / 256 = 127.99 + 128 = 255.99
Now, the 0 will fit perfectly in a byte, but the 255.99 has to be rounded down to 255, causing me to loose precision, because when converting back I'll get 32512 instead of 32767.
How can I do this, without loosing the original min/max values? I know I make a very obvious thought error, but I cant figure out where the mistake lies.
And yes, ofcourse I'm fully aware I lost precision by dividing, and will not be able to deduce the original values from the 8-bit samples, but I just wonder why I don't get the original maximum.
The answers for down-sampling have already been provided.
This answer relates to up-sampling using the full range. Here is a C99 snippet demonstrating how you can spread the error across the full range of your values:
#include <stdio.h>
int main(void)
{
for( int i = 0; i < 256; i++ ) {
unsigned short scaledVal = ((unsigned short)i << 8) + (unsigned short)i;
printf( "%8d%8hu\n", i, scaledVal );
}
return 0;
}
It's quite simple. You shift the value left by 8 and then add the original value back. That means every increase by 1 in the [0,255] range corresponds to an increase by 257 in the [0,65535] range.
I would like to point out that this might give worse results than you began with. For example, if you downsampled 65280 (0xff00) you would get 255, but then upsampling that would give 65535 (0xffff), which is a total error of 255. You will have similarly large errors across most of the higher end of your data range.
You might do better to abandon the notion of going back to the [0,65535] range, and instead round your values by half. That is, shift left and add 127. This means the error is uniform instead of skewed. Because you don't actually know what the original value was, the best you can do is estimate it with a value right in the centre.
To summarize, I think this is more mathematically correct:
unsigned short scaledVal = ((unsigned short)i << 8) + 127;
You don't get the original maximum because you can't represent the number 256 as an 8-bit unsigned integer.
if you're trying to compress your 16 bit integer value into a 8 bit integer value range, you take the most significant 8 bits and keep them while throwing out the least significant 8 bits. Normally this is accomplished by shifting the bits. A >> operator is a shift from most to least significant bits which would work if used 8 times or >>8. You can also just mask out the bytes and divide off the 00s doing your rounding before your division, with something like 8BitInt = (16BitInt & 65280)/256; [65280 a.k.a 0xFF00]
Every bit you shift off of a value halves it, like division by 2, and rounds down.
All of the above is complicated some by the fact that you're dealing with a signed integer.
Finally I'm not 100% certain I got everything right here because really, I haven't tried doing this.
I have the following code:
NSUInteger one = 1;
CGPoint p = CGPointMake(-one, -one);
NSLog(#"%#", NSStringFromCGPoint(p));
Its output:
{4.29497e+09, 4.29497e+09}
On the other hand:
NSUInteger one = 1;
NSLog(#"%i", -one); // prints -1
I know there’s probably some kind of overflow going on, but why do the two cases differ and why doesn’t it work the way I want? Should I always remind myself of the particular numeric type of my variables and expressions even when doing trivial arithmetics?
P.S. Of course I could use unsigned int instead of NSUInteger, makes no difference.
When you apply the unary - to an unsigned value, the unsigned value is negated and then forced back into unsigned garb by having Utype_MAX + 1 repeatedly added to that value. When you pass that to CGPointMake(), that (very large) unsigned value is then assigned to a CGFloat.
You don't see this in your NSLog() statement because you are logging it as a signed integer. Convert that back to a signed integer and you indeed get -1. Try using NSLog("%u", -one) and you'll find you're right back at 4294967295.
unsigned int versus NSUInteger DOES make a difference: unsigned int is half the size of NSUInteger under an LP64 architecture (x86_64, ppc64) or when you compile with NS_BUILD_32_LIKE_64 defined. NSUInteger happens to always be pointer-sized (but use uintptr_t if you really need an integer that's the size of a pointer!); unsigned is not when you're using the LP64 model.
OK without actually knowing, but reading around on the net about all of these datatypes, I'd say the issue was with the conversion from a NUSInteger (which resolves to either an int (x32) or a long (x64)) to a CGFloat (which resolves to either a float(x32) or double(x64)).
In your second example that same conversion is not happening. The other thing that may be effecting it is that from my reading, NSUinteger is not designed contain negative numbers, only positive ones. So that is likely to be where things start to go wrong.
I was trying to display a number: 2893604342.00. But, when i am displaying it it is displayed as: -2893604342.
Following is the code snippet ...
avg += int(totalData[i][col.dataField]);
I have even replaced it with Number, but it's still showing the same negative number.
Please let me know whether there is any problem with int or Number!
The maximum values are accessible through each numeric type's static properties:
Number.MAX_VALUE
uint.MAX_VALUE
int.MAX_VALUE
(Just trace 'em.)
integers in flash are 32 bits, so an unsigned int's max value is (2^32)-1, 0xffffff or 4294967295. a signed int's max positive value is (2^(32-1))-1 or 2147483647 (one of the bits is used for the sign). the Number type is 64 bits.
in order to guarantee space for your result, type the variable to Number and cast the result to Number (or not at all).
var avg : Number = 0;
...
avg += totalData[i][col.dataField] as Number;
The largest exact integral value is 2^53, Remember ActionScript is ECMA at heart. Look for the operator ToInt32 for more info on that.
Try casting it to a uint instead of an int