Is the "e" character used to denote an invalid number in graphics data? - numerical

I've got a graphical program that's exporting a data file with numbers such as: -1.33227e-015 and -4.02456e-016.
I've long been perplexed by the "e-" notation. Is it used to denote an invalid number? What sort of valid value can I extract from the above numbers? What are they trying to say?

e means "× 10^". It standard for exponent.
e.g. 1.33227e-015 means 1.33227 × 10-15 and -4.02456e-016 means -4.02456 × 10-16.
See http://en.wikipedia.org/wiki/Scientific_notation#E_notation for detail.

No. It signifies exponential/scientific notation. -4.02456e-016 means -4.02456 divided by 10 to the power 16.

e or E stands for exponent. Just like x10^ (in written mathematics). The number following tells you how far the decimal place is moving, (+ for left, - for right) so your above number:
-1.33227e-015
Becomes:
-.00000000000000133227
While:
-4.02456e-016
Becomes:
-.000000000000000402456

That is scientific notation being used to represent extremely "large" or "small" numbers.

Related

How to import really large numbers into R? [duplicate]

I am importing a csv that has a single column which contains very long integers (for example: 2121020101132507598)
a<-read.csv('temp.csv',as.is=T)
When I import these integers as strings they come through correctly, but when imported as integers the last few digits are changed. I have no idea what is going on...
1 "4031320121153001444" 4031320121153001472
2 "4113020071082679601" 4113020071082679808
3 "4073020091116779570" 4073020091116779520
4 "2081720101128577687" 2081720101128577792
5 "4041720081087539887" 4041720081087539712
6 "4011120071074301496" 4011120071074301440
7 "4021520051054304372" 4021520051054304256
8 "4082520061068996911" 4082520061068997120
9 "4082620101129165548" 4082620101129165312
As others have noted, you can't represent integers that large. But R isn't reading those values into integers, it's reading them into double precision numerics.
Double precision can only represent numbers to ~16 places accurately, which is why you see your numbers rounded after 16 places. See the gmp, Rmpfr, and int64 packages for potential solutions. Though I don't see a function to read from a file in any of them, maybe you could cook something up by looking at their sources.
UPDATE:
Here's how you can get your file into an int64 object:
# This assumes your numbers are the only column in the file
# Read them in however, just ensure they're read in as character
a <- scan("temp.csv", what="")
ia <- as.int64(a)
R's maximum intger value is about 2E9. As #Joshua mentions in another answer, one of the potential solutions is the int64 package.
Import the values as character instead. Then convert to type int64.
require(int64)
a <- read.csv('temp.csv', colClasses = 'character', header=FALSE)[[1]]
a <- as.int64(a)
print(a)
[1] 4031320121153001444 4113020071082679601 4073020091116779570
[4] 2081720101128577687 4041720081087539887 4011120071074301496
[7] 4021520051054304372 4082520061068996911 4082620101129165548
You simply cannot represent integers that big. See
.Machine
which on my box has
$integer.max
[1] 2147483647
The maximum value of a 32-bit signed integer is 2,147,483,647. Your numbers are much larger.
Try importing them as floating point values instead.
There4 are a few caveats to be aware of when dealing with floating point arithmetic in R or any other language:
http://blog.revolutionanalytics.com/2009/11/floatingpoint-errors-explained.html
http://blog.revolutionanalytics.com/2009/03/when-is-a-zero-not-a-zero.html
http://floating-point-gui.de/basic/

What's the difference between integer class and numeric class in R

I want to preface this by saying I'm an absolute programming beginner, so please excuse how basic this question is.
I'm trying to get a better understanding of "atomic" classes in R and maybe this goes for classes in programming in general. I understand the difference between a character, logical, and complex data classes, but I'm struggling to find the fundamental difference between a numeric class and an integer class.
Let's say I have a simple vector x <- c(4, 5, 6, 6) of integers, it would make sense for this to be an integer class. But when I do class(x) I get [1] "numeric". Then if I convert this vector to an integer class x <- as.integer(x). It return the same exact list of numbers except the class is different.
My question is why is this the case, and why the default class for a set of integers is a numeric class, and what are the advantages and or disadvantages of having an integer set as numeric instead of integer.
There are multiple classes that are grouped together as "numeric" classes, the 2 most common of which are double (for double precision floating point numbers) and integer. R will automatically convert between the numeric classes when needed, so for the most part it does not matter to the casual user whether the number 3 is currently stored as an integer or as a double. Most math is done using double precision, so that is often the default storage.
Sometimes you may want to specifically store a vector as integers if you know that they will never be converted to doubles (used as ID values or indexing) since integers require less storage space. But if they are going to be used in any math that will convert them to double, then it will probably be quickest to just store them as doubles to begin with.
Patrick Burns on Quora says:
First off, it is perfectly feasible to use R successfully for years
and not need to know the answer to this question. R handles the
differences between the (usual) numerics and integers for you in the
background.
> is.numeric(1)
[1] TRUE
> is.integer(1)
[1] FALSE
> is.numeric(1L)
[1] TRUE
> is.integer(1L)
[1] TRUE
(Putting capital 'L' after an integer forces it to be stored as an
integer.)
As you can see "integer" is a subset of "numeric".
> .Machine$integer.max
[1] 2147483647
> .Machine$double.xmax
[1] 1.797693e+308
Integers only go to a little more than 2 billion, while the other
numerics can be much bigger. They can be bigger because they are
stored as double precision floating point numbers. This means that
the number is stored in two pieces: the exponent (like 308 above,
except in base 2 rather than base 10), and the "significand" (like
1.797693 above).
Note that 'is.integer' is not a test of whether you have a whole
number, but a test of how the data are stored.
One thing to watch out for is that the colon operator, :, will return integers if the start and end points are whole numbers. For example, 1:5 creates an integer vector of numbers from 1 to 5. You don't need to append the letter L.
> class(1:5)
[1] "integer"
Reference: https://www.quora.com/What-is-the-difference-between-numeric-and-integer-in-R
To quote the help page (try ?integer), bolded portion mine:
Integer vectors exist so that data can be passed to C or Fortran code which expects them, and so that (small) integer data can be represented exactly and compactly.
Note that current implementations of R use 32-bit integers for integer vectors, so the range of representable integers is restricted to about +/-2*10^9: doubles can hold much larger integers exactly.
Like the help page says, R's integers are signed 32-bit numbers so can hold between -2147483648 and +2147483647 and take up 4 bytes.
R's numeric is identical to an 64-bit double conforming to the IEEE 754 standard. R has no single precision data type. (source: help pages of numeric and double). A double can store all integers between -2^53 and 2^53 exactly without losing precision.
We can see the data type sizes, including the overhead of a vector (source):
> object.size(1:1000)
4040 bytes
> object.size(as.numeric(1:1000))
8040 bytes
To my understanding - we do not declare a variable with a data type so by default R has set any number without L to be a numeric.
If you wrote:
> x <- c(4L, 5L, 6L, 6L)
> class(x)
>"integer" #it would be correct
Example of Integer:
> x<- 2L
> print(x)
Example of Numeric (kind of like double/float from other programming languages)
> x<-3.4
> print(x)
Numeric is an umbrella term for several types of classes (e.g. double and integer). Integers are numbers which do not have decimal points and thus are stored with minimal space in memory. Use the integer class only when doing computations with such numbers, otherwise revert to numeric.

Using MPFR And Adding - How many Digits are Correct?

I have a pretty easy question (I think). As much as I've tried, I can not find an answer to this question.
I am creating a function, for which I want the user to enter two numbers. The first is the the number of terms of a certain infinite series to add together. The second is the number of digits the user would like the truncated sum to be accurate to.
Say the terms of the sequence are a_i. How much precision n, would be required in mpfr to ensure the result of adding these a_i from i=0 up to the user's entered value would be needed to guarantee the number of digits the user needs?
By the way, I'm adding the a_i in a naive way.
Any help will be much appreciated.
Thanks,
Rick
You can convert between decimal digits of precision, d, and binary digits of precision, b, with logarithms
b = d × log(10) / log(2)
A little rearranging shows why
b × log(2) = d × log(10)
log(2b) = log(10d)
2b = 10d
Each term of the series (and each addition) will introduce a rounding error at the least significant digit so, assuming each of the t terms involves n (two argument) arithmetic operations, you will want to add an extra
log(t * (n+2))/log(2)
bits.
You'll need to round the number of bits of precision up to be sure that you have enough room for your decimal digits of precision
b = ceil((d*log(10.0) + log(t*(n+2)))/log(2.0));
Finally, you should be aware that the terms may introduce cancellation errors, in which case this simple calculation will dramatically underestimate the required number of bits, even assuming I've got it right in the first place ;-)

Reading a CSV file containing longs? [duplicate]

I am importing a csv that has a single column which contains very long integers (for example: 2121020101132507598)
a<-read.csv('temp.csv',as.is=T)
When I import these integers as strings they come through correctly, but when imported as integers the last few digits are changed. I have no idea what is going on...
1 "4031320121153001444" 4031320121153001472
2 "4113020071082679601" 4113020071082679808
3 "4073020091116779570" 4073020091116779520
4 "2081720101128577687" 2081720101128577792
5 "4041720081087539887" 4041720081087539712
6 "4011120071074301496" 4011120071074301440
7 "4021520051054304372" 4021520051054304256
8 "4082520061068996911" 4082520061068997120
9 "4082620101129165548" 4082620101129165312
As others have noted, you can't represent integers that large. But R isn't reading those values into integers, it's reading them into double precision numerics.
Double precision can only represent numbers to ~16 places accurately, which is why you see your numbers rounded after 16 places. See the gmp, Rmpfr, and int64 packages for potential solutions. Though I don't see a function to read from a file in any of them, maybe you could cook something up by looking at their sources.
UPDATE:
Here's how you can get your file into an int64 object:
# This assumes your numbers are the only column in the file
# Read them in however, just ensure they're read in as character
a <- scan("temp.csv", what="")
ia <- as.int64(a)
R's maximum intger value is about 2E9. As #Joshua mentions in another answer, one of the potential solutions is the int64 package.
Import the values as character instead. Then convert to type int64.
require(int64)
a <- read.csv('temp.csv', colClasses = 'character', header=FALSE)[[1]]
a <- as.int64(a)
print(a)
[1] 4031320121153001444 4113020071082679601 4073020091116779570
[4] 2081720101128577687 4041720081087539887 4011120071074301496
[7] 4021520051054304372 4082520061068996911 4082620101129165548
You simply cannot represent integers that big. See
.Machine
which on my box has
$integer.max
[1] 2147483647
The maximum value of a 32-bit signed integer is 2,147,483,647. Your numbers are much larger.
Try importing them as floating point values instead.
There4 are a few caveats to be aware of when dealing with floating point arithmetic in R or any other language:
http://blog.revolutionanalytics.com/2009/11/floatingpoint-errors-explained.html
http://blog.revolutionanalytics.com/2009/03/when-is-a-zero-not-a-zero.html
http://floating-point-gui.de/basic/

Weird error in R when importing (64-bit) integer with many digits

I am importing a csv that has a single column which contains very long integers (for example: 2121020101132507598)
a<-read.csv('temp.csv',as.is=T)
When I import these integers as strings they come through correctly, but when imported as integers the last few digits are changed. I have no idea what is going on...
1 "4031320121153001444" 4031320121153001472
2 "4113020071082679601" 4113020071082679808
3 "4073020091116779570" 4073020091116779520
4 "2081720101128577687" 2081720101128577792
5 "4041720081087539887" 4041720081087539712
6 "4011120071074301496" 4011120071074301440
7 "4021520051054304372" 4021520051054304256
8 "4082520061068996911" 4082520061068997120
9 "4082620101129165548" 4082620101129165312
As others have noted, you can't represent integers that large. But R isn't reading those values into integers, it's reading them into double precision numerics.
Double precision can only represent numbers to ~16 places accurately, which is why you see your numbers rounded after 16 places. See the gmp, Rmpfr, and int64 packages for potential solutions. Though I don't see a function to read from a file in any of them, maybe you could cook something up by looking at their sources.
UPDATE:
Here's how you can get your file into an int64 object:
# This assumes your numbers are the only column in the file
# Read them in however, just ensure they're read in as character
a <- scan("temp.csv", what="")
ia <- as.int64(a)
R's maximum intger value is about 2E9. As #Joshua mentions in another answer, one of the potential solutions is the int64 package.
Import the values as character instead. Then convert to type int64.
require(int64)
a <- read.csv('temp.csv', colClasses = 'character', header=FALSE)[[1]]
a <- as.int64(a)
print(a)
[1] 4031320121153001444 4113020071082679601 4073020091116779570
[4] 2081720101128577687 4041720081087539887 4011120071074301496
[7] 4021520051054304372 4082520061068996911 4082620101129165548
You simply cannot represent integers that big. See
.Machine
which on my box has
$integer.max
[1] 2147483647
The maximum value of a 32-bit signed integer is 2,147,483,647. Your numbers are much larger.
Try importing them as floating point values instead.
There4 are a few caveats to be aware of when dealing with floating point arithmetic in R or any other language:
http://blog.revolutionanalytics.com/2009/11/floatingpoint-errors-explained.html
http://blog.revolutionanalytics.com/2009/03/when-is-a-zero-not-a-zero.html
http://floating-point-gui.de/basic/

Resources