Related
There is an option in R to get control over digit display. For example:
options(digits=10)
is supposed to give the calculation results in 10 digits till the end of R session. In the help file of R, the definition for digits parameter is as follows:
digits: controls the number of digits
to print when printing numeric values.
It is a suggestion only. Valid values
are 1...22 with default 7
So, it says this is a suggestion only. What if I like to always display 10 digits, not more or less?
My second question is, what if I like to display more than 22 digits, i.e. for more precise calculations like 100 digits? Is it possible with base R, or do I need an additional package/function for that?
Edit: Thanks to jmoy's suggestion, I tried sprintf("%.100f",pi) and it gave
[1] "3.1415926535897931159979634685441851615905761718750000000000000000000000000000000000000000000000000000"
which has 48 decimals. Is this the maximum limit R can handle?
The reason it is only a suggestion is that you could quite easily write a print function that ignored the options value. The built-in printing and formatting functions do use the options value as a default.
As to the second question, since R uses finite precision arithmetic, your answers aren't accurate beyond 15 or 16 decimal places, so in general, more aren't required. The gmp and rcdd packages deal with multiple precision arithmetic (via an interace to the gmp library), but this is mostly related to big integers rather than more decimal places for your doubles.
Mathematica or Maple will allow you to give as many decimal places as your heart desires.
EDIT:
It might be useful to think about the difference between decimal places and significant figures. If you are doing statistical tests that rely on differences beyond the 15th significant figure, then your analysis is almost certainly junk.
On the other hand, if you are just dealing with very small numbers, that is less of a problem, since R can handle number as small as .Machine$double.xmin (usually 2e-308).
Compare these two analyses.
x1 <- rnorm(50, 1, 1e-15)
y1 <- rnorm(50, 1 + 1e-15, 1e-15)
t.test(x1, y1) #Should throw an error
x2 <- rnorm(50, 0, 1e-15)
y2 <- rnorm(50, 1e-15, 1e-15)
t.test(x2, y2) #ok
In the first case, differences between numbers only occur after many significant figures, so the data are "nearly constant". In the second case, Although the size of the differences between numbers are the same, compared to the magnitude of the numbers themselves they are large.
As mentioned by e3bo, you can use multiple-precision floating point numbers using the Rmpfr package.
mpfr("3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825")
These are slower and more memory intensive to use than regular (double precision) numeric vectors, but can be useful if you have a poorly conditioned problem or unstable algorithm.
If you are producing the entire output yourself, you can use sprintf(), e.g.
> sprintf("%.10f",0.25)
[1] "0.2500000000"
specifies that you want to format a floating point number with ten decimal points (in %.10f the f is for float and the .10 specifies ten decimal points).
I don't know of any way of forcing R's higher level functions to print an exact number of digits.
Displaying 100 digits does not make sense if you are printing R's usual numbers, since the best accuracy you can get using 64-bit doubles is around 16 decimal digits (look at .Machine$double.eps on your system). The remaining digits will just be junk.
One more solution able to control the how many decimal digits to print out based on needs (if you don't want to print redundant zero(s))
For example, if you have a vector as elements and would like to get sum of it
elements <- c(-1e-05, -2e-04, -3e-03, -4e-02, -5e-01, -6e+00, -7e+01, -8e+02)
sum(elements)
## -876.5432
Apparently, the last digital as 1 been truncated, the ideal result should be -876.54321, but if set as fixed printing decimal option, e.g sprintf("%.10f", sum(elements)), redundant zero(s) generate as -876.5432100000
Following the tutorial here: printing decimal numbers, if able to identify how many decimal digits in the certain numeric number, like here in -876.54321, there are 5 decimal digits need to print, then we can set up a parameter for format function as below:
decimal_length <- 5
formatC(sum(elements), format = "f", digits = decimal_length)
## -876.54321
We can change the decimal_length based on each time query, so it can satisfy different decimal printing requirement.
If you work primarily with tibbles, there is a function that enforces digits: num().
Here is an example:
library(tidyverse)
data <- tribble(
~ weight, ~ weight_selfreport,
81.5,81.66969147005445,
72.6,72.59528130671505,
92.9,93.01270417422867,
79.4,79.4010889292196,
94.6,96.64246823956442,
80.2,79.4010889292196,
116.2,113.43012704174228,
95.4,95.73502722323049,
99.5,99.8185117967332
)
data <-
data %>%
mutate(across(where(is.numeric), ~ num(., digits = 3)))
data
#> # A tibble: 9 × 2
#> weight weight_selfreport
#> <num:.3!> <num:.3!>
#> 1 81.500 81.670
#> 2 72.600 72.595
#> 3 92.900 93.013
#> 4 79.400 79.401
#> 5 94.600 96.642
#> 6 80.200 79.401
#> 7 116.200 113.430
#> 8 95.400 95.735
#> 9 99.500 99.819
Thus you can even decide to have different rounding options depending on what your needs are. I find it very helpful and a rather quick solution to printing dfs.
Just noticed that R limits numeric values to 7 digits below the decimal. I'm needing to calculate and output numeric values of down to 16 digits. Is it possible to exceed the supposed 7 digit decimal limit in R?
As you can see in the example below, it won't output
any digits below 7.
> 0.6431159420289856
[1] 0.6431159
Desired output of course is
> 0.6431159420289856
[1] 0.6431159420289856
My particular use case requires those values to be outputted.
You can change the decimal places displayed with options(digits = 16) to get your requested output. That said, R will do math on all the digits available, regardless of the options setting for decimal places.
options(digits = 16)
0.6431159420289856
[1] 0.6431159420289856
Maybe a daft question but why does R remove the significant 0 in the end of a number? For example 1.250 becomes 1.25 which has not the same accuracy. I have been trying to calculate the number of significant digits of a number by using as.character() in combination with gsub() and regular expressions (according to various posts) but i get the wrong result for numbers such as 1.250, since as.character removes the last 0 digit. Therefore the answer for 1.250 comes out as 2 digits rather than 3 which is the correct.
To be more specific why this is an issue for me:
I have long tables in word comprising of bond lengths which are in the format eg: 1.2450(20):
The number in parenthesis is the uncertainty in the measurement which means that the real value is somewhere between 1.2450+0.0020 and 1.2450-0.0020. I have imported all these data from word in a large data frame like so:
df<-data.frame(Activity = c(69790, 201420, 17090),
WN1=c(1.7598, 1.759, 1.760),
WN1sd=c(17, 15, 3))
My aim is to plot the WN1 values against activity but also have the error bar on. This means that i will need to manually convert the WN1sd to: WN1sd=c(0.0017, 0.015, 0.003) which is not the R way to go, hence the need to obtain the number of significant digits of WN1. This works fine for the first two WN1 values but not for the 3rd value since R mistakenly thinks that the last 0 is not significant.
You have to prepare the standard deviations at the time you import your data from your word document
There's a point where you should have strings like that :
"1.2345(89)" "4.230(34)" "3.100(7)"
This is a function you can apply to those chars and get the sd right:
split.mean.sd = function(mean.sd) {
mean <- gsub("(.*)\\(.*", "\\1", mean.sd)
sd <- gsub(".*\\((.*)\\)", "\\1", mean.sd)
digits.after.dot <- nchar(gsub(".*\\.(.*).*", "\\1", mean))
sd <- as.numeric(sd)*10^(-digits.after.dot)
mean <- as.numeric(mean)
c(mean, sd)
}
For example:
v <- c("1.2345(89)","4.230(34)","3.100(7)")
sapply(v, split.mean.sd)
gives you
1.2345(89) 4.230(34) 3.100(7)
[1,] 1.2345 4.230 3.100
[2,] 0.0089 0.034 0.007
Most programming languages, R included, do not track the number of significant digits for floating-point values. This is because in many cases significant digits are not necessary, would significantly slow down computations and require more RAM.
You may want to be interested in some libraries for computations with uncertainties, like the errors (PDF) package.
There is an option in R to get control over digit display. For example:
options(digits=10)
is supposed to give the calculation results in 10 digits till the end of R session. In the help file of R, the definition for digits parameter is as follows:
digits: controls the number of digits
to print when printing numeric values.
It is a suggestion only. Valid values
are 1...22 with default 7
So, it says this is a suggestion only. What if I like to always display 10 digits, not more or less?
My second question is, what if I like to display more than 22 digits, i.e. for more precise calculations like 100 digits? Is it possible with base R, or do I need an additional package/function for that?
Edit: Thanks to jmoy's suggestion, I tried sprintf("%.100f",pi) and it gave
[1] "3.1415926535897931159979634685441851615905761718750000000000000000000000000000000000000000000000000000"
which has 48 decimals. Is this the maximum limit R can handle?
The reason it is only a suggestion is that you could quite easily write a print function that ignored the options value. The built-in printing and formatting functions do use the options value as a default.
As to the second question, since R uses finite precision arithmetic, your answers aren't accurate beyond 15 or 16 decimal places, so in general, more aren't required. The gmp and rcdd packages deal with multiple precision arithmetic (via an interace to the gmp library), but this is mostly related to big integers rather than more decimal places for your doubles.
Mathematica or Maple will allow you to give as many decimal places as your heart desires.
EDIT:
It might be useful to think about the difference between decimal places and significant figures. If you are doing statistical tests that rely on differences beyond the 15th significant figure, then your analysis is almost certainly junk.
On the other hand, if you are just dealing with very small numbers, that is less of a problem, since R can handle number as small as .Machine$double.xmin (usually 2e-308).
Compare these two analyses.
x1 <- rnorm(50, 1, 1e-15)
y1 <- rnorm(50, 1 + 1e-15, 1e-15)
t.test(x1, y1) #Should throw an error
x2 <- rnorm(50, 0, 1e-15)
y2 <- rnorm(50, 1e-15, 1e-15)
t.test(x2, y2) #ok
In the first case, differences between numbers only occur after many significant figures, so the data are "nearly constant". In the second case, Although the size of the differences between numbers are the same, compared to the magnitude of the numbers themselves they are large.
As mentioned by e3bo, you can use multiple-precision floating point numbers using the Rmpfr package.
mpfr("3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825")
These are slower and more memory intensive to use than regular (double precision) numeric vectors, but can be useful if you have a poorly conditioned problem or unstable algorithm.
If you are producing the entire output yourself, you can use sprintf(), e.g.
> sprintf("%.10f",0.25)
[1] "0.2500000000"
specifies that you want to format a floating point number with ten decimal points (in %.10f the f is for float and the .10 specifies ten decimal points).
I don't know of any way of forcing R's higher level functions to print an exact number of digits.
Displaying 100 digits does not make sense if you are printing R's usual numbers, since the best accuracy you can get using 64-bit doubles is around 16 decimal digits (look at .Machine$double.eps on your system). The remaining digits will just be junk.
One more solution able to control the how many decimal digits to print out based on needs (if you don't want to print redundant zero(s))
For example, if you have a vector as elements and would like to get sum of it
elements <- c(-1e-05, -2e-04, -3e-03, -4e-02, -5e-01, -6e+00, -7e+01, -8e+02)
sum(elements)
## -876.5432
Apparently, the last digital as 1 been truncated, the ideal result should be -876.54321, but if set as fixed printing decimal option, e.g sprintf("%.10f", sum(elements)), redundant zero(s) generate as -876.5432100000
Following the tutorial here: printing decimal numbers, if able to identify how many decimal digits in the certain numeric number, like here in -876.54321, there are 5 decimal digits need to print, then we can set up a parameter for format function as below:
decimal_length <- 5
formatC(sum(elements), format = "f", digits = decimal_length)
## -876.54321
We can change the decimal_length based on each time query, so it can satisfy different decimal printing requirement.
If you work primarily with tibbles, there is a function that enforces digits: num().
Here is an example:
library(tidyverse)
data <- tribble(
~ weight, ~ weight_selfreport,
81.5,81.66969147005445,
72.6,72.59528130671505,
92.9,93.01270417422867,
79.4,79.4010889292196,
94.6,96.64246823956442,
80.2,79.4010889292196,
116.2,113.43012704174228,
95.4,95.73502722323049,
99.5,99.8185117967332
)
data <-
data %>%
mutate(across(where(is.numeric), ~ num(., digits = 3)))
data
#> # A tibble: 9 × 2
#> weight weight_selfreport
#> <num:.3!> <num:.3!>
#> 1 81.500 81.670
#> 2 72.600 72.595
#> 3 92.900 93.013
#> 4 79.400 79.401
#> 5 94.600 96.642
#> 6 80.200 79.401
#> 7 116.200 113.430
#> 8 95.400 95.735
#> 9 99.500 99.819
Thus you can even decide to have different rounding options depending on what your needs are. I find it very helpful and a rather quick solution to printing dfs.
I am having the following issue. I have numeric values with about 10-20 decimals. I am writing those values in .csv (via write.csv2()) and in a database (via SQLSave() from the RODBC package). All decimals are for me relevant.
Unfortunately R rounds up the numbers after the 8th decimal.
As an example:
5655698.697843645699322
Becomes:
5655698,69784365
I tried to increase the number of digits ($digits in options()) but this is affecting only the number of digits I see in the console.
I tried with format(,digits=x) and this is working. However since I have a large number columns this is quite costly and it does not look as a clean solution.
Is there another way to increase the number of digits by writing in .csv and in the database?
R uses 64-bit IEEE double precision as its base numeric format. This has a limit of precision of 14-15 significant figures (not decimal places). So R is just writing out the numbers to the correct limit of its accuracy.
If you want more decimals, you can use a package for arbitrary-precision arithmetic:
http://cran.r-project.org/web/packages/Rmpfr/index.html
http://cran.r-project.org/web/packages/gmp/index.html
Can you try this (but this works for only txt file not csv file)?
x<-5655698.697843645699322
y<-sprintf("%.15f",x)
[1] "5655698.697843645699322"
write(y,"x.txt")
Updated:
y<-format(round(x, 15), nsmall = 15)
write(y,"x.txt")
y<-format(round(x, 25), nsmall = 25)
Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L, :
invalid 'nsmall' argument
The error means you can't increase the decimal by more than 15 (as per argument nsmall)
y<-format(round(x, 25), nsmall = 15)
> y
[1] "5655698.697843645699322"