Writing big integers in normal (not scientific) format [duplicate] - r

I have a dataframe with a column of p-values, and I want to make a selection on these p-values.
> pvalues_anova
[1] 9.693919e-01 9.781728e-01 9.918415e-01 9.716883e-01 1.667183e-02
[6] 9.952762e-02 5.386854e-01 9.997699e-01 8.714044e-01 7.211856e-01
[11] 9.536330e-01 9.239667e-01 9.645590e-01 9.478572e-01 6.243775e-01
[16] 5.608563e-01 1.371190e-04 9.601970e-01 9.988648e-01 9.698365e-01
[21] 2.795891e-06 1.290176e-01 7.125751e-01 5.193604e-01 4.835312e-04
Selection way:
anovatest<- results[ - which(results$pvalues_anova < 0.8) ,]
The function works really fine if I use it in R. But if I run it in another application (galaxy), the numbers which don't have e-01 e.g. 4.835312e-04 are not thrown out.
Is there another way to notate p-values, like 0.0004835312 instead of 4.835312e-04?

You can effectively remove scientific notation in printing with this code:
options(scipen=999)

format(99999999,scientific = FALSE)
gives
99999999

Summarising all existing answers
(And adding a few of my points)
Note : In the below explanation, value is the number to be represented in some (integer/float) format.
Solution 1 :
options(scipen=999)
Solution 2 :
format(value, scientific=FALSE);
Solution 3 :
as.integer(value);
Solution 4 :
You can use integers which don't get printed in scientific notation. You can specify that your number is an integer by putting an "L" behind it
paste(100000L)
will print 100000
Solution 5 :
Control formatting tightly using 'sprintf()'
sprintf("%6d", 100000)
will print 100000
Solution 6 :
prettyNum(value, scientific = FALSE, digits = 16)

I also find the prettyNum(..., scientific = FALSE) function useful for printing when I don't want trailing zeros. Note that these functions are useful for printing purposes, i.e., the output of these functions are strings, not numbers.
p_value <- c(2.45496e-5, 3e-17, 5.002e-5, 0.3, 123456789.123456789)
format(p_value, scientific = FALSE)
#> [1] " 0.00002454960000000" " 0.00000000000000003"
#> [3] " 0.00005002000000000" " 0.29999999999999999"
#> [5] "123456789.12345679104328156"
format(p_value, scientific = FALSE, drop0trailing = TRUE)
#> [1] " 0.0000245496" " 0.00000000000000003"
#> [3] " 0.00005002" " 0.29999999999999999"
#> [5] "123456789.12345679104328156"
# Please note that the last number's last two digits are rounded:
prettyNum(p_value, scientific = FALSE, digits = 16)
#> [1] "0.0000245496" "0.00000000000000003" "0.00005002"
#> [4] "0.3" "123456789.1234568"

Related

R Column calculated into exponential/ e numbers rather than decimal [duplicate]

I have a dataframe with a column of p-values, and I want to make a selection on these p-values.
> pvalues_anova
[1] 9.693919e-01 9.781728e-01 9.918415e-01 9.716883e-01 1.667183e-02
[6] 9.952762e-02 5.386854e-01 9.997699e-01 8.714044e-01 7.211856e-01
[11] 9.536330e-01 9.239667e-01 9.645590e-01 9.478572e-01 6.243775e-01
[16] 5.608563e-01 1.371190e-04 9.601970e-01 9.988648e-01 9.698365e-01
[21] 2.795891e-06 1.290176e-01 7.125751e-01 5.193604e-01 4.835312e-04
Selection way:
anovatest<- results[ - which(results$pvalues_anova < 0.8) ,]
The function works really fine if I use it in R. But if I run it in another application (galaxy), the numbers which don't have e-01 e.g. 4.835312e-04 are not thrown out.
Is there another way to notate p-values, like 0.0004835312 instead of 4.835312e-04?
You can effectively remove scientific notation in printing with this code:
options(scipen=999)
format(99999999,scientific = FALSE)
gives
99999999
Summarising all existing answers
(And adding a few of my points)
Note : In the below explanation, value is the number to be represented in some (integer/float) format.
Solution 1 :
options(scipen=999)
Solution 2 :
format(value, scientific=FALSE);
Solution 3 :
as.integer(value);
Solution 4 :
You can use integers which don't get printed in scientific notation. You can specify that your number is an integer by putting an "L" behind it
paste(100000L)
will print 100000
Solution 5 :
Control formatting tightly using 'sprintf()'
sprintf("%6d", 100000)
will print 100000
Solution 6 :
prettyNum(value, scientific = FALSE, digits = 16)
I also find the prettyNum(..., scientific = FALSE) function useful for printing when I don't want trailing zeros. Note that these functions are useful for printing purposes, i.e., the output of these functions are strings, not numbers.
p_value <- c(2.45496e-5, 3e-17, 5.002e-5, 0.3, 123456789.123456789)
format(p_value, scientific = FALSE)
#> [1] " 0.00002454960000000" " 0.00000000000000003"
#> [3] " 0.00005002000000000" " 0.29999999999999999"
#> [5] "123456789.12345679104328156"
format(p_value, scientific = FALSE, drop0trailing = TRUE)
#> [1] " 0.0000245496" " 0.00000000000000003"
#> [3] " 0.00005002" " 0.29999999999999999"
#> [5] "123456789.12345679104328156"
# Please note that the last number's last two digits are rounded:
prettyNum(p_value, scientific = FALSE, digits = 16)
#> [1] "0.0000245496" "0.00000000000000003" "0.00005002"
#> [4] "0.3" "123456789.1234568"

Show number instead of e in axis [duplicate]

I have a dataframe with a column of p-values, and I want to make a selection on these p-values.
> pvalues_anova
[1] 9.693919e-01 9.781728e-01 9.918415e-01 9.716883e-01 1.667183e-02
[6] 9.952762e-02 5.386854e-01 9.997699e-01 8.714044e-01 7.211856e-01
[11] 9.536330e-01 9.239667e-01 9.645590e-01 9.478572e-01 6.243775e-01
[16] 5.608563e-01 1.371190e-04 9.601970e-01 9.988648e-01 9.698365e-01
[21] 2.795891e-06 1.290176e-01 7.125751e-01 5.193604e-01 4.835312e-04
Selection way:
anovatest<- results[ - which(results$pvalues_anova < 0.8) ,]
The function works really fine if I use it in R. But if I run it in another application (galaxy), the numbers which don't have e-01 e.g. 4.835312e-04 are not thrown out.
Is there another way to notate p-values, like 0.0004835312 instead of 4.835312e-04?
You can effectively remove scientific notation in printing with this code:
options(scipen=999)
format(99999999,scientific = FALSE)
gives
99999999
Summarising all existing answers
(And adding a few of my points)
Note : In the below explanation, value is the number to be represented in some (integer/float) format.
Solution 1 :
options(scipen=999)
Solution 2 :
format(value, scientific=FALSE);
Solution 3 :
as.integer(value);
Solution 4 :
You can use integers which don't get printed in scientific notation. You can specify that your number is an integer by putting an "L" behind it
paste(100000L)
will print 100000
Solution 5 :
Control formatting tightly using 'sprintf()'
sprintf("%6d", 100000)
will print 100000
Solution 6 :
prettyNum(value, scientific = FALSE, digits = 16)
I also find the prettyNum(..., scientific = FALSE) function useful for printing when I don't want trailing zeros. Note that these functions are useful for printing purposes, i.e., the output of these functions are strings, not numbers.
p_value <- c(2.45496e-5, 3e-17, 5.002e-5, 0.3, 123456789.123456789)
format(p_value, scientific = FALSE)
#> [1] " 0.00002454960000000" " 0.00000000000000003"
#> [3] " 0.00005002000000000" " 0.29999999999999999"
#> [5] "123456789.12345679104328156"
format(p_value, scientific = FALSE, drop0trailing = TRUE)
#> [1] " 0.0000245496" " 0.00000000000000003"
#> [3] " 0.00005002" " 0.29999999999999999"
#> [5] "123456789.12345679104328156"
# Please note that the last number's last two digits are rounded:
prettyNum(p_value, scientific = FALSE, digits = 16)
#> [1] "0.0000245496" "0.00000000000000003" "0.00005002"
#> [4] "0.3" "123456789.1234568"

Overriding scientific notation in ggplot's legend [duplicate]

I have a dataframe with a column of p-values, and I want to make a selection on these p-values.
> pvalues_anova
[1] 9.693919e-01 9.781728e-01 9.918415e-01 9.716883e-01 1.667183e-02
[6] 9.952762e-02 5.386854e-01 9.997699e-01 8.714044e-01 7.211856e-01
[11] 9.536330e-01 9.239667e-01 9.645590e-01 9.478572e-01 6.243775e-01
[16] 5.608563e-01 1.371190e-04 9.601970e-01 9.988648e-01 9.698365e-01
[21] 2.795891e-06 1.290176e-01 7.125751e-01 5.193604e-01 4.835312e-04
Selection way:
anovatest<- results[ - which(results$pvalues_anova < 0.8) ,]
The function works really fine if I use it in R. But if I run it in another application (galaxy), the numbers which don't have e-01 e.g. 4.835312e-04 are not thrown out.
Is there another way to notate p-values, like 0.0004835312 instead of 4.835312e-04?
You can effectively remove scientific notation in printing with this code:
options(scipen=999)
format(99999999,scientific = FALSE)
gives
99999999
Summarising all existing answers
(And adding a few of my points)
Note : In the below explanation, value is the number to be represented in some (integer/float) format.
Solution 1 :
options(scipen=999)
Solution 2 :
format(value, scientific=FALSE);
Solution 3 :
as.integer(value);
Solution 4 :
You can use integers which don't get printed in scientific notation. You can specify that your number is an integer by putting an "L" behind it
paste(100000L)
will print 100000
Solution 5 :
Control formatting tightly using 'sprintf()'
sprintf("%6d", 100000)
will print 100000
Solution 6 :
prettyNum(value, scientific = FALSE, digits = 16)
I also find the prettyNum(..., scientific = FALSE) function useful for printing when I don't want trailing zeros. Note that these functions are useful for printing purposes, i.e., the output of these functions are strings, not numbers.
p_value <- c(2.45496e-5, 3e-17, 5.002e-5, 0.3, 123456789.123456789)
format(p_value, scientific = FALSE)
#> [1] " 0.00002454960000000" " 0.00000000000000003"
#> [3] " 0.00005002000000000" " 0.29999999999999999"
#> [5] "123456789.12345679104328156"
format(p_value, scientific = FALSE, drop0trailing = TRUE)
#> [1] " 0.0000245496" " 0.00000000000000003"
#> [3] " 0.00005002" " 0.29999999999999999"
#> [5] "123456789.12345679104328156"
# Please note that the last number's last two digits are rounded:
prettyNum(p_value, scientific = FALSE, digits = 16)
#> [1] "0.0000245496" "0.00000000000000003" "0.00005002"
#> [4] "0.3" "123456789.1234568"

knitr vs. interactive R behaviour

I am reposting my problem here, after I noticed that was the approach advised by knitr's author to get more help.
I am a bit puzzle with a .Rmd file that I can proceed line by line in an interactive R session, and also with R CMD BATCH, but that fails when using knit("test.Rmd"). I am not sure where the problem lies, and I tried to narrow the problem down as much as I could. Here is the example (in test.Rmd):
```{r Rinit, include = FALSE, cache = FALSE}
opts_knit$set(stop_on_error = 2L)
library(adehabitatLT)
```
The functions to be used later:
```{r functions}
ld <- function(ltraj) {
if (!inherits(ltraj, "ltraj"))
stop("ltraj should be of class ltraj")
inf <- infolocs(ltraj)
df <- data.frame(
x = unlist(lapply(ltraj, function(x) x$x)),
y = unlist(lapply(ltraj, function(x) x$y)),
date = unlist(lapply(ltraj, function(x) x$date)),
dx = unlist(lapply(ltraj, function(x) x$dx)),
dy = unlist(lapply(ltraj, function(x) x$dy)),
dist = unlist(lapply(ltraj, function(x) x$dist)),
dt = unlist(lapply(ltraj, function(x) x$dt)),
R2n = unlist(lapply(ltraj, function(x) x$R2n)),
abs.angle = unlist(lapply(ltraj, function(x) x$abs.angle)),
rel.angle = unlist(lapply(ltraj, function(x) x$rel.angle)),
id = rep(id(ltraj), sapply(ltraj, nrow)),
burst = rep(burst(ltraj), sapply(ltraj, nrow)))
class(df$date) <- c("POSIXct", "POSIXt")
attr(df$date, "tzone") <- attr(ltraj[[1]]$date, "tzone")
if (!is.null(inf)) {
nc <- ncol(inf[[1]])
infdf <- as.data.frame(matrix(nrow = nrow(df), ncol = nc))
names(infdf) <- names(inf[[1]])
for (i in 1:nc) infdf[[i]] <- unlist(lapply(inf, function(x) x[[i]]))
df <- cbind(df, infdf)
}
return(df)
}
ltraj2sldf <- function(ltr, proj4string = CRS(as.character(NA))) {
if (!inherits(ltr, "ltraj"))
stop("ltr should be of class ltraj")
df <- ld(ltr)
df <- subset(df, !is.na(dist))
coords <- data.frame(df[, c("x", "y", "dx", "dy")], id = as.numeric(row.names(df)))
res <- apply(coords, 1, function(dfi) Lines(Line(matrix(c(dfi["x"],
dfi["y"], dfi["x"] + dfi["dx"], dfi["y"] + dfi["dy"]),
ncol = 2, byrow = TRUE)), ID = format(dfi["id"], scientific = FALSE)))
res <- SpatialLinesDataFrame(SpatialLines(res, proj4string = proj4string),
data = df)
return(res)
}
```
I load the object and apply the `ltraj2sldf` function:
```{r fail}
load("tr.RData")
juvStp <- ltraj2sldf(trajjuv, proj4string = CRS("+init=epsg:32617"))
dim(juvStp)
```
Using knitr("test.Rmd") fails with:
label: fail
Quitting from lines 66-75 (test.Rmd)
Error in SpatialLinesDataFrame(SpatialLines(res, proj4string =
proj4string), (from <text>#32) :
row.names of data and Lines IDs do not match
Using the call directly in the R console after the error occurred works as expected...
The problem is related to the way format produces the ID (in the apply call of ltraj2sldf), just before ID 100,000: using an interactive call, R gives "99994", "99995", "99996", "99997", "99998", "99999", "100000"; using knitr R gives "99994", " 99995", " 99996", " 99997", " 99998", " 99999", "100000", with additional leading spaces.
Is there any reason for this behaviour to occur? Why should knitr behave differently than a direct call in R? I have to admit I'm having hard time with that one, since I cannot debug it (it works in an interactive session)!
Any hint will be much appreciated. I can provide the .RData if it helps (the file is 4.5 Mo), but I'm mostly interested in why such a difference happens. I tried without any success to come up with a self-reproducible example, sorry about that. Thanks in advance for any contribution!
After a comment of baptiste, here are some more details about IDs generation. Basically, the ID is generated at each line of the data frame by an apply call, which in turn uses format like this: format(dfi["id"], scientific = FALSE). Here, the column id is simply a series from 1 to the number of rows (1:nrow(df)). scientific = FALSE is just to ensure that I don't have results such as 1e+05 for 100000.
Based on an exploration of the IDs generation, the problem only occurred for those presented in the first message, i.e. 99995 to 99999, for which a leading space is added. This should not happen with this format call, since I did not ask for a specific number of digit in the output. For instance:
> format(99994:99999, scientific = FALSE)
[1] "99994" "99995" "99996" "99997" "99998" "99999"
However, if the IDs are generated in chunks, it might occur:
> format(99994:100000, scientific = FALSE)
[1] " 99994" " 99995" " 99996" " 99997" " 99998" " 99999" "100000"
Note that the same processed one at a time gives the expected result:
> for (i in 99994:100000) print(format(i, scientific = FALSE))
[1] "99994"
[1] "99995"
[1] "99996"
[1] "99997"
[1] "99998"
[1] "99999"
[1] "100000"
In the end, it's exactly like if the IDs were not prepared one at a time (as I would expect from an apply call by line), but in this case, 6 at a time, and only when close to 1e+05... And of course, only when using knitr, not interactive or batch R.
Here is my session information:
> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
[5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] knitr_1.2 adehabitatLT_0.3.12 CircStats_0.2-4
[4] boot_1.3-9 MASS_7.3-27 adehabitatMA_0.3.6
[7] ade4_1.5-2 sp_1.0-11 basr_0.5.3
loaded via a namespace (and not attached):
[1] digest_0.6.3 evaluate_0.4.4 formatR_0.8 fortunes_1.5-0
[5] grid_3.0.1 lattice_0.20-15 stringr_0.6.2 tools_3.0.1
Both Jeff and baptiste were were indeed right! This is an option problem, related to the digits argument. I managed to come up with a working minimal example (e.g. in test.Rmd):
Simple reproducible example : df1 is a data frame of 110,000 rows,
with 2 random normal variables + an `id` variable which is a series
from 1 to the number of row.
```{r example}
df1 <- data.frame(x = rnorm(110000), y = rnorm(110000), id = 1:110000)
```
From this, we create a `id2` variable using `format` and `scientific =
FALSE` to have results with all numbers instead of scientific
notations (e.g. 100,000 instead of 1e+05):
```{r example-continued}
df1$id2 <- apply(df1, 1, function(dfi) format(dfi["id"], scientific = FALSE))
df1$id2[99990:100010]
```
It works as expected using R interactively, resulting in:
[1] "99990" "99991" "99992" "99993" "99994" "99995" "99996"
[8] "99997" "99998" "99999" "100000" "100001" "100002" "100003"
[15] "100004" "100005" "100006" "100007" "100008" "100009" "100010"
However, the results are quite different using knit:
> library(knitr)
> knit("test.Rmd")
[...]
## [1] "99990" "99991" "99992" "99993" "99994" " 99995" " 99996"
## [8] " 99997" " 99998" " 99999" "100000" "100001" "100002" "100003"
## [15] "100004" "100005" "100006" "100007" "100008" "100009" "100010"
Note the additional leading spaces after 99994. The difference actually comes from the digits option, as rightly suggested by Jeff: R uses 7 by default, while knitr uses 4. This difference affects the output of format, although I don't really understand what's going on here. R-style:
> options(digits = 7)
> format(99999, scientific = FALSE)
[1] "99999"
knitr-style:
> options(digits = 4)
> format(99999, scientific = FALSE)
[1] " 99999"
But it should affect all numbers, not just after 99994 (well, to be honest, I don't even understand why it's adding leading spaces at all):
> options(digits = 4)
> format(c(1:10, 99990:100000), scientific = FALSE)
[1] " 1" " 2" " 3" " 4" " 5" " 6" " 7"
[8] " 8" " 9" " 10" " 99990" " 99991" " 99992" " 99993"
[15] " 99994" " 99995" " 99996" " 99997" " 99998" " 99999" "100000"
From this, I have no idea which is at fault: knitr, apply or format? At least, I came up with a workaround, using the argument trim = TRUE in format. It doesn't solve the cause of the problem, but did remove the leading space in the results...
I added a comment to your knitr GitHub issue with this information.
format() adds the extra whitespace when the digits option is not sufficient to display a value but scientific=FALSE is also specified. knitr sets digits to 4 inside code blocks, which causes the behavior you describe:
options(digits=4)
format(99999, scientific=FALSE)
Produces:
[1] " 99999"
While:
options(digits=5)
format(99999, scientific=FALSE)
Produces:
[1] "99999"
Thanks to Aleksey Vorona and Duncan Murdoch, this bug is now fixed in R-devel!
See: https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15411

How can I disable scientific notation?

I have a dataframe with a column of p-values, and I want to make a selection on these p-values.
> pvalues_anova
[1] 9.693919e-01 9.781728e-01 9.918415e-01 9.716883e-01 1.667183e-02
[6] 9.952762e-02 5.386854e-01 9.997699e-01 8.714044e-01 7.211856e-01
[11] 9.536330e-01 9.239667e-01 9.645590e-01 9.478572e-01 6.243775e-01
[16] 5.608563e-01 1.371190e-04 9.601970e-01 9.988648e-01 9.698365e-01
[21] 2.795891e-06 1.290176e-01 7.125751e-01 5.193604e-01 4.835312e-04
Selection way:
anovatest<- results[ - which(results$pvalues_anova < 0.8) ,]
The function works really fine if I use it in R. But if I run it in another application (galaxy), the numbers which don't have e-01 e.g. 4.835312e-04 are not thrown out.
Is there another way to notate p-values, like 0.0004835312 instead of 4.835312e-04?
You can effectively remove scientific notation in printing with this code:
options(scipen=999)
format(99999999,scientific = FALSE)
gives
99999999
Summarising all existing answers
(And adding a few of my points)
Note : In the below explanation, value is the number to be represented in some (integer/float) format.
Solution 1 :
options(scipen=999)
Solution 2 :
format(value, scientific=FALSE);
Solution 3 :
as.integer(value);
Solution 4 :
You can use integers which don't get printed in scientific notation. You can specify that your number is an integer by putting an "L" behind it
paste(100000L)
will print 100000
Solution 5 :
Control formatting tightly using 'sprintf()'
sprintf("%6d", 100000)
will print 100000
Solution 6 :
prettyNum(value, scientific = FALSE, digits = 16)
I also find the prettyNum(..., scientific = FALSE) function useful for printing when I don't want trailing zeros. Note that these functions are useful for printing purposes, i.e., the output of these functions are strings, not numbers.
p_value <- c(2.45496e-5, 3e-17, 5.002e-5, 0.3, 123456789.123456789)
format(p_value, scientific = FALSE)
#> [1] " 0.00002454960000000" " 0.00000000000000003"
#> [3] " 0.00005002000000000" " 0.29999999999999999"
#> [5] "123456789.12345679104328156"
format(p_value, scientific = FALSE, drop0trailing = TRUE)
#> [1] " 0.0000245496" " 0.00000000000000003"
#> [3] " 0.00005002" " 0.29999999999999999"
#> [5] "123456789.12345679104328156"
# Please note that the last number's last two digits are rounded:
prettyNum(p_value, scientific = FALSE, digits = 16)
#> [1] "0.0000245496" "0.00000000000000003" "0.00005002"
#> [4] "0.3" "123456789.1234568"

Resources