Comma separator for numbers in R? - r

Is there a function in R to display large numbers separated with commas?
i.e., from 1000000 to 1,000,000.

You can try either format or prettyNum, but both functions return a vector of characters. I'd only use that for printing.
> prettyNum(12345.678,big.mark=",",scientific=FALSE)
[1] "12,345.68"
> format(12345.678,big.mark=",",scientific=FALSE)
[1] "12,345.68"
EDIT: As Michael Chirico says in the comment:
Be aware that these have the side effect of padding the printed strings with blank space, for example:
> prettyNum(c(123,1234),big.mark=",")
[1] " 123" "1,234"
Add trim=TRUE to format or preserve.width="none" to prettyNum to prevent this:
> prettyNum(c(123,1234),big.mark=",", preserve.width="none")
[1] "123" "1,234"
> format(c(123,1234),big.mark=",", trim=TRUE)
[1] "123" "1,234"

See ?format:
> format(1e6, big.mark=",", scientific=FALSE)
[1] "1,000,000"
>

The other answers posted obviously work - but I have always used
library(scales)
label_comma()(1000000)

I think Joe's comment to MatthewR offers the best answer and should be highlighted:
As of Sept 2018, the scales package (part of the Tidyverse) does exactly this:
> library(scales)
> x <- 10e5
> comma(x)
[1] "1,000,000"
The scales package appears to play very nicely with ggplot2, allowing for fine control of how numerics are displayed in plots and charts.

Related

R is changing my variable value by itself

I have a dataframe that has an id field with values as these two:
587739706883375310
587739706883375408
The problem is that, when I ask R to show these two numbers, the output that I get is the following:
587739706883375360
587739706883375360
which are not the real values of my ID field, how do I solve that?
For your information: I have executed options(scipen = 999) to R does not convert my number to a scientific notation.
This problem also happens in R console, if I enter these examples numbers I also get the same printing as shown above.
EDIT: someone asked
dput(yourdata$id)
I did that and the result was:
c(587739706883375360, 587739706883375360, 587739706883375488, 587739706883506560, 587739706883637632, 587739706883637632, 587739706883703040)
To compare, the original data in the csv file is:
587739706883375310,587739706883375408,587739706883375450,587739706883506509,587739706883637600,587739706883637629,587739706883703070
I also did the following test with one of these numbers:
> 587739706883375408
[1] 587739706883375360
> as.double(587739706883375408)
[1] 587739706883375360
> class(as.double(587739706883375408))
[1] "numeric"
> is.double(as.double(587739706883375408))
[1] TRUE
You can use the bit64 package to represent such large numbers:
library(bit64)
as.integer64("587739706883375408")
# integer64
# [1] 587739706883375408
as.integer64("587739706883375408") + 1
# integer64
# [1] 587739706883375409

Cyrillic transliteration in R

Are there packages for Cyrillic text transliteration to Latin in R? I need to convert data frames to Latin to use factors. It is somewhat messy to use Cyrillic factors in R.
I have found the package at last.
> library(stringi)
> stri_trans_general("женщина", "cyrillic-latin")
[1] "ženŝina"
> stri_trans_general("женщина", "russian-latin/bgn")
[1] "zhenshchina"
After that, the only issue remaining is the "ё" letter.
> stri_trans_general("Ёж", "russian-latin/bgn")
[1] "Yëzh"
> stri_trans_general("подъезд", "russian-latin/bgn")
[1] "podʺyezd"
> stri_trans_general("мальчик", "russian-latin/bgn")
[1] "malʹchik"
I had to remove all the "ё", "ʹ" and "ʺ" characters
> iconv(stri_trans_general("ёж", "russian-latin/bgn"),from="UTF8",to="ASCII",sub="")
[1] "yzh"
Or one can just remove the 'Ё' and 'ё' letters before
> gsub('ё','e',gsub('Ё','E','Ёжики на ёлке'))
[1] "Eжики на eлке"
or after transliteration.
It is possible to do it with stringi package as you above, but with different transform identifier, for Serbian latin:
`stri_trans_general("жшчћђ", "Serbian-Latin/BGN")`
All characters should be transformed correctly to Serbian latin.
If afterwards one uses Base R to filter the data in Cyrillic, one get's all NA's, but if dplyr is used then everything is fine.

Convert mathematical notation to string

The solution might be very simply, but I can't seem to figure it out easily. I have the following number:
a = 1000000
#> a
#[1] 1e+06
I would like to convert "a" to a string, but when I try using toString, it gives the following:
#> toString(a)
#[1] "1e+06"
I would like to get: 1,000,000 instead, with the comma separator. Is that easily feasible?
Thanks!
format(1e6, big.mark=",", scientific=FALSE) or prettyNum(1000000,big.mark=",",scientific=F) should give you the desired result

knitr - strange behaviour for digits

I ran into some trouble concerning the number of digits printed in knitr.
The number does not correspond to the settings [options('digits')].
I know that it was an issue with that about a year ago but has been resolved (https://github.com/yihui/knitr/issues/120).
```{r}
packageVersion("knitr")
options("digits")
a <- 100.101
a
as.character(a)
options(digits=4)
a
options(digits=10)
a
```
This is what I get (the same on two different machines): http://rpubs.com/markheckmann/6715 .
Something is going wrong here and I do not have a clue. Any ideas?
I don't think options(digits=10) is doing what you exepct. Perhaps you meant
sprintf( "%.10f",101.101)
# [1] "101.1010000000"
This isn't a knitr issue; it's just how R displays digits. Try your code on its own, without knitting.
a <- 100.101
a
#[1] 100.101
as.character(a)
#[1] "100.101"
options(digits=4)
a
#[1] 100.1
options(digits=10)
a
[1] 100.101
print doesn't pad numbers with zeroes to make up the width; for that you need format.
format(a, nsmall = 10)
#[1] "100.1010000000"

Change default number formatting in R

Is there a way to change the default number formatting in R so that numbers will print a certain way without repeatedly having to use the format() function? For example, I would like to have
> x <- 100000
> x
[1] 100,000
instead of
> x <- 100000
> x
[1] 100000
Well if you want to save keystrokes, binding the relevant R function to some pre-defined key-strokes, is fast and simple in any of the popular text editors.
Aside from that, I suppose you can always just write a small formatting function to wrap your expression in; so for instance:
fnx = function(x){print(formatC(x, format="d", big.mark=","), quote=F)}
> 567 * 43245
[1] 24519915
> fnx(567*4325)
[1] 2,452,275
R has several utility functions that will do this. I prefer "formatC" because it's a little more flexible than 'format' and 'prettyNum'.
In my function above, i wrapped the formatC call in a call to 'print' in order to remove the quotes (") from the output, which i don't like (i prefer to look at 100,000 rather than "100,000").
I don't know how to change the default (in fact, I would advise against it because including the comma makes it a character).
You can do this:
> prettyNum(100000, big.mark=",", scientific=FALSE)
[1] "100,000"

Resources