How to format upto two decimal digit in R? [duplicate] - r

I have a number, for example 1.128347132904321674821 that I would like to show as only two decimal places when output to screen (or written to a file). How does one do that?
x <- 1.128347132904321674821
EDIT:
The use of:
options(digits=2)
Has been suggested as a possible answer. Is there a way to specify this within a script for one-time use? When I add it to my script it doesn't seem to do anything different and I'm not interested in a lot of re-typing to format each number (I'm automating a very large report).
--
Answer: round(x, digits=2)

Background: Some answers suggested on this page (e.g., signif, options(digits=...)) do not guarantee that a certain number of decimals are displayed for an arbitrary number. I presume this is a design feature in R whereby good scientific practice involves showing a certain number of digits based on principles of "significant figures". However, in many domains (e.g., APA style, business reports) formatting requirements dictate that a certain number of decimal places are displayed. This is often done for consistency and standardisation purposes rather than being concerned with significant figures.
Solution:
The following code shows exactly two decimal places for the number x.
format(round(x, 2), nsmall = 2)
For example:
format(round(1.20, 2), nsmall = 2)
# [1] "1.20"
format(round(1, 2), nsmall = 2)
# [1] "1.00"
format(round(1.1234, 2), nsmall = 2)
# [1] "1.12"
A more general function is as follows where x is the number and k is the number of decimals to show. trimws removes any leading white space which can be useful if you have a vector of numbers.
specify_decimal <- function(x, k) trimws(format(round(x, k), nsmall=k))
E.g.,
specify_decimal(1234, 5)
# [1] "1234.00000"
specify_decimal(0.1234, 5)
# [1] "0.12340"
Discussion of alternatives:
The formatC answers and sprintf answers work fairly well. But they will show negative zeros in some cases which may be unwanted. I.e.,
formatC(c(-0.001), digits = 2, format = "f")
# [1] "-0.00"
sprintf(-0.001, fmt = '%#.2f')
# [1] "-0.00"
One possible workaround to this is as follows:
formatC(as.numeric(as.character(round(-.001, 2))), digits = 2, format = "f")
# [1] "0.00"

You can format a number, say x, up to decimal places as you wish. Here x is a number with many decimal places. Suppose we wish to show up to 8 decimal places of this number:
x = 1111111234.6547389758965789345
y = formatC(x, digits = 8, format = "f")
# [1] "1111111234.65473890"
Here format="f" gives floating numbers in the usual decimal places say, xxx.xxx, and digits specifies the number of digits. By contrast, if you wanted to get an integer to display you would use format="d" (much like sprintf).

You can try my package formattable.
> # devtools::install_github("renkun-ken/formattable")
> library(formattable)
> x <- formattable(1.128347132904321674821, digits = 2, format = "f")
> x
[1] 1.13
The good thing is, x is still a numeric vector and you can do more calculations with the same formatting.
> x + 1
[1] 2.13
Even better, the digits are not lost, you can reformat with more digits any time :)
> formattable(x, digits = 6, format = "f")
[1] 1.128347

for 2 decimal places assuming that you want to keep trailing zeros
sprintf(5.5, fmt = '%#.2f')
which gives
[1] "5.50"
As #mpag mentions below, it seems R can sometimes give unexpected values with this and the round method e.g. sprintf(5.5550, fmt='%#.2f') gives 5.55, not 5.56

Something like that :
options(digits=2)
Definition of digits option :
digits: controls the number of digits to print when printing numeric values.

If you prefer significant digits to fixed digits then, the signif command might be useful:
> signif(1.12345, digits = 3)
[1] 1.12
> signif(12.12345, digits = 3)
[1] 12.1
> signif(12345.12345, digits = 3)
[1] 12300

Check functions prettyNum, format
to have trialling zeros (123.1240 for example) use sprintf(x, fmt='%#.4g')

The function formatC() can be used to format a number to two decimal places. Two decimal places are given by this function even when the resulting values include trailing zeros.

I'm using this variant for force print K decimal places:
# format numeric value to K decimal places
formatDecimal <- function(x, k) format(round(x, k), trim=T, nsmall=k)

Note that numeric objects in R are stored with double precision, which gives you (roughly) 16 decimal digits of precision - the rest will be noise. I grant that the number shown above is probably just for an example, but it is 22 digits long.

Looks to me like to would be something like
library(tutoR)
format(1.128347132904321674821, 2)
Per a little online help.

if you just want to round a number or a list, simply use
round(data, 2)
Then, data will be round to 2 decimal place.

I wrote this function that could be improve but looks like works well in corner cases. For example, in the case of 0.9995 the vote correct answer gives us 1.00 which is incorrect. I use that solution in the case that the number has no decimals.
round_correct <- function(x, digits, chars = TRUE) {
if(grepl(x = x, pattern = "\\.")) {
y <- as.character(x)
pos <- grep(unlist(strsplit(x = y, split = "")), pattern = "\\.", value = FALSE)
if(chars) {
return(substr(x = x, start = 1, stop = pos + digits))
}
return(
as.numeric(substr(x = x, start = 1, stop = pos + digits))
)
} else {
return(
format(round(x, 2), nsmall = 2)
)
}
}
Example:
round_correct(10.59648, digits = 2)
[1] "10.59"
round_correct(0.9995, digits = 2)
[1] "0.99"
round_correct(10, digits = 2)
[1] "10.00"

here's my approach from units to millions.
digits parameter let me adjust the minimum number of significant values (integer + decimals). You could adjust decimal rounding inside first.
number <-function(number){
result <- if_else(
abs(number) < 1000000,
format(
number, digits = 3,
big.mark = ".",
decimal.mark = ","
),
paste0(
format(
number/1000000,
digits = 3,
drop0trailing = TRUE,
big.mark = ".",
decimal.mark = ","
),
"MM"
)
)
# result <- paste0("$", result)
return(result)
}

library(dplyr)
# round the numbers
df <- df %>%
mutate(across(where(is.numeric), .fns = function(x) {format(round(x, 2), nsmall = 2)}))
Here I am changing all numeric values to have only 2 decimal places. If you need to change it to more decimal places
# round the numbers for k decimal places
df <- df %>%
mutate(across(where(is.numeric), .fns = function(x) {format(round(x, k), nsmall = k)}))
Replace the k with the desired number of decimal places

Related

Losing precision in dataframe while changing column datatype [duplicate]

I have a number, for example 1.128347132904321674821 that I would like to show as only two decimal places when output to screen (or written to a file). How does one do that?
x <- 1.128347132904321674821
EDIT:
The use of:
options(digits=2)
Has been suggested as a possible answer. Is there a way to specify this within a script for one-time use? When I add it to my script it doesn't seem to do anything different and I'm not interested in a lot of re-typing to format each number (I'm automating a very large report).
--
Answer: round(x, digits=2)
Background: Some answers suggested on this page (e.g., signif, options(digits=...)) do not guarantee that a certain number of decimals are displayed for an arbitrary number. I presume this is a design feature in R whereby good scientific practice involves showing a certain number of digits based on principles of "significant figures". However, in many domains (e.g., APA style, business reports) formatting requirements dictate that a certain number of decimal places are displayed. This is often done for consistency and standardisation purposes rather than being concerned with significant figures.
Solution:
The following code shows exactly two decimal places for the number x.
format(round(x, 2), nsmall = 2)
For example:
format(round(1.20, 2), nsmall = 2)
# [1] "1.20"
format(round(1, 2), nsmall = 2)
# [1] "1.00"
format(round(1.1234, 2), nsmall = 2)
# [1] "1.12"
A more general function is as follows where x is the number and k is the number of decimals to show. trimws removes any leading white space which can be useful if you have a vector of numbers.
specify_decimal <- function(x, k) trimws(format(round(x, k), nsmall=k))
E.g.,
specify_decimal(1234, 5)
# [1] "1234.00000"
specify_decimal(0.1234, 5)
# [1] "0.12340"
Discussion of alternatives:
The formatC answers and sprintf answers work fairly well. But they will show negative zeros in some cases which may be unwanted. I.e.,
formatC(c(-0.001), digits = 2, format = "f")
# [1] "-0.00"
sprintf(-0.001, fmt = '%#.2f')
# [1] "-0.00"
One possible workaround to this is as follows:
formatC(as.numeric(as.character(round(-.001, 2))), digits = 2, format = "f")
# [1] "0.00"
You can format a number, say x, up to decimal places as you wish. Here x is a number with many decimal places. Suppose we wish to show up to 8 decimal places of this number:
x = 1111111234.6547389758965789345
y = formatC(x, digits = 8, format = "f")
# [1] "1111111234.65473890"
Here format="f" gives floating numbers in the usual decimal places say, xxx.xxx, and digits specifies the number of digits. By contrast, if you wanted to get an integer to display you would use format="d" (much like sprintf).
You can try my package formattable.
> # devtools::install_github("renkun-ken/formattable")
> library(formattable)
> x <- formattable(1.128347132904321674821, digits = 2, format = "f")
> x
[1] 1.13
The good thing is, x is still a numeric vector and you can do more calculations with the same formatting.
> x + 1
[1] 2.13
Even better, the digits are not lost, you can reformat with more digits any time :)
> formattable(x, digits = 6, format = "f")
[1] 1.128347
for 2 decimal places assuming that you want to keep trailing zeros
sprintf(5.5, fmt = '%#.2f')
which gives
[1] "5.50"
As #mpag mentions below, it seems R can sometimes give unexpected values with this and the round method e.g. sprintf(5.5550, fmt='%#.2f') gives 5.55, not 5.56
Something like that :
options(digits=2)
Definition of digits option :
digits: controls the number of digits to print when printing numeric values.
If you prefer significant digits to fixed digits then, the signif command might be useful:
> signif(1.12345, digits = 3)
[1] 1.12
> signif(12.12345, digits = 3)
[1] 12.1
> signif(12345.12345, digits = 3)
[1] 12300
Check functions prettyNum, format
to have trialling zeros (123.1240 for example) use sprintf(x, fmt='%#.4g')
The function formatC() can be used to format a number to two decimal places. Two decimal places are given by this function even when the resulting values include trailing zeros.
I'm using this variant for force print K decimal places:
# format numeric value to K decimal places
formatDecimal <- function(x, k) format(round(x, k), trim=T, nsmall=k)
Note that numeric objects in R are stored with double precision, which gives you (roughly) 16 decimal digits of precision - the rest will be noise. I grant that the number shown above is probably just for an example, but it is 22 digits long.
Looks to me like to would be something like
library(tutoR)
format(1.128347132904321674821, 2)
Per a little online help.
if you just want to round a number or a list, simply use
round(data, 2)
Then, data will be round to 2 decimal place.
I wrote this function that could be improve but looks like works well in corner cases. For example, in the case of 0.9995 the vote correct answer gives us 1.00 which is incorrect. I use that solution in the case that the number has no decimals.
round_correct <- function(x, digits, chars = TRUE) {
if(grepl(x = x, pattern = "\\.")) {
y <- as.character(x)
pos <- grep(unlist(strsplit(x = y, split = "")), pattern = "\\.", value = FALSE)
if(chars) {
return(substr(x = x, start = 1, stop = pos + digits))
}
return(
as.numeric(substr(x = x, start = 1, stop = pos + digits))
)
} else {
return(
format(round(x, 2), nsmall = 2)
)
}
}
Example:
round_correct(10.59648, digits = 2)
[1] "10.59"
round_correct(0.9995, digits = 2)
[1] "0.99"
round_correct(10, digits = 2)
[1] "10.00"
here's my approach from units to millions.
digits parameter let me adjust the minimum number of significant values (integer + decimals). You could adjust decimal rounding inside first.
number <-function(number){
result <- if_else(
abs(number) < 1000000,
format(
number, digits = 3,
big.mark = ".",
decimal.mark = ","
),
paste0(
format(
number/1000000,
digits = 3,
drop0trailing = TRUE,
big.mark = ".",
decimal.mark = ","
),
"MM"
)
)
# result <- paste0("$", result)
return(result)
}
library(dplyr)
# round the numbers
df <- df %>%
mutate(across(where(is.numeric), .fns = function(x) {format(round(x, 2), nsmall = 2)}))
Here I am changing all numeric values to have only 2 decimal places. If you need to change it to more decimal places
# round the numbers for k decimal places
df <- df %>%
mutate(across(where(is.numeric), .fns = function(x) {format(round(x, k), nsmall = k)}))
Replace the k with the desired number of decimal places

How to generate a sequence with a recurring motif that is interspersed with random characters

I'm trying to generate a sequence of certain letters containing a repeating motif that is interspersed with random letters.
For example: ABXXXXXXXABXXXXXXXABXXXXXXX, where X = A, B, C or D, selected at random.
I also need to specify the overall length of the sequence, change the letters that repeat, and how often they do so (e.g., to make BC repeat every 5 characters).
Sadly, I have only been able to get as far as generating the random sequence of defined length, containing select characters:
set.seed(42)
x <- sample(letters[c(1, 2, 3, 4)], size=200, replace = TRUE)
Here is a custom function that repeats a fixed pattern, every n characters,
f1 <- function(x, overall_len, chars_repeat) {
l1 <- rep(list(x), (overall_len / chars_repeat))
res <- paste(sapply(l1, function(i)
paste0(i, paste0(sample(letters[1:4], size = chars_repeat, replace = TRUE), collapse = ''),
collapse = '')),
collapse = '')
return(res)
}
f1('WQ', 32, 8)
#[1] "WQcccdddacWQbacccabcWQccaaaaaaWQabbcddcb"
f1('BC', 20, 4)
#[1] "BCbdbcBCacbdBCdacbBCdbbaBCaccd"
f1('BC', 20, 10)
#[1] "BCdbbabacccaBCbabdbbbaac"
f1('AAA', 40, 5)
#[1] "AAAabcacAAAdbcbcAAAbdbdcAAAadcdcAAAcadbdAAAddaacAAAadcabAAAdbabb"
Building a function that uses stringi and a for loop:
library(stringi)
generateRandomSequence <- function(fixedPart, randomLength, repititions){
output <- ""
for(i in 1: repititions){
newPart <- paste(fixedPart, stri_rand_strings(1, randomLength) ,sep="")
output <- paste(output,newPart,sep="")
}
return(output)
}
We can call the function:
generateRandomSequence("AB",5,2)
Giving result:"ABuwHpdABWj8eh"
The first parameter "AB" is the repeating sequence. The second parameter is the number of random characters that intersperses the repeating sequence. The third part controls the number of repititions.

R: Change Vector Output to Several Ranges

I am using Jenks Natural Breaks via the BAMMtools package to segment my data in RStudio Version 1.0.153. The output is a vector that shows where the natural breaks occur in my data set, as such:
[1] 14999 41689 58415 79454 110184 200746
I would like to take the output above and create the ranges inferred by the breaks. Ex: 14999-41689, 41690-58415, 58416-79454, 79455-110184, 110185-200746
Are there any functions that I can use in R Studio to accomplish this? Thank you in advance!
Input data
x <- c(14999, 41689, 58415, 79454, 110184, 200746)
If you want the ranges as characters you can do
y <- x; y[1] <- y[1] - 1 # First range given in question doesn't follow the pattern. Adjusting for that
paste(head(y, -1) + 1, tail(y, -1), sep = '-')
#[1] "14999-41689" "41690-58415" "58416-79454" "79455-110184" "110185-200746"
If you want a list of the actual sets of numbers in each range you can do
seqs <- Map(seq, head(y, -1) + 1, tail(y, -1))
You can definitely create your own function that produces the exact output you're looking for, but you can use the cut function that will give you something like this:
# example vector
x = c(14999, 41689, 58415, 79454, 110184, 200746)
# use the vector and its values as breaks
ranges = cut(x, x, dig.lab = 6)
# see the levels
levels(ranges)
#[1] "(14999,41689]" "(41689,58415]" "(58415,79454]" "(79454,110184]" "(110184,200746]"

Numeric Matching / Extracting with Hard Coded Values in R

Having trouble understanding numeric matching / indexing in R.
If I have a situation where I create a dataframe such as:
options(digits = 3)
x <- seq(from = 0, to = 5, by = 0.10)
TestDF <- data.frame(x = x, y = dlnorm(x))
and I wanted to compare a hardcoded value to my y column -
> TestDF[TestDF$y == 0.0230,]$x
numeric(0)
That being said, if I compare to the value that's straight out of the dataframe (which for an x value of 4.9, should be a y value of 0.0230).
> TestDF[TestDF$y == TestDF[50,]$y,]$x
[1] 4.9
Does this have to do with exact matching? If I limit the digits to 3 decimal point, then 0.0230000 won't be the same as the original value in y I'm comparing to? If this is the case, is there a way around it if I do need to extract values based on rounded, hard-coded values?
You can use round() function to reduce the number of decimal digits to the preferred scale of the floating point number. See below.
set.seed(1L)
x <- seq(from = 0, to = 5, by = 0.10)
TestDF <- data.frame(x = x, y = dlnorm(x))
constant <- 0.023
TestDF[ with(TestDF, round(y, 3) == constant), ]
# x y
# 50 4.9 0.02302884
You can compare the rounded y with the stated value:
> any(TestDF$y == 0.0230)
[1] FALSE
> any(round(TestDF$y, 3) == 0.0230)
[1] TRUE
I'm not certain you grok the meaning of the digits option. From ?options it says about digits
digits: controls the number of significant digits to print when printing numeric values.
(emphasis mine.) So this only affects how the values are printed, not how they are stored.
You generated a set of reals, none of which are exactly 0.0230. This has nothing to do with exact matching. The value you indicated should be 0.0230 is actually stored as
> with(TestDF, print(y[50], digits = 22))
[1] 0.02302883835550340041465
regardless of the digits setting in options because that setting only affects the printed value. And the issue is not exact matching because even with the small fudge allowed by the recommended way to do comparisons, all.equal(), y[50] and 0.0230 are still not equal
> with(TestDF, all.equal(0.0230, y[50]))
[1] "Mean relative difference: 0.001253842"

Fixing rounded R values in xtable in knitr

Has anyone come up with a solution to adjust rounded R values shown in a knitr document, either as stand along \Sexpr{} or through xtable? Typing?round returns Note that for rounding off a 5, the IEC 60559 standard is expected to be used, ‘go to the even digit’.
My problem is the following scenario when showing calculated numbers from a dataframe using xtable. If the values were each shown in a separate column in a table, the reader would assume there is a calculation error:
2.5 + 3.1 = 5.6
would show up as
2 + 3 = 6
when R rounds the numbers (I have set the significant digits to 0 since the audience doesn't need more detail). This situation could potentially happen no matter how many decimal places are shown (and I would like to avoid showing any!).
I use the following for inline expressions, however I rarely insert a number into the paragraph and it usually isn't shown as a calculation. This will show 1 decimal place for numbers less than 10 and greater than -10 and should round up on even numbers ending with 0.5.
number_hook <- function(x) {
if (is.numeric(x)) {
if (x < 10 & x > 0 | x < 0 & x > -10) {
y = prettyNum(x,
small.mark = ".",
digits = 2)
return(y)
} else if (sign(x) == 1) {
y = x + 0.5
y = trunc(y)
y = prettyNum(y, big.mark = ",", small.mark = ".", digits = 0)
return(y)
} else if (sign(x) == -1) {
y = x - 0.5
y = trunc(y)
y = prettyNum(y, big.mark = ",", small.mark = ".", digits = 0)
return(y)
}
} else {
x
}
}
Any help, work-arounds, or suggestions are appreciated! Thank you!
I have also visited this similar question.
My original recommendation did not work correctly. First you should modify your original function.
number_hook <- function(x) {
ifelse(abs(x) < 10 & abs(x) > 0, prettyNum(x, small.mark = ",", digits = 2), trunc(x))
}
This should simplify the number of if statements. Then you can use:
xtable::xtable(dplyr::mutate_if(iris, is.numeric, number_hook))
To apply the function to every numeric column in your data frame.
Try it on:
foo <- data.frame(a = rnorm(10), b = rnorm(10, 10), c = rnorm(10, -10))
xtable::xtable(dplyr::mutate_if(foo, is.numeric, number_hook))
And you should get values that you need.

Resources