Why does format() change numbers into characters? - r

Why does format change numbers into characters? Is there a way to force format() to keep output as numeric? This becomes an issue for me when dealing with lists of dataframes.
> number <- 33333
> class(number)
[1] "numeric"
> test1 <- format (number, nsmall = 2 )
> class(test1)
[1] "character"
> test2 <- as.numeric (format (number, nsmall = 2 ))
> class(test2)
[1] "numeric"

formattable can return numeric with additional formatting using formatC
test1 <- formattable::comma(number, digits=2, big.mark = "")
-checking
> class(test1)
[1] "formattable" "numeric"
> test1
[1] 3333.00
> test1 + 10
[1] 3343.00

Related

Rounding Error when converting from character to numeric

I have a data.table of data numbers in character format that I am trying to convert to numeric numbers. However the issue is that the numbers are very long and I want to retain all of the numbers without any rounding from R. For examle the first 5 elements of the data.table:
> TimeO[1]
[1] "20110630224701281482"
> TimeO[2]
[1] "20110630224701281523"
> TimeO[3]
[1] "20110630224701281533"
> TimeO[4]
[1] "20110630224701281548"
> TimeO[5]
[1] "20110630224701281762"
I wrote a function to convert from a character into numeric:
convert_time_fast <- function(tim){
b <- tim - tim%/%10^12*10^12
# hhmmssffffff
ms <- b%%10^6; b <-(b-ms)/10^6
ss <- b%%10^2; b <-(b-ss)/10^2
mm <- b%%10^2; hh <-(b-mm)/10^2
# if hours>=22, subtract 24 (previous day)
hh <- hh - (hh>=22)*24
return(hh+mm/60+ss/3600+ms/(3600*10^6))
}
However the rounding occurs in R so datapoints now have the same time. See first 5 elements after converting:
TimeOC <--convert_time_fast(as.numeric(TimeO))
> TimeOC[1]
[1] 1.216311
> TimeOC[2]
[1] 1.216311
> TimeOC[3]
[1] 1.216311
> TimeOC[4]
[1] 1.216311
> TimeOC[5]
[1] 1.216311
Any help figuring this out would be greatly appreciated!
You should test to see if they are really equal (all.equal()).
Usually R limits the number of digits it prints (usually to 7), but they are still there.
See also this example:
> as.numeric("1.21631114")
[1] 1.216311
> as.numeric("1.21631118")
[1] 1.216311
> all.equal(as.numeric("1.21631114"), as.numeric("1.21631118"))
[1] "Mean relative difference: 3.288632e-08" # which indicates they're not the same

R change vector to pythonic tuple

Hi I want to change a vector of strings into one string, which is in the Python Tuple format.
Input:
a <- c('stack', 'overflow', 'kicks', 'ass')
Expected Output:
"('stack', 'overflow', 'kicks', 'ass')"
What would be an easy solution to implement?
This is what I have done and I expect there should be an easier solution:
> b <- a[1]
> for(word in a[-1]){ b <- paste(b, word, sep="','") }
> b
[1] "stack','overflow','kick','ass"
> b <- paste("('", b, "')",sep="")
> b
[1] "('stack','overflow','kick','ass')"
> paste0("(", paste(sQuote(a), collapse = ","), ")")
[1] "(‘stack’,‘overflow’,‘kicks’,‘ass’)"
> options(useFancyQuotes = FALSE)
> paste0("(", paste(sQuote(a), collapse = ","), ")")
[1] "('stack','overflow','kicks','ass')"
> substring(capture.output(dput(a)), 2)
[1] "(\"stack\", \"overflow\", \"kicks\", \"ass\")"

formatting numerical values

I would like to format numerical values, but during formatting they loose "numeric" quality. Is there a better option?
> values
[1] 5 10 20 30
> class(values[1])
[1] "numeric"
> class(values)
[1] "numeric"
> out<-sprintf("%6.2f",values)
> out
[1] " 5.00" " 10.00" " 20.00" " 30.00"
> class(out)
[1] "character"
> class(out[1])
[1] "character"
out is no longer numeric.
You can use the options of print to change the number of digits printed :
R> print(3.141592, digits=3)
[1] 3.14
You can also set options(digits) to make it more or less permanent in your session :
R> options(digits=3)
R> print(3.141592)
[1] 3.14
But this will not necessarily apply to plots, etc.

R: f(x) != sapply(x,f) -- bug or feature?

> f = function(x) as.Date(as.character(x), format='%Y%m%d')
> f(20110606)
[1] "2011-06-06"
> sapply(20110606, f)
[1] 15131
Why 2 returned values are not the same. I need to apply this function to a long vector of dates, but I'm not getting dates with sapply()!
The functions you use to create f are already vectorized. There's no need to use sapply, unless you work for the Department of Redundancy Department.
> f <- function(x) as.Date(as.character(x), format='%Y%m%d')
> d <- 20110606 + 0:10
> f(d)
[1] "2011-06-06" "2011-06-07" "2011-06-08" "2011-06-09"
[5] "2011-06-10" "2011-06-11" "2011-06-12" "2011-06-13"
[9] "2011-06-14" "2011-06-15" "2011-06-16"
> lapply(20110606, f)
[[1]]
[1] "2011-06-06"
> unlist(lapply(20110606, f))
[1] 15131
sapply unlists lapply and in doing so unclasses the date
> unclass(lapply(20110606, f)[[1]])
[1] 15131
> class(lapply(20110606, f)[[1]])
[1] "Date"
as #Joshua Ulrich noted there is no need to use apply type functions however for interest
d <- 20110606 + 0:10
do.call("c",lapply(d, f))
would be one possible way to "unlist" the dates

as.matrix not preserving the data mode of an empty data.frame

I have found something odd today, I wanted to ask you if there was a logical reason for what I am seeing, or if you think this is a bug that should be reported to the R-devel team:
df <- data.frame(a = 1L:10L)
class(df$a)
# [1] "integer"
m <- as.matrix(df)
class(m[, "a"])
# [1] "integer"
No surprise so far: as.matrix preserves the data mode, here "integer". However, with an empty (no rows) data.frame:
df <- data.frame(a = integer(0))
class(df$a)
# [1] "integer"
m <- as.matrix(df)
class(m[, "a"])
# [1] "logical"
Any idea why the mode changes from "integer" to "logical" here? I am using version 2.13.1
Thank you.
This is because of this one line in as.matrix.data.frame:
if (any(dm == 0L)) return(array(NA, dim = dm, dimnames = dn))
Basically, if any dimensions are zero, you get an array "full" of NA. I say "full" because there aren't really any observations because one of the dimensions is zero.
The reason the class is logical is because that's the class of NA. There are special NA for other classes, but they're not really necessary here. For example:
> class(NA)
[1] "logical"
> class(NA_integer_)
[1] "integer"
> class(NA_real_)
[1] "numeric"
> class(NA_complex_)
[1] "complex"
> class(NA_character_)
[1] "character"

Resources