Why apply() converts date objects to numeric objects? [duplicate] - r

This question already has answers here:
why all date strings are changed into numbers?
(2 answers)
Closed 7 years ago.
Why apply() converts my date objects to numeric before calling the user function?
apply(matrix(seq(as.Date("2010-01-01"), as.Date("2010-01-05"), 1)), 1, function(x) { return(class(x)) })
[1] "numeric" "numeric" "numeric" "numeric" "numeric"
And why as.Date() doesn't have the origin parameter set to "1970-01-01" by default?
> as.Date(apply(matrix(seq(as.Date("2010-01-01"), as.Date("2010-01-05"), 1)), 1, function(x) { return(x) }))
Error in as.Date.numeric(apply(matrix(seq(as.Date("2010-01-01"), as.Date("2010-01-05"), :
'origin' must be supplied
> as.Date(apply(matrix(seq(as.Date("2010-01-01"), as.Date("2010-01-05"), 1)), 1, function(x) { return(x) }), origin="1970-01-01")
[1] "2010-01-01" "2010-01-02" "2010-01-03" "2010-01-04" "2010-01-05"

There is a function seq.Date in the base package that will allow you to make a sequence for a Date object. But a matrix will still only take atomic vectors, so you will either just have to call as.Date() again whenever you need to use the Date, or just store it in a dataframe because that can hold "Date" class values.
As far as the default parameter for as.Date, I don't think it makes sense to have 1970 set as the default. What if people are analyzing data from before that date for whatever possible reason?

Related

R: What are dates in a dates vector: dates or numeric values? (difference between x[i] and i)

Could anyone explain please why in the first loop each element of my dates vector is a date while in the second each element of my dates vector is numeric?
Thank you!
x <- as.Date(c("2018-01-01", "2018-01-02", "2018-01-02", "2018-05-06"))
class(x)
# Loop 1 - each element is a Date:
for (i in seq_along(x)) print(class(x[i]))
# Loop 2 - each element is numeric:
for (i in x) print(class(i))
The elements are Date, the first loop is correct.
Unfortunately R does not consistently have the style of the second loop. I believe that the issue is that the for (i in x) syntax bypasses the Date methods for accessors like [, which it can do because S3 classes in R are very thin and don't prevent you from not using their intended interfaces. This can be confusing because something like for (i in 1:4) print(i) works directly, since numeric is a base vector type. Date is S3, so it is coerced to numeric. To see the numeric objects that are printing in the second loop, you can run this:
x <- as.Date(c("2018-01-01", "2018-01-02", "2018-01-02", "2018-05-06"))
for (i in x) print(i)
#> [1] 17532
#> [1] 17533
#> [1] 17533
#> [1] 17657
which is giving you the same thing as the unclassed version of the Date vector. These numbers are the days since the beginning of Unix time, which you can also see below if you convert them back to Date with that origin.
unclass(x)
#> [1] 17532 17533 17533 17657
as.Date(unclass(x), "1970-01-01")
#> [1] "2018-01-01" "2018-01-02" "2018-01-02" "2018-05-06"
So I would stick to using the proper accessors for any S3 vector types as you do in the first loop.
When you run:
for (i in seq_along(x)) print(class(x[i]))
You're using an iterator i over each element of x. Which means that each time you get the class of each iterated member of x.
However, when you run:
for (i in x) print(class(i))
You're looking for the class of each member. Using the ?Date:
Dates are represented as the number of days since 1970-01-01
Which is the reason why you get numeric as your class.
Moreover, if you'll use print() for each loop you'll get dates and numbers:
for (i in seq_along(x)) print(x[i])
[1] "2018-01-01"
[1] "2018-01-02"
[1] "2018-01-02"
[1] "2018-05-06"
and
for (i in x) print(i)
[1] 17532
[1] 17533
[1] 17533
[1] 17657
Lastly, if you want to test R's logic we can do something like that:
x[1] - as.Date("1970-01-01")
Taking the first element of x ("2018-01-01") and subtract "1970-01-01", which is the first date. Our output will be:
Time difference of 17532 days
If you look at ?'for', you'll see that for(var in seq) is only defined when seq is "An expression evaluating to a vector", and is.vector(x) is FALSE. So the documentation says (maybe not so clearly) that the behavior here is undefined, which is why the behavior is unexpected.
As joran mentions, as.vector(x) returns a numeric vector, same as unclass(x) mentioned by Calum You.

Date format changes

I am just preparing the some table like cols<- c("Metrics",as.Date(Sys.Date()-8,origin="1899-12-30"),as.Date(Sys.Date()-1,origin="1899-12-30")) , and I am not getting the expected output. Any one please help.
Output : "Metrics" "17927" "17934"
cols<- c("Metrics",as.Date(Sys.Date()-8,origin="1899-12-30"),as.Date(Sys.Date()-1,origin="1899-12-30"))
cols<- c("Metrics",as.Date(Sys.Date()-8,origin="1899-12-
30"),as.Date(Sys.Date()-1,origin="1899-12-30"))
Expected Output:
"Metrics" "2019-01-31" "2019-02-07"
1) character output If you are looking for a character vector as the result then convert the Date class components to character. Also note that the as.Date shown in the question is not needed since Sys.Date() and offsets from it are already of Date class. Further note that if Sys.Date() were called twice right at midnight it is possible that the two calls might occur on different days. To avoid this possibility we create a today variable so that it only has to be called once.
today <- Sys.Date()
cols <- c("Metrics", as.character(today-8), as.character(today-1))
cols
## [1] "Metrics" "2019-01-31" "2019-02-07"
1a) This could be made even shorter like this.
cols <- c("Metrics", as.character(Sys.Date() - c(8, 1)))
cols
## [1] "Metrics" "2019-01-31" "2019-02-07"
2) list output Alternately if what you want is a list with one character component and two Date components then:
today <- Sys.Date()
L <- list("Metrics", today - 8, today - 1)
L
giving:
[[1]]
[1] "Metrics"
[[2]]
[1] "2019-01-31"
[[3]]
[1] "2019-02-07"
If we already had L and wanted a character vector then we could further convert it like this:
sapply(L, as.character)
## [1] "Metrics" "2019-01-31" "2019-02-07"

Loop over objects in global environment [duplicate]

This question already has answers here:
Get type of all variables
(6 answers)
Closed 4 years ago.
I'm pretty new to R and trying to write a loop or a concise code for a simple task: checking the class of all the current objects in my Global Environment in R.
class(mydata)
#[1] "data.frame"
class(mylist)
#[1] "list"
class(mymatrix)
#[1] "matrix"
...
The following code worked, but what if I have many objects and I don't want to type all the names.
dflist <- list(mydata, mylist, mymatrix)
lapply(dflist,class)
I tried the following methods, none of them worked.
#1
for (i in ls()){
class(i)
}
#2
for (i in ls()){
lapply(i,class)
}
any solutions? Thanks.
You could use mget which returns "a named list of objects". The function's first argument should be a character vector of object names, which is what ls() returns.
lapply(mget(ls()), class)
#$mydata
#[1] "data.frame"
#
#$mylist
#[1] "list"
#
#$mymatrix
#[1] "matrix"
Try eapply:
eapply(.GlobalEnv, class)
Using sapply:
sapply(ls()[sapply(ls(), function(x) any(class(get(x)) %in% c("data.frame", "matrix", "list")))], function(x) class(get(x)))

Change type of object based on typeof another object

How do I change the typeof of one object a to another object b
without explicitly specifying the type
a <- letters
b <- as.factor(a)
typeof(a)
#> [1] "character"
So I would like to convert b to typeof(a), but without explicitly
using as.character, because in another instance a might be e.g.
integer. This obviously does not work:
typeof(b) <- typeof(a)
The closest I could come is, but not sure if there's any better solution.
a <- '1'
b <- 2
a <- unlist(lapply(a,paste0('as.',class(b))))
a
a <- '245'
a <- unlist(lapply(a,paste0('as.',class(b))))
a
Output:
> a
[1] 245
Similar answer to #amrrs except I think you may want this functionality to be used programtically, and when converting types of an object you may get errors that return NA's - which is unwanted behaviour - when you try and convert variables that cannot be coerced to another data type.
The below function accounts for this based on R's coercion rules. (Assuming you want to be using class() and not typeof())
convertClass <- function(object1, object2){
logic=c("logical", "integer", "numeric", "complex", "character", "list")
ifelse(match(class(object1), logic) < match(class(object2), logic),
eval(parse(text=paste0('as.',class(object2),"(",object1,")"))),
paste0("convertClass() cannot convert type ", class(object1), " to ", class(object2))
)
}
> convertClass(1,'1')
[1] "1"
> convertClass('1', 1)
[1] "convertClass() cannot convert type character to numeric"
Using this loses the functionality of converting "1" to 1 for example, which can be coerced in R, but does provide a strict safeguard if you don't know the type of a variable that you are feeding the function.

convert string data.frame to Date

I am wondering why this error occurs. I would like to convert this using brackets as I am making sequential conversions in a loop. And because I just want to be able to do it and understand what is happening.
head(clean.deposit.rates)
Date
1 1/31/1983
2 2/28/1983
3 3/31/1983
4 4/30/1983
5 5/31/1983
6 6/30/1983
class(clean.deposit.rates)
[1] "data.frame"
class(as.Date(clean.deposit.rates[[1]], "%m/%d/%Y"))
[1] "Date"
class(as.Date(clean.deposit.rates$Date, "%m/%d/%Y"))
[1] "Date"
as.Date(clean.deposit.rates["Date"], "%m/%d/%Y")
Error in as.Date.default(clean.deposit.rates["Date"], "%m/%d/%Y") :
do not know how to convert 'clean.deposit.rates["Date"]' to class “Date”
You need to use two [ brackets. With one, the column remains as a data frame. With two, it becomes an atomic vector which can properly be passed to the correct as.Date method
as.Date(df["Date"], "%m/%d/%Y")
# Error in as.Date.default(df["Date"], "%m/%d/%Y") :
# do not know how to convert 'df["Date"]' to class “Date”
Since df["Date"] is class data.frame, the x argument uses as.Date.default because there is no as.Date.data.frame method. The error is triggered because x is FALSE for all the if statements and continues through as.Date.default to the line
stop(gettextf("do not know how to convert '%s' to class %s",
deparse(substitute(x)), dQuote("Date")), domain = NA)
Using df[["Date"]], the column becomes a vector and is passed to either as.Date.character or as.Date.factor depending on the class of the vector, and the desired result is returned.
as.Date(df[["Date"]], "%m/%d/%Y")
# [1] "1983-01-31" "1983-02-28" "1983-03-31" "1983-04-30" "1983-05-31"
# [6] "1983-06-30"
If you want to do this for multiple columns in a single data frame, then use the lapply function. Something like:
colNames <- c('StartDate','EndDate')
mydf[colNames] <- lapply( mydf[colNames], as.Date, "%m/%d/%Y" )

Resources