Set time value into data frame cell - r

I'm trying to set a time value into a data frame:
ps = data.frame(t(rep(NA, 2)))
ps[1,1] = strptime('10:30:00', '%H:%M:%S')
but I get the error:
provided 9 variables to replace 1 variables
since a time value is a list (?) in R it thinks I'm trying to set 9 columns, when I really just want to set the one column to that class.
What can I do to make this set properly?

This is due to the result of strptime() being an object of class "POSIXlt":
> ps = data.frame(t(rep(NA, 2)))
> ps[1,1] = strptime('10:30:00', '%H:%M:%S')
Warning message:
In `[<-.data.frame`(`*tmp*`, 1, 1, value = list(sec = 0, min = 30L, :
provided 9 variables to replace 1 variables
> strptime('10:30:00', '%H:%M:%S')
[1] "2012-03-21 10:30:00"
> class(strptime('10:30:00', '%H:%M:%S'))
[1] "POSIXlt" "POSIXt"
A "POSIXlt" object is a list representation (hence the lt rather than the ct in the class name) of the time:
> foo <- strptime('10:30:00', '%H:%M:%S')
> str(foo)
POSIXlt[1:1], format: "2012-03-21 10:30:00"
> unclass(foo)
$sec
[1] 0
$min
[1] 30
$hour
[1] 10
$mday
[1] 21
$mon
[1] 2
$year
[1] 112
$wday
[1] 3
$yday
[1] 80
$isdst
[1] 0
A "POSIXlt" object is a list of length 9:
> length(unclass(foo))
[1] 9
hence the warning message, as the object is being stripped back to it's constituent parts/representation. You can stick the "POSIXct" representation in instead without generating the warning:
> ps[1,1] = as.POSIXct(strptime('10:30:00', '%H:%M:%S'))
> ps[1,1]
[1] 1332325800
but we are still loosing the class information. Still, you can go back to the "POSIXct" representation later using the as.POSIXct() function but you will need to specify the origin argument. See ?POSIXct for more.
> class(ps[1,1])
[1] "numeric"
A solution is to coerce ps$X1 to be class "POSIXct" before inserting the time:
> ps = data.frame(t(rep(NA, 2)))
> ps <- transform(ps, X1 = as.POSIXct(X1))
> ps[1,1] <- as.POSIXct(strptime('10:30:00', '%H:%M:%S'))
> ps
X1 X2
1 2012-03-21 10:30:00 NA
> str(ps)
'data.frame': 1 obs. of 2 variables:
$ X1: POSIXct, format: "2012-03-21 10:30:00"
$ X2: logi NA
No warning (as before with as.POSIXct()) but also the class information is retained, where before it was lost. Do read ?`[.data.frame`, especially the Coercion section which has some details; but my take how is that understanding the coercion in replacements like this is tricky.

Related

How to display real dates in a loop in r

When I iterate over dates in a loop, R prints out the numeric coding of the dates.
For example:
dates <- as.Date(c("1939-06-10", "1932-02-22", "1980-03-13", "1987-03-17",
"1988-04-14", "1979-08-28", "1992-07-16", "1989-12-11"), tryFormats = c("%Y-%m-%d"))
for(d in dates){
print(d)
}
The output is as follows:
[1] -11163
[1] -13828
[1] 3724
[1] 6284
[1] 6678
[1] 3526
[1] 8232
[1] 7284
How do I get R to print out the actual dates?
So the output reads:
[1] "1939-06-10"
[1] "1932-02-22"
[1] "1980-03-13"
[1] "1987-03-17"
[1] "1988-04-14"
[1] "1979-08-28"
[1] "1992-07-16"
[1] "1989-12-11"
Thank you!
When you use dates as seq in a for loop in R, it loses its attributes.
You can use as.vector to strip attributes and see for yourself (or dput to see under the hood on the full object):
as.vector(dates)
# [1] -11163 -13828 3724 6284 6678 3526 8232 7284
dput(dates)
# structure(c(-11163, -13828, 3724, 6284, 6678, 3526, 8232, 7284), class = "Date")
In R, Date objects are just numeric vectors with class Date (class is an attribute).
Hence you're seeing numbers (FWIW, these numbers count days since 1970-01-01).
To restore the Date attribute, you can use the .Date function:
for (d in dates) print(.Date(d))
# [1] "1939-06-10"
# [1] "1932-02-22"
# [1] "1980-03-13"
# [1] "1987-03-17"
# [1] "1988-04-14"
# [1] "1979-08-28"
# [1] "1992-07-16"
# [1] "1989-12-11"
This is equivalent to as.Date(d, origin = '1970-01-01'), the numeric method for as.Date.
Funnily enough, *apply functions don't strip attributes:
invisible(lapply(dates, print))
# [1] "1939-06-10"
# [1] "1932-02-22"
# [1] "1980-03-13"
# [1] "1987-03-17"
# [1] "1988-04-14"
# [1] "1979-08-28"
# [1] "1992-07-16"
# [1] "1989-12-11"
There are multiple way you can handle this :
Loop over index of dates :
for(d in seq_along(dates)){
print(dates[d])
}
#[1] "1939-06-10"
#[1] "1932-02-22"
#[1] "1980-03-13"
#[1] "1987-03-17"
#[1] "1988-04-14"
#[1] "1979-08-28"
#[1] "1992-07-16"
#[1] "1989-12-11"
Or convert date to list and then print directly.
for(d in as.list(dates)) {
print(d)
}

Saving dates in a matrix ("origin must be supplied") with r

I am writing my bachelor thesis and I have not much experience with r so far.
My problem is that my dates which I made with this commands :
t<-strptime(x, "%d.%m.%Y %H.%M")
don't work anymore when I save them in a matrix with the other information on those specific dates.
I am a bit confused because it works just fine when I don't put them in a matrix like this t[1:10]
But that happens as soon as I try to save them in a matrix
matrix1<-matrix(c(t,v2,v3,v4),nrow=length(v2))
Fehler in as.POSIXct.numeric(X[[i]], ...) : 'origin' muss angegeben werden
It's German but it means origin must be supplied.
Any ideas what I have to do to fix it? I am a bit frustrated :)
Roland is right. You can't have Posixlt objects in a matrix. What you can do is save those dates as numeric timestamps in the matrix and convert them back to dates while accessing
Converting to numeric timestamp:
>date<- as.numeric(as.POSIXct("2014-02-16 2:13:46 UTC",origin="01-01-1970"))
>date
[1] 1392545626
Then save those timestamps in a matrix as you do and to convert it back to date, use the above command again without converting it into a numeric.
t (terrible name by the way, easily confused with the t function) is a POSIXlt object, which internally is a list. First you should check, what c(t,v2,v3,v4) returns (I don't know how v2 etc are defined).
Then we can look into the documentation in help("matrix"):
data
an optional data vector (including a list or expression vector). Non-atomic classed R objects are coerced by as.vector and all attributes discarded.
The important bit is "all attributes discarded". This is what you get if you discard the attributes (which include the class attribute) of a POSIXlt object:
x <- strptime(c("2016-05-09 12:00:00", "2016-05-09 13:00:00"), format = "%Y-%m-%d %H:%M:%S")
attributes(x) <- NULL
print(x)
# [[1]]
# [1] 0 0
#
# [[2]]
# [1] 0 0
#
# [[3]]
# [1] 12 13
#
# [[4]]
# [1] 9 9
#
# [[5]]
# [1] 4 4
#
# [[6]]
# [1] 116 116
#
# [[7]]
# [1] 1 1
#
# [[8]]
# [1] 129 129
#
# [[9]]
# [1] 1 1
#
# [[10]]
# [1] "CEST" "CEST"
#
# [[11]]
# [1] NA NA
A matrix can't contain POSIXlt objects (or any objects, i.e., anything with an explicit class).

Can't replace NA's with Date

I cannot replace the NA's for some reason even if I use the is.na code. I want to replace the NA with the current date. Any ideas?
Here is what my dataframe looks like:
df
Name Parent Date
1 A no parent OLD
2 B no parent NA
3 C no parent OLD
4 D no parent OLD
5 E no parent OLD
When I try this code it doesn't work:
today <- Sys.Date()
df[["Date"]][is.na(df[["Date"]])] <- today
str(df)
'data.frame': 2505 obs. of 3 variables:
$ Name : chr " A" " B" "C" "D" ...
$ Parent: chr "no parent" "no parent" "no parent" "no parent" ...
$ Date : chr "OLD" NA "OLD" "OLD" ...
A Date in R is just a double with a Date class attribute. Once the attribute stripped off - it just becomes a double. see
attributes(today)
# $class
# [1] "Date"
unclass(today)
# [1] 16897
storage.mode(today) ## data.table::as.IDate uses an integer storage mode
# [1] "double"
And a single column can't hold several classes in R. From [<-.data.frame
When [ is used with a logical matrix, each value is coerced to the
type of the column into which it is to be placed.
Investigating the [<-.data.frame documentation I"m not sure how the conversion to a character, happens, probably
as.character(`attributes<-`(today, NULL))
# [1] "16897"
Or
as.character(unclass(today))
# [1] "16897"
While you are looking for
as.character(today)
## [1] "2016-04-06"
So to sum it up, this should do
df[is.na(df$Date), "Date"] <- as.character(today)

Two (supposedly) identical date objects in R are not equal?

I have a simple question. I have two Date objects in R that are supposed to be identical (they have the same value and class), but R is saying they are not equal. I am running on linux though I get the same result on a windows machine. Why is this happening?
code:
start=as.Date("2014-12-31")
finish=as.Date("2014-11-28")
dates = seq(start,finish,length=6)
christmasEve = as.Date("2014-12-24")
print(dates[2])
print(christmasEve)
print(class(dates[2]))
print(class(christmasEve))
(christmasEve==dates[2])
output:
[1] "2014-12-24"
[1] "2014-12-24"
[1] "Date"
[1] "Date"
[1] FALSE
Any help would be greatly appreciated!
-Paul
The problem is that you are dividing a number of days that is not a multiple of six by six. Check out:
as.numeric(dates)
# [1] 16435.0 16428.4 16421.8 16415.2 16408.6 16402.0
start - finish
# Time difference of 33 days
Since you are creating the dates as a sequence the dates are not exact round numbers.
> as.numeric(dates)
[1] 16435.0 16428.4 16421.8 16415.2 16408.6 16402.0
> as.numeric(christmasEve)
[1] 16428
> as.character(christmasEve) == as.character(dates[2])
[1] TRUE
It is not possible to test your code as there is no sampleRate. I assumed that sampleRate is 6. You could compare your dates with the code below:
all(as.character(christmasEve) == as.character(dates[2]))
The whole things should work like that
> sampleRate <- 6
>
> start=as.Date("2014-12-31")
> finish=as.Date("2014-11-28")
> dates = seq(start,finish,length=sampleRate)
> christmasEve = as.Date("2014-12-24")
> print(dates[2])
[1] "2014-12-24"
> print(christmasEve)
[1] "2014-12-24"
> print(class(dates[2]))
[1] "Date"
> print(class(christmasEve))
[1] "Date"
> (christmasEve==dates[2])
[1] FALSE
>
> all(christmasEve == dates[2])
[1] FALSE
> all(as.character(christmasEve) == as.character(dates[2])
+ )
[1] TRUE

Character to date with as.Date

I have a vector (length=1704) of character like this:
[1] "1871_01" "1871_02" "1871_03" "1871_04" "1871_05" "1871_06" "1871_07" "1871_08" "1871_09" "1871_10" "1871_11" "1871_12"
[13] "1872_01" "1872_02" "1872_03" ...
.
.
.
[1681] "2011_01" "2011_02" "2011_03" "2011_04" "2011_05" "2011_06" "2011_07" "2011_08" "2011_09" "2011_10" "2011_11" "2011_12"
[1693] "2012_01" "2012_02" "2012_03" "2012_04" "2012_05" "2012_06" "2012_07" "2012_08" "2012_09" "2012_10" "2012_11" "2012_12"
I want to convert this vector into a vector of dates.
For that I use:
as.Date(vector, format="%Y_%m")
But it returns "NA"
I tried for one value:
b <- "1871_01"
as.Date(b, format="%Y_%m")
[1] NA
strptime(b, "%Y_%m")
[1] NA
I don't understand why it doesn't work...
Does anyone have a clue?
If you do regular work in year+month format, the zoo package can come in handy since it treats yearmon as a first class citizen (and is compatible with Date objects/functions):
library(zoo)
my.ym <- as.yearmon("1871_01", format="%Y_%m")
print(my.ym)
## [1] "Jan 1871"
str(my.ym)
## Class 'yearmon' num 1871
my.date <- as.Date(my.date)
print(my.date)
## [1] "1871-01-01"
str(my.date)
## Date[1:1], format: "1871-01-01"

Resources