I have a vector (length=1704) of character like this:
[1] "1871_01" "1871_02" "1871_03" "1871_04" "1871_05" "1871_06" "1871_07" "1871_08" "1871_09" "1871_10" "1871_11" "1871_12"
[13] "1872_01" "1872_02" "1872_03" ...
.
.
.
[1681] "2011_01" "2011_02" "2011_03" "2011_04" "2011_05" "2011_06" "2011_07" "2011_08" "2011_09" "2011_10" "2011_11" "2011_12"
[1693] "2012_01" "2012_02" "2012_03" "2012_04" "2012_05" "2012_06" "2012_07" "2012_08" "2012_09" "2012_10" "2012_11" "2012_12"
I want to convert this vector into a vector of dates.
For that I use:
as.Date(vector, format="%Y_%m")
But it returns "NA"
I tried for one value:
b <- "1871_01"
as.Date(b, format="%Y_%m")
[1] NA
strptime(b, "%Y_%m")
[1] NA
I don't understand why it doesn't work...
Does anyone have a clue?
If you do regular work in year+month format, the zoo package can come in handy since it treats yearmon as a first class citizen (and is compatible with Date objects/functions):
library(zoo)
my.ym <- as.yearmon("1871_01", format="%Y_%m")
print(my.ym)
## [1] "Jan 1871"
str(my.ym)
## Class 'yearmon' num 1871
my.date <- as.Date(my.date)
print(my.date)
## [1] "1871-01-01"
str(my.date)
## Date[1:1], format: "1871-01-01"
Related
I have data that has a FixDateTime column (head below) where it is a character
head(df$FixDateTime)
[1] "2017-03-15 15:00:04" "2017-03-16 14:00:48" "2017-03-17 13:00:22"
[4] "2017-03-18 12:00:47" "2017-03-19 11:01:00" "2017-03-20 10:00:47"
class(df$FixDateTime)
[1] "character"
Using the code below I try to convert to as.POSIXct and the resulting column is full of NAs. I know that there are no NAs in my dataset
df$DateTime<-as.POSIXct(df$FixDateTime, format="%Y-%m%-dT%H:%M:%S", tz="MST")
head(df$DateTime)
[1] NA NA NA NA NA NA
I have also run the code in the same way omiting the "T" (with a space instead) and it results in the same thing
I have played with the timezone, and this does not seem to be the issue. I just need a column in the POSIXct format containing date and time.
You can use a tidyverse approach
lubridate::ymd_hms("2017-03-17 13:00:22",tz = "MST")
When I iterate over dates in a loop, R prints out the numeric coding of the dates.
For example:
dates <- as.Date(c("1939-06-10", "1932-02-22", "1980-03-13", "1987-03-17",
"1988-04-14", "1979-08-28", "1992-07-16", "1989-12-11"), tryFormats = c("%Y-%m-%d"))
for(d in dates){
print(d)
}
The output is as follows:
[1] -11163
[1] -13828
[1] 3724
[1] 6284
[1] 6678
[1] 3526
[1] 8232
[1] 7284
How do I get R to print out the actual dates?
So the output reads:
[1] "1939-06-10"
[1] "1932-02-22"
[1] "1980-03-13"
[1] "1987-03-17"
[1] "1988-04-14"
[1] "1979-08-28"
[1] "1992-07-16"
[1] "1989-12-11"
Thank you!
When you use dates as seq in a for loop in R, it loses its attributes.
You can use as.vector to strip attributes and see for yourself (or dput to see under the hood on the full object):
as.vector(dates)
# [1] -11163 -13828 3724 6284 6678 3526 8232 7284
dput(dates)
# structure(c(-11163, -13828, 3724, 6284, 6678, 3526, 8232, 7284), class = "Date")
In R, Date objects are just numeric vectors with class Date (class is an attribute).
Hence you're seeing numbers (FWIW, these numbers count days since 1970-01-01).
To restore the Date attribute, you can use the .Date function:
for (d in dates) print(.Date(d))
# [1] "1939-06-10"
# [1] "1932-02-22"
# [1] "1980-03-13"
# [1] "1987-03-17"
# [1] "1988-04-14"
# [1] "1979-08-28"
# [1] "1992-07-16"
# [1] "1989-12-11"
This is equivalent to as.Date(d, origin = '1970-01-01'), the numeric method for as.Date.
Funnily enough, *apply functions don't strip attributes:
invisible(lapply(dates, print))
# [1] "1939-06-10"
# [1] "1932-02-22"
# [1] "1980-03-13"
# [1] "1987-03-17"
# [1] "1988-04-14"
# [1] "1979-08-28"
# [1] "1992-07-16"
# [1] "1989-12-11"
There are multiple way you can handle this :
Loop over index of dates :
for(d in seq_along(dates)){
print(dates[d])
}
#[1] "1939-06-10"
#[1] "1932-02-22"
#[1] "1980-03-13"
#[1] "1987-03-17"
#[1] "1988-04-14"
#[1] "1979-08-28"
#[1] "1992-07-16"
#[1] "1989-12-11"
Or convert date to list and then print directly.
for(d in as.list(dates)) {
print(d)
}
I am writing my bachelor thesis and I have not much experience with r so far.
My problem is that my dates which I made with this commands :
t<-strptime(x, "%d.%m.%Y %H.%M")
don't work anymore when I save them in a matrix with the other information on those specific dates.
I am a bit confused because it works just fine when I don't put them in a matrix like this t[1:10]
But that happens as soon as I try to save them in a matrix
matrix1<-matrix(c(t,v2,v3,v4),nrow=length(v2))
Fehler in as.POSIXct.numeric(X[[i]], ...) : 'origin' muss angegeben werden
It's German but it means origin must be supplied.
Any ideas what I have to do to fix it? I am a bit frustrated :)
Roland is right. You can't have Posixlt objects in a matrix. What you can do is save those dates as numeric timestamps in the matrix and convert them back to dates while accessing
Converting to numeric timestamp:
>date<- as.numeric(as.POSIXct("2014-02-16 2:13:46 UTC",origin="01-01-1970"))
>date
[1] 1392545626
Then save those timestamps in a matrix as you do and to convert it back to date, use the above command again without converting it into a numeric.
t (terrible name by the way, easily confused with the t function) is a POSIXlt object, which internally is a list. First you should check, what c(t,v2,v3,v4) returns (I don't know how v2 etc are defined).
Then we can look into the documentation in help("matrix"):
data
an optional data vector (including a list or expression vector). Non-atomic classed R objects are coerced by as.vector and all attributes discarded.
The important bit is "all attributes discarded". This is what you get if you discard the attributes (which include the class attribute) of a POSIXlt object:
x <- strptime(c("2016-05-09 12:00:00", "2016-05-09 13:00:00"), format = "%Y-%m-%d %H:%M:%S")
attributes(x) <- NULL
print(x)
# [[1]]
# [1] 0 0
#
# [[2]]
# [1] 0 0
#
# [[3]]
# [1] 12 13
#
# [[4]]
# [1] 9 9
#
# [[5]]
# [1] 4 4
#
# [[6]]
# [1] 116 116
#
# [[7]]
# [1] 1 1
#
# [[8]]
# [1] 129 129
#
# [[9]]
# [1] 1 1
#
# [[10]]
# [1] "CEST" "CEST"
#
# [[11]]
# [1] NA NA
A matrix can't contain POSIXlt objects (or any objects, i.e., anything with an explicit class).
I have not had experience with using dates in R. I have read all of the docs but I still can't figure out why I am getting this error. I am trying to take a vector of strings and convert that into a vector of dates, using some specified format. I have tried both using for loops and converting each date indicidually, or using vector functions like sapply, but neither is working. Here is the code using for loops:
dates = rawData[,ind] # get vector of date strings
print("single date example")
print(as.Date(dates[1]))
dDates = rep(1,length(dates)) # initialize vector of dates
class(dDates)="Date"
for (i in 1:length(dates)){
dDates[i]=as.Date(dates[i])
}
print(dDates[1:10])
EDIT: info on "dates" variables
[1] "dates"
V16 V17 V18 V19 V36
[1,] "2014-01-16" "2014-01-30" "2014-01-16" "2014-01-17" "1999-03-16 12:00"
[2,] "2014-01-04" "2014-01-18" "2014-01-04" "2014-01-08" "1998-09-04 12:00"
[3,] "2014-03-05" "2014-03-19" "2014-03-05" "2014-03-07" "1996-09-30 05:00"
[4,] "2014-01-21" "2014-02-04" "2014-01-22" "2014-01-24" "1995-08-21 12:00"
[5,] "2014-01-07" "2014-01-21" "2014-01-07" "2014-01-09" "1994-04-07 12:00"
[1] "class(dates)"
[1] "matrix"
[1] "class(dates[1,1])"
[1] "character"
[1] "dim(dates)"
[1] 56557 8
The result I am getting is as follows:
[1] "single date example"
[1] "2014-01-16"
Error in charToDate(x) :
character string is not in a standard unambiguous format
So basically, when I try to parse a signle element of the date string into a date, it works fine. But when I try to parse the dates in a loop, it breaks. How could this be so?
The reason why I am using a loop instead of sapply is because that was returning an even stranger result. When I try to run:
dDates = sapply(dDates, function(x) as.Date(x, format = "%Y-%m-%d"))
I am getting the following output:
2014-01-16 2014-01-04 2014-03-05 2014-01-21 2014-01-07 2014-01-02 2014-01-08
NA NA NA NA NA NA NA
2014-02-22 2014-01-09 2014-02-22
NA NA NA
Which is very strange. As you can see, since my format was correct, it was able to parse out the dates. But for some reason, it is also giving a time value of NA (or at least that is what I think the NA means). Maybe this is happening because some of my date strings have times, while others don't. But the thing is I left the time out of the format because I don't care about time.
Does anyone know why this is happening or how to fix it? I can't find anywhere online where you can "set" the time value of a date object easily -- I just can't seem to get rid of that NA. And somehow even a for loop doesn't work! Either was, the output is strange and I am not getting the expected results, even though my format is correct. Very frustrating that a simple thing like parsing a vector of dates is so much more difficult than in matlab or java.
Any help please?
EDIT: when I try simply
dDates = as.Date(dates,format="%m/%d/%Y")
I get the output
"dDates[1:10]"
[1] NA NA NA NA NA NA NA NA NA NA
still those mysterious NA's. I am also getting an error
Error in as.Date.default(value) :
do not know how to convert 'value' to class “Date”
Using a subset of your data,
v <- c("2014-01-16", "2014-01-30", "2014-01-16", "2014-01-17", "1999-03-16 12:00")
these statements are equivalent, since your format is the default one:
as.Date(v)
[1] "2014-01-16" "2014-01-30" "2014-01-16" "2014-01-17" "1999-03-16"
as.Date(v, format = "%Y-%m-%d")
[1] "2014-01-16" "2014-01-30" "2014-01-16" "2014-01-17" "1999-03-16"
If you would like to format the output of your date, use format:
format(as.Date(v), format = "%m/%d/%Y")
[1] "01/16/2014" "01/30/2014" "01/16/2014" "01/17/2014" "03/16/1999"
I have a start date and an end date but when I am making a list to contain all dates in between, the format is changed:
> startDate <- as.Date("2012-01-01")
> startDate
[1] "2012-01-01"
> endDate <- as.Date("2012-02-01")
> endDate
[1] "2012-02-01"
> startDate:endDate
[1] 15340 15341 15342 15343 15344 15345 15346 15347 15348 15349 15350 15351 15352 15353 15354 15355
[17] 15356 15357 15358 15359 15360 15361 15362 15363 15364 15365 15366 15367 15368 15369 15370 15371
So you can see that all dates are converted to a numeric format.
But the problem is, I have a API function that can only read date format as "YYYY-MM-DD".
Can any one suggest how I can generate such a list like:
[1] "2012-01-01" "2012-01-02" "2012-01-03" "2012-01-04" ....
Use seq function:
seq(startDate,endDate,by="day") #you could use also by=1
# see ?seq.Date for other options for "by"
From help page of operator : (use ?":" or ?Colon):
For other arguments from:to is equivalent to seq(from, to), and
generates a sequence from from to to in steps of 1 or -1. Value to
will be included if it differs from from by an integer up to a numeric
fuzz of about 1e-7. Non-numeric arguments are coerced internally
(hence without dispatching methods) to numeric—complex values will
have their imaginary parts discarded with a warning.
So
identical(startDate:endDate,as.numeric(startDate):as.numeric(endDate))
[1] TRUE
And btw, you are generating a vector, not a list. You can make a list out of your values by using as.list function though, if that is what you really want.