Converting monthly numerics to readable dates in R - r

How can set R to count months instead of dates when converting integers to dates?
After reading several threads on how to convert dates in R, it seems like nobody has asked how it is possible to convert numeric dates if the numerics is given in monthly timeseries. E.g. 552 represents January 2006.
I have tried several things, such as using as.Date(dates,origin="1899-12-01"), but I reckognize that R counts days instead of months. Thus, the code on year-month number 552 above yields "1901-06-06" instead of the correct 2006-01-01.
Sidenote: I also want the format to be YEARmonth, but does R allow displaying dates without days?

I think your starting date should be '1960-01-01'.
anyway you can solve this problem using the package lubridate.
in this case you can start from a date and add months.
library(lubridate)
as.Date('1960-01-01') %m+% months(552)
it gives you
[1] "2006-01-01"
you can display only the year and month of a date, but in that case R coerces the date into a character.
format(as.Date('2006-01-01'), "%Y-%m")

Related

Comparing dates in a dataframe and appending info based on comparison result in R

so I am lost with the following problem:
I have a dataframe, in which one column contains (STARTED) the starting time of a survey, and several others information of the survey schedule of that survey participant (D5 to D10: only the planned survey dates, D17 to D50: planned send-out times of measurement per day). I'd like to create to columns that indicate now which survey day (1-6) and which measurement per day (1-6) this survey corresponds to.
First problem is the format (!)...
STARTED has the format %Y-%m-%d %H:%M:%S, D5 to D10 %d.%m.%Y and D17 to D50 %d.%m.%Y %H:%M.
I tried dmy_hms() from lubridate, parse_date_time(), and simply as.POSIXct(), but I always fail to get STARTED and the D17 to D50 section into a comparable format. Any solutions on this one?
After just separating STARTED into date & time columns, I was able to compare using ifelse() with D5 to D10 and to create the column of day running from 1 to 6.
This might be already more elegant with something like which(), but I was not able to create a vectorized version of this, as which(<<D5:D10>> == STARTED) would need to compare that per row. Does anyone have a solution for this?
And lastly, how on earth can I set up the second column indicating the measurement time? The first and last survey of the is easy, as there are also uniquely labelled, but for the other four ones I would need to compare per day whether the starting time is before the planned survey time of the following survey. I could imagine just checking whether STARTED falls in between two planned survey times just next to each other - as a POSIXct object that might work, if I can parse the different formats.
Help is greatly appreciated, thanks!
A screenshot from the beginning of the data:
Screenshot from R data using View()
For these first few rows, the intended variable day would need to be c(1,2,1,1,1,2,2) and measurement c(3,2,4,2,1,2,3).
Your other columns are not formatted with %d.%m.%Y, instead either %d.%m.%t (date only) or %d.%m.%y %H:%M. Note the change from %Y to %y.
Try:
as.Date("20.05.22", format = "%d.%m.%y")
# [1] "2022-05-20"
as.POSIXct("20.05.22 06:00", format = "%d.%m.%y %H:%M")
# [1] "2022-05-20 06:00:00 EDT"

as.Date keeps on returning NAs [duplicate]

This question already has answers here:
How to convert a character string date to date class if day value is missing
(1 answer)
Converting year and month ("yyyy-mm" format) to a date?
(9 answers)
Closed 6 years ago.
So I've been through some Stack Exchange answers, and I can't resolve this.
I have a column in a dataframe that has dates as characters as follows
2011-12
2012-04
2011-10
etc
I would like to convert these to date formats which I have tried to do as follows:
Tots$DatesMerge<-as.Date(Tots$DatesMerge,"%Y-%m")
but I get NA's back all the time.
I tried to do as here but no joy. I'm really not sure what I'm doing wrong.
I'd say as.Date won't be able to work on values where there's no day of the month. You could try with zoo, as long as you don't mind it coming out as a yearmon class:
library( zoo )
as.yearmon( Tots$DatesMerge )
Alternatively, you can specify a day of the month to use as a dummy:
as.Date( paste0( Tots$DatesMerge, "-15" ) )
Edit: there is already an answer and it is a duplicate, but I suppose the explanation can be useful for further readers, so I'll leave it.
Explanation
This comes from the documentation in R, "Dates are represented as the number of days since 1970-01-01, with negative values for earlier dates".
In R, dates are thus dependent on year, month and days or at least an integer that represent the span (in days) from or to 1970-01-01. As such, the base Dates package in R cannot convert the data formated in years and month into dates since there are no days.
Solution
As a consequence, you have the option, if you go with the base R package, to provide a a day that would be used to convert your data.
Tots$DatesMerge <- as.Date(paste0(Tots$DatesMerge,"01"),"%Y-%m-%d")

Using R for a Date format of 07-JUL-16 06.05.54.000000 AM

I have 2 Date variables in a .csv file with formats of "07-JUL-16 06.05.54.000000 AM". I want to use these in a regression model. Should I be reading these into a data frame as factors or characters? How can I take a difference of the 2 dates in each case?
Read them in as characters (e.g. stringsAsFactors=FALSE or tidyverse functions), then use as.POSIXct, e.g.
as.POSIXct("07-JUL-16 06.05.54.000000 AM",format="%d-%b-%y %I.%M.%OS %p")
## [1] "2016-07-07 06:05:54 EDT"
(I'm assuming that you are intending a day-month-year format rather than a month-day-year format -- but actually I don't have any evidence to support that thought!)
Once you've done this, subtracting the values should just work (give you an object of difftime) -- but be careful with units when converting to numeric!
For what it's worth, lubridate::ymd_hms thinks it can guess the format, but guesses wrong (?? assuming I guessed right above: with a two-digit year, and without any year values greater than 31, there's really nothing to distinguish years and days ...)

How to convert ordinal date day-month-year format using R

I have log files where the date is mentioned in the ordinal date format.
wikipedia page for ordinal date
i.e 14273 implies 273'rd day of 2014 so 14273 is 30-Sep-2014.
is there a function in R to convert ordinal date (14273) to (30-Sep-2014).
Tried the date package but didn come across a function that would do this.
Try as.Date with the indicated format:
as.Date(sprintf("%05d", 14273), format = "%y%j")
## [1] "2014-09-30"
Notes
For more information see ?strptime [link]
The 273 part is sometimes referred to as the day of the year (as opposed to the day of the month) or the day number or the julian day relative to the beginning of the year.
If the input were a character string of the form yyjjj (rather than numeric) then as.Date(x, format = "%y%j") will do.
Update Have updated to also handle years with one digit as per comments.
Data example
x<-as.character(c("14273", "09001", "07031", "01033"))
Data conversion
x1<-substr(x, start=0, stop=2)
x2<-substr(x, start=3, stop=5)
x3<-format(strptime(x2, format="%j"), format="%m-%d")
date<-as.Date(paste(x3, x1, sep="-"), format="%m-%d-%y")
You can use lubridate package as follows:
>library(lubridate)
# Create a template date object
>date <- as.POSIXlt("2009-02-10")
# Update the date using
> update(date, year=2014, yday=273)
[1] "2014-09-30 JST"

Creating a single timestamp from separate DAY OF YEAR, Year and Time columns in R

I have a time series dataset for several meteorological variables. The time data is logged in three separate columns:
Year (e.g. 2012)
Day of year (e.g. 261 representing 17-September in a Leap Year)
Hrs:Mins (e.g. 1610)
Is there a way I can merge the three columns to create a single timestamp in R? I'm not very familiar with how R deals with the Day of Year variable.
Thanks for any help with this!
It looks like the timeDate package can handle gregorian time frames. I haven't used it personally but it looks straightforward. There is a shift argument in some methods that allow you to set the offset from your data.
http://cran.r-project.org/web/packages/timeDate/timeDate.pdf
Because you mentioned it, I thought I'd show the actual code to merge together separate columns. When you have the values you need in separate columns you can use paste to bring them together and lubridate::mdy to parse them.
library(lubridate)
col.month <- "Jan"
col.year <- "2012"
col.day <- "23"
date <- mdy(paste(col.month, col.day, col.year, sep = "-"))
Lubridate is a great package, here's the official page: https://github.com/hadley/lubridate
And here is a nice set of examples: http://www.r-statistics.com/2012/03/do-more-with-dates-and-times-in-r-with-lubridate-1-1-0/
You should get quite far using ISOdatetime. This function takes vectors of year, day, hour, and minute as input and outputs an POSIXct object which represents time. You just have to split the third column into two separate hour minute columns and you can use the function.

Resources