R Date conversion from factor - r

I have a factor of a date given in number with the origin 1899-12-30, e.g.
dat1 <- factor(42648)
If I want to change it to a date format, I'd normally do
as.Date(dat1, origin="1899-12-30")
but the conversion requires a certain format. Doing
as.Date(as.numeric(as.character(dat1)), origin="1899-12-30")
works but it seems a little over the top. Is there a shorter way?

Related

How do I stop implicit date conversion when using ifelse with date time data? [duplicate]

This question already has answers here:
How to prevent ifelse() from turning Date objects into numeric objects
(7 answers)
Closed 4 years ago.
I have a data frame that contains one column that is a series of dates, collected via a Google form. The date and time were collected separately. The data was entered by selecting a day from a calendar, and the date was entered manually - should have been a 24-hour clock, but the field appears to have just checked that the hour and minute were in the correct range.
I've read the file in from .csv . I converted the date time character field (as read in from the .csv) to a date time format in a new variable by using as.POSIXct(foo$When, tz="NZ", format="%Y-%m-%d %H:%M"). The dates and times were correctly constructed.
Except: I have some incorrect date/time entries in the original data. These have all been set to NA in the new field, as you expect. For those that do include a time, I have been trying to fix them while still retaining a POSIXct format.
I have been unsuccessful.
Here is an example of the data I have, and what I have tried to do:
TestDataForHelp <- data.frame(OldDateTime =
c("2013-12-04 21:10", "2013-12-15 09:07", "2014-01-01 06:27",
"2014-11-02 21:15", "2014-11-07 23:00", "2015-01-04 21:42",
"201508-11-02 20:15", "201508-11-02 20:15", "2017-11-02"))
TestDataForHelp$ActualDateTime <-
as.POSIXct(TestDataForHelp$OldDateTime, tz="NZ", format="%Y-%m-%d %H:%M")
TestDataForHelp$FixedDateTime <-
ifelse(TestDataForHelp$OldDateTime=="201508-11-02 20:15",
as.POSIXct("2015-11-02 20:15", tz="NZ", format="%Y-%m-%d %H:%M"),
TestDataForHelp$ActualDateTime)
The new variable, FixedDateTime, does not have a POSIXct type. It has been implicitly converted to a numeric type. How can I retain the POSIXct format from ActualDateTime and not have the implicit type conversion?
I would like to not have FixedDateTime but, rather, put the corrected data into ActualDateTime. The ifelse() seems to be the part of the code causing the format to shift from POSIXct to numeric. If I do:
TestDataForHelp$CopiedDateTime <- TestDataForHelp$ActualDateTime
The new variable, that is simply a copy of the original, retains the POSIXct type.
The previous question linked in the comments relates to date values only, not date time values. The data manipulation becomes more complicated with dealing with date time values, given that mine also do not include seconds. The other difference is that the original variable contains a mix of date, date-time, and incorrect date-time values, whereas that previous question had values that were all the same. It was unclear whether the non-uniform content of the variable was causing the problem.
Edit: I fixed the problem by fixing the strings before I converted them to dates. This removed the need to try to loop through the dates.
I can replicate the numeric answer, but not explain it. It is however calculating the results correctly for you. I'm not sure why it's returning as a numeric. However, the conversion from numeric to date is easy enough if you know the origin, which should be 1970-01-01. So I believe the following does the trick:
(Note, the first block is just what you already have)
TestDataForHelp$FixedDateTime <- ifelse(TestDataForHelp$OldDateTime=="201508-11-02 20:15",
as.POSIXct("2015-11-02 20:15", tz="NZ", format="%Y-%m-%d %H:%M"),
TestDataForHelp$ActualDateTime)
TestDataForHelp$FixedDateTime <- as.POSIXct(TestDataForHelp$FixedDateTime,
origin = as.POSIXct("1970-01-01", tz="NZ"))

R - Numeric to Date gives wrong value

I have a data frame DP with a column variable in numeric format which is a numeric representation of Date.
Example: 43282 corresponds to 7/1/2018 (try in excel).
But in R when I call as.Date() to convert it to date, I get the wrong date
DP$Time <- as.Date(DP$variable)
variable Time
1 43282 2088-07-02
What am I doing wrong here?
If it is based on excel, then change the origin from default 1970-01-01 to 1899-12-30
as.Date(43282, origin = '1899-12-30')
#[1] "2018-07-01"
Excel's origin of date is "1899-12-30" and R's origin of date is "1970-01-01". Since they have different origins for the date, while importing the data from Excel, you are getting a different date in R.
Specify the right origin and it will print the right values:
DP$Time <- as.Date(DP$variable, origin = "1899-12-30")

Using R for a Date format of 07-JUL-16 06.05.54.000000 AM

I have 2 Date variables in a .csv file with formats of "07-JUL-16 06.05.54.000000 AM". I want to use these in a regression model. Should I be reading these into a data frame as factors or characters? How can I take a difference of the 2 dates in each case?
Read them in as characters (e.g. stringsAsFactors=FALSE or tidyverse functions), then use as.POSIXct, e.g.
as.POSIXct("07-JUL-16 06.05.54.000000 AM",format="%d-%b-%y %I.%M.%OS %p")
## [1] "2016-07-07 06:05:54 EDT"
(I'm assuming that you are intending a day-month-year format rather than a month-day-year format -- but actually I don't have any evidence to support that thought!)
Once you've done this, subtracting the values should just work (give you an object of difftime) -- but be careful with units when converting to numeric!
For what it's worth, lubridate::ymd_hms thinks it can guess the format, but guesses wrong (?? assuming I guessed right above: with a two-digit year, and without any year values greater than 31, there's really nothing to distinguish years and days ...)

How to determine the correct argument for origin in as.Date, R

I have a data set in R that contains a column of dates in the format yyyy/mm/dd. I am trying to use as.Date to convert these dates to date objects in R. However, I cannot seem to find the correct argument for origin to input into as.Date. The following code is an example of what I have been trying. I am using a CSV file from Excel, so I used origin="1899/12/30 based on other sites I have looked at.
> as.Date(2001/04/26, origin="1899/12/30")
[1] "1900-01-18"
However, this is not working since the input date 2001/04/26 is returned as "1900-01-18". I need to convert the dates into date objects so I can then convert the dates into julian dates.
You can either is as.Date with a numeric value, or with a character value. When you type just 2001/04/26 into R, that's doing division and getting 19.24 (a numeric value). And numeric values require an origin and the number you supply is the offset from that origin. So you're getting 19 days away from your origin, ie "1900-01-18". A date like Apr 26 2001 would be
as.Date(40659, origin="1899-12-30")
# [1] "2011-04-26"
If your dates from Excel "look like" dates chances are they are character values (or factors). To convert a character value to a Date with as.Date() you want so specify a format. Here
as.Date("2001/04/26", format="%Y/%m/%d")
# [1] "2001-04-26"
see ?strptime for details on the special % variables. Now if you're read your data into a data.frame with read.table or something, there's a chance your variable may be a factor. If that's the case, you'll want do convert to character with'
as.Date(as.character(mydf$datecol), format="%Y/%m/%d")

convert string to time in r

I have an array of time strings, for example 115521.45 which corresponds to 11:55:21.45 in terms of an actual clock.
I have another array of time strings in the standard format (HH:MM:SS.0) and I need to compare the two.
I can't find any way to convert the original time format into something useable.
I've tried using strptime but all it does is add a date (the wrong date) and get rid of time decimal places. I don't care about the date and I need the decimal places:
for example
t <- strptime(105748.35, '%H%M%OS') = ... 10:57:48
using %OSn (n = 1,2 etc) gives NA.
Alternatively, is there a way to convert a time such as 10:57:48 to 105748?
Set the options to allow digits in seconds, and then add the date you wish before converting (so that the start date is meaningful).
options(digits.secs=3)
strptime(paste0('2013-01-01 ',105748.35), '%Y-%M-%d %H%M%OS')

Resources