Convert factors to datetime, <fctr> 10/25/2018 (M, D, Y) - r

I have a data frame called RequisitionHistory2 with a variable called RequisitionDateTime and the levels are factors which look like 4/30/2019 14:16 I would like to split this into RequisitionDate and RequisitionTime in a datetime format.
I tried this code, but this still does not solve my issue with needing to split these into their own columns. The code also did not work as I got the error below.
mutate(When = as.POSIXct(RequisitionHistory2, format="%m/%d/%. %H:%M %p"))
Error in as.POSIXct.default(RequisitionHistory2, format = "%m/%d/%. %H:%M %p") : do not know how to convert 'RequisitionHistory2' to class “POSIXct”
I would like to have the variable RequisitionDateTime split into RequisitionDate and another variable RequisitionTime in the dataframe RequisitionHistory2. Any help is greatly appreciated!

Do not convert factors to datetime directly. You will need to convert it to a character first and then use a datetime function.
as.Date(as.character("10/25/2018"), format = "%m/%d/%Y")
would work for your date example.
library(lubridate)
mutate(df,When = mdy_hm(RequisitionHistory2))
If your datetime is in 4/30/2019 14:16 format
Note that as.POSIXct() works only on datetimes already in ISO 8601 format. I wrote a blog post about this and I think would be helpful for you to check out:
https://jackylam.io/tutorial/uber-data/

The anytime package ON CRAN directly converts from many formats, including factor and ordered to dates and datetime objects. It also heuristically tries a number of viable formats so that you do not need a format string. See the README at GitHub for an introduction, there is also a vignette
Your example works:
R> library(anytime)
R> anytime(as.factor("4/30/2019 14:16"))
[1] "2019-04-30 14:16:00 CDT"
R> anytime(as.factor("4/3/2019 14:16:17"), useR=TRUE)
[1] "2019-04-03 14:16:17 CDT"
R>
However, the underlying (Boost C++) parser does not like single digit days or month so you may need to flip back to R's parser via useR=TRUE as I did on the second example.

Related

as.Date keeps returning NA - Mont-Year format

I imported csv data saved from excel and I am using a Mac if that matters.
To simplify things, I have been working with one particular entry from my data "Jan-12". This is of type character and in the form Month-Year.
This is what I have tried:
as.Date("Jan-12", format = "%b-%y")
I keep getting NA. I have browsed through other answers but haven't been able to figure out whats happening.
as.Date() help page says:
If the date string does not specify the date completely, the returned
answer may be system-specific.
If you really need to use as.Date() you can append some fixed day (say 1st) to date and convert
mydate <- "Jan-12"
as.Date(paste0("01-", mydate), format= "%d-%b-%y")
"2012-01-01"
Or you can use lubridate::fast_strptime():
library(lubridate)
fast_strptime("Jan-12", "%b-%y")
"2012-01-01 UTC"
Here is another option using zoo, then you can use as.Date.
library(zoo)
as.Date(as.yearmon("Jan-12", "%b-%y"))
# [1] "2012-01-01"

Unable to convert character / factor class into date

I know this question is asked a lot, but I only come to you because I tried everything (including the tips from similar questions that I managed to understand).
I have a rather big CSV file (> 16 000 rows), with, among others, a "Date" column, containing dates in the following format "01/01/1999".
However, when loading the file, the column is not recognised as a date, but as a Factor with read.csv2, or a character with fread (data.table package). I loaded the lubridate library.
In both cases, I tried to convert the column to a date format, using all methods I knew (column = Date, data = test):
as.Date(test$Date, format = "%d/%m/%Y", tz = "")
Or
strptime(test$Date, format = "%d/%m/%y", tz = "")
Or
as_date(test$Date)
And also the function dmy from lubridate, and
as.POSIXct(test$Date, "%d/%m/%y", tz = "").
I also tried changing the format: ymd instead of dmy, "-" instead of "/".
I even tried to change the character class to numeric (when loaded with fread), and the factor class to numeric (when loaded with read.csv2).
Despite all of this, the columns stay in their factor / character classes.
Does someone know what I missed?
Just use the anydate() function from the anytime package:
R> library(anytime)
R> var <- as.factor(c("01/01/1999", "01/02/1999"))
R> anydate(var)
[1] "1999-01-01" "1999-01-02"
R>
R> class(anydate(var))
[1] "Date"
R>
R> class(var)
[1] "factor"
R>
R>
It will read just about any input time, and convert it without requiring a format and this works as long as the represented is somewhat standard (i.e. we do not work with two-digit years etc).
(Otherwise you can of course also use the base R functions after first converting from factor back to character via as.character(). But anytime() and anydate() do that, and much more, for you too.)
If you are using read.csv2, try
read.csv2(..., stringsAsFactors=F)
and then continue with as.Date

Want to convert date from a specific format

I have a list of data that has date information in the format:
11-Feb-08, 13-Feb-08, 2-Mar-08 etc. How can I change all the entries in this column to be in dd/mm/yy format. I have tried as.Date and as.POSIXct but it converts it to NAs. sos pls help.
You are getting NAs for the date values because of the formatting issue. Provide appropriate date format in format argument of as.POSIXct or as.Date function.
As per the date example(11-Feb-08), the appropriate format would be :
format = '%d-%b-%y'.
Do look at the documentation using ?strptime for format related query.It is well documented for each kind of date format.
You can try below Code using lubridate
library(lubridate)
c<-data.frame("Date" = c("11-Feb-08","13-Feb-08", "2-Mar-08"))
c$Date<-dmy(c$Date, tz = "Asia/Kolkata")
str(c$Date)
You will get below result:
POSIXct[1:3], format: "2008-02-11" "2008-02-13" "2008-03-02"

Converting integer format date to double format of date

I have date format in following format in a data frame:
Jan-85
Apr-99
1-Nov
Feb-96
When I see the typeof(df$col) I get the answer as "integer".
Actually when I see the format in excel it is in m/d/yyyy format. I was trying to convert this to date format in R. All my efforts yielded NA.
I tried parse_date_time function. I tried as.date along with as.character. I tried as.POSIXct but everything is giving me NA.
My trials were as follows and everything was a failure:
as.Date.numeric(df$col,"m%d%Y")
transform(df$col, as.Date(as.character(df$col), "%m%d%Y"))
as.Date(df$col,"m%d%Y")
as.POSIXct.numeric(as.character(loan_new$issue_d), format="%Y%m%d")
as.POSIXct.date(as.character(df$col), format="%Y%m%d")
mdy(df$col)
parse_date_time(df$col,c("mdy"))
How can I convert this to date format? I have used lubridate package for parse_date_time and mdy package.
dput output is below
Label <- factor(c("Apr-08",
"Apr-09", "Apr-10", "Apr-11", "Aug-07", "Aug-08", "Aug-09", "Aug-10",
"Aug-11", "Dec-07", "Dec-08", "Dec-09", "Dec-10", "Dec-11", "Feb-08",
"Feb-09", "Feb-10", "Feb-11", "Jan-08", "Jan-09", "Jan-10", "Jan-11",
"Jul-07", "Jul-08", "Jul-09", "Jul-10", "Jul-11", "Jun-07", "Jun-08",
"Jun-09", "Jun-10", "Jun-11", "Mar-08", "Mar-09", "Mar-10", "Mar-11",
"May-08", "May-09", "May-10", "May-11", "Nov-07", "Nov-08", "Nov-09",
"Nov-10", "Nov-11", "Oct-07", "Oct-08", "Oct-09", "Oct-10", "Oct-11",
"Sep-07", "Sep-08", "Sep-09", "Sep-10", "Sep-11"))
NA is typically what you get when you misspecify the format. Which is what you do. That said, if your data is really looking like the first example you gave, it's impossible to simply convert this to a date. You have two different formats, one being month-year and the other day-month.
If your updated date (i.e. Dec-11) is the correct format, then you use the format argument of as.Date like this:
date <- "Dec-11"
as.Date(date, format = "%b-%d")
# [1] "2017-12-11"
Or on your example data:
as.Date(Label, format = "%b-%d")
# [1] "2017-04-08" "2017-04-09" "2017-04-10" "2017-04-11" "2017-08-07" "2017-08-08"
# [7] "2017-08-09" "2017-08-10" "2017-08-11" "2017-12-07" "2017-12-08" "2017-12-09"
If you want to convert something like Jan-85, you have to decide which day of the month that date should have. Say we just take the first of each month, then you can do:
x <- "Jan-85"
xd <- paste0("1-",x)
as.Date(xd, "%d-%b-%y")
# [1] "1985-01-01"
More information on the format codes can be found on ?strptime
Note that R will automatically add this year as the year. It has to, otherwise it can't specify the date. In case you do not have a day of the month (eg like Jan-85), conversion to a date is impossible because the underlying POSIX algorithms don't have all necessary information.
Also keep in mind that this only works when your locale is set to english. Otherwise you have a big chance your OS won't recognize the month abbreviations correctly. To do so, do eg:
Sys.setlocale(category = "LC_TIME", locale = "English_United Kingdom")
You can later set it back to the original one if you must, or restart your R session to reset the locale settings.
note: Please check carefully which locale notations are valid for your OS. The above example works on Windows, but is not guaranteed on either Linux or Mac.
Why you see integer
The fact that these string values are of integer type, is due to the fact that R automatically convert character vectors to factors when reading in a data frame. So typeof() returns integer because that's the internal representation of a factor.

R dates "origin" must be supplied

My code:
axis.Date(1,sites$date, origin="1970-01-01")
Error:
Error in as.Date.numeric(x) : 'origin' must be supplied
Why is it asking me for the origin when I supplied it in the above code?
I suspect you meant:
axis.Date(1, as.Date(sites$date, origin = "1970-01-01"))
as the 'x' argument to as.Date() has to be of type Date.
As an aside, this would have been appropriate as a follow-up or edit of your previous question.
My R use 1970-01-01:
>as.Date(15103, origin="1970-01-01")
[1] "2011-05-09"
and this matches the calculation from
>as.numeric(as.Date(15103, origin="1970-01-01"))
So generally this has been solved, but you might get this error message because the date you use is not in the correct format.
I know this is an old post, but whenever I run this I get NA all the way down my date column. My dates are in this format 20150521 – NealC Jun 5 '15 at 16:06
If you have dates of this format just check the format of your dates with:
str(sides$date)
If the format is not a character, then convert it:
as.character(sides$date)
For as.Date, you won't need an origin any longer, because this is supplied for numeric values only. Thus you can use (assuming you have the format of NealC):
as.Date(as.character(sides$date),format="%Y%m%d")
I hope this might help some of you.
Another option is the lubridate package:
library(lubridate)
x <- 15103
as_date(x, origin = lubridate::origin)
"2011-05-09"
y <- 1442866615
as_datetime(y, origin = lubridate::origin)
"2015-09-21 20:16:55 UTC"
From the docs:
Origin is the date-time for 1970-01-01 UTC in POSIXct format. This date-time is the origin for the numbering system used by POSIXct, POSIXlt, chron, and Date classes.
If you have both date and time information in the numeric value, then use as.POSIXct. Data.table package IDateTime format is such a case. If you use fwrite to save a file, the package automatically converts date-times to idatetime format which is unix time. To convert back to normal format following can be done.
Example: Let's say you have a unix time stamp with date and time info: 1442866615
> as.POSIXct(1442866615,origin="1970-01-01")
[1] "2015-09-21 16:16:54 EDT"
by the way, the zoo package, if it is loaded, overrides the base as.Date() with its own which, by default, provides origin="1970-01-01".
(i mention this in case you find that sometimes you need to add the origin, and sometimes you don't.)

Resources