as.Date() not giving desired result. (giving NA) - r

I read a .xlsx file containing Date columns into R and converted it into dataframe.
Some date columns are being read correctly but most of the others are getting converted to "43116" format.
Any attempt to convert it into Date using as.Date(, origin= <>, format=<>) is returning NA.
I have tried all possible solutions like using 'stringAsFactors = FALSE', POSIT thing and checking the excel file for date formats but nothing worked.
Please help.

It is difficult to recreate the problem if no data is provided, but if you want to convert the number 43129 or the character "43129" to a R date you should do the following:
a <- 43129
b <- '43129'
format(as.Date(a, origin = "1899-12-30"), '%Y-%m-%d')
[1] "2018-01-29"
format(as.Date(as.integer(b), origin = "1899-12-30"), '%Y-%m-%d')
[1] "2018-01-29"
I used the format yyyy-mm-dd, but any other date format could be used if you format it properly.
Hope it helps!

Related

how to load dates in a readable format from excel into r?

I have an excel spreadsheet that has different date formats. R reads in the dates correctly when they're formatted as m/dd/yy, but when they're m/d/yy, it reads them as a code like this: 36251.
To fix it, I found this, which seems useful, but it is only for one date and I have multiple.
How to convert Excel date format to proper date in R
This was what was suggested:
as.Date(42705, origin = "1899-12-30")
So I tried this:
stockindices0$Date <- as.Date(stockindices0$Date , origin = "12-30-99")
but got the error:
Error in charToDate(x) :
character string is not in a standard unambiguous format
I also tried it with the year as 1899 with no success.
You can try this
as.Date(stockindices0$Date, format = "%d/%m/%y")

R: date format issue (from character to unknown format)

I have a .csv file with several dates presented like this: "26-11-21". These dates are in character format and when I try to convert them to date format using the following code, they are converted to Unknown format and appear as "2021-11-26":
file$Date1 <- as.Date(as.character(file$Date1),format="%d-%m-%y")
I have also tried the two following codes but it does not work either:
file$Date1 <- as.Date(paste0(as.character(file$Date1)), format = "%d-%m-%y")
file$Date1 <- format(as.Date(file$Date1, "%d-%m-%y"), "%d-%m-%y")
Thank you for your help.

How to read in excel file when Date and Time in the same column in R

I am trying to read an excel file into R. Among other fields, the excel file has two "date" fields, each containing both the date and time stamp in the SAME field.
Example:
StartDate 9/14/2019 10:18:59 AM
EndDate 9/18/2019 2:27:14 AM
When I tried read_excel to read in the excel file, the data frame formatted these two columns very strangely. It spat out the days (with decimals). Such as 43712.429849537039, Which I thought was days from Jan-01-1970 (the origin date that popped up when I typed lubrudate::origin).
data %<>%
mutate(StartDate = as.Date(StartDate, origin = "1970-01-01 UTC"))
So I tried converting this back using as.Date, but it converts it to the totally wrong date... (converts all the dates to the year 2089). Example, 2089-09-05.
Any help with this would be really appreciated! There must be a simpler way to directly read in a date-time column?!
You can use the lubridate package, it is excellent:
library(tidyverse)
df <- data.frame(StartDate =c("9/14/2019 10:18:59 AM","9/14/2019 3:18:59 PM"),
EndDate= c("9/18/2019 2:27:14 AM","9/18/2019 1:27:14 PM"))
df <- df %>% mutate(StartDate = lubridate::mdy_hms(StartDate), EndDate = lubridate::mdy_hms(EndDate))
It turns out that excel has a different "origin date" from R. Excels counts the days from 01-01-1900, where as R counts days from 01-01-1970.
When I used read_excel to read the file into a df, R used excels' counts of days. Which is why I got a weird date when I tried to convert to the date format using 1970. As soon as I used as.Date with excels "origin" date of 1990 (excels origin date), my dates parsed out correctly!

Converting integer format date to double format of date

I have date format in following format in a data frame:
Jan-85
Apr-99
1-Nov
Feb-96
When I see the typeof(df$col) I get the answer as "integer".
Actually when I see the format in excel it is in m/d/yyyy format. I was trying to convert this to date format in R. All my efforts yielded NA.
I tried parse_date_time function. I tried as.date along with as.character. I tried as.POSIXct but everything is giving me NA.
My trials were as follows and everything was a failure:
as.Date.numeric(df$col,"m%d%Y")
transform(df$col, as.Date(as.character(df$col), "%m%d%Y"))
as.Date(df$col,"m%d%Y")
as.POSIXct.numeric(as.character(loan_new$issue_d), format="%Y%m%d")
as.POSIXct.date(as.character(df$col), format="%Y%m%d")
mdy(df$col)
parse_date_time(df$col,c("mdy"))
How can I convert this to date format? I have used lubridate package for parse_date_time and mdy package.
dput output is below
Label <- factor(c("Apr-08",
"Apr-09", "Apr-10", "Apr-11", "Aug-07", "Aug-08", "Aug-09", "Aug-10",
"Aug-11", "Dec-07", "Dec-08", "Dec-09", "Dec-10", "Dec-11", "Feb-08",
"Feb-09", "Feb-10", "Feb-11", "Jan-08", "Jan-09", "Jan-10", "Jan-11",
"Jul-07", "Jul-08", "Jul-09", "Jul-10", "Jul-11", "Jun-07", "Jun-08",
"Jun-09", "Jun-10", "Jun-11", "Mar-08", "Mar-09", "Mar-10", "Mar-11",
"May-08", "May-09", "May-10", "May-11", "Nov-07", "Nov-08", "Nov-09",
"Nov-10", "Nov-11", "Oct-07", "Oct-08", "Oct-09", "Oct-10", "Oct-11",
"Sep-07", "Sep-08", "Sep-09", "Sep-10", "Sep-11"))
NA is typically what you get when you misspecify the format. Which is what you do. That said, if your data is really looking like the first example you gave, it's impossible to simply convert this to a date. You have two different formats, one being month-year and the other day-month.
If your updated date (i.e. Dec-11) is the correct format, then you use the format argument of as.Date like this:
date <- "Dec-11"
as.Date(date, format = "%b-%d")
# [1] "2017-12-11"
Or on your example data:
as.Date(Label, format = "%b-%d")
# [1] "2017-04-08" "2017-04-09" "2017-04-10" "2017-04-11" "2017-08-07" "2017-08-08"
# [7] "2017-08-09" "2017-08-10" "2017-08-11" "2017-12-07" "2017-12-08" "2017-12-09"
If you want to convert something like Jan-85, you have to decide which day of the month that date should have. Say we just take the first of each month, then you can do:
x <- "Jan-85"
xd <- paste0("1-",x)
as.Date(xd, "%d-%b-%y")
# [1] "1985-01-01"
More information on the format codes can be found on ?strptime
Note that R will automatically add this year as the year. It has to, otherwise it can't specify the date. In case you do not have a day of the month (eg like Jan-85), conversion to a date is impossible because the underlying POSIX algorithms don't have all necessary information.
Also keep in mind that this only works when your locale is set to english. Otherwise you have a big chance your OS won't recognize the month abbreviations correctly. To do so, do eg:
Sys.setlocale(category = "LC_TIME", locale = "English_United Kingdom")
You can later set it back to the original one if you must, or restart your R session to reset the locale settings.
note: Please check carefully which locale notations are valid for your OS. The above example works on Windows, but is not guaranteed on either Linux or Mac.
Why you see integer
The fact that these string values are of integer type, is due to the fact that R automatically convert character vectors to factors when reading in a data frame. So typeof() returns integer because that's the internal representation of a factor.

read.csv2 date formatting in R

I wish to import my csv file into a data frame but the date in my csv file is in a non-standard format.
The date in the first column is in the following format:
08.09.2016
One of the arguments in my read.csv2 functions is to specify the classes and when I specify this column as a date I receive the following error upon execution:
Error in charToDate(x) :
character string is not in a standard unambiguous format
I'm guessing it doesn't like converting the date from factor class to date class.
I've read a little about POSIXlt but I don't understand the details of the function.
Any ideas how to convert the class from factor to date??
When you convert character to date, you need specify format if it is not standard. The error you got is the result of as.Date("08.09.2016"). But if you do as.Date("08.09.2016", format = "%m.%d.%Y"), it is fine.
I am not sure whether it is possible to pass format to read.csv2 for correct date formatting (maybe not). I would simply read in this date column as factor, then do as.Date(as.character(), format = "%m.%d.%Y") on this column myself.
Generally we use the following format "dd/mm/yy" how can I reorganise the date to that format?
Use format(, format = "%d/%m/%y").
A complete example:
format(as.Date("08.09.2016", format = "%m.%d.%Y"), format = "%d/%m/%y")
# [1] "09/08/16"

Resources