Running into an issue when drawing in Excel data from R and converting to a date within R. I have a "time_period" column that is pulled from Excel in that Excel date format with 5 digit numbers (e.g. 41640).
> head(all$time_period)
[1] "41640" "41671" "41699" "41730" "41760" "41791"
These numbers are originally in chr format so I change them to numeric type with:
all[,3] <- lapply(all[,3], function(x) as.numeric(as.character(x)))
Once that is complete, I run the below to format the date:
all$time_period <-format(as.Date(all$time_period, "1899-12-30"), "%Y-%m-%d")
However, once this action is completed the time_period column is all the same date (presumably the first date in the column).
> head(all$time_period)
[1] "2014-01-01" "2014-01-01" "2014-01-01" "2014-01-01" "2014-01-01" "2014-01-01"
Any suggestions? Thanks in advance.
set the origin argument in as.Date()
These numbers refer to distances away from an origin, which depends on the machine the excel file was created on.
Windows: as.Date(my_date, origin = "1899-12-30")
Mac: as.Date(my_date, origin = "1904-01-01")
For example:
x <- c("41640","41671","41699","41730","41760","41791")
x <- as.numeric(x)
format(as.Date(x, "1899-12-30"), "%Y-%m-%d")
Returns:
[1] "2014-01-01" "2014-02-01" "2014-03-01" "2014-04-01" "2014-05-01" "2014-06-01"
I believe this one line solves your problem, you don't need to format it, as de default of as.Date function is "%Y-%m-%d".
time_period = c("41640", "41671", "41699", "41730", "41760", "41791")
as.Date(as.numeric(time_period), origin = "1899-12-30").
Related
My data comes from excel. The dates are in dd/mm/yyyy format:
certificado$fecha <- c("22/02/2019", "43679", "22/02/2019", "22/01/2019", "28/10/2019",
"18/09/2019")
However, R is reading some dates as mm/dd/yyyy. My code is supposed to convert all of them to an specific format.
certificados$Fecha <- as.Date(certificados$Fecha,format = "%d/%m/%Y")
But im getting NAs due to date format issues.
If you cannot fix this at the source, this code finds both formats:
vec <- c("22/02/2019", "43679", "22/02/2019", "22/01/2019", "28/10/2019", "18/09/2019")
out <- as.Date(vec, format = "%d/%m/%Y")
out
# [1] "2019-02-22" NA "2019-02-22" "2019-01-22" "2019-10-28" "2019-09-18"
isna <- is.na(out)
out[isna] <- as.Date(as.integer(vec[isna]), origin = "1900-01-01")
out
# [1] "2019-02-22" "2019-08-04" "2019-02-22" "2019-01-22" "2019-10-28" "2019-09-18"
I have dates in this format:
Apr-12,
Dec-12,
30-Jul-14,
Mar-16,
29-Feb-16,
May-17,
20-Nov-14,
R is treating it like factor variable. I want it to treat it like a date, and wherever the day of the date is missing, it should replace it with 1st.
Thank you in advance!
I think we need to parse them separately because the format is not consistent. We first parse the ones which have date, month and year component. The ones which return NA's are then parsed by adding "01" in them.
new_x <- as.Date(x, "%d-%b-%y")
new_x[is.na(new_x)] <- as.Date(paste0("01-", x[is.na(new_x)]), "%d-%b-%y")
new_x
#[1] "2012-04-01" "2012-12-01" "2014-07-30" "2016-03-01" "2016-02-29" "2017-05-01"
#[7] "2014-11-20"
Read more about formats at ?strptime.
data
x <-factor(c("Apr-12", "Dec-12", "30-Jul-14", "Mar-16", "29-Feb-16",
"May-17","20-Nov-14"))
Conditionally append a "01-" when the first three characters are not in the system vector, month.abb
as.Date( ifelse( substr(dtvec,1,3) %in% month.abb, paste0("01-",dtvec), dtvec) ,"%d-%b-%y")
[1] "2012-04-01" "2012-12-01" "2014-07-30" "2016-03-01" "2016-02-29" "2017-05-01" "2014-11-20"
I've imported one date value into R:
dtime <- read.csv("dtime.csv", header=TRUE)
It's output (7th Nov, 2013) is printed as:
> dtime
Date
1 07-11-2013 23:06
and also its class is 'factor'.
> class(dtime$Date)
[1] "factor"
Now, I want to extract the time details (hours, minutes, seconds) from the data. So, I was trying to convert the dataframe's date value to Date type. But none of the following commands worked:
dtime <- as.Date(as.character(dtime),format="%d%m%Y")
unclass(as.POSIXct(dtime))
as.POSIXct(dtime$Date, format = "%d-%m-%Y %H:%M:%S")
How do I achieve this in R???
Your attempts didn't work because the format specified was wrong.
With base R there are two possible ways of solving this, with as.POSIXlt
Res <- as.POSIXlt(dtime$Date, format = "%d-%m-%Y %H:%M")
Res$hour
Res$min
Also, for more options, see
attr(Res, "names")
## [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" "isdst" "zone" "gmtoff"
Or a bit less conveniently with as.POSIXct
Res2 <- as.POSIXct(dtime$Date, format = "%d-%m-%Y %H:%M")
format(Res2, "%H") # returns a character vector
format(Res2, "%M") # returns a character vector
I would like to contribute solution utilising lubridate :
dates <- c("07-11-2013 23:06", "08-10-2012 11:11")
dta <- data.frame(dates)
require(lubridate)
dta$properDate <- dmy_hm(dta$dates)
If needed, lubridate will enable you to conveniently specify time zones or extract additional information.
I am trying to convert the string "2013-JAN-14" into a Date as follow :
sdate1 <- "2013-JAN-14"
ddate1 <- as.Date(sdate1,format="%Y-%b-%d")
ddate1
but I get :
[1] NA
What am I doing wrong ? should I install a package for this purpose (I tried installing chron) .
Works for me. The reasons it doesn't for you probably has to do with your system locale.
?as.Date has the following to say:
## This will give NA(s) in some locales; setting the C locale
## as in the commented lines will overcome this on most systems.
## lct <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")
x <- c("1jan1960", "2jan1960", "31mar1960", "30jul1960")
z <- as.Date(x, "%d%b%Y")
## Sys.setlocale("LC_TIME", lct)
Worth a try.
This can also happen if you try to convert your date of class factor into a date of class Date. You need to first convert into POSIXt otherwise as.Date doesn't know what part of your string corresponds to what.
Wrong way: direct conversion from factor to date:
a<-as.factor("24/06/2018")
b<-as.Date(a,format="%Y-%m-%d")
You will get as an output:
a
[1] 24/06/2018
Levels: 24/06/2018
class(a)
[1] "factor"
b
[1] NA
Right way, converting factor into POSIXt and then into date
a<-as.factor("24/06/2018")
abis<-strptime(a,format="%d/%m/%Y") #defining what is the original format of your date
b<-as.Date(abis,format="%Y-%m-%d") #defining what is the desired format of your date
You will get as an output:
abis
[1] "2018-06-24 AEST"
class(abis)
[1] "POSIXlt" "POSIXt"
b
[1] "2018-06-24"
class(b)
[1] "Date"
My solution below might not work for every problem that results in as.Date() returning NA's, but it does work for some, namely, when the Date variable is read in in factor format.
Simply read in the .csv with stringsAsFactors=FALSE
data <- read.csv("data.csv", stringsAsFactors = FALSE)
data$date <- as.Date(data$date)
After trying (and failing) to solve the NA problem with my system locale, this solution worked for me.
How would I convert the following character variables to dates?
strDates <- c("Jan.2008", "Feb.2008")
str(strDates)
chr [1:2] "Jan.2008" "Feb.2008"
dates <- as.Date(strDates, "%b %Y")
str(dates)
Date[1:2], format: NA NA
Any assistance would be greatly appreciated
To form a valid 'date', you also need a day which your data was lacking. So we add one, and we simply use an arbitrary day (here: first of the month):
R> strDates <- c("Jan.2008", "Feb.2008")
R> strptime(paste("01", strDates), "%d %b.%Y")
[1] "2008-01-01" "2008-02-01"
R>
A Date requires a day element as well, so you can add that to the input string with paste:
full.dates <- paste("01", strDates, sep = ".")
Specify the template correctly, including separator tokens:
as.Date(full.dates, "%d.%b.%Y")
[1] "2008-01-01" "2008-02-01"