I have an excel which includes dates. I'm importing this excel file into a 'data frame'.
After importing, I tried to convert one column into an date format, but it's displaying 'NA'
What I tried:
str(df$Date_of_visit) # prints type before conversion
df$Date_of_visit # values in the column
df$Date_of_visit <- as.Date(df$Date_of_visit, origin = "1899-12-30", format="%m%d%y") #converting to date
str(df$Date_of_visit) # prints type after conversion
print(df$Date_of_visit) # values in the column
Output I got :
chr [1:4] "43503" "43319" "43473" "43473"
Date[1:4], format: NA NA NA NA
[1] NA NA NA NA
Can someone help me out? What is the mistake I'm doing here?
Thanks in advance!
Regards
Mouni.
You need to not specify format= argument in your as.Date(), and convert the characters to numeric before using as.Date(). Example:
dte <- c("43503","43319","43473","43473")
dte <- as.Date(as.numeric(dte), origin = "1899-12-30")
dte
#[1] "2019-02-07" "2018-08-07" "2019-01-08" "2019-01-08"
format(dte, "%m%d%Y")
#[1] "02072019" "08072018" "01082019" "01082019"
You can use format() to convert the Date objects to character of your choice of format. Note that format() gives you character object, not Date anymore.
Related
I have a data frame containing date time information as characters in the format dd/mm/yyyy hh:mm but I can't get it to convert e.g
$ LaserStart : chr "07/12/2014 11:21" "13/12/2014 05:37"
I am trying to convert them to date time using
data.LotCT$Start <- strptime(data.LotCT$LaserStart, "%d/%B/%Y %H:%M")
this runs without producing any errors but when I review the dataframe I have only NA
$ Start : POSIXlt, format: NA NA NA ...
thanks in advance
> x <- "07/12/2014 11:21"
> y <- strptime(x, format='%m/%d/%Y %H:%M')
> strftime(y, '%d/%B/%Y %H:%M')
[1] "12/July/2014 11:21"
Just figured it out
data.LotCT$Start <- strptime(data.LotCT$LaserStart, "%d/%B/%Y %H:%M")
should be
data.LotCT$Start <- strptime(data.LotCT$LaserStart, "%d/%m/%Y %H:%M")
which gives
$ Start : POSIXlt, format: "2014-12-07 11:21:00" "2014-12-13 05:37:00"
sorry for bothering you all :)
I have difficulty converting dates from excel (reading from csv) to R. Help is much appreciated.
Here is what I'm doing:
df$date = as.Date(df$excel.date, format = "%d/%m/%Y")
However, some dates get converted but some not. Here is the output of:
head(df$date)
[1] NA NA NA "0006-01-05" NA NA
the first 5 entries imported from csv file are as follows:
7/28/05
7/28/05
12/16/05
5/1/06
4/21/05
and here is the output of:
head(df$excel.date)
[1] 7/28/05 7/28/05 12/16/05 5/1/06 4/21/05 1/25/07
1079 Levels: 1/1/00 1/1/02 1/1/97 1/10/96 1/10/99 1/11/04 1/11/94 1/11/96 1/11/97 1/11/98 ... 9/9/99
str(df)
.
.
$ excel.date : Factor w/ 1079 levels "1/1/00","1/1/02",..: 869 869 288 618 561 48 710 1022 172 241 ...
First of all, make sure you have the dates in your file in an unambiguous format, using full years (not just 2 last numbers). %Y is for "year with century" (see ?strptime) but you don't seem to have century. So you can use %y (at your own risk, see ?strptime again) or reformat the dates in Excel.
It is also a good idea to use as.is=TRUE with read.csv when reading in these data -- otherwise character vectors are converted to factors which can lead to unexpected results.
And on Wndows it may be easier to use RODBC to read in dates directly from xls or xlsx file.
(edit)
The following may give a hint:
> as.Date("13/04/2014", format= "%d/%m/%Y")
[1] "2014-04-13"
> as.Date(factor("13/04/2014"), format= "%d/%m/%Y")
[1] "2014-04-13"
> as.Date(factor("13/04/14"), format= "%d/%m/%Y")
[1] "14-04-13"
> as.Date(factor("13/04/14"), format= "%d/%m/%y")
[1] "2014-04-13"
(So as.Date can actually take care of factors - the magick happens in as.Date.factor method defined as:
function (x, ...) as.Date(as.character(x), ...)
It is not a good idea to represent dates as factors but in this case it is not a problem either. I think the problem is excel which saves your years as 2-digit numbers in a CSV file, without asking you.)
-
The ?strptime help file says that using %y is platform specific - you can have different results on different machines. So if there's no way of going back to the source and save the csv in a better way you might use something like the following:
x <- c("7/28/05", "7/28/05", "12/16/05", "5/1/06", "4/21/05", "1/25/07")
repairExcelDates <- function(x, yearcol=3, fmt="%m/%d/%Y") {
x <- do.call(rbind, lapply(strsplit(x, "/"), as.numeric))
year <- x[,yearcol]
if(any(year>99)) stop("dont'know what to do")
x[,yearcol] <- ifelse(year <= as.numeric(format(Sys.Date(), "%Y")), year+2000, year + 1900)
# if year <= current year then add 2000, otherwise add 1900
x <- apply(x, 1, paste, collapse="/")
as.Date(x, format=fmt)
}
repairExcelDates(x)
# [1] "2005-07-28" "2005-07-28" "2005-12-16" "2006-05-01" "2005-04-21"
# [6] "2007-01-25"
Your data is formatted as Month/Day/Year so
df$date = as.Date(df$excel.date, format = "%d/%m/%Y")
should be
df$date = as.Date(df$excel.date, format = "%m/%d/%Y")
I am trying to convert the string "2013-JAN-14" into a Date as follow :
sdate1 <- "2013-JAN-14"
ddate1 <- as.Date(sdate1,format="%Y-%b-%d")
ddate1
but I get :
[1] NA
What am I doing wrong ? should I install a package for this purpose (I tried installing chron) .
Works for me. The reasons it doesn't for you probably has to do with your system locale.
?as.Date has the following to say:
## This will give NA(s) in some locales; setting the C locale
## as in the commented lines will overcome this on most systems.
## lct <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")
x <- c("1jan1960", "2jan1960", "31mar1960", "30jul1960")
z <- as.Date(x, "%d%b%Y")
## Sys.setlocale("LC_TIME", lct)
Worth a try.
This can also happen if you try to convert your date of class factor into a date of class Date. You need to first convert into POSIXt otherwise as.Date doesn't know what part of your string corresponds to what.
Wrong way: direct conversion from factor to date:
a<-as.factor("24/06/2018")
b<-as.Date(a,format="%Y-%m-%d")
You will get as an output:
a
[1] 24/06/2018
Levels: 24/06/2018
class(a)
[1] "factor"
b
[1] NA
Right way, converting factor into POSIXt and then into date
a<-as.factor("24/06/2018")
abis<-strptime(a,format="%d/%m/%Y") #defining what is the original format of your date
b<-as.Date(abis,format="%Y-%m-%d") #defining what is the desired format of your date
You will get as an output:
abis
[1] "2018-06-24 AEST"
class(abis)
[1] "POSIXlt" "POSIXt"
b
[1] "2018-06-24"
class(b)
[1] "Date"
My solution below might not work for every problem that results in as.Date() returning NA's, but it does work for some, namely, when the Date variable is read in in factor format.
Simply read in the .csv with stringsAsFactors=FALSE
data <- read.csv("data.csv", stringsAsFactors = FALSE)
data$date <- as.Date(data$date)
After trying (and failing) to solve the NA problem with my system locale, this solution worked for me.
I have this vector representing time recorded as hours (0 to 24) and minute (0 to 59). I would like to transform it into a %H:%M time format in R such that I can use function like difftime.
str(SF5$ES_TIME)
int [1:11452] 1940 600 5 1455 1443 2248 1115 900 200 420 ...
This is what I've tried, but in both cases, I got an error:
>SF5$time1<-as.POSIXct(SF5$ES_TIME, format = "%H:%M",tz="EST")
Error in as.POSIXct.numeric(SF5$ES_TIME, format = "%H:%M", tz = "EST") :
'origin' must be supplied
SF5$time1<-as.POSIXct(as.character(SF5$ES_TIME), format="%H:%M",tz="")
> str(SF5$time1)
POSIXct[1:11452], format: NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ...
Any help or reading suggestions would be much appreciated!
Thank you,
Aurelie
Well, the error message tells you to provide origin and a minute is 60 seconds, so:
SF5 <- list(ES_TIME=as.integer(c(1940,600,5,1455,1443,2248,1115,900,200,420)))
x <- as.POSIXct(SF5$ES_TIME*60, origin="1970-01-01")
format(x, format="%H:%M")
#[1] "08:20" "10:00" "00:05" "00:15" "00:03" "13:28" "18:35" "15:00" "03:20" "07:00"
Note that the POSIXct date is just a number (with a class), so you need the format call to print it as you want - the default printing of x would print the full date info (year/month/day etc).
...any origin date would do since you don't care about it, but 1970-01-01 is the usual origin...
I was able to crack down the code! Thank you all for your tip!
#1) as suggested by Justin : put all numbers into four digits with zero padding
SF5$ES_TIME2<-sprintf("%04d",SF5$ES_TIME)
#2) Matched these %H%M with their corresponding date %y-%m-%d
SF5$ES.datetime <- paste(SF5$ES_TIME2,SF5$ES_DATE,sep=" ")
#3) Transform into Date-Time format
SF5$ES.datetime2 <- as.POSIXct(SF5$ES.datetime,format="%H%M %y-%m-%d", tz="")
# Did the same for my other time-date of interest
SF5$SH_TIME2<-sprintf("%04d",SF5$SH_TIME)
SF5$SH.datetime <- paste(SF5$SH_TIME2,SF5$SH_DATE,sep=" ")
SF5$SH.datetime2 <- as.POSIXct(SF5$SH.datetime,format="%H%M %y-%m-%d", tz="")
# Calculate the time difference between the 2 date-time in hours
SF5$duration<-difftime(SF5$SH.datetime2,SF5$ES.datetime2,units="hours",tz="")
How would I convert the following character variables to dates?
strDates <- c("Jan.2008", "Feb.2008")
str(strDates)
chr [1:2] "Jan.2008" "Feb.2008"
dates <- as.Date(strDates, "%b %Y")
str(dates)
Date[1:2], format: NA NA
Any assistance would be greatly appreciated
To form a valid 'date', you also need a day which your data was lacking. So we add one, and we simply use an arbitrary day (here: first of the month):
R> strDates <- c("Jan.2008", "Feb.2008")
R> strptime(paste("01", strDates), "%d %b.%Y")
[1] "2008-01-01" "2008-02-01"
R>
A Date requires a day element as well, so you can add that to the input string with paste:
full.dates <- paste("01", strDates, sep = ".")
Specify the template correctly, including separator tokens:
as.Date(full.dates, "%d.%b.%Y")
[1] "2008-01-01" "2008-02-01"