I am working with data from csv files that will all look the same so I am hoping to come up with a code that can be easily applied to all of them.
However, sadly enough I am failing at step one :-(.
The csv files have the date and time saved in one column, so when I import them with read.csv that column gets read as a chr. How can I most easily convert this into a date that I then can use for plotting and analysis?
Here is what I tried:
load the data --> will save the date and time as chr under mydata$Date.Time (e.g. 1/1/15 0:00)
mydata<-read.csv(file.choose(), stringsAsFactors = FALSE,
strip.white = TRUE,
na.strings = c("NA",""), skip=16,
header=TRUE)
separate the Date.Time into Date and Time:
new <- do.call( rbind , strsplit( as.character( mydata$Date.Time ) , " " ) )
add these two back to the df mydata:
cbind( mydata , Date = new[,2] , Time = new[,1] )
convert Date into a date format via as.Date:
mydata$Date <- as.Date(new[,1], format="")
So this works fine for the date however I am stuck with the time, I tried this:
mydata$Time <- format(as.POSIXct(new[,2], format="%H:%M"))
this gives me the following error:
Error in as.POSIXlt.character(x, tz, ...) :
character string is not in a standard unambiguous format
I wonder if there is a smarter way of doing this? Reading in time and date seems to be one of the substantial tasks that I would like to understand. Is there a way of R directly recognizing the date and time from the csv? Or is it generally smarter to generate a time vector by its own, if so how would I do that?
Thanks so much for your help.
Sandra
If you want to use time only, consider using the chron package:
library(chron)
mytime <- times("21:19:37")
or in your case
times(new[,2])
assuming that that's a character vector.
I tried the chron approach but it wouldn't work for me :-(.
So what I ended up doing is just creating a time vector for the period that I am loading the data in for:
date <-seq(as.POSIXct("2015/1/1 00:00"), as.POSIXct("2015/1/31 23:00"), "hours")
and then adding it back to the df.
Not what I wanted but it will work until I find the ultimate solution :-)
Related
I have a problem with the as.date function.
I have a list of normal date shows in the excel, but when I import it in R, it becomes numbers, like 33584. I understand that it counts since a specific day. I want to set up my date in the form of "dd-mm-yy".
The original data is:
how the "date" variable looks like in r
I've tried:
as.date <- function(x, origin = getOption(date.origin)){
origin <- ifelse(is.null(origin), "1900-01-01", origin)
as.Date(date, origin)
}
and also simply
as.Date(43324, origin = "1900-01-01")
but none of them works. it shows the error: do not know how to convert '.' to class “Date”
Thank you guys!
The janitor package has a pair of functions designed to deal with reading Excel dates in R. See the following links for usage examples:
https://www.rdocumentation.org/packages/janitor/versions/2.0.1/topics/excel_numeric_to_date
https://www.rdocumentation.org/packages/janitor/versions/2.0.1/topics/convert_to_date
janitor::excel_numeric_to_date(43324)
[1] "2018-08-12"
I've come across excel sheets read in with readxl::read_xls() that read date columns in as strings like "43488" (especially when there is a cell somewhere else that has a non-date value). I use
xldate<- function(x) {
xn <- as.numeric(x)
x <- as.Date(xn, origin="1899-12-30")
}
d <- data.frame(date=c("43488"))
d$actual_date <- xldate(d$date)
print(d$actual_date)
# [1] "2019-01-23"
Dates are notoriously annoying. I would highly recommend the lubridate package for dealing with them. https://lubridate.tidyverse.org/
Use as_date() from lubridate to read numeric dates if you need to.
You can use format() to put it in dd-mm-yy.
library(lubridate)
date_vector <- as_date(c(33584, 33585), origin = lubridate::origin)
formatted_date_vector <- format(date_vector, "%d-%m-%y")
I am relatively new to R and I have a dataset in which I am trying to convert a date and time into a numeric value. The date and time are in the format 01JUN17:00:00:00 under a variable called pickup_datetime. I have tried using the code
cab_small_sample$pickup_datetime <- as.numeric(as.Date(cab_small_sample$pickup_datetime, format = '%d%b%y'))
but this way doesn't incorporate time, I tried to add the time format to the format section of code but still did not work. Is there an R function that will convert the data into a numeric value>
R has two main time classes: "Date" and "POSIXct". POSIXct is a datetime class and you can get all the gory details at: ? DateTimeClasses. The help page for the formats used at the time of data input, however, are at ?striptime.
cab_small_sample <- data.frame(pickup_datetime = "01JUN17:00:00:00")
cab_small_sample$pickup_dt <- as.numeric(as.POSIXct(cab_small_sample$pickup_datetime,
format = '%d%b%y:%H:%M:%S'))
cab_small_sample
# pickup_datetime pickup_dt
#1 01JUN17:00:00:00 1496300400 # seconds since 1970-01-01
I find that a "destructive reassignment of values" is generally a bad idea so as a "my (best?) practice rule" I don't assign to the same column until I'm sure I have the code working properly. (And I always leave an untouched copy somewhere safe.)
lubridate is an extremely handy package for dealing with dates. It includes a variety of functions which do the date/time parsing for you, as long as you can provide the order of components. In this case, since your data is in day-month-year-hms form, you can use the dmy_hms function.
library(lubridate)
cab_small_sample <- dplyr::tibble(
pickup_datetime = c("01JUN17:00:00:00", "01JUN17:11:00:00"))
cab_small_sample$pickup_POSIX <- dmy_hms(cab_small_sample$pickup_datetime)
at the moment I'm trying to convert a string into time-format.
e.g. my string looks like following: time <- '12:00'.
I already tried to use the chron-Package. And my code looks like following:
time <- paste(time,':00', sep = '') time <- times(time)
Instead of getting a value like "12:00:00" the function times() always translate the object time into "0.5"
Am I using the wrong approach?
regards
Your code works. If you check the 'class()' it is "times". However, if you want another way, try:
time <- '12:00:00'
newtime<-as.POSIXlt(time, format = "%H:%M:%S") # The whole date with time
t <- strftime(newtime, format="%H:%M:%S") # To extract the time part
t
#[1] "12:00:00"
Cheers !
I have a dataset with some date time like this "{datetime:2015-07-01 09:10:00" So I wanted to remove the text, and then keep the date & the time as as.Date returns only the date. So I write this code but the only problem I have is that during the second line with strsplit, it only returns me the date time of the first line and so erase the others... I woud love to get ALL my date time not only the first. I thought about sapply maybe, but I can't make it right I have many errors or maybe with a loop for? I am novice to R so I don't really know how to do this the best way.
Could you help me please? Besides If you have another idea for the time & date format or a simple way to do it, it should be very nice of you too.
data$`Date Time`=as.character(data$`Date Time`)
data$`Date Time`=unlist(strsplit(data[,1], split='e:'))[2]
date=substr(data$`Date Time`,0,10)
date=as.Date(date)
time=substr(data$`Date Time`,12,19)
data$Date=date
data$Time=time
Thank you very much for your help!
You could use the format argument to avoid all the strsplit:
times <- as.POSIXct(data$`Date Time`, format='{datetime:%Y-%m-%d %H:%M:%S')
(The reason for the "{datetime:" in the format is because you mentioned this is the format of your strings).
This object has both date and time in it, and then you can just store it in the dataframe as a single column of type POSIXct rather than two columns of type string e.g.
data$datetime <- times
but if you do want to store the date as a Date and the time as a string (as in your example above):
data$Date <- as.Date(times)
data$Time <- strftime(times, format='%H:%M:%S')
See ?as.Date, ?as.POSIXct, ?strptime for more details on that format argument and various conversions between date and string.
I am still learning R, and get very confused when using various data types, classes, etc. I have run into this issue of "Dates" not being in the right format for xts countless times now, and find a solution each time after searching long and hard for (what I consider) complicated solutions.
I am looking for a way to load a CSV into R and convert the date upon loading it each time I want to load a csv into R. 99% of my files contain Date as the first column, in format 01-31-1900 (xts wants YYYY-mm-dd).
Right now I have the following:
FedYieldCurve <- read.csv("Yield Curve.csv", header = TRUE, sep = ",", stringsAsFactors = FALSE)
FedYieldCurve$Date <- format(as.Date(FedYieldCurve$Date), "%Y/%m/%d")
and i am getting: Error in charToDate(x) :
character string is not in a standard unambiguous format
The format argument must be inside as.Date. Try this (if the dates in the files are stored in the 01-31-1900 format):
as.Date(FedYieldCurve$Date,format="%m-%d-%Y")
When you try to coerce a string to a Date object you have to specify the format of the string as the format argument in the as.Date call. You have the error you reported when you try to coerce a string which has a format other than the standard YYYY-mm-dd.
Provide a few lines of the file when asking questions like this. In the absence of this we have supplied some data below in a self contained example.
Use read.zoo from the zoo package (which xts loads) specifying the format. (Replace the read.zoo line with the commented line to read from a file.)
Lines <- "Date,Value
01-31-1900,3"
library(xts)
# z <- read.zoo("myfile.csv", header = TRUE, sep = ",", format = "%m-%d-%Y")
z <- read.zoo(text = Lines, header = TRUE, sep = ",", format = "%m-%d-%Y")
x <- as.xts(z)
See ?read.zoo and Reading Data in zoo.