Have an large data frame where there's 2 columns (POSIXct) and need to calculate length of ride.
Dates are formatted as follows:
format: "2020-10-31 19:39:43"
Can use the difftime function, correct?
Thanks
Given your data is using the correct POSIXct format you can simply subtract two dates to get the difference. No need for additional functions.
date1 <- as.POSIXct(strptime("2020-10-31 19:39:43", format = "%Y-%m-%d %H:%M:%OS"))
date2 <- as.POSIXct(strptime("2020-10-31 19:20:43", format = "%Y-%m-%d %H:%M:%OS"))
date1 - date2
Output: Time difference of 19 mins
It depends what output format you want.
For example if you want month difference between two dates, you can use the "interval" function from library "lubridate"
library(lubridate)
interval(as.Date(df$date1),as.Date(df$date2) %/% months(1))
It also works with years, weeks, days, hours
Related
I currently have a data frame with a column for Start.Time (imported from a *.csv file), and the format is in 24 hour format (e.g., 20:00:00 equals 8pm). My goal is to capture observations with a start time in various intervals (e.g., between 9:00:00 and 10:00:00), which also meet other criteria. However, it seems that R sorts this 'character' variable in a way that does not align with how our day goes (e.g., 14:00:00 is considered a lower value than 9:00:00).
For example, below is a line of code that works as intended, where I am capturing observations on two different trail segments, which had a start time between 8:00:00 and 9:00:00.
RLLtoMist8.9<-sum((dataset1$Trail.Segment==52|dataset1$Trail.Segment==55) &
(dataset1$Start.Time>="8:00" & dataset1$Start.Time < "9:00"),
na.rm=TRUE)
RLLtoMist8.9
But, this code below does not work as intended, as R is 'valuing' 9:00:00 as greater than 10:00:00.
RLLtoMist9.10 <-
sum((dataset1$Trail.Segment==52|dataset1$Trail.Segment==55) &
(dataset1$Start.Time>="9:00:00 AM" & dataset1$Start.Time < "10:00:00 AM"),
na.rm=TRUE)
It's certainly true that character types are sorted so that "14:00" is less than "9:00". However R has a datetime class which would sort times correctly once a character representation has been parsed.
a <- as.POSIXct("14:00", format="%H:%M")
b <- as.POSIXct("8:00", format="%H:%M")
# test
> a < b
[1] FALSE
You would be able to convert an entire column with:
dataset1$Start.Time <- as.POSIXct(dataset1$Start.Time, format="%H:%M")
The dates of a and b were the system date at the time of conversion, so if you printed them you would see dates and times in the default format. There are packages, such as chron, that let you use just times, but POSIXt objects have dates and times necessarily. See ?DateTimeClasses. The lubridate package also has an 'interval' class and there exist a difftime function in base-R.
There's also seq.POSIXt and cut.POSIXt functions, either of which could be used to create multiple time or date boundaries for categorical transformations of datetimes.
Using the data.table library:
# convert to data table
dataset1<-data.table(dataset1)
# format to a date format rather that character
dataset1[, Start.Time := as.POSIXct(Start.Time, format="%H:%M:%S")]
#now do your filtering
dataset1[between(Start.Time, as.POSIXct("09:00:00", format="%H:%M:%S"), as.POSIXct("10:00:00", format="%H:%M:%S")) & (Trail.Segment==52 | Trail.Segment==55)]
I have a dataset, df with a column containing dates in yyyy format (ex: 2018). I’m trying to make a time series graph, and therefore need to convert them to a date format.
I initially tried, df$year <- as.Date(df$year) but was told I needed to specify an origin.
I then tried to convert to a character, then a date format:
df$year <- as.character(df$year)
df$year <- as.Date(df$year, format = “%Y”)
This seems to have worked, however when it changed the all the years to yyyy-mm-dd format, and set the month and day to April 5th, today. For example 2018 becomes 2018-04-05.
Does anyone have an idea for how to fix this? I would like it to start on January 1, not the day I am performing the conversion. I also tried strptime(as.character(beer_states$year), “%Y”) with the same result.
Any help would be very much appreciated. Thanks!
Add an arbitrary date and month before converting to date.
df$Date <- as.Date(paste(df$year, 1, 1), '%Y %m %d')
We can use as.yearmon
library(zoo)
df$Date <- as.Date(as.yearmon(df$year, '-01'), '%Y-%m'))
Current situation:
mydate <- "14:45"
class(mydate)
The current class of this value is a character. I would like to convert it into a POSIXlt format.
I tried the strptime() function but it unfortunately adds the full date to my hours when I actually only need Hours:Minutes
mydate <- strptime(mydate, format = "%H:%M")
What can I do to get a POSIXlt format uniquely containing hours and minutes ?
Thanks in advance for your returns !
POSIXlt and POSIXct always contain date and time. You can use chron times class to represent times less than 24:00:00.
library(chron)
tt <- times(paste(mydate, "00", sep = ":"))
tt
## [1] 14:45:00
times class objects are represented internally as a fraction of a day so, for example, adding 1/24 will add an hour.
tt + 1/24 # add one hour
## [1] 15:45:00
For me it works like this:
test <- "2016-04-10T12:21:25.4278624"
z <- as.POSIXct(test,format="%Y-%m-%dT%H:%M:%OS")
#output:
z
"2016-04-10 12:21:25 CEST"
The code is form here: R: convert date from character to datetime
I would like do perform the reverse of R converting a factor YYYY-MM to a date.
I have a data frame (df) with dates presented as YYYY-MM-DD format (column "Date1") and would like to convert them into YYYY-MM format. The days range between 0-31 but these aren't necessary for my YYYY-MM format.
The point is that I have another set of columns that are in YYYY-MM format (column "Date2") and want to calculate the differences between them in whole months (values x, y and z in column "Difference(months)")
Date1 Date2 Difference(months)
2008-08-11 1995-02 x
2010-06-18 1972-09 y
2011-04-22 1956-11 z
Doing df$Date1_new <- as.Date(df$Date1, "%Y-%m") just gives a series of "NA" values.
Please can you give me a way that does what I am asking for or another easier method that gives the same desired result?
Using format(as.Date(df$Date1, "%Y-%m-%d"), "%Y-%m") works for converting 2008-08-11 to 2008-08 - as suggested by #docendo discimus
I have log files where the date is mentioned in the ordinal date format.
wikipedia page for ordinal date
i.e 14273 implies 273'rd day of 2014 so 14273 is 30-Sep-2014.
is there a function in R to convert ordinal date (14273) to (30-Sep-2014).
Tried the date package but didn come across a function that would do this.
Try as.Date with the indicated format:
as.Date(sprintf("%05d", 14273), format = "%y%j")
## [1] "2014-09-30"
Notes
For more information see ?strptime [link]
The 273 part is sometimes referred to as the day of the year (as opposed to the day of the month) or the day number or the julian day relative to the beginning of the year.
If the input were a character string of the form yyjjj (rather than numeric) then as.Date(x, format = "%y%j") will do.
Update Have updated to also handle years with one digit as per comments.
Data example
x<-as.character(c("14273", "09001", "07031", "01033"))
Data conversion
x1<-substr(x, start=0, stop=2)
x2<-substr(x, start=3, stop=5)
x3<-format(strptime(x2, format="%j"), format="%m-%d")
date<-as.Date(paste(x3, x1, sep="-"), format="%m-%d-%y")
You can use lubridate package as follows:
>library(lubridate)
# Create a template date object
>date <- as.POSIXlt("2009-02-10")
# Update the date using
> update(date, year=2014, yday=273)
[1] "2014-09-30 JST"