How to find weekday number from datetime format in R - r

I have a dataset that contains DateTime in (dd-mm-yy hh: mm) format. I want to find the weekday number? I have tried the weekday function, but it is showing the wrong output.
wday("13-06-21 19:32")
6

The datetime is not in the POSIXct class i.e. it is just character class. We need to convert and then apply
library(lubridate)
wday(dmy_hm("13-06-21 19:32"))
[1] 1
According to the ?wday
x- a POSIXct, POSIXlt, Date, chron, yearmon, yearqtr, zoo, zooreg, timeDate, xts, its, ti, jul, timeSeries, or fts object.
Thus, a character input may return incorrect value
If we check the source code
> methods('wday')
[1] wday.default* wday.numeric*
> getAnywhere(wday.default)
function (x, label = FALSE, abbr = TRUE, week_start = getOption("lubridate.week.start",
7), locale = Sys.getlocale("LC_TIME"))
{
wday(as.POSIXlt(x, tz = tz(x))$wday + 1, label, abbr, locale = locale,
week_start = week_start)
}
It is calling the as.POSIXlt on a incorrect format
> as.POSIXlt("13-06-21 19:32", tz = tz("13-06-21 19:32"))
[1] "0013-06-21 19:32:00 UTC"
> as.POSIXlt("13-06-21 19:32", tz = tz("13-06-21 19:32"))$wday
[1] 5
> as.POSIXlt("13-06-21 19:32", tz = tz("13-06-21 19:32"))$wday + 1
[1] 6

You do not need extra packages such as lubridate.
Parse the date, convert to POSIXlt, and extract the weekday (and adjust for for its scale)
> D <- as.Date("13-06-21 19:32", "%d-%m-%y")
> D
[1] "2021-06-13"
>
> as.POSIXlt(D)$wday + 1 # see help(DateTimeClasses)
[1] 1
>

Related

Adding a new column and calculating ride length from start to finish

I'm trying to add a new column to my data set and calculate the ride_length from start to finish.
Example of what glimpse returns:
$ started_at <chr> "23/01/2021 16:14",
$ ended_at <chr> "23/01/2021 16:24",
My code:
data_trip_cleaned$ride_length <- difftime(data_trip_cleaned$started_at,data_trip_cleaned$ended_at,units = "mins")
Error:
Error in as.POSIXlt.character(x, tz, ...) : character string is not in a standard unambiguous format
Your error suggests difftime can't interpret the format of your date/time automatically. From ?difftime:
"Function difftime calculates a difference of two date/time objects
and returns an object of class difftime with an attribute indicating
the units."
Are your started_at and ended_at class = datetime? If not, look at ?as.POSIXct. Confirm this works like you are expecting:
as.POSIXct("23/01/2021 16:24", format = "%d/%m/%Y %H:%M")
# "2021-01-23 16:24:00 EST"
For each column:
data_trip_cleaned$started_at <- as.POSIXct(
data_trip_cleaned$started_at, format = "%d/%m/%Y %H:%M")
data_trip_cleaned$ended_at <- as.POSIXct(
data_trip_cleaned$ended_at, format = "%d/%m/%Y %H:%M")
# or many columns
datetimes <- c("started_at", "ended_at")
t(lapply(df[,datetimes], FUN = function(x) as.POSIXct(x, format = "%d/%m/%Y %H:%M")))
# Then calculate difference
data_trip_cleaned$diff <- data_trip_cleaned$ended_at - data_trip_cleaned$started_at
# Alternatively
difftime(data_trip_cleaned$ended_at, data_trip_cleaned$started_at, unit = "secs")
# See ?difftime to see other options for units=

Factor variable into weekdays

I have a variable for the date of medical admission. However, it is not properly formatted. It is a factor and formatted as "DDMMYEAR HRMN", like "01012016 1215", which should mean "01-01-2016 12:15". How can I reformat it and assign weekdays?
You can use lubridate to parse the date, then weekdays from base R to get the day of week as a character.
library(lubridate)
d <- dmy_hm("01012016 1215")
weekdays(d)
Use as.POSIXct/strptime to convert to date time and then use weekdays.
df$date <- as.POSIXct(df$date, format = '%d%m%Y %H%M', tz = 'UTC')
df$weekday <- weekdays(df$date)
For example,
string <- '01012016 1215'
date <- as.POSIXct(string, format = '%d%m%Y %H%M', tz = 'UTC')
date
#[1] "2016-01-01 12:15:00 UTC"
weekdays(date)
#[1] "Friday"

How can I test if 2 datetimes represent the same clock time (ignoring the date) in R?

In R, I have 2 POSIXct (i.e. date+time) vectors, and I would like to test if they represent the same clock time, regardless of date. For example:
library(lubridate)
# 1:12:57 PM in NYC
dttm1 <- ymd_hms("1993-10-15 13:12:57", tz = "America/New_York")
# Also 1:12:57 PM in NYC, on a different day
dttm2 <- ymd_hms("2007-2-27 17:12:57", tz = "UTC")
# 1:12:57 PM, but not in NYC
dttm3 <- ymd_hms("2007-2-27 13:12:57", tz = "UTC")
# Not 1:12:57 PM in any time zone
dttm4 <- ymd_hms("1963-1-15 01:12:57", tz = "UTC")
Is there a function that will tell me which of these datetimes represent the same clock time? Such a function should return TRUE for dttm1 and dttm2 and FALSE for any other pair of the above dates. Ideally this function would be vectorized to work element-wise on vectors of datetimes. If such a function only works for datetimes in the same time zone or requires specifying a time zone in which the comparison should be performed, that would be fine. (For bonus points, a 2nd function that returns TRUE for only dttm1 and dttm3 would be nice as well.)
I must be reinventing the wheel here, but you could do:
utc_time <- function(x, convert = TRUE) {
if (convert) x <- as.POSIXct(as.numeric(x), origin = "1970-01-01", tz = "UTC")
substr(x, 12, 19)
}
So that by default, you get the UTC-converted time as a string
utc_time(dttm1)
#> [1] "17:12:57"
utc_time(dttm2)
#> [1] "17:12:57"
utc_time(dttm1) == utc_time(dttm2)
#> [1] TRUE
utc_time(dttm3)
#> [1] "13:12:57"
utc_time(dttm3) == utc_time(dttm1)
#> [1] FALSE
But if you want the option to not convert date times, then specify convert = FALSE:
utc_time(dttm1, convert = FALSE) == utc_time(dttm3, convert = FALSE)
#> [1] TRUE
you could also do:
mytime <- function(x) as.numeric(x) %% 86400
mytime(dttm1) == mytime(c(dttm2,dttm3, dttm4))
[1] TRUE FALSE FALSE

as.POSIXct does not recognise date format = "%Y-%W"

library(xts)
data <- data.frame(year_week = c("2016-46", "2016-47", "2016-48"),
satisfaction = c(0.25, 0.45, 0.58))
data = xts(data[-1], order.by = as.POSIXct(data$year_week, format = "%Y-%W"))
I want to create an xts object from the data.frame data where the dates keep the format year-week. When I am running the code the columns take the form 2016-12-05 which is incorrect and far from what I am trying to achieve.
This is a variant on a quasi-FAQ on 'by can one not parse year and month as a date': because a date is day and month and year.
Or year, a year and week and day. Otherwise you are indeterminate:
> as.Date(format(Sys.Date(), "%Y-%W-%d"), "%Y-%W-%d")
[1] "2017-12-04"
>
using
> Sys.Date()
[1] "2017-12-04"
> format(Sys.Date(), "%Y-%W-%d")
[1] "2017-49-04"
>
so %W works on input and output provided you also supply a day.
For input data that does not have a day, you could just add a given weekday, say 1:
> as.Date(paste0(c("2016-46", "2016-47", "2016-48"), "-1"), "%Y-%W-%w")
[1] "2016-11-14" "2016-11-21" "2016-11-28"
>

Convert to the day and time of the year in R

I have data for more than 3 years. For each year I want to find the day corresponding to Jaunary 1 of that year. For example:
> x <- c('5/5/2007','12/31/2007','1/2/2008')
> #Convert to day of year (julian date) –
> strptime(x,"%m/%d/%Y")$yday+1
[1] 125 365 2
I want to know how to do the same thing but with time added. But I still get the day not time. Can anyone suggest what is the better way to find the julian date with date and time ?
> x1 <- c('5/5/2007 02:00','12/31/2007 05:58','1/2/2008 16:25')
> #Convert to day of year (julian date) –
> strptime(x1,"%m/%d/%Y %H:%M")$yday+1
[1] 125 365 2
Rather than this result, I want the output in decimal days. For example the first example would be 125.0833333 and so on.
Thank you so much.
Are you hoping to get the day + a numerical part of a day as output? If so, something like this will work:
test <- strptime(x1,"%m/%d/%Y %H:%M")
(test$yday+1) + (test$hour/24) + (test$min/(24*60))
#[1] 125.083333 365.248611 2.684028
Although this matches what you ask for, I think removing the +1 might make more sense:
(test$yday) + (test$hour/24) + (test$min/(24*60))
#[1] 124.083333 364.248611 1.684028
Though my spidey senses are tingling that Dirk is going to show up and show me how to do this with a POSIXct date/time representation.
Here is an attempt of such an answer using base functions:
mapply(julian, as.POSIXct(test), paste(format(test,"%Y"),"01","01",sep="-"))
#[1] 124.083333 364.248611 1.684028
You can also use POSIXct and POSIXlt representations along with firstof function from xts.
x1 <- c("5/5/2007 02:00", "12/31/2007 05:58", "1/2/2008 16:25")
x1
## [1] "5/5/2007 02:00" "12/31/2007 05:58" "1/2/2008 16:25"
y <- as.POSIXlt(x1, format = "%m/%d/%Y %H:%M")
result <- mapply(julian, x = as.POSIXct(y), origin = firstof(y$year + 1900))
result
## [1] 124.083333 364.248611 1.684028
if you don't want to use xts then perhaps something like this
result <- mapply(julian,
x = as.POSIXct(x1, format = "%m/%d/%Y %H:%M", tz = "GMT"),
origin = as.Date(paste0(gsub(".*([0-9]{4}).*", "\\1", x1),
"-01-01"),
tz = "GMT"))
result
## [1] 124.083333 364.248611 1.684028
If you want to do the other way around (convert day of the year to date and time), you can use this little function:
doy2date = function(mydoy){
mydate = as.Date(mydoy, origin = "2008-01-01 00:00:00", tz = "GMT")
dech = (mydoy - as.integer(mydoy)) * 24
myh = as.integer(dech)
mym = as.integer( (dech - as.integer(dech)) * 60)
mys = round(I( (((dech - as.integer(dech)) * 60) - mym) * 60), digits=0 )
posixdate = as.POSIXct(paste(mydate, " ", myh,":",mym,":",mys, sep=""), tz = "GMT")
return(posixdate)
}
As an example, if you try:
doy2date(117.6364)
The function will return "2008-04-27 15:16:25 GMT" as a POSIXct.

Resources