I have a variable for the date of medical admission. However, it is not properly formatted. It is a factor and formatted as "DDMMYEAR HRMN", like "01012016 1215", which should mean "01-01-2016 12:15". How can I reformat it and assign weekdays?
You can use lubridate to parse the date, then weekdays from base R to get the day of week as a character.
library(lubridate)
d <- dmy_hm("01012016 1215")
weekdays(d)
Use as.POSIXct/strptime to convert to date time and then use weekdays.
df$date <- as.POSIXct(df$date, format = '%d%m%Y %H%M', tz = 'UTC')
df$weekday <- weekdays(df$date)
For example,
string <- '01012016 1215'
date <- as.POSIXct(string, format = '%d%m%Y %H%M', tz = 'UTC')
date
#[1] "2016-01-01 12:15:00 UTC"
weekdays(date)
#[1] "Friday"
Related
I'm having trouble converting my date variable in my data frame to a new date-time variable. I know parse_date_time(x, orders="ymd HMS") but I don't know what code is needed to say: use this dataframe (workHours) and grab this column (date) now change to a new column named (date_time) and convert to "ymd HMS"
The date column already has the date and time in this format: mm/dd/yyyy hh:mm:ss and its a fct or factor.
If you have data like this -
workHours <- data.frame(date = c('3/22/2020 04:51:12', '3/15/2019 10:12:32'))
workHours
# date
#1 3/22/2020 04:51:12
#2 3/15/2019 10:12:32
You can use as.POSIXct in base R
workHours$date_time <- as.POSIXct(workHours$date, format = '%m/%d/%Y %T', tz = 'UTC')
Or lubridate::mdy_hms
workHours$date_time <- lubridate::mdy_hms(workHours$date)
Both of which would return -
# date date_time
#1 3/22/2020 04:51:12 2020-03-22 04:51:12
#2 3/15/2019 10:12:32 2019-03-15 10:12:32
class(workHours$date_time)
#[1] "POSIXct" "POSIXt"
We can use anytime
library(anytime)
workHours <- data.frame(date = c('3/22/2020 04:51:12', '3/15/2019 10:12:32'))
anytime(workHours$date)
[1] "2020-03-22 04:51:12 EDT" "2019-03-15 10:12:32 EDT"
How does one convert a column from str to dtm? I've tried as.date and strptime and non of those works. Say I have a table with a column with 3 attributes (2003/11/04 19:29, 2001/04/02 21:32, 2003/10/28 09:51) in the str format. How would I covert this column so that it is in the dtm format? Thank you in advance!
Check ?strptime for different format arguments. You can do:
x <- c('2003/11/04 19:29', '2001/04/02 21:32', '2003/10/28 09:51')
as.POSIXct(x, format = "%Y/%m/%d %H:%M", tz = "UTC")
#Can also be done with `strptime`
#strptime(x, format = "%Y/%m/%d %H:%M", tz = "UTC")
#[1] "2003-11-04 19:29:00 UTC" "2001-04-02 21:32:00 UTC" "2003-10-28 09:51:00 UTC"
Or with lubridate
lubridate::ymd_hm(x)
Replace x with column name df$column_name.
In my dataframe I have a date column and I would like to convert it from character to date in the format d/m/y.
The head of my data:
head(df$date)
[1] [17/Jun/2019:08:33:49 [17/Jun/2019:08:38:20 [17/Jun/2019:08:38:24 [17/Jun/2019:09:52:42
[5] [17/Jun/2019:09:52:44 [17/Jun/2019:09:52:45
I used this but it converts every value into NA
df$date = as.Date(df$date, "[%d%b%y")
Try:
df$date <- strptime(df$date, format = "[%d/%b/%Y:%H:%M:%S")
df$date <- as.Date(df$date, format = "%d/%m/%y")
Using a tidyverse approach, looks like the dmy_hms() function accommodates that atypical first colon:
library(lubridate)
df <- df %>% mutate(date = dmy_hms(date), date = date(date))
Using your first value as an example:
date <- "17/Jun/2019:08:33:49"
date <- dmy_hms(date)
date
#[1] "2019-06-17 08:33:49 UTC"
date <- date(date) #or all in one line, date <- dmy_hms(date) %>% date()
date
#[1] "2019-06-17"
Assuming this is your input
x <- c("[17/Jun/2019:08:33:49", "[17/Jun/2019:08:38:20",
"17/Jun/2019:08:38:24", "[17/Jun/2019:09:52:42")
First convert it into POSIXct format and then to Date
as.Date(as.POSIXct(x, format = "[%d/%b/%Y:%T"))
#[1] "2019-06-17" "2019-06-17" "2019-06-17" "2019-06-17"
or any other format
format(as.POSIXct(x, format = "[%d/%b/%Y:%T"), "%d/%m/%Y")
#[1] "17/06/2019" "17/06/2019" "17/06/2019" "17/06/2019"
If you want to convert into Date object try this.
df$date = as.Date(df$date,format="[%d/%b/%Y:%H:%M:%S")
If you want to retain time as well, then try the following.
df$date = as.POSIXct(df$date,format="[%d/%b/%Y:%H:%M:%S")
Best wishes.
I have a large date frame of over 100k rows. The date column contains dates in multiple formats such as "%m/%d/%Y", "%Y-%m", "%Y", and "%Y-%m-%d". I can convert these all to dates with parse_date_time() from lubridate.
dates <- c("05/10/1983","8/17/2014","1953-12","1975","2001-06-17")
parse_date_time(dates, orders = c("%m/%d/%Y","%Y-%m","%Y","%Y-%m-%d"))
[1] "1983-05-10 UTC" "2014-08-17 UTC" "1953-12-01 UTC" "1975-01-01 UTC" "2001-06-17 UTC"
But as you can see, this sets dates with missing day to the first of the month and dates with missing month and day to the first of the year. How can I set those to the 15th and June 15th, respectively?
Use nchar to check the dates vector and paste what is missing.
library(lubridate)
dates <- c("05/10/1983","8/17/2014","1953-12","1975","2001-06-17")
dates <- ifelse(nchar(dates) == 4, paste(dates, "06-15", sep = "-"),
ifelse(nchar(dates) == 7, paste(dates, 15, sep = "-"), dates))
dates
#[1] "05/10/1983" "8/17/2014" "1953-12-15" "1975-06-15"
#[5] "2001-06-17"
parse_date_time(dates, orders = c("%m/%d/%Y","%Y-%m","%Y","%Y-%m-%d"))
#[1] "1983-05-10 UTC" "2014-08-17 UTC" "1953-12-15 UTC"
#[4] "1975-06-15 UTC" "2001-06-17 UTC"
Another solution would be to use an index vector, also based on nchar.
n <- nchar(dates)
dates[n == 4] <- paste(dates[n == 4], "06-15", sep = "-")
dates[n == 7] <- paste(dates[n == 7], "15", sep = "-")
dates
#[1] "05/10/1983" "8/17/2014" "1953-12-15" "1975-06-15"
#[5] "2001-06-17"
As you can see, the result is the same as with ifelse.
Here's another way of doing that - based on orders:
library(lubridate)
dates <- c("05/10/1983","8/17/2014","1953-12","1975","2001-06-17")
parseDates <- function(x, orders = c('mdY', 'dmY', 'Ymd', 'Y', 'Ym')){
fmts <- guess_formats(x, orders = orders)
dte <- parse_date_time(x, orders = fmts[1], tz = 'UTC')
if(!grepl('m', fmts[1]) ){
dte <- dte + days(165)
return(dte)
}
if(!grepl('d', fmts[1]) ){
dte <- dte + days(14)
}
return(dte)
}
output
> parseDates(dates[4])
[1] "1975-06-15 UTC"
> parseDates(dates[3])
[1] "1953-12-15 UTC"
This way for different date formats you only need to change the orders argument while the rest is done using lubridate.
Hope this is helpful!
I have this data frame which gives me Date and Time columns. I am trying to combine these 2 columns but strptime is returning NA. i want to understand why is it happening?
x <- data.frame(date = "1/2/2007", time = "00:00:02")
y <- strptime(paste(x$date,x$time,sep = " "), format = "%b/%d/%y %H:%M:%S")
We need %m and %Y in place of %b and %y (%b - Abbreviated month name in the current locale on this platform. %y - Year without century (00–99)).
strptime(paste(x$date,x$time,sep = " "), "%m/%d/%Y %H:%M:%S")
#[1] "2007-01-02 00:00:02 IST"
For understanding the format, it is better to check ?strptime
Or we can use mdy_hms from lubridate
library(lubridate)
with(x, mdy_hms(paste(date, time)))
#[1] "2007-01-02 00:00:02 UTC"