This question already has an answer here:
Convert multiple character columns to as.Date and time in R
(1 answer)
Closed 2 years ago.
I have a tibble containing some date columns formatted as strings:
library(tidyverse)
df<-tibble(dates1 = c("2020-08-03T00:00:00.000Z", "2020-08-03T00:00:00.000Z"),
dates2 = c("2020-08-05T00:00:00.000Z", "2020-08-05T00:00:00.000Z"))
I want to convert the strings from YMD-HMS to DMY-HMS. Can someone explain to me why this doesn't work:
df %>%
mutate_at(vars(starts_with("dates")), as.Date, format="%d/%m/%Y %H:%M:%S")
Whereas this does?
df %>% mutate(dates1 = format(as.Date(dates1), "%d/%m/%Y %H:%M:%S")) %>%
mutate(dates2 = format(as.Date(dates2), "%d/%m/%Y %H:%M:%S"))
Finally, is it possible to assign these columns as 'datetime' columns (e.g. dttm) rather than chr once the date formatting has taken place?
The format argument which you are passing is for as.Date whereas what you really want is to pass it for format function. You can use an anonymous function for that or use formula syntax.
library(dplyr)
df %>%
mutate(across(starts_with("dates"), ~format(as.Date(.), "%d/%m/%Y %H:%M:%S")))
# A tibble: 2 x 2
# dates1 dates2
# <chr> <chr>
#1 03/08/2020 00:00:00 05/08/2020 00:00:00
#2 03/08/2020 00:00:00 05/08/2020 00:00:00
To represent data as date or datetime R uses standard way of representing them which is Y-M-D H:M:S, you can change the representation using format but then the output would be character as above.
df %>%
mutate(across(starts_with("dates"), lubridate::ymd_hms))
# dates1 dates2
# <dttm> <dttm>
#1 2020-08-03 00:00:00 2020-08-05 00:00:00
#2 2020-08-03 00:00:00 2020-08-05 00:00:00
Related
I am working with large datasets and in which one column is represented as char data type instead of a DateTime datatype. I trying it convert but I am unable to convert it.
Could you please suggest any suggestions for this problem? it would be very helpful for me
Thanks in advance
code which i am using right now
c_data$dt_1 <- lubridate::parse_date_time(c_data$started_at,"ymd HMS")
getting output:
2027- 05- 20 20:10:03
but desired output is
2020-05-20 10:03
Here is another way using lubridate:
library(lubridate)
df <- tibble(start_at = c("27/05/2020 10:03", "25/05/2020 10:47"))
df %>%
mutate(start_at = dmy_hms(start_at))
# A tibble: 2 x 1
start_at
<dttm>
1 2020-05-27 20:10:03
2 2020-05-25 20:10:47
In R, dates and times have a single format. You can change it's format to your required format but then it would be of type character.
If you want to keep data in the format year-month-day min-sec you can use format as -
format(Sys.time(), '%Y-%m-%d %M:%S')
#[1] "2021-08-27 17:54"
For the entire column you can apply this as -
c_data$dt_2 <- format(c_data$dt_1, '%Y-%m-%d %M:%S')
Read ?strptime for different formatting options.
Using anytime
library(dplyr)
library(anytime)
addFormats("%d/%m/%Y %H:%M")
df %>%
mutate(start_at = anytime(start_at))
-output
# A tibble: 2 x 1
start_at
<dttm>
1 2020-05-27 10:03:00
2 2020-05-25 10:47:00
I have a dataframe with the column name perioden. This column contains the date but it is written in this format: 2010JJ00, 2011JJ00, 2012JJ00, 2013JJ00 etc..
This column is also a character when I look at the structure. I've tried multiple solutions but so far am still stuck, my qeustion is how can I convert this column to a date and how do I remove the JJ00 part so that you only see the year format of the column.
You can try this approach. Using gsub() to remove the non desired text (as said by #AllanCameron) and then format to date using paste0() to add the day and month, and as.Date() for date transformation:
#Data
df <- data.frame(Date=c('2010JJ00', '2011JJ00', '2012JJ00', '2013JJ00'),stringsAsFactors = F)
#Remove string
df$Date <- gsub('JJ00','',df$Date)
#Format to date, you will need a day and month
df$Date2 <- as.Date(paste0(df$Date,'-01-01'))
Output:
Date Date2
1 2010 2010-01-01
2 2011 2011-01-01
3 2012 2012-01-01
4 2013 2013-01-01
We can use ymd with truncated option
library(lubridate)
library(stringr)
ymd(str_remove(df$Date, 'JJ\\d+'), truncated = 2)
#[1] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01"
data
df <- data.frame(Date=c('2010JJ00', '2011JJ00', '2012JJ00', '2013JJ00'), stringsAsFactors = FALSE)
This question already has an answer here:
Convert factor to date class for multiple columns
(1 answer)
Closed 2 years ago.
I read in an array from Excel using read_excel, and get two datetime columns, but what I need is two columns of dates
User DOB Answer_dt Question Answer
<chr> <dttm> <dttm> <int> <int>
1 User1 1900-01-01 00:00:00 2017-01-26 00:00:00 1 7
2 User2 1900-01-01 00:00:00 2017-01-26 00:00:00 2 8
I would like the datetime columns to be converted to dates (the times are irrelevant), and have tried using mutate and lubridate in various combinations, but have succeeded only in getting an error message that I don't understand:
> library(lubridate)
> dt <- eML_daily[1, "DOB"]
> dt
# A tibble: 1 x 1
DOB
<dttm>
1 1900-01-01 00:00:00
Warning message:
`...` is not empty.
These dots only exist to allow future extensions and should be empty.
Did you misspecify an argument?
> as_date(dt)
Error in as.Date.default(x, ...) :
do not know how to convert 'x' to class “Date”
> as_date(df[,"DOB"])
Error in as.Date.default(x, ...) :
do not know how to convert 'x' to class “Date”
I don't understand the warning messages, and can't quite see what I am doing wrong. Surely it should be a simple matter to convert from dttm to date and discard the time, which I don't need.
I'd be very appreciative for a pointer.
Sincerely and with many thanks in advance
Thomas Philips
In as_date(dt) you are attempting to convert a tibble to a datetime. That unsurprisingly fails. In as_date(df[,"DOB"]), I can't say what you are trying to do as you haven't given us df.
Working example;
library(tidyverse)
library(lubridate)
dt <- tibble(x=as_datetime("2017-01-26 00:00:00"))
dt
# A tibble: 1 x 1
x
<dttm>
1 2017-01-26 00:00:00
dt %>% mutate(x=as_date(x))
# A tibble: 1 x 1
x
<date>
1 2017-01-26
You can use as.Date to convert date-time columns to date.
If you want to change columns 2 and 3 to date, you can do.
eML_daily[2:3] <- lapply(eML_daily[2:3], as.Date)
Or with dplyr :
library(dplyr)
eML_daily %>% mutate(across(2:3, as.Date))
#For dplyr < 1.0.0
#eML_daily %>% mutate_at(2:3, as.Date)
Have you tried to convert it to character first?
Here's a quick sample:
x <- tibble(dt = c(Sys.time(),Sys.time() - 345767)) %>%
mutate(dt = as_date(as.character(dt)))
I'm trying to deal with a date and time variable (dttm) in a spark data frame. I'm using sparklyr and dplyr. Here is my issue...
Each row of the column in question is in this format:
2018-06-11 22:06:45
I want to split this date and time column (dttm) into two columns :
the first one with the date : 2018-06-11 (yyyy-mm-dd)
the second one with the time : 22:06:45 (hh:mm:ss)
So in the first place, I used regexp_replace and mutate to create the time column :
spark_df %>% mutate(time = regexp_replace(date_and_time, "^[^_]* ", ""))
Here is what I obtain in my new column "time":
00:06:45
So the code is nearly working, the only issue is that the two first digit are converting in 00.
Maybe this could be a good starting point if it doesn't solve your problem.
dates <- data.frame(date =
c("2018-06-11 22:06:45", "2018-06-11 22:07:45", "2019-06-11 22:06:45"))
tbl <- copy_to(sc, dates)
tbl %>% mutate(new_date = as.POSIXct(date)) %>%
mutate(day = as.Date(new_date),
time = paste0(hour(new_date), ":", minute(new_date), ":",
second(new_date)))
# date new_date day time
# <chr> <dttm> <date> <chr>
# 1 2018-06-11 22:06:45 2018-06-11 12:06:45 2018-06-11 22:6:45
# 2 2018-06-11 22:07:45 2018-06-11 12:07:45 2018-06-11 22:7:45
# 3 2019-06-11 22:06:45 2019-06-11 12:06:45 2019-06-11 22:6:45
I have looked at different options from previous answers, but none has given me the correct output.
I would like to separate timestamp into date and time using R
sorted_transactions_table$TRANSACTION_DATE <- as.Date(sorted_transactions_table$TRANSACTION_TIME)
I have tried this but I get an error:
Error in charToDate(x) : character string is not in a standard
unambiguous format
Timestamp from my dataset is in the format:
01-OCT-18 12.01.23.000000 AM
Convert it into standard datetime format first and then use format
df$TRANSACTION_DATE <- as.POSIXct(df$TRANSACTION_DATE,
format = "%d-%b-%y %H.%M.%OS %p")
transform(df, Date = as.Date(TRANSACTION_DATE),
#Also Date = format(TRANSACTION_DATE, "%Y-%m-%d") would work
time = format(TRANSACTION_DATE, "%T"))
# col1 TRANSACTION_DATE Date time
#1 1 2018-10-01 12:01:23 2018-10-01 12:01:23
#2 2 2018-10-01 12:02:23 2018-10-01 12:02:23
#3 3 2018-10-01 12:03:23 2018-10-01 12:03:23
You could also do this in dplyr chain
library(dplyr)
df %>%
mutate(TRANSACTION_DATE = as.POSIXct(TRANSACTION_DATE,
format = "%d-%b-%y %H.%M.%OS %p"),
Date = as.Date(TRANSACTION_DATE),
time = format(TRANSACTION_DATE, "%T"))
Read ?strptime for all formatting options.
data
Using a reproducible example
df <- data.frame(col1 = 1:3, TRANSACTION_DATE = c("01-OCT-18 12.01.23.000000 AM",
"01-OCT-18 12.02.23.000000 AM", "01-OCT-18 12.03.23.000000 AM"))
df
# col1 TRANSACTION_DATE
#1 1 01-OCT-18 12.01.23.000000 AM
#2 2 01-OCT-18 12.02.23.000000 AM
#3 3 01-OCT-18 12.03.23.000000 AM
I would use the lubridate package:
library(lubridate)
library(dplyr)
df %>%
mutate(TRANSACTION_DATE = dmy_hms(TRANSACTION_DATE),
Date = date(TRANSACTION_DATE),
time = format(TRANSACTION_DATE, "%T"))