This question already has answers here:
Changing the Date print format, retaining its mode, class and type
(1 answer)
Change Date print format from yyyy-mm-dd to dd-mm-yyyy
(2 answers)
Change print format of a Date without converting it to character
(1 answer)
Closed last year.
I have a column of dates that were read in as character. I want to produce a data class with my desired format (say, US-style, 08/28/2020).
But, all solutions to change format, produce character class, or produce date class with standard format (2020-08-28)
This is a reproducible example:
df1 <- data.frame(date=c("08/27/2020", "08/28/2020", "08/29/2020"), cases=c(5,6,7))
class(df1$date)
df1$date1<- format(as.Date(df1$date, format = "%m/%d/%Y"), "%m/%d/%Y")
class(df1$date1)
df1$date2<-as.Date(parse_date_time(df1$date,"%m/%d/%Y"))
class(df1$date2)
df1$date3<- as.Date(df1$date, format = "%m/%d/%Y")
class(df1$date3)
df1
As you can see data1 has my desired format while it is not date class. In addition, date2 and date3 are Date class while they produce undesired format.
date cases date1 date3 date2
1| 08/27/2020 | 5 | 08/27/2020 |2020-08-27 | 2020-08-27|
2| 08/28/2020 | 6 | 08/28/2020 |2020-08-28 | 2020-08-28|
3| 08/29/2020 | 7 | 08/29/2020 |2020-08-29 | 2020-08-29|
Where am I going wrong?
A Date class is always shown like "2020-08-27" in R. That's R's standard Date. To reformat it into something different you can use strftime. It assumes a Date class and outputs a character object with your desired format, e.g.
df1$date2
[1] "2020-08-27" "2020-08-28" "2020-08-29"
class(df1$date2)
[1] "Date"
strftime(df1$date2, format="%m/%d/%Y")
[1] "08/27/2020" "08/28/2020" "08/29/2020"
class(strftime(df1$date2, format="%m/%d/%Y"))
[1] "character"
When dealing with dates and time lubridate package is really handy: https://lubridate.tidyverse.org/.
In this case we could use mdy function (month, day, year) for date and date1.
library(lubridate)
library(dplyr)
df1 %>%
mutate(across(c(date, date1), mdy))
date cases date1 date3
<date> <dbl> <date> <date>
1 2020-08-27 5 2020-08-27 2020-08-27
2 2020-08-28 6 2020-08-28 2020-08-28
3 2020-08-29 7 2020-08-29 2020-08-29
Related
I am working with large datasets and in which one column is represented as char data type instead of a DateTime datatype. I trying it convert but I am unable to convert it.
Could you please suggest any suggestions for this problem? it would be very helpful for me
Thanks in advance
code which i am using right now
c_data$dt_1 <- lubridate::parse_date_time(c_data$started_at,"ymd HMS")
getting output:
2027- 05- 20 20:10:03
but desired output is
2020-05-20 10:03
Here is another way using lubridate:
library(lubridate)
df <- tibble(start_at = c("27/05/2020 10:03", "25/05/2020 10:47"))
df %>%
mutate(start_at = dmy_hms(start_at))
# A tibble: 2 x 1
start_at
<dttm>
1 2020-05-27 20:10:03
2 2020-05-25 20:10:47
In R, dates and times have a single format. You can change it's format to your required format but then it would be of type character.
If you want to keep data in the format year-month-day min-sec you can use format as -
format(Sys.time(), '%Y-%m-%d %M:%S')
#[1] "2021-08-27 17:54"
For the entire column you can apply this as -
c_data$dt_2 <- format(c_data$dt_1, '%Y-%m-%d %M:%S')
Read ?strptime for different formatting options.
Using anytime
library(dplyr)
library(anytime)
addFormats("%d/%m/%Y %H:%M")
df %>%
mutate(start_at = anytime(start_at))
-output
# A tibble: 2 x 1
start_at
<dttm>
1 2020-05-27 10:03:00
2 2020-05-25 10:47:00
I have data with dates in a not directly usable format. I have data that are either annual, quaterly or mensual. Annual are stored correctly, quaterly are in the form 1Q2010, and monthly JAN2010.
So something like
library(tidyverse)
library(data.table)
MWE <- data.table(date=c("JAN2020","FEB2020","1Q2020","2020"),
value=rnorm(4,2,1))
> MWE
date value
1: JAN2020 2.5886057
2: FEB2020 0.5913031
3: 1Q2020 1.6237973
4: 2020 1.4093762
I want to have them in a standard format. I thing a decently readable way to do that is to replace the non standard elements, so to have these elements :
Date_Brute <- c("JAN","FEB","MAR","APR","MAY","JUN","JUL","AUG","SEP","OCT","NOV","DEC","1Q","2Q","3Q","4Q")
Replaced by these ones
Date_Standardisee <- c("01-01","01-02","01-03","01-04","01-05","01-06","01-07", "01-08","01-09","01-10","01-11","01-12","01-01","01-04","01-07","01-10")
Now I think gsub does not work with vectors. I have found this answer that suggests using stingr::str_replace_all but I have not been able to make it function in a data.table.
I am open to other functions to replace a vector by another one, but would like to avoid for instance slicing the data, and using specific date lectures functions.
Desired output :
> MWE
date value
1: 01-01-2020 2.5886057
2: 01-02-2020 0.5913031
3: 01-01-2020 1.6237973
4: 2020 1.4093762
You can try with lubridate::parse_date_time() and which takes a vector of candidate formats to attempt in the conversion:
library(lubridate)
library(data.table)
MWE[, date := parse_date_time(date, orders = c("bY","qY", "Y"))]
date value
1: 2020-01-01 -0.4948354
2: 2020-02-01 1.0227036
3: 2020-01-01 2.6285688
4: 2020-01-01 1.9158595
We can use grep with as.yearqtr and as.yearmon to convert those 'date' elements into Date class and further change it to the specified format
library(zoo)
library(data.table)
MWE[grep('Q', date), date := format(as.Date(as.yearqtr(date,
'%qQ %Y')), '%d-%m-%Y')]
MWE[grep("[A-Z]", date), date := format(as.Date(as.yearmon(date)), '%d-%m-%Y')]
-output
MWE
# date value
#1: 01-01-2020 0.8931051
#2: 01-02-2020 2.9813625
#3: 01-01-2020 1.1918638
#4: 2020 2.8001267
Or another option is fcoalecse with myd from lubridate
library(lubridate)
MWE[, date := fcoalesce(format(myd(date, truncated = 2), '%d-%m-%Y'), date)]
I have a dataframe with the column name perioden. This column contains the date but it is written in this format: 2010JJ00, 2011JJ00, 2012JJ00, 2013JJ00 etc..
This column is also a character when I look at the structure. I've tried multiple solutions but so far am still stuck, my qeustion is how can I convert this column to a date and how do I remove the JJ00 part so that you only see the year format of the column.
You can try this approach. Using gsub() to remove the non desired text (as said by #AllanCameron) and then format to date using paste0() to add the day and month, and as.Date() for date transformation:
#Data
df <- data.frame(Date=c('2010JJ00', '2011JJ00', '2012JJ00', '2013JJ00'),stringsAsFactors = F)
#Remove string
df$Date <- gsub('JJ00','',df$Date)
#Format to date, you will need a day and month
df$Date2 <- as.Date(paste0(df$Date,'-01-01'))
Output:
Date Date2
1 2010 2010-01-01
2 2011 2011-01-01
3 2012 2012-01-01
4 2013 2013-01-01
We can use ymd with truncated option
library(lubridate)
library(stringr)
ymd(str_remove(df$Date, 'JJ\\d+'), truncated = 2)
#[1] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01"
data
df <- data.frame(Date=c('2010JJ00', '2011JJ00', '2012JJ00', '2013JJ00'), stringsAsFactors = FALSE)
This question already has an answer here:
Convert multiple character columns to as.Date and time in R
(1 answer)
Closed 2 years ago.
I have a tibble containing some date columns formatted as strings:
library(tidyverse)
df<-tibble(dates1 = c("2020-08-03T00:00:00.000Z", "2020-08-03T00:00:00.000Z"),
dates2 = c("2020-08-05T00:00:00.000Z", "2020-08-05T00:00:00.000Z"))
I want to convert the strings from YMD-HMS to DMY-HMS. Can someone explain to me why this doesn't work:
df %>%
mutate_at(vars(starts_with("dates")), as.Date, format="%d/%m/%Y %H:%M:%S")
Whereas this does?
df %>% mutate(dates1 = format(as.Date(dates1), "%d/%m/%Y %H:%M:%S")) %>%
mutate(dates2 = format(as.Date(dates2), "%d/%m/%Y %H:%M:%S"))
Finally, is it possible to assign these columns as 'datetime' columns (e.g. dttm) rather than chr once the date formatting has taken place?
The format argument which you are passing is for as.Date whereas what you really want is to pass it for format function. You can use an anonymous function for that or use formula syntax.
library(dplyr)
df %>%
mutate(across(starts_with("dates"), ~format(as.Date(.), "%d/%m/%Y %H:%M:%S")))
# A tibble: 2 x 2
# dates1 dates2
# <chr> <chr>
#1 03/08/2020 00:00:00 05/08/2020 00:00:00
#2 03/08/2020 00:00:00 05/08/2020 00:00:00
To represent data as date or datetime R uses standard way of representing them which is Y-M-D H:M:S, you can change the representation using format but then the output would be character as above.
df %>%
mutate(across(starts_with("dates"), lubridate::ymd_hms))
# dates1 dates2
# <dttm> <dttm>
#1 2020-08-03 00:00:00 2020-08-05 00:00:00
#2 2020-08-03 00:00:00 2020-08-05 00:00:00
This question already has an answer here:
Convert factor to date class for multiple columns
(1 answer)
Closed 2 years ago.
I read in an array from Excel using read_excel, and get two datetime columns, but what I need is two columns of dates
User DOB Answer_dt Question Answer
<chr> <dttm> <dttm> <int> <int>
1 User1 1900-01-01 00:00:00 2017-01-26 00:00:00 1 7
2 User2 1900-01-01 00:00:00 2017-01-26 00:00:00 2 8
I would like the datetime columns to be converted to dates (the times are irrelevant), and have tried using mutate and lubridate in various combinations, but have succeeded only in getting an error message that I don't understand:
> library(lubridate)
> dt <- eML_daily[1, "DOB"]
> dt
# A tibble: 1 x 1
DOB
<dttm>
1 1900-01-01 00:00:00
Warning message:
`...` is not empty.
These dots only exist to allow future extensions and should be empty.
Did you misspecify an argument?
> as_date(dt)
Error in as.Date.default(x, ...) :
do not know how to convert 'x' to class “Date”
> as_date(df[,"DOB"])
Error in as.Date.default(x, ...) :
do not know how to convert 'x' to class “Date”
I don't understand the warning messages, and can't quite see what I am doing wrong. Surely it should be a simple matter to convert from dttm to date and discard the time, which I don't need.
I'd be very appreciative for a pointer.
Sincerely and with many thanks in advance
Thomas Philips
In as_date(dt) you are attempting to convert a tibble to a datetime. That unsurprisingly fails. In as_date(df[,"DOB"]), I can't say what you are trying to do as you haven't given us df.
Working example;
library(tidyverse)
library(lubridate)
dt <- tibble(x=as_datetime("2017-01-26 00:00:00"))
dt
# A tibble: 1 x 1
x
<dttm>
1 2017-01-26 00:00:00
dt %>% mutate(x=as_date(x))
# A tibble: 1 x 1
x
<date>
1 2017-01-26
You can use as.Date to convert date-time columns to date.
If you want to change columns 2 and 3 to date, you can do.
eML_daily[2:3] <- lapply(eML_daily[2:3], as.Date)
Or with dplyr :
library(dplyr)
eML_daily %>% mutate(across(2:3, as.Date))
#For dplyr < 1.0.0
#eML_daily %>% mutate_at(2:3, as.Date)
Have you tried to convert it to character first?
Here's a quick sample:
x <- tibble(dt = c(Sys.time(),Sys.time() - 345767)) %>%
mutate(dt = as_date(as.character(dt)))