How to convert character into date format in R - r

I have a csv file which has date in following format.
8/13/2016
8/13/2016
8/13/2016
2016-08-13T08:26:04Z
2016-08-13T14:30:23Z
8/13/2016
8/13/2016
When I import this into R it takes it as a character. I want to convert it into Date format,but when I convert it into date format it takes all NA values
as.Date(df$create_date,format="%m%d%y")
Date field in CSV has different formats in which date is recorded. How can I convert it into date format in R

A base R option (assuming that there are only two formats in the OP's 'create_date' column), will be to create a logical index with grepl for those date elements that start with 'year', subset the 'create_date' based on the logical index ('i1'), convert to Date class separately and assign those separately to a Date vector of the same length as the number of rows of the dataset to create the full Date class.
i1 <- grepl("^[0-9]{4}", df$create_date)
v1 <- as.Date(df$create_date[i1])
v2 <- as.Date(df$create_date[!i1], "%m/%d/%Y")
v3 <- Sys.Date() + 0:(nrow(df)-1)
v3[i1] <- v1
v3[!i1] <- v2
df$create_date <- v3
Or as I commented in the OP's post (first) parse_date_time from lubridate can be used
library(lubridate)
as.Date(parse_date_time(df$create_date, c('mdy', 'ymd_hms')))
#[1] "2016-08-13" "2016-08-13" "2016-08-13" "2016-08-13"
#[5] "2016-08-13" "2016-08-13" "2016-08-13"
data
df <- structure(list(create_date = c("8/13/2016", "8/13/2016",
"8/13/2016",
"2016-08-13T08:26:04Z", "2016-08-13T14:30:23Z", "8/13/2016",
"8/13/2016")), .Names = "create_date", class = "data.frame",
row.names = c(NA, -7L))

Related

max(DF1$EFFDT) <= DF2$EFFDT [duplicate]

This question already has an answer here:
Searching for nearest date in data frame
(1 answer)
Closed 1 year ago.
I have two dataframes, DF1 containing monthly data snapshot of data whereas DF2 with a particular date and i want to be able to retrieve data only for closest maxdate (<=) from DF1 wrt DF2 data.
DF1
Account
Date
A1000001
1-JAN-2021
A1000002
1-FEB-2021
A1000003
1-MAR-2021
A1000004
1-APR-2021
DF2
Date
15-MAR-2021
Output Expected:
Account
Date
A1000003
1-MAR-2021
Change the dates to actual date class and using sapply you may find the closest date in df1 for each date in df2.
df1$Date <- as.Date(df1$Date, '%d-%b-%Y')
df2$Date <- as.Date(df2$Date, '%d-%b-%Y')
result <- df1[sapply(df2$Date, function(x) which.min(abs(df1$Date - x))), ]
result
# Account Date
#3 A1000003 2021-03-01
data
It is easier to help if you provide data in a reproducible format
df1 <- structure(list(Account = c("A1000001", "A1000002", "A1000003",
"A1000004"), Date = c("1-JAN-2021", "1-FEB-2021", "1-MAR-2021",
"1-APR-2021")), row.names = c(NA, -4L), class = "data.frame")
df2 <- structure(list(Date = "15-MAR-2021"), row.names = c(NA, -1L),
class = "data.frame")

Convert monthly data from one type to another

I have a column with monthly date that has such type (number of month, _, year):
Date
9_2018
1_2013
12_2014
etc.
I want to convert this date format to a date of the following form (year, month):
New_Date
201809
201301
201412
How can I do this?
We can use zoo::as.yearmon to convert the date and then use format to get data in the required format.
format(zoo::as.yearmon(df$Date, "%m_%Y"), "%Y%m")
#[1] "201809" "201301" "201412"
Or can be done in base R as well by pasting an arbitrary date to year-month value we have.
format(as.Date(paste0("1_", df$Date), "%d_%m_%Y"), "%Y%m")
data
df <- structure(list(Date = structure(c(3L, 1L, 2L), .Label = c("1_2013",
"12_2014", "9_2018"), class = "factor")),
class = "data.frame",row.names = c(NA, -3L))
Padding zeros for month and then splitting at the "_":
library(stringr)
mon <- sapply(strsplit(Date, "_"), FUN="[", 1)
mon <- str_pad(mon, width=2, pad="0")
year <- sapply(strsplit(Date, "_"), FUN="[", 2)
paste0(year,mon)
[1] "201809" "201301" "201412"

Convert Date formats in base R [duplicate]

This question already has answers here:
Changing date format in R
(7 answers)
Closed 3 years ago.
Given two dates in a data frame that are in this format:
df <- tibble(date = c('25/05/95', '21/09/18'))
df$date <- as.Date(df$date)
How can I convert the dates into this format - date = c('1995-05-25', '2018-09-21') with the year appearing first and in four digit format, and by only using base R?
Here is my attempt, I successfully reversed the order, but still wasn't able to express the year in 4 digit format:
df <- tibble(date_orig = c('25/05/1995', '21/09/2018'))
df$date <- as.Date(df$date_orig)
year_date <- format(df$date, '%d')
month_date <- format(df$date, '%m')
day_date <- format(df$date, '%y')
df$newdate <- as.Date(paste(paste(year_date, month_date, sep = '-'), day_date, sep = '-'))
df$newdate_final <- as.Date(df$newdate, '%Y-%m-%d')
You need to know which format your date follows and find it in ?strptime to convert it in date object. As you required output is the standard way to represent dates you would not need format.
as.Date(df$date, "%d/%m/%Y")
#[1] "1995-05-25" "2018-09-21"

Changing date format (NA error)

So I have this data file which includes dates and other values. I've imported my data using the following code:
df <- read.csv(file.choose(), header=T, stringsAsFactors=F)
This is so that all the values in the data frame are in character. This makes the next step easier for me.
The data.frame (df) includes:
date x
20020102 1
20020102 2
The date changes every few thousand rows.
I want to change the date format so that it would be yyyy-mm-dd.
I've tried using the code:
df$date <- as.Date(df$date, format="%Y-%m-%d")
and have also used
df$date <- strptime(df$date, format="%Y-%m-%d")
but have always gotten NA values in the date column.
I'm a beginner at R so it would be very helpful if the solution could be simple or can be explained clearly.
Thanks so much!
You can use the correct format
df$date <- as.Date(df$date, format='%Y%m%d')
It is not clear whether you have numeric or non-numeric 'date' column. If it is 'numeric', convert to 'character' first
df$date <- as.Date(as.character(df$date), format='%Y%m%d')
But, strptime would work even if the column is numeric.
Or using library(lubridate)
library(lubridate)
ymd(df$date)
The problem is that your colunm "date" is not of class 'Date', it is a 'numeric' vector, thus the command as.Date returns NA`s.
You can check if the class of the colunm date is correct with this command:
class(df$date)
Following the advise from #akrun you should transform the date colunm into a 'character' vector, then you can format the style the way you want:
### your data example:
df <- data.frame(date = c(20020102, 20020102),
x = c(1,2))
class(df$date)
#> [1] "numeric"
#convert the colunm date to character
df$date <- as.character(df$date)
# Then, convert to the desired date format:
df$date <- as.Date(df$date, format='%Y%m%d')
# check the results:
df
#> date x
#> 1 2002-01-02 1
#> 2 2002-01-02 2
class(df$date)
#> [1] "Date"

Converting multiple columns in an R dataframe to Date Format

I have a large datafile where all the dates have been loaded as charaters. I would like to change all the Dates columns to date format. Most of the dates have "%y%m%d" format, some have "%Y%m%d" format. There are 25 columns of dates, so changing each one individually is inefficient.
I can do
df$DATE1 <- as.Date(df$DATE1, format ="%y%m%d")
df$DATE2 <- as.Date(df$DATE2, format ="%y%m%d")
etc., but very bad coding.
I tried the following code, but is is not working. This assumes all of the dates are of the format "%y%m%d". Using grep("DATE", names(df)) will get all the Dates columns
df[ , grep("DATE", names(df))] <- as.Date(df[ , grep("DATE", names(df))], "%y%m%d")
Try:
df[, cols <- grep("^DATE", names(df))] <- lapply(df[, cols <- grep("^DATE", names(df))], as.Date, format = "%y%m%d")
Example:
df <- data.frame(DATE1 = c('910812', '900928'), DATE2 = c('890813', '890910'))
# Apply the above and you get:
# > df
# DATE1 DATE2
# 1 1991-08-12 1989-08-13
# 2 1990-09-28 1989-09-10
# > class(df[, 1])
# [1] "Date"

Resources