R function from string cell YEARMonth as date? - r

So I have this long dataset, where in one column I have a date specified as character in format YYYMMM, but month abbreviated. So for example 1995MAR, 1995APR and so on. How can I transform that to date format?
I tried as.Date but it obviously hasn't worked, and with lubridate::ymd which hasn't worked as well.

Using parse_date_time from lubridate
date <- "1995MAR"
library(lubridate)
parse_date_time(date, order = "Yb")
Output:
[1] "1995-03-01 UTC"
Alternatively using zoo
library(zoo)
as.Date(as.yearmon(date, '%Y%b'))
Output:
"1995-03-01"
str(as.Date(as.yearmon(date, '%Y%b')))
Date[1:1], format: "1995-03-01"

In Base R, add a day number to parse:
date <- "1995MAR"
as.Date(paste(date, "01"), format = "%Y%b %d")
#[1] "1995-03-01"

Related

lubridate mdy format issue

I have a date column whose date values are something like this:
date = c("1/6/2022", "1/6/2022", "1/19/2022", "1/20/2022")
When I try to convert it to new date column using lubridate::mdy I get:
library(lubridate)
date_new = lubridate::mdy(date)
print(date_new)
[1] "2022-01-06" "2022-01-06" "2022-01-19" "2022-01-20"
# Desired output (mm-dd-yyyy or mm/dd/yyyy):
[1] "01-06-2022" "01-06-2022" "01-19-2022" "01-20-2022"
How can I get the desired output using lubridate?
We can use format
format(lubridate::mdy(date), '%m-%d-%Y')
[1] "01-06-2022" "01-06-2022" "01-19-2022" "01-20-2022"

Getting Wrong date format as result in R

I am trying to separate a date from a column in a database, but the result date format is not proper.
column data = "24-01-2021 19:15"
Code used:
database_1$date <- format(as.Date(database_1$start_time), "%d-%m-%Y")
Result: 20-01-0024
Expected result: 24-01-2021
Update:
I just realized the expected output:
just add format("%d.%m.%Y") to the code:
as.Date(dmy_hm(column_date)) %>%
format("%d.%m.%Y")
[1] "24.01.2021"
With lubridate package you can:
With dmy_hm you read in the character column_date to date format
Then you can add as.Date
library(lubridate)
column_date <- "24-01-2021 19:15"
as.Date(dmy_hm(column_date))
[1] "2021-01-24"

How to parse both of these date formats?

Dates formatted as 4/29/2016 are parsed correctly, but dates formatted as 6242016 and 2042016 are not parsed.
Does R think that some of the dates without the slash have day first instead of month?
I've tried including dmy in lubrdiate and it still doesn't work.
I've tried looking at Sys.getlocale("LC_TIME") and it gives me "English_United States.1252".
demo$date <- as.character(demo$date)
demo <- demo %>%
mutate(date = parse_date_time(date, "mdy"))
You can wrangle your dates all into the same format using stringr. Then convert to numeric and use lubridate to parse.
library(stringr)
library(lubridate)
dates <- c("6242016", "2042016", "4/29/2016")
dates <- str_remove_all(dates, "/")
dates <- as.numeric(dates)
lubridate::mdy(dates)
# [1] "2016-06-24" "2016-02-04" "2016-04-29"
as.Date(sprintf("%08d",
as.numeric(gsub("/", "", c("6242016", "2042016", "4/29/2016")))),
format = "%m%d%Y")
# [1] "2016-06-24" "2016-02-04" "2016-04-29"
This
as.Date("2042016", "%m%d%Y")
returns NA as opposed to
as.Date("02042016", "%m%d%Y")
This is because the month has to be represented by two digits (00-12)
Try adding a leading zero for months in the range [1,9].

Convert factor ddmmyyyy:ss:mm:hh to ddmmyyyy format in R

I have factor of date format 30APR2019:00:00:00, how can I convert it to date format ddmmyy in R?
If you want the date back you can convert to date-time using as.POSIXct and then return date
as.Date(as.POSIXct("30APR2019:00:00:00", format = "%d%b%Y:%T"))
If you want it in the specific format, we can use format
format(as.POSIXct("30APR2019:00:00:00", format = "%d%b%Y:%T"), "%d%m%Y")
We can use sub to strip off the substring
sub(":.*", "","30APR2019:00:00:00")
#"30APR2019"
Or with substring
substr("30APR2019:00:00:00", 1, 9)
Or using strptime
format( strptime("30APR2019:00:00:00", format = "%d%b%Y:%T"), "%d%b%Y")
#[1] "30Apr2019"
Or another option with anydate from anytime
library(anytime)
anydate("30APR2019:00:00:00")
#[1] "2019-04-30"
Or with lubridate and format
library(lubridate)
format(dmy_hms("30APR2019:00:00:00"), "%d%b%Y")
#[1] "30Apr2019"
and wrap with format if needed

How to convert a date to YYYYDDD?

I can't figure out how to turn Sys.Date() into a number in the format YYYYDDD. Where DDD is the day of the year, i.e. Jan 1 would be 2016001 Dec 31 would be 2016365
Date <- Sys.Date() ## The Variable Date is created as 2016-01-01
SomeFunction(Date) ## Returns 2016001
You can just use the format function as follows:
format(Date, '%Y%j')
which gives:
[1] "2016161" "2016162" "2016163"
If you want to format it in other ways, see ?strptime for all the possible options.
Alternatively, you could use the year and yday functions from the data.table or lubridate packages and paste them together with paste0:
library(data.table) # or: library(lubridate)
paste0(year(Date), yday(Date))
which will give you the same result.
The values that are returned by both options are of class character. Wrap the above solutions in as.numeric() to get real numbers.
Used data:
> Date <- Sys.Date() + 1:3
> Date
[1] "2016-06-09" "2016-06-10" "2016-06-11"
> class(Date)
[1] "Date"
Here's one option with lubridate:
library(lubridate)
x <- Sys.Date()
#[1] "2016-06-08"
paste0(year(x),yday(x))
#[1] "2016160"
This should work for creating a new column with the specified date format:
Date <- Sys.Date
df$Month_Yr <- format(as.Date(df$Date), "%Y%d")
But, especially when working with larger data sets, it is easier to do the following:
library(data.table)
setDT(df)[,NewDate := format(as.Date(Date), "%Y%d"
Hope this helps. May have to tinker if you only want one value and are not working with a data set.

Resources