Date String conversion in R via as.Date() results in NAs [duplicate] - r

This question already has answers here:
Converting year and month ("yyyy-mm" format) to a date?
(9 answers)
Closed 12 months ago.
This question has appeared before, I know, but I haven't been able to derive the correct output after running Sys.setlocale("LC_TIME", "american").
Ok, so I have dataframe with a column of dates that look something like "October 2020" and "June 2021", etc.
I run as.Date(dataframe$month_column, format = "%B %Y"), and I do this after running Sys.setlocale() with the above inputs. But my output continues to be a bunch of NAs.
I suppose one work around I could try would be to split the Month and Year parts by the space into two columns, which might make it easier to coerce into date types, but I'd like to avoid that if possible, since I want to plot quantities by Month Year.
Any insight would be appreciated.

You can convert character strings of month and years with the my() function in the lubridate package. my, in this case, stands for month-year (there are also other functions, like mdy() which look for entries in the form of month-day-year, and dym() which expect day-month-year order).
library(lubridate)
dates <- data.frame(
dt = c("October 2020", "June 2021")
)
dates$converted <-my(dates$dt)
dates
Output:
dt converted
1 October 2020 2020-10-01
2 June 2021 2021-06-01

Related

How can as.Date() convert fully written dates into ISO 8601? [duplicate]

This question already has an answer here:
Format for ordinal dates (day of month with suffixes -st, -nd, -rd, -th)
(1 answer)
Closed 1 year ago.
I currently have a vector of dates that are in the following format:
a <- c("Wednesday 26th May 2021","Thursday 27th May 2021")
I've tried to get it into ISO 8601 using the following:
as.Date(a, "%I %d%S %F %Y")
But I'm not 100% certain about the syntax of writing dates.
Any thoughts are appreciated!
You can remove the date suffixes and use as.Date -
#Added an extra date that does not have th as prefix.
a <- c("Wednesday 26th May 2021","Thursday 27th May 2021",
'Tuesday 1st June 2021', 'Monday 31st May 2021')
as.Date(sub('(?<=\\d)(th|rd|st|nd)', '', a, perl = TRUE), '%A %d %b %Y')
#[1] "2021-05-26" "2021-05-27" "2021-06-01" "2021-05-31"
Read ?strptime for different format specification.
If you are open to packages lubridate::dmy works directly.
lubridate::dmy(a)
#[1] "2021-05-26" "2021-05-27" "2021-06-01" "2021-05-31"

Year column to time series [duplicate]

This question already has answers here:
Convert four digit year values to class Date
(5 answers)
Closed 5 years ago.
OK, this should be really simple but I'm not 'getting it.' I have a data frame with a column "Year" that I want to convert to a time series, but the format is tripping me up. How do I convert the "Year" value to a date, with the actual date being the end of each respective year (e.g. 2015 -> December 31st 2015)?
Year Production
1 1900 38400000
2 1901 43400000
3 1902 49000000
4 1903 44100000
5 1904 49800000
Goal is to get this to a time series data frame. (e.g. xts)
It is not quite the same as a previous question that converted a vector of years to dates. "Convert four digit year values to date type". Goal is to index the data by date, converting it to xts or similar object.
Edited:
This was the final solution:
df <- xts(x = df_original, order.by = as.Date(paste0(df_original[,1], "-12-31")))
whereby the "[,1]" indicates the first column of the original data frame.
If you want each full date to be 31 December, you could use paste along with as.Date to cast to a date:
df$date <- as.Date(paste0(df$Year, "-12-31"))
In addition to Tim Biegeleisen's answer, I will just add another way
df$final_date <- as.Date(ISOdate(df$Year, 12, 31))

Need help converting date format [duplicate]

This question already has an answer here:
R formatting a date from a character mmm dd, yyyy to class date [duplicate]
(1 answer)
Closed 5 years ago.
I have a csv file where the dates look like this: Jan 31, 2017
I would like it be 2017-01-31
I've been searching the site and I found a lot of similar questions but none with my strange date format.
EDIT: Was unclear. I need to change a lot of dates so doing it manually won't work. Thanks for suggestions, reading the help function of lubridate and strptime now.
Use the lubridate library:
library(lubridate)
date <- "Jan 31, 2017"
date2 <- mdy(date)
date2
[1] "2017-01-31"

in R how to convert a date in character format to numeric and then easily calculate the difference between two dates

So I want to convert "October 2010" and "November 2010" to a numeric format and hence if I take the difference of these two I get result: 1.
I tried to use as.date function but it seems that it only works for full format: month-day-year.
You can try formatting your raw date strings, and treating each one as being on the first day of that month.
dates <- c("October 2010", "November 2010")
# extract the first three letters for the month, and the last 4 digits for the year
dates.new <- paste0(substr(dates, 1, 3), "-01-", substr(dates, nchar(dates)-3, nchar(dates)))
> dates.new
[1] "Oct-01-2010" "Nov-01-2010"
# convert to POSIXct
dates.posix <- as.POSIXct(dates.new, format="%B-%d-%y")
diff <- dates.posix[2] - dates.posix[1]
> diff
Time difference of 31 days
In your question you want to calculate the difference in number of months and not in number of days. You could map your month-year character vector to a numeric number of months, starting at month 1 with the first month in your dataset and ending with month n with the last month in your dataset. Then it would be straightforward to calculate a difference in number of months.
Alternatively - to be able to manipulate date-time objects - you will have to create full dates, by introducing a 01 in front of all dates for example "01 November 2010" and then calculating the difference between dates. This the main part of the answer below.
Manipulating date-time objects
The lubridate package can calculate the difference between two dates. It deals with non trivial issues such as February 29th. If it's not installed on your system:
install.packages("lubridate")
Then
library(lubridate)
ymd("20160301")-ymd("20160228")
# Time difference of 2 days
ymd("20150301")-ymd("20150228")
# Time difference of 1 days
To read full month names look at formatting details in help(parse_date_time)
d <- parse_date_time("November 01 2010", "Bdy") - parse_date_time("October 01 2010", "Bdy")
d
# Time difference of 31 days
d is a difftime object, (based on converting a difftime to integer) you can convert it to a numeric number of days and weeks (but not to a number of months):
class(d)
# [1] "difftime"
as.numeric(d, units="days")
# [1] 31
as.numeric(d, units="weeks")
# [1] 4.428571

Formatting month abbreviations using as.Date [duplicate]

This question already has answers here:
Converting year and month ("yyyy-mm" format) to a date?
(9 answers)
Closed 6 years ago.
I'm working with monthly data and have a character vector of dates, formatted:
Sep/2012
Aug/2012
Jul/2012
and so on, back to 1981. I've tried using
as.Date(dates, "%b/%Y")
where %b represents month abbreviations, but this only returns NAs. What am I doing wrong?
Note: I already found a workaround using gsub() to add "01/" in front of each entry, like so:
01/Sep/2012
01/Aug/2012
01/Jul/2012
Then as.Dates() works, but this seems a little inelegant, and isn't strictly accurate anyway.
You are looking for as.yearmon() in the zoo package. Given your dates
dates <- c("Sep/2012","Aug/2012","Jul/2012")
we load the package and convert to the "yearmon" class
require(zoo)
dates1 <- as.yearmon(dates, format = "%b/%Y")
dates1
Which gives
R> dates1
[1] "Sep 2012" "Aug 2012" "Jul 2012"
You can coerce to an object of class "Date" using the as.Date() method
R> as.Date(dates1)
[1] "2012-09-01" "2012-08-01" "2012-07-01"
Which would be a simpler way of getting the thing you did via gsub(). There is a frac argument which controls how far through the month the day component should be:
R> as.Date(dates1, frac = 0.5)
[1] "2012-09-15" "2012-08-16" "2012-07-16"
But that may ont be sufficient for you.
If you really only want the dates stored as you have them, then they aren't really dates but if you are happy to work within the zoo package then the "yearmon" class can be used as an index for a zoo object which is a time series.

Resources