as.POSIXct does not recognise date format = "%Y-%W" - r

library(xts)
data <- data.frame(year_week = c("2016-46", "2016-47", "2016-48"),
satisfaction = c(0.25, 0.45, 0.58))
data = xts(data[-1], order.by = as.POSIXct(data$year_week, format = "%Y-%W"))
I want to create an xts object from the data.frame data where the dates keep the format year-week. When I am running the code the columns take the form 2016-12-05 which is incorrect and far from what I am trying to achieve.

This is a variant on a quasi-FAQ on 'by can one not parse year and month as a date': because a date is day and month and year.
Or year, a year and week and day. Otherwise you are indeterminate:
> as.Date(format(Sys.Date(), "%Y-%W-%d"), "%Y-%W-%d")
[1] "2017-12-04"
>
using
> Sys.Date()
[1] "2017-12-04"
> format(Sys.Date(), "%Y-%W-%d")
[1] "2017-49-04"
>
so %W works on input and output provided you also supply a day.
For input data that does not have a day, you could just add a given weekday, say 1:
> as.Date(paste0(c("2016-46", "2016-47", "2016-48"), "-1"), "%Y-%W-%w")
[1] "2016-11-14" "2016-11-21" "2016-11-28"
>

Related

Converting non-standard date format strings ("April-20") to date objects R

I have a vector of date strings in the form month_name-2_digit_year i.e.
a = rbind("April-21", "March-21", "February-21", "January-21")
I'm trying to convert that vector into a vector of date objects. I'm aware this question is very similar to this: Convert non-standard date format to date in R posted some years ago, but unfortunately, it has not answered my question.
I have tried the following as.Date() calls to do this, but it just returns a vector of NA. I.e.
b = as.Date(a, format = "%B-%y")
b = as.Date(a, format = "%B%y")
b = as.Date(a, "%B-%y")
b = as.Date(a, "%B%y")
I'm also attempted to do it using the convertToDate function from the openxlsx package:
b = convertToDate(a, format = "%B-%y")
I have also tried all the above but using a single character string rather than a vector, but that produced the same issue.
I'm a little lost as to why this isn't working, as this format has worked in reverse earlier in my script (that is, I had a date object already in dd-mm-yyyy format and converted it to month_name-yy using %B-%y). Is there another way to go from string to date when the string is a non-standard (anything other than dd-mm-yyy or mm-dd-yy if you're in the US) date format?
For the record my R locales are all UK and english.
Thanks in advance.
A Date must have all three of day, month and year. Convert to yearmon class which requires only month and year and then to Date as in (1) and (2) below or add the day as in (3).
(1) and (3) give first of month and (2) gives the end of the month.
(3) uses only functions from base R.
Also consider not converting to Date at all but just use yearmon objects instead since they directly represent a year and month which is what the input represents.
library(zoo)
# test input
a <- c("April-21", "March-21", "February-21", "January-21")
# 1
as.Date(as.yearmon(a, "%B-%y"))
## [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
# 2
as.Date(as.yearmon(a, "%B-%y"), frac = 1)
## [1] "2021-04-30" "2021-03-31" "2021-02-28" "2021-01-31"
# 3
as.Date(paste(1, a), "%d %B-%y")
## [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
In addition to zoo, which #G. Grothendieck mentioned, you can also use clock or lubridate.
clock supports a variable precision calendar type called year_month_day. In this case you'd want "month" precision, then you can set the day to whatever you'd like and convert back to Date.
library(clock)
x <- c("April-21", "March-21", "February-21", "January-21")
ymd <- year_month_day_parse(x, format = "%B-%y", precision = "month")
ymd
#> <year_month_day<month>[4]>
#> [1] "2021-04" "2021-03" "2021-02" "2021-01"
# First of month
as.Date(set_day(ymd, 1))
#> [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
# End of month
as.Date(set_day(ymd, "last"))
#> [1] "2021-04-30" "2021-03-31" "2021-02-28" "2021-01-31"
The simplest solution may be to use lubridate::my(), which parses strings in the order of "month then year". That assumes that you want the first day of the month, which may or may not be correct for you.
library(lubridate)
x <- c("April-21", "March-21", "February-21", "January-21")
# Assumes first of month
my(x)
#> [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"

Convert Month Year to date r (from excel) [duplicate]

library(xts)
data <- data.frame(year_week = c("2016-46", "2016-47", "2016-48"),
satisfaction = c(0.25, 0.45, 0.58))
data = xts(data[-1], order.by = as.POSIXct(data$year_week, format = "%Y-%W"))
I want to create an xts object from the data.frame data where the dates keep the format year-week. When I am running the code the columns take the form 2016-12-05 which is incorrect and far from what I am trying to achieve.
This is a variant on a quasi-FAQ on 'by can one not parse year and month as a date': because a date is day and month and year.
Or year, a year and week and day. Otherwise you are indeterminate:
> as.Date(format(Sys.Date(), "%Y-%W-%d"), "%Y-%W-%d")
[1] "2017-12-04"
>
using
> Sys.Date()
[1] "2017-12-04"
> format(Sys.Date(), "%Y-%W-%d")
[1] "2017-49-04"
>
so %W works on input and output provided you also supply a day.
For input data that does not have a day, you could just add a given weekday, say 1:
> as.Date(paste0(c("2016-46", "2016-47", "2016-48"), "-1"), "%Y-%W-%w")
[1] "2016-11-14" "2016-11-21" "2016-11-28"
>

Can anyone help me to change factor to date in r?

I imported a csv file with dates to R. The dataframe is named DT, and one of the column called date has year and month in it.
class(DT$date)
[1] "factor"
head(DT$date)
[1] 2013年1月 2013年1月 2013年1月 2013年1月 2013年1月 2013年1月
60 Levels: 2013年10月 2013年11月 2013年12月 2013年1月 ... 2017年9月
And I tried to use as.Date to convert it to date format:
date <- as.Date(DT$date, format = "%Y/%m")
date <- as.Date(as.factor(DT$date), format = "%Y/%m")
date <- as.Date(as.factor(DT$date), format = "%Y/%m/%d")
During this operation I lose all my dates. Then I tried the lubridate package:
date <- ymd(DT$date)
date <- as.yearmon( DT$date)
However, I lose all my dates again. Can anyone help me to change this factor to Date in R?
Thanks.
The following seems to work:
DT = data.frame(date = c("2013年1月", "2013年11月", "2017年9月"))
lubridate::parse_date_time(DT$date, orders = "ym")
You should generally start with the parse_date_time function.

Convert dates from Stata to R

I am having difficulty converting a vector of integers into dates.
I've imported a dataset from Stata using:
> dataire <- read.dta13("~/lcapm_ireland.dta", convert.factors = TRUE,
generate.factors = FALSE, encoding = "UTF-8", fromEncoding = NULL,
convert.underscore = FALSE, missing.type = FALSE, convert.dates = TRUE,
replace.strl = TRUE, add.rownames = FALSE)
My date variable is a monthly time series starting on January 2000 and formatted as "2000-Jan".
Similarly to R, Stata handles dates as integers but in the latter January 1960 is origin zero for monthly dates. Thus, when importing the dataset into R, I end up with a vector of dates of the form:
> c(478, 479, 480, ...)
In addition, my date variable is:
> class(datem)
[1] "Date"
How can I use as.Date or other functions to transform the time-series of integers in a monthly date variable formatted as "%Y-%b"?
The short answer is that you can't get exactly what you want. This is because
in R, dates with numeric form must include a day.
For successfully importing a Stata date in R, you first can convert the respective
variable in Stata from a monthly to a date-time one:
clear
set obs 1
generate date = monthly("2000-Jan", "YM")
display %tmCCYY-Mon date
2000-Jan
display date
480
replace date = dofm(date)
display %tdCCYY-Mon date
2000-Jan
display date
14610
replace date = cofd(date) + tc(00:00:35)
display %tc date
01jan2000 00:01:40
display %15.0f date
1262304100352
Then in R you can do the following:
statadatetime <- 1262304100352
rdatetime <- as.POSIXct(statadatetime/1000, origin = "1960-01-01")
rdatetime
[1] "2000-01-01 02:01:40 EET"
typeof(rdatetime)
[1] "double"
rdate <- as.Date(rdatetime)
rdate
[1] "2000-01-01"
typeof(rdate)
[1] "double"
You can get the Year-(abbreviated) Month form you want with the following:
rdate = format(rdate,"%Y-%b")
[1] "2000-Jan"
typeof(rdate)
[1] "character"
However, as you can see, this will change the type of rdate holding
the date.
Trying to change it back you get:
rdate <- as.Date(rdate)
Error in charToDate(x) :
character string is not in a standard unambiguous format
This is simpler but you will get a date with day, 1990-03-01.
You have a column vector of integers, DATE_IN_MONTHS, that are months since the origin of time in Stata which is 1960-01-01. In R the origin of time is is 1970-01-01.
With package lubridate one simple changes the origin of time and then adds months:
db <- haven::read_dta('StataDatabase.dta') %>%
dplyr::mutate(., DATE_IN_MONTHS = ymd("1960-01-01") + months(DATE_IN_MONTHS))
Now db$DATE_IN_MONTHS contains c(1990-03-01, 1990-04-01, 1990-05-01,...) where each element is a date in R.

Is there an inverse of the yday lubridate function?

I have a list of days in the format "day of the year" obtained by applying lubridate::yday() function to a list of dates. For instance, from the following dates (mm-dd-yyyy format):
01-01-2015
01-02-2015
...
by applying yday() you get
1
2
...
Is there a function that can do the reverse given the yday output and the year? Ie, from a yday value and a year, get back to a date in the mm-dd-yyyy format?
To do this with the lubridate package, you use the parse_date_time() function with j as the order = argument.
Example:
my.yday <- "1"
parse_date_time(x = my.yday, orders = "j")
# [1] "2021-01-01 UTC"
The default appears to be the current year. If you want to specify the year just add it in!
my.yday <- "1"
parse_date_time(x = paste(1995, my.yday), orders = "yj")
# [1] "1995-01-01 UTC"
Note: make sure you supply the julian day as character, not numeric. If numeric it will fail when over 3 digits (weird bug that they don't plan to fix)
Any sequence added to a Date() type creates a new Date() sequence with just that offset.
Witness:
R> as.Date("2016-01-01") + 0:9
[1] "2016-01-01" "2016-01-02" "2016-01-03"
[4] "2016-01-04" "2016-01-05" "2016-01-06"
[7] "2016-01-07" "2016-01-08" "2016-01-09"
[10] "2016-01-10"
R> as.Date("2016-01-01") + 100:109
[1] "2016-04-10" "2016-04-11" "2016-04-12"
[4] "2016-04-13" "2016-04-14" "2016-04-15"
[7] "2016-04-16" "2016-04-17" "2016-04-18"
[10] "2016-04-19"
R>
So once again a so-called lubridate question as nothing to do with that package but simply requires to know how the base R types function.
> yday ("1990-03-17") - 1 + as.Date ("1990-01-01")
[1] "1990-03-17"
Just as round out the answer by #DirkEddelbuettel, since I was having the same issue and its been 3 years.
In order to get the desired mm-dd-yyyy for a given dataset where you have only the year and the calendar day as a result of lubridate::yday().
You just need to offset a Date() type object for that given year starting at january 1st, by what your yday() output is subtracted by one.
Then if you want to get the month or day within that month you can use lubridate's month() and day() functions to pull those parts.
(I assume you have the year since that will impact the calendar date b/c of leap years, messing up your month/day assignment. If not, any year will do)
library(dplyr)
library(magrittr)
#Example dataset with years on/around leap year
my_df <- data.frame(
year = c(2010, 2011, 2012, 2013, 2014, 2015),
my_yday = c(150, 150, 150, 150, 150, 150)
)
#skip straight back to the yyyy-mm-dd format
my_df %>% mutate(new_date = as.Date(paste0(year, "-01-01")) + (my_yday - 1))
#Get every component
my_df %>%
mutate(
new_day = lubridate::day(as.Date(str_c(year, "-01-01")) + (my_yday - 1)),
new_month = lubridate::month(as.Date(paste0(year, "-01-01")) + (my_yday - 1)),
new_date = as.POSIXct(str_c(new_month, new_day, year, sep = "/"),
format = "%m/%d/%y"))

Resources