library(xts)
data <- data.frame(year_week = c("2016-46", "2016-47", "2016-48"),
satisfaction = c(0.25, 0.45, 0.58))
data = xts(data[-1], order.by = as.POSIXct(data$year_week, format = "%Y-%W"))
I want to create an xts object from the data.frame data where the dates keep the format year-week. When I am running the code the columns take the form 2016-12-05 which is incorrect and far from what I am trying to achieve.
This is a variant on a quasi-FAQ on 'by can one not parse year and month as a date': because a date is day and month and year.
Or year, a year and week and day. Otherwise you are indeterminate:
> as.Date(format(Sys.Date(), "%Y-%W-%d"), "%Y-%W-%d")
[1] "2017-12-04"
>
using
> Sys.Date()
[1] "2017-12-04"
> format(Sys.Date(), "%Y-%W-%d")
[1] "2017-49-04"
>
so %W works on input and output provided you also supply a day.
For input data that does not have a day, you could just add a given weekday, say 1:
> as.Date(paste0(c("2016-46", "2016-47", "2016-48"), "-1"), "%Y-%W-%w")
[1] "2016-11-14" "2016-11-21" "2016-11-28"
>
Related
I am given a data frame where one of the column parameters is a year and month value (ex "2019-05"). I need to only display rows where the date value is later than a certain value. For instance, if I only wanted to show data later than a given year-month "2018-11".
You can convert them to dates, but in R < and > work on characters too, so you could just do something like this (assuming there's a leading 0 in the months with only 1 digit)
examp <- c('2011-01', '2013-08', '2018-04', '2018-12', '2019-05')
examp[examp > '2018-11']
#[1] "2018-12" "2019-05"
If you want to convert to dates, add a day to the end and use as.Date
examp <- as.Date(paste0(examp, '-01'))
examp
# [1] "2011-01-01" "2013-08-01" "2018-04-01" "2018-12-01" "2019-05-01"
examp[examp > as.Date('2018-11-01')]
# [1] "2018-12-01" "2019-05-01"
I am having difficulty converting a vector of integers into dates.
I've imported a dataset from Stata using:
> dataire <- read.dta13("~/lcapm_ireland.dta", convert.factors = TRUE,
generate.factors = FALSE, encoding = "UTF-8", fromEncoding = NULL,
convert.underscore = FALSE, missing.type = FALSE, convert.dates = TRUE,
replace.strl = TRUE, add.rownames = FALSE)
My date variable is a monthly time series starting on January 2000 and formatted as "2000-Jan".
Similarly to R, Stata handles dates as integers but in the latter January 1960 is origin zero for monthly dates. Thus, when importing the dataset into R, I end up with a vector of dates of the form:
> c(478, 479, 480, ...)
In addition, my date variable is:
> class(datem)
[1] "Date"
How can I use as.Date or other functions to transform the time-series of integers in a monthly date variable formatted as "%Y-%b"?
The short answer is that you can't get exactly what you want. This is because
in R, dates with numeric form must include a day.
For successfully importing a Stata date in R, you first can convert the respective
variable in Stata from a monthly to a date-time one:
clear
set obs 1
generate date = monthly("2000-Jan", "YM")
display %tmCCYY-Mon date
2000-Jan
display date
480
replace date = dofm(date)
display %tdCCYY-Mon date
2000-Jan
display date
14610
replace date = cofd(date) + tc(00:00:35)
display %tc date
01jan2000 00:01:40
display %15.0f date
1262304100352
Then in R you can do the following:
statadatetime <- 1262304100352
rdatetime <- as.POSIXct(statadatetime/1000, origin = "1960-01-01")
rdatetime
[1] "2000-01-01 02:01:40 EET"
typeof(rdatetime)
[1] "double"
rdate <- as.Date(rdatetime)
rdate
[1] "2000-01-01"
typeof(rdate)
[1] "double"
You can get the Year-(abbreviated) Month form you want with the following:
rdate = format(rdate,"%Y-%b")
[1] "2000-Jan"
typeof(rdate)
[1] "character"
However, as you can see, this will change the type of rdate holding
the date.
Trying to change it back you get:
rdate <- as.Date(rdate)
Error in charToDate(x) :
character string is not in a standard unambiguous format
This is simpler but you will get a date with day, 1990-03-01.
You have a column vector of integers, DATE_IN_MONTHS, that are months since the origin of time in Stata which is 1960-01-01. In R the origin of time is is 1970-01-01.
With package lubridate one simple changes the origin of time and then adds months:
db <- haven::read_dta('StataDatabase.dta') %>%
dplyr::mutate(., DATE_IN_MONTHS = ymd("1960-01-01") + months(DATE_IN_MONTHS))
Now db$DATE_IN_MONTHS contains c(1990-03-01, 1990-04-01, 1990-05-01,...) where each element is a date in R.
library(xts)
data <- data.frame(year_week = c("2016-46", "2016-47", "2016-48"),
satisfaction = c(0.25, 0.45, 0.58))
data = xts(data[-1], order.by = as.POSIXct(data$year_week, format = "%Y-%W"))
I want to create an xts object from the data.frame data where the dates keep the format year-week. When I am running the code the columns take the form 2016-12-05 which is incorrect and far from what I am trying to achieve.
This is a variant on a quasi-FAQ on 'by can one not parse year and month as a date': because a date is day and month and year.
Or year, a year and week and day. Otherwise you are indeterminate:
> as.Date(format(Sys.Date(), "%Y-%W-%d"), "%Y-%W-%d")
[1] "2017-12-04"
>
using
> Sys.Date()
[1] "2017-12-04"
> format(Sys.Date(), "%Y-%W-%d")
[1] "2017-49-04"
>
so %W works on input and output provided you also supply a day.
For input data that does not have a day, you could just add a given weekday, say 1:
> as.Date(paste0(c("2016-46", "2016-47", "2016-48"), "-1"), "%Y-%W-%w")
[1] "2016-11-14" "2016-11-21" "2016-11-28"
>
I have a list of days in the format "day of the year" obtained by applying lubridate::yday() function to a list of dates. For instance, from the following dates (mm-dd-yyyy format):
01-01-2015
01-02-2015
...
by applying yday() you get
1
2
...
Is there a function that can do the reverse given the yday output and the year? Ie, from a yday value and a year, get back to a date in the mm-dd-yyyy format?
To do this with the lubridate package, you use the parse_date_time() function with j as the order = argument.
Example:
my.yday <- "1"
parse_date_time(x = my.yday, orders = "j")
# [1] "2021-01-01 UTC"
The default appears to be the current year. If you want to specify the year just add it in!
my.yday <- "1"
parse_date_time(x = paste(1995, my.yday), orders = "yj")
# [1] "1995-01-01 UTC"
Note: make sure you supply the julian day as character, not numeric. If numeric it will fail when over 3 digits (weird bug that they don't plan to fix)
Any sequence added to a Date() type creates a new Date() sequence with just that offset.
Witness:
R> as.Date("2016-01-01") + 0:9
[1] "2016-01-01" "2016-01-02" "2016-01-03"
[4] "2016-01-04" "2016-01-05" "2016-01-06"
[7] "2016-01-07" "2016-01-08" "2016-01-09"
[10] "2016-01-10"
R> as.Date("2016-01-01") + 100:109
[1] "2016-04-10" "2016-04-11" "2016-04-12"
[4] "2016-04-13" "2016-04-14" "2016-04-15"
[7] "2016-04-16" "2016-04-17" "2016-04-18"
[10] "2016-04-19"
R>
So once again a so-called lubridate question as nothing to do with that package but simply requires to know how the base R types function.
> yday ("1990-03-17") - 1 + as.Date ("1990-01-01")
[1] "1990-03-17"
Just as round out the answer by #DirkEddelbuettel, since I was having the same issue and its been 3 years.
In order to get the desired mm-dd-yyyy for a given dataset where you have only the year and the calendar day as a result of lubridate::yday().
You just need to offset a Date() type object for that given year starting at january 1st, by what your yday() output is subtracted by one.
Then if you want to get the month or day within that month you can use lubridate's month() and day() functions to pull those parts.
(I assume you have the year since that will impact the calendar date b/c of leap years, messing up your month/day assignment. If not, any year will do)
library(dplyr)
library(magrittr)
#Example dataset with years on/around leap year
my_df <- data.frame(
year = c(2010, 2011, 2012, 2013, 2014, 2015),
my_yday = c(150, 150, 150, 150, 150, 150)
)
#skip straight back to the yyyy-mm-dd format
my_df %>% mutate(new_date = as.Date(paste0(year, "-01-01")) + (my_yday - 1))
#Get every component
my_df %>%
mutate(
new_day = lubridate::day(as.Date(str_c(year, "-01-01")) + (my_yday - 1)),
new_month = lubridate::month(as.Date(paste0(year, "-01-01")) + (my_yday - 1)),
new_date = as.POSIXct(str_c(new_month, new_day, year, sep = "/"),
format = "%m/%d/%y"))
Hey I have some data aggregated at quarter level and there is a column contains data like this:
> unique(data$fiscalyearquarter)
[1] "2012Q3" "2010Q3" "2012Q1" "2011Q4" "2012Q4" "2008Q1" "2008Q2" "2010Q4" "2010Q1"
[10] "2009Q2" "2012Q2" "2011Q3" "2013Q2" "2013Q1" "2011Q2" "2013Q4" "2009Q4" "2009Q3"
[19] "2011Q1" "2010Q2" "2013Q3" "2008Q4" "2009Q1" "2014Q1" "2008Q3" "2014Q2"
I am thinking about writing a function that turn a string into a timestamp.
Something like this, split the the string to be year and quarter and then force the quarter to be converted to be month(the middle of the quarter).
convert <- function(myinput = "2008Q2"){
year <- substr(myinput, 1, 4)
quarter <- substr(myinput, 6, 6)
month <- 3 * as.numeric(quarter) - 1
date <- as.Date(paste0(year, sprintf("%02d", month), '01'), '%Y%m%d')
return(date)
}
I have to convert those strings to date format and then analyze it from there.
> convert("2010Q3")
[1] "2010-08-01"
Is there any way beyond my hard coding solution to analyze time series problem at quarterly level?
If x is your vector and you're okay having the date be the first day of the quarter:
library(zoo)
as.Date(as.yearqtr(x))
If you want the date to be the first day of the second month of the quarter like your example, you could hack together something like this:
as.Date(format(as.Date(as.yearqtr(x))+40, "%Y-%m-01"))