I'm trying to insert the previous date for every date in a vector in R.
This is my current vector:
[1] "1990-02-08" "1990-03-28" "1990-05-16" "1990-07-05" "1990-07-13" "1990-08-22" "1990-10-03"
[8] "1990-10-29" "1990-11-14" "1990-12-07" "1990-12-18" "1991-01-08" "1991-02-01" "1991-02-07"
I'm trying to get the following:
[1] "1990-02-07" "1990-02-08" "1990-03-27" "1990-03-28" "1990-05-15" "1990-05-16" "1990-07-05"
ect.
I tried the following:
dates_lagged = as.Date(dates)-1
dates_combined = c(date, dates_lagged)
However, with this method, some dates are not getting lagged.
Is there a better way to do this?
Edit: to answer the comment, this is my code (replaced CSV with its starting values):
FOMC <- read_csv(file = c("x", "1990-02-08", "1990-03-28", "1990-05-16", "1990-07-05", "1990-07-13", "1990-08-22", "1990-10-03",
"1990-10-29", "1990-11-14", "1990-12-07"))
FOMC$x <- as.Date(FOMC$x, format = "%Y-%m-%d")
colnames(FOMC) <- "Date"
dates_vector <- FOMC[["Date"]]
FOMC = as.vector(as.Date(dates_vector))
dates_lagged = as.Date(FOMC)-1
dates_combined = c(FOMC, dates_lagged)
as.Date(dates_combined)
For some reason, there is no "1990-10-28" before "1990-10-29" for example, and I can't figure out why.
You could try:
as.Date(c(rbind(dates - 1, dates)), origin = "1970-01-01")
#> [1] "1990-02-07" "1990-02-08" "1990-03-27" "1990-03-28" "1990-05-15"
#> [6] "1990-05-16" "1990-07-04" "1990-07-05" "1990-07-12" "1990-07-13"
#> [11] "1990-08-21" "1990-08-22" "1990-10-02" "1990-10-03" "1990-10-28"
#> [16] "1990-10-29" "1990-11-13" "1990-11-14" "1990-12-06" "1990-12-07"
#> [21] "1990-12-17" "1990-12-18" "1991-01-07" "1991-01-08" "1991-01-31"
#> [26] "1991-02-01" "1991-02-06" "1991-02-07"
Data
dates <- c("1990-02-08", "1990-03-28", "1990-05-16", "1990-07-05", "1990-07-13",
"1990-08-22", "1990-10-03", "1990-10-29", "1990-11-14", "1990-12-07",
"1990-12-18", "1991-01-08", "1991-02-01", "1991-02-07")
dates <- as.Date(dates)
Created on 2021-11-04 by the reprex package (v2.0.0)
Related
Hy guys,
I want to build in R a Date as:
02-06-year
for 15 years.
Here the code:
library(timeDate)
listHolidays
seq=0:5000
data.iniziale <- as.Date("2015-01-01")
calendario = data.iniziale + seq
l = length(calendario)
for (i in 1:l){
x[i]=as.Date(year(calendario[i]),06,02)
}
It does not work as is.
How can I do for that date
Similar to Albins solution, but I understood the question slightly different:
format(seq(as.Date("2015-01-01"), as.Date("2030-01-01"), "year"), "%Y-06-02")
Output:
[1] "2015-06-02" "2016-06-02" "2017-06-02" "2018-06-02" "2019-06-02" "2020-06-02" "2021-06-02"
[8] "2022-06-02" "2023-06-02" "2024-06-02" "2025-06-02" "2026-06-02" "2027-06-02" "2028-06-02"
[15] "2029-06-02" "2030-06-02"
I suggest to use some of existing functions of R to facilitate your task.
With the seq function, you can generate simply a sequence of dates. And format is as shown below:
format(seq(as.Date("2015-01-01"), as.Date("2030-01-01"), "days"), "%m-%d-%Y")
Output (partly):
[1] "01-01-2015" "01-02-2015" "01-03-2015" "01-04-2015" "01-05-2015" "01-06-2015" "01-07-2015" "01-08-2015"
[9] "01-09-2015" "01-10-2015" "01-11-2015" "01-12-2015" "01-13-2015" "01-14-2015" "01-15-2015" "01-16-2015"
Another possible solution, using lubridate:
library(tidyverse)
library(lubridate)
str_c("2-6-", 2000:2014) %>% dmy
#> [1] "2000-06-02" "2001-06-02" "2002-06-02" "2003-06-02" "2004-06-02"
#> [6] "2005-06-02" "2006-06-02" "2007-06-02" "2008-06-02" "2009-06-02"
#> [11] "2010-06-02" "2011-06-02" "2012-06-02" "2013-06-02" "2014-06-02"
This question already has answers here:
Pasting two vectors with combinations of all vectors' elements
(8 answers)
Closed 2 years ago.
I have two vectors, one that contains a list of variables, and one that contains dates, such as
Variables_Pays <- c("PIB", "ConsommationPrivee","ConsommationPubliques",
"FBCF","ProductionIndustrielle","Inflation","InflationSousJacente",
"PrixProductionIndustrielle","CoutHoraireTravail")
Annee_Pays <- c("2000","2001")
I want to merge them to have a vector with each variable indexed by my date, that is my desired output is
> Colonnes_Pays_Principaux
[1] "PIB_2020" "PIB_2021" "ConsommationPrivee_2020"
[4] "ConsommationPrivee_2021" "ConsommationPubliques_2020" "ConsommationPubliques_2021"
[7] "FBCF_2020" "FBCF_2021" "ProductionIndustrielle_2020"
[10] "ProductionIndustrielle_2021" "Inflation_2020" "Inflation_2021"
[13] "InflationSousJacente_2020" "InflationSousJacente_2021" "PrixProductionIndustrielle_2020"
[16] "PrixProductionIndustrielle_2021" "CoutHoraireTravail_2020" "CoutHoraireTravail_2021"
Is there a simpler / more readabl way than a double for loop as I have tried and succeeded below ?
Colonnes_Pays_Principaux <- vector()
for (Variable in (1:length(Variables_Pays))){
for (Annee in (1:length(Annee_Pays))){
Colonnes_Pays_Principaux=
append(Colonnes_Pays_Principaux,
paste(Variables_Pays[Variable],Annee_Pays[Annee],sep="_")
)
}
}
expand.grid will create a data frame with all combinations of the two vectors.
with(
expand.grid(Variables_Pays, Annee_Pays),
paste0(Var1, "_", Var2)
)
#> [1] "PIB_2000" "ConsommationPrivee_2000"
#> [3] "ConsommationPubliques_2000" "FBCF_2000"
#> [5] "ProductionIndustrielle_2000" "Inflation_2000"
#> [7] "InflationSousJacente_2000" "PrixProductionIndustrielle_2000"
#> [9] "CoutHoraireTravail_2000" "PIB_2001"
#> [11] "ConsommationPrivee_2001" "ConsommationPubliques_2001"
#> [13] "FBCF_2001" "ProductionIndustrielle_2001"
#> [15] "Inflation_2001" "InflationSousJacente_2001"
#> [17] "PrixProductionIndustrielle_2001" "CoutHoraireTravail_2001"
We can use outer :
c(t(outer(Variables_Pays, Annee_Pays, paste, sep = '_')))
# [1] "PIB_2000" "PIB_2001"
# [3] "ConsommationPrivee_2000" "ConsommationPrivee_2001"
# [5] "ConsommationPubliques_2000" "ConsommationPubliques_2001"
# [7] "FBCF_2000" "FBCF_2001"
# [9] "ProductionIndustrielle_2000" "ProductionIndustrielle_2001"
#[11] "Inflation_2000" "Inflation_2001"
#[13] "InflationSousJacente_2000" "InflationSousJacente_2001"
#[15] "PrixProductionIndustrielle_2000" "PrixProductionIndustrielle_2001"
#[17] "CoutHoraireTravail_2000" "CoutHoraireTravail_2001"
No real need to go beyond the basics here! Use paste for pasting the strings and rep to repeat either Annee_Pays och Variables_Pays to get all combinations:
Variables_Pays <- c("PIB", "ConsommationPrivee","ConsommationPubliques",
"FBCF","ProductionIndustrielle","Inflation","InflationSousJacente",
"PrixProductionIndustrielle","CoutHoraireTravail")
Annee_Pays <- c("2000","2001")
# To get this is the same order as in your example:
paste(rep(Variables_Pays, rep(2, length(Variables_Pays))), Annee_Pays, sep = "_")
# Alternative order:
paste(Variables_Pays, rep(Annee_Pays, rep(length(Variables_Pays), 2)), sep = "_")
# Or, if order doesn't matter too much:
paste(Variables_Pays, rep(Annee_Pays, length(Variables_Pays)), sep = "_")
In base R:
Variables_Pays <- c("PIB", "ConsommationPrivee","ConsommationPubliques",
"FBCF","ProductionIndustrielle","Inflation","InflationSousJacente",
"PrixProductionIndustrielle","CoutHoraireTravail")
Annee_Pays <- c("2000","2001")
cbind(paste(Variables_Pays, Annee_Pays,sep="_"),paste(Variables_Pays, rev(Annee_Pays),sep="_")
I have a bunch of character vectors which I use to download some files (one for each month of the year), for which I have to change the date for every single link manually (at the end of the vector). It looks like this:
query_01_19 = "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.01.2019&to=31.01.2019"
query_02_19 = "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.02.2019&to=28.02.2019"
query_03_19 = "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.03.2019&to=31.03.2019"
query_04_19 = "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.04.2019&to=30.04.2019"
query_05_19 = "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.05.2019&to=31.05.2019"
query_06_19 = "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.06.2019&to=30.06.2019"
query_07_19 = "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.07.2019&to=31.07.2019"
query_08_19 = "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.08.2019&to=31.08.2019"
query_09_19 = "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.09.2019&to=30.09.2019"
query_10_19 = "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.10.2019&to=31.10.2019"
query_11_19 = "?format=Html&userId=1232&userHash=1277KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.11.2019&to=30.11.2019"
query_12_19 = "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.12.2019&to=31.12.2019"
This is already rather tedious for one year, but it becomes a real pain if I want to this for all the following years (let's say until 2030).
Is there an easier way to do this?
Thanks in advance!
A few tricks to make this easy:
use of seq.Date to generate the first day of each month (it is shown here as seq due to the convenience R's S3 methods provide);
substract 1 from those to get the last day of the previous months; and
join those together with paste0 after formating them to the dot-separated date format.
## 1
dates <- seq(as.Date("2018-01-01"), as.Date("2019-01-01"), by = "month")
dates
# [1] "2018-01-01" "2018-02-01" "2018-03-01" "2018-04-01" "2018-05-01" "2018-06-01" "2018-07-01"
# [8] "2018-08-01" "2018-09-01" "2018-10-01" "2018-11-01" "2018-12-01" "2019-01-01"
dates_first <- format(dates[-length(dates)], format = "%d.%m.%Y")
## 2
dates_last <- format(dates[-1] - 1L, format = "%d.%m.%Y")
dates_last
# [1] "31.01.2018" "28.02.2018" "31.03.2018" "30.04.2018" "31.05.2018" "30.06.2018" "31.07.2018"
# [8] "31.08.2018" "30.09.2018" "31.10.2018" "30.11.2018" "31.12.2018"
## 3
paste0(
"?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=",
dates_first,
"&to=",
dates_last)
# [1] "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.01.2018&to=31.01.2018"
# [2] "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.02.2018&to=28.02.2018"
# [3] "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.03.2018&to=31.03.2018"
# [4] "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.04.2018&to=30.04.2018"
# [5] "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.05.2018&to=31.05.2018"
# [6] "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.06.2018&to=30.06.2018"
# [7] "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.07.2018&to=31.07.2018"
# [8] "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.08.2018&to=31.08.2018"
# [9] "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.09.2018&to=30.09.2018"
# [10] "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.10.2018&to=31.10.2018"
# [11] "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.11.2018&to=30.11.2018"
# [12] "?format=Html&userId=1232&userHash=U127KfIHaiz3ks2gXEgNctA9n8P4c87o1SFcEu2weKpNdupQwmuRaMltEN7&query=ApplicationStatusByJob&from=01.12.2018&to=31.12.2018"
(Easily could have been done with sprintf or related functions.)
I'm an experienced Pandas user and am having trouble plugging values from my R frame into a function.
The following function works with hard coded values
>seq.Date(as.Date('2018-01-01'), as.Date('2018-01-31'), 'days')
[1] "2018-01-01" "2018-01-02" "2018-01-03" "2018-01-04" "2018-01-05" "2018-01-06" "2018-01-07"
[8] "2018-01-08" "2018-01-09" "2018-01-10" "2018-01-11" "2018-01-12" "2018-01-13" "2018-01-14"
[15] "2018-01-15" "2018-01-16" "2018-01-17" "2018-01-18" "2018-01-19" "2018-01-20" "2018-01-21"
[22] "2018-01-22" "2018-01-23" "2018-01-24" "2018-01-25" "2018-01-26" "2018-01-27" "2018-01-28"
[29] "2018-01-29" "2018-01-30" "2018-01-31"
Here is an extract from a dataframe I'm using
>df[1,1:2]
# A tibble: 1 x 2
start_time end_time
<date> <date>
1 2017-04-27 2017-05-11
When plugging these values into the 'seq.Date' function I get an error
> seq.Date(from=df[1,1], to=df[1,2], 'days')
Error in seq.Date(from = df[1, 1], to = df[1, 2], "days") :
'from' must be a "Date" object
I suspect this is because subsetting using df[x,y] returns a tibble rather than the specific value
data.class(df[1,1])
[1] "tbl_df"
What I'm hoping to derive is a sequence of dates. I need to be able to point this at various places around the dataframe.
Many thanks for any help!
Just use double brackets:
seq.Date(from=df[[1,1]], to=df[[1,2]], 'days')
The extraction functions of tibble may not return vectors but one column tibbles, use dplyr::pull to extract the column as vector, like in this answer: Extract a dplyr tbl column as a vector
Another option is to set the drop argument in the `[` function to TRUE.
If TRUE the result is coerced to the lowest possible dimension
seq.Date(from = df[1, 1, drop = TRUE], to = df[1, 2, drop = TRUE], 'days')
# [1] "2017-04-27" "2017-04-28" "2017-04-29" "2017-04-30" "2017-05-01" "2017-05-02" "2017-05-03" "2017-05-04" "2017-05-05" "2017-05-06"
#[11] "2017-05-07" "2017-05-08" "2017-05-09" "2017-05-10" "2017-05-11"
data
df <- tibble(start_time = as.Date('2017-04-27'),
end_time = as.Date('2017-05-11'))
I have a time series in R that I would like to work with, spanning from 01-01-52 to 01-01-88. (1952 to 1988). 37 observations.
However, when I read it in in R, I encounter the problem that the observations from 01-01-52 to 01-01-68 are interpreted as being in 2052 etc., rather than 1952.
How do I force R to read in all the data as being from 1952 to 1988?
Link to my data: https://www.dropbox.com/s/93foyc238skt3xj/AgricIndus.csv?dl=0
This is the code I have used. Do you know what I need to do with my code to make it read properly?
agri <- read.table("AgricIndus.csv",
sep = ",", header = TRUE, skip = 0,
stringsAsFactors = FALSE)
agri$time <- as.Date(agri$time, "%m-%d-%y")
agri.xts <- xts(agri[, 2:3], order.by = agri$time)
One way (hack) can be the following:
agri$time <- as.Date(paste0(substring(agri$time,1,6), '19', substring(agri$time,7,8)),
"%m-%d-%Y")
agri$time
# [1] "01-01-52" "01-01-53" "01-01-54" "01-01-55" "01-01-56" "01-01-57" "01-01-58" "01-01-59" "01-01-60" "01-01-61" "01-01-62" "01-01-63" "01-01-64" "01-01-65"
# [15] "01-01-66" "01-01-67" "01-01-68" "01-01-69" "01-01-70" "01-01-71" "01-01-72" "01-01-73" "01-01-74" "01-01-75" "01-01-76" "01-01-77" "01-01-78" "01-01-79"
# [29] "01-01-80" "01-01-81" "01-01-82" "01-01-83" "01-01-84" "01-01-85" "01-01-86" "01-01-87" "01-01-88"
If you can be sure that your time series is regular then the it is probably the easiest to generate a regular date sequence like so:
agri$time <- seq.Date(as.Date("1952-01-01"),as.Date("1988-01-01"),by='years’)
Another easy solution that would work for irregular time series as well would be to read your data as years 52 to 88 with format = %m-%d-%Y (capitalized “Y” !) and add 1900 years:
df$time <- as.POSIXlt(as.Date(df$time,format = '%m-%d-%Y'))
df$time$year <-df$time$year + 1900
df$time <- as.Date(df$time)
df$time
[1] "1952-01-01" "1953-01-01" "1954-01-01" "1955-01-01"
[5] "1956-01-01" "1957-01-01" "1958-01-01" "1959-01-01"
[9] "1960-01-01" "1961-01-01" "1962-01-01" "1963-01-01"
[13] "1964-01-01" "1965-01-01" "1966-01-01" "1967-01-01"
[17] "1968-01-01" "1969-01-01" "1970-01-01" "1971-01-01"
[21] "1972-01-01" "1973-01-01" "1974-01-01" "1975-01-01"
[25] "1976-01-01" "1977-01-01" "1978-01-01" "1979-01-01"
[29] "1980-01-01" "1981-01-01" "1982-01-01" "1983-01-01"
[33] "1984-01-01" "1985-01-01" "1986-01-01" "1987-01-01"
[37] "1988-01-01"