I have a set of one year finacial data. The data is collected in working days.
Is there any way in R to assign to each data point a date given that the first data point was collected for eaxmple on juanari the 3th.
You need to take two steps to get to a solution:
Create a sequence of dates using seq.Date
Use wday to calculate the day of the week and remove all days with value 1 (Sunday) and 7 (Saturday)
The code and results:
startdate <- as.Date("2011-01-03")
dates <- seq(startdate, by="1 day", length.out=15)
dates[wday(dates) != 1 & wday(dates) != 7]
[1] "2011-01-03" "2011-01-04" "2011-01-05" "2011-01-06" "2011-01-07"
[6] "2011-01-10" "2011-01-11" "2011-01-12" "2011-01-13" "2011-01-14"
[11] "2011-01-17"
PS. You will have two keep a separate lists of holidays in your region and remove these from the list.
The timeDate package offers functions to extract business days in whatever financial center you happen to favor (there are almost 500 such financialcenters in their classification).
Related
I have the following two dates:
dates <- c("2019-02-01", "2019-06-30")
I want to create the following bins from above two dates:
2019-05-30, 2019-04-30, 2019-03-31, 2019-02-28
I used cut function along with seq,
dt <- as.Date(dates)
cut(seq(dt[1], dt[2], by = "month"), "month")
but this does not produce correct results.
Could you please shed some light on the use of cut function on dates?
We assume that what is wanted is all end of months between but not including the 2 dates in dates. In the question dates[1] is the beginning of the month and dates[2] is the end of the month but we do not assume that although if we did it might be simplified. We have produced descending series below but usually in R one uses ascending.
The first approach below uses a monthly sequence and cut and the second approach below uses a daily sequence.
No packages are used.
1) We define a first of the month function, fom, which given a Date or character date gives the Date of the first of the month using cut. Then we calculate monthly dates between the first of the months of the two dates, convert those to end of the month and then remove any dates that are not strictly between the dates in dates.
fom <- function(x) as.Date(cut(as.Date(x), "month"))
s <- seq(fom(dates[2]), fom(dates[1]), "-1 month")
ss <- fom(fom(s) + 32) - 1
ss[ss > dates[1] & ss < dates[2]]
## [1] "2019-05-31" "2019-04-30" "2019-03-31" "2019-02-28"
2) Another approach is to compute a daily sequence between the two elements of dates after converting to Date class and then only keep those for which the next day has a different month and is between the dates in dates. This does not use cut.
dt <- as.Date(dates)
s <- seq(dt[2], dt[1], "-1 day")
s[as.POSIXlt(s)$mon != as.POSIXlt(s+1)$mon & s > dt[1] & s < dt[2]]
## [1] "2019-05-31" "2019-04-30" "2019-03-31" "2019-02-28"
There is no need for cut here:
library(lubridate)
dates <- c("2019-02-01", "2019-06-30")
seq(min(ymd(dates)), max(ymd(dates)), by = "months") - 1
#> [1] "2019-01-31" "2019-02-28" "2019-03-31" "2019-04-30" "2019-05-31"
Created on 2021-11-25 by the reprex package (v2.0.1)
I have a data frame in R with a column that contains date/time as
2018-06-01T19:55:57.000Z. This time is in GMT. It is of type character. How can I convert this field to a central time zone time, showing the date, time and timezone in the column and also cut out the 000Z that are there in the end?
I ran the code as.POSIXct(data$datetime, format="%Y-%m-%d %H:%M:%S") but it returns NA in all the fields. Any ideas?
[1] "2018-06-01T19:30:02.000Z" "" "2018-04-23T20:51:13.000Z" "2018-05-25T18:06:53.000Z"
[5] "2018-05-31T21:59:19.000Z" "2018-06-01T16:30:36.000Z" "2018-06-01T14:16:04.000Z" "2018-05-18T22:03:41.000Z"
[9] "2018-05-15T17:15:22.000Z" "2018-06-01T18:57:33.000Z" "2018-06-01T17:48:04.000Z" ""
[13] "2018-06-01T16:10:10.000Z" "2018-05-31T19:34:01.000Z" "2018-05-18T13:34:32.000Z" "2018-06-01T19:55:57.000Z"
We can use lubridate methods
library(lubridate)
ymd_hms(str1)
data
str1 <- "2018-06-01T19:55:57.000Z"
I know how to get the week from an index, but don't know the other way around: how to create an index if I have the calendar weeks (in this case, from an SAP system with 0CALWEEK as 201501, 201502 ... 201552, 201553.
Found this:
How to Parse Year + Week Number in R?
but the day is needed and it's not clear how to set it, especially at the end of the year (Year - week - day: YEAR-53-01 does not always exist, since the first day of week 53 might be Monday, then 01 (Sunday) is not in that week.
I could try to get in the source system the first day of the corresponding week (through SQL) but thought R might do it easier...
Do you have any suggestions?
(Which first day of the week would be not important , since I will create all objects the same way and then merge/cbind them, then continue the analysis. If zoo is easier, I'll go with it)
Thanks!
The problem is that all indices end in 2015-07-29:
data <- 1:4
weeks <- c('201501','201502','201552','201553')
weeks_2 <- as.Date(weeks,format='%Y%w')
xts(data, order.by = weeks_2)
[,1]
2015-07-29 1
2015-07-29 2
2015-07-29 3
2015-07-29 4
test <- xts(data, order.by = weeks_2)
index(test)
[1] "2015-07-29" "2015-07-29" "2015-07-29" "2015-07-29"
You can use as.Date() function, I think is the easiest way:
weeks <- c('201501','201502','201552','201553')
as.Date(paste0(weeks,'1'),format='%Y%W%w') # paste a dummy day
## [1] "2015-01-05" "2015-01-12" "2015-12-28" NA
Where:
%W: Week 00-53 with Monday as first day of the week
or
%U: Week 01-53 with Sunday as first day of the week
%w: Weekday 0-6 Sunday is 0
For this year, week number 53 doesn't exist. And If you want to start with 2015-01-01, just set the right week day:
weeks <- c('201500','201501','201502','201551','201552')
as.Date(paste0(weeks,'4'),format='%Y%W%w')
## [1] "2015-01-01" "2015-01-08" "2015-01-15" "2015-12-24" "2015-12-31"
You may try with substr() and lubridate
library(lubridate)
# a number from your list: 201502
# set the year
x <- ymd("2015-01-1")
# retrieve second week
week(x) <- 2
x
[1] "2015-01-08"
you can use the result for your Index or rownames().
zoo and xts are great for time series once you have set the names,
be sure to remove any column with dates from your data frame
The original data frame in R contains a column called OrdDate with values representing dates including 12/31/1999, 1/1/2007, and so on (over 230,000 of them) ranging from years 1999-2010.
However, when I apply date ordering with R code (follows) to the data frame, the dates sort correctly for those in 1999, but do not starting with 1/1/year, displaying 1/1/2009 next, before dates like 1/1/2004 and 1/2/2000.
Any ideas what I could do to enforce the correct date sorting on this column?
R code:
sorted_frame<-frame1[order(as.Date(frame1$OrdDate, format="%m/%d/%y")),]
Say you have this :
a<-c("12/31/2010","12/31/1999","12/31/2008","12/31/1998")
Using ywill fail as it is a 2-digit year:
order(as.Date(a,format="%m/%d/%y"))
[1] 2 4 1 3
That is because you have :
as.Date(a,format="%m/%d/%y")
[1] "2020-12-31" "2019-12-31" "2020-12-31" "2019-12-31"
Using Y will achieve what you want because it is a 4-digit year:
order(as.Date(a,format="%m/%d/%Y"))
[1] 4 2 3 1
As R analyst pointed out, using sort might be a better solution as it returns your values as Dates for R :
a[order(as.Date(a,format="%m/%d/%Y"))]
[1] "12/31/1998" "12/31/1999" "12/31/2008" "12/31/2010"
sort(as.Date(a,format="%m/%d/%Y"))
[1] "1998-12-31" "1999-12-31" "2008-12-31" "2010-12-31"
I have log files where the date is mentioned in the ordinal date format.
wikipedia page for ordinal date
i.e 14273 implies 273'rd day of 2014 so 14273 is 30-Sep-2014.
is there a function in R to convert ordinal date (14273) to (30-Sep-2014).
Tried the date package but didn come across a function that would do this.
Try as.Date with the indicated format:
as.Date(sprintf("%05d", 14273), format = "%y%j")
## [1] "2014-09-30"
Notes
For more information see ?strptime [link]
The 273 part is sometimes referred to as the day of the year (as opposed to the day of the month) or the day number or the julian day relative to the beginning of the year.
If the input were a character string of the form yyjjj (rather than numeric) then as.Date(x, format = "%y%j") will do.
Update Have updated to also handle years with one digit as per comments.
Data example
x<-as.character(c("14273", "09001", "07031", "01033"))
Data conversion
x1<-substr(x, start=0, stop=2)
x2<-substr(x, start=3, stop=5)
x3<-format(strptime(x2, format="%j"), format="%m-%d")
date<-as.Date(paste(x3, x1, sep="-"), format="%m-%d-%y")
You can use lubridate package as follows:
>library(lubridate)
# Create a template date object
>date <- as.POSIXlt("2009-02-10")
# Update the date using
> update(date, year=2014, yday=273)
[1] "2014-09-30 JST"