Fill a vector with weekdays only - r

I know the start date start and the last date maturity. How can I fill in a vector with dates without taking weekends dates into account ?
For instance, let's say :
> start = as.Date("2013-02-28");
> maturity = as.Date("2013-03-07");
I would like to get the following vector as a result :
results
[1] "2013-03-01" "2013-03-04" "2013-03-05" "2013-03-06" "2013-03-07"
> start = as.Date("2013-02-28");
> maturity = as.Date("2013-03-07");
> x <- seq(start,maturity,by = 1)
> x
[1] "2013-02-28" "2013-03-01" "2013-03-02" "2013-03-03" "2013-03-04" "2013-03-05"
[7] "2013-03-06" "2013-03-07"
> x <- x[!weekdays(x) %in% c('Saturday','Sunday')]
> x
[1] "2013-02-28" "2013-03-01" "2013-03-02" "2013-03-03" "2013-03-04" "2013-03-05"
[7] "2013-03-06" "2013-03-07"
Same results... ?

There are probably a billion ways to do this with a variety of functions from multiple packages. But my first thought is to simply make a sequence and then remove the weekends:
x <- seq(as.Date('2011-01-01'),as.Date('2011-12-31'),by = 1)
x <- x[!weekdays(x) %in% c('Saturday','Sunday')]
This answer is valid only with an English based system. For instance, in a French version, 'Saturday' and 'Sunday' must be translated into 'samedi' and 'dimanche'

This less human than #joran answer:) , but it is no local-time depending
dd <- seq(as.Date('2011-01-01'),as.Date('2011-12-31'),by = 1)
dd[! (as.POSIXlt(dd)$wd %in% c(0,1))]
PS : another option , is to set locals before applying weekdays
tt <- Sys.getlocale('LC_TIME')
Sys.setlocale('LC_TIME','ENGLISH')
dd <- dd[!weekdays(x) %in% c('Saturday','Sunday')]
Sys.setlocale('LC_TIME',tt)

Related

Print all hours:minutes from 00:00 to 23:59

I would like to print all the hours: minutes in a day from 00:00 to 23:59.
This part goes beyond the question, but if you want to help me, this is the whole idea:
Once that is done, I would like to calculate all the "curious" times that can be interpreted as serendipities. Patterns like: 00:00, 22:22, 01:10, 12:34, 11:44, and the like.
Later on, I would like to count all the "serendipities", and divide them to the total number of hours to know the probabilities of find a "serendipity" each time a person look at the time on his smartphone.
To be honest, I am pretty lost. There is already some months without coding. For the first part of the problem, I guess that a loop can make the task.
For the second part, an if conditional can probably make it.
For the first part of the problem I have tried loops like this
for(i in x){
for(k in y){
cat(i,":",k, ",")
}
}
For the second, something like
Assuming the digits of the time are ab:cd
if(a==b & a==c & a==d){
print(ab:cd)
TRUE
}
if(a==b & c==d){
print(ab:cd)
TRUE
}
I would like to get the whole list of numbers first. Then, the list of "serendipities", and finally the count of both to make the percentage.
I find interesting how people find patterns in numbers when they look at the time, and I would like to know how probable is to get one of these patterns out of the 24*60 = 1440
I hope I have explained myself. (I used to be better with coding and maths, but after some months, I have forgotten almost everything).
Here's a way to generate the list of all possible times.
h <- seq(from=0, to=23)
m <- seq(from=0, to=59)
h <- sprintf('%02d', h)
m <- sprintf('%02d', m)
df <- data.frame(expand.grid(h, m))
df$times <- paste0(df$Var1, ':', df$Var2)
df <- df[order(df$times), ]
df$times
Partial output
df$times[1:25]
[1] "00:00" "00:01" "00:02" "00:03" "00:04" "00:05" "00:06" "00:07" "00:08"
[10] "00:09" "00:10" "00:11" "00:12" "00:13" "00:14" "00:15" "00:16" "00:17"
[19] "00:18" "00:19" "00:20" "00:21" "00:22" "00:23" "00:24"
Length of variable
dim(df)
[1] 1440 3
We can create a sequence of 1 minute interval starting from 00:00:00 to 23:59:00 and then use format to get output in desired format.
format(seq(as.POSIXct("00:00:00", format = "%T"),
as.POSIXct("23:59:00", format = "%T"), by = "1 min"), "%H:%M")
#[1] "00:00" "00:01" "00:02" "00:03" "00:04" "00:05" "00:06" "00:07" "00:08" "00:09"
# "00:10" "00:11" "00:12" "00:13" "00:14" "00:15" "00:16" "00:17" "00:18" "00:19" ...
Yet another way of doing it:
> result <- character(1440)
> for (i in 0:1439) result[i+1L] <- sprintf("%02d:%02d",
+ i %/% 60,
+ i %% 60
+ )
> head(result)
[1] "00:00" "00:01" "00:02" "00:03" "00:04" "00:05"
> tail(result)
[1] "23:54" "23:55" "23:56" "23:57" "23:58" "23:59"

Access R Dataframe Values Rather than Tibble

I'm an experienced Pandas user and am having trouble plugging values from my R frame into a function.
The following function works with hard coded values
>seq.Date(as.Date('2018-01-01'), as.Date('2018-01-31'), 'days')
[1] "2018-01-01" "2018-01-02" "2018-01-03" "2018-01-04" "2018-01-05" "2018-01-06" "2018-01-07"
[8] "2018-01-08" "2018-01-09" "2018-01-10" "2018-01-11" "2018-01-12" "2018-01-13" "2018-01-14"
[15] "2018-01-15" "2018-01-16" "2018-01-17" "2018-01-18" "2018-01-19" "2018-01-20" "2018-01-21"
[22] "2018-01-22" "2018-01-23" "2018-01-24" "2018-01-25" "2018-01-26" "2018-01-27" "2018-01-28"
[29] "2018-01-29" "2018-01-30" "2018-01-31"
Here is an extract from a dataframe I'm using
>df[1,1:2]
# A tibble: 1 x 2
start_time end_time
<date> <date>
1 2017-04-27 2017-05-11
When plugging these values into the 'seq.Date' function I get an error
> seq.Date(from=df[1,1], to=df[1,2], 'days')
Error in seq.Date(from = df[1, 1], to = df[1, 2], "days") :
'from' must be a "Date" object
I suspect this is because subsetting using df[x,y] returns a tibble rather than the specific value
data.class(df[1,1])
[1] "tbl_df"
What I'm hoping to derive is a sequence of dates. I need to be able to point this at various places around the dataframe.
Many thanks for any help!
Just use double brackets:
seq.Date(from=df[[1,1]], to=df[[1,2]], 'days')
The extraction functions of tibble may not return vectors but one column tibbles, use dplyr::pull to extract the column as vector, like in this answer: Extract a dplyr tbl column as a vector
Another option is to set the drop argument in the `[` function to TRUE.
If TRUE the result is coerced to the lowest possible dimension
seq.Date(from = df[1, 1, drop = TRUE], to = df[1, 2, drop = TRUE], 'days')
# [1] "2017-04-27" "2017-04-28" "2017-04-29" "2017-04-30" "2017-05-01" "2017-05-02" "2017-05-03" "2017-05-04" "2017-05-05" "2017-05-06"
#[11] "2017-05-07" "2017-05-08" "2017-05-09" "2017-05-10" "2017-05-11"
data
df <- tibble(start_time = as.Date('2017-04-27'),
end_time = as.Date('2017-05-11'))

Looping over dates with R

I need to write some code in R that builds a string by looping over dates and I cant' seem to find an example of this in my books or by Googling. Basically:
for theDate = 1Jan14 to 31Dec14{
"http://website.com/api/" + theDate
}
I thought about creating an input file that held the dates, but that seems inelegant.Does anybody know of a better solution?
This doesn't consume that much memory and doesn't need the julian function:
start <- as.Date("01-08-14",format="%d-%m-%y")
end <- as.Date("08-09-14",format="%d-%m-%y")
theDate <- start
while (theDate <= end)
{
print(paste0("http://website.com/api/",format(theDate,"%d%b%y")))
theDate <- theDate + 1
}
.
[1] "http://website.com/api/01Aug14"
[1] "http://website.com/api/02Aug14"
[1] "http://website.com/api/03Aug14"
[1] "http://website.com/api/04Aug14"
[1] "http://website.com/api/05Aug14"
[1] "http://website.com/api/06Aug14"
[1] "http://website.com/api/07Aug14"
[1] "http://website.com/api/08Aug14"
[1] "http://website.com/api/09Aug14"
[1] "http://website.com/api/10Aug14"
[1] "http://website.com/api/11Aug14"
[1] "http://website.com/api/12Aug14"
[1] "http://website.com/api/13Aug14"
[1] "http://website.com/api/14Aug14"
[1] "http://website.com/api/15Aug14"
[1] "http://website.com/api/16Aug14"
[1] "http://website.com/api/17Aug14"
[1] "http://website.com/api/18Aug14"
[1] "http://website.com/api/19Aug14"
[1] "http://website.com/api/20Aug14"
[1] "http://website.com/api/21Aug14"
[1] "http://website.com/api/22Aug14"
[1] "http://website.com/api/23Aug14"
[1] "http://website.com/api/24Aug14"
[1] "http://website.com/api/25Aug14"
[1] "http://website.com/api/26Aug14"
[1] "http://website.com/api/27Aug14"
[1] "http://website.com/api/28Aug14"
[1] "http://website.com/api/29Aug14"
[1] "http://website.com/api/30Aug14"
[1] "http://website.com/api/31Aug14"
[1] "http://website.com/api/01Sep14"
[1] "http://website.com/api/02Sep14"
[1] "http://website.com/api/03Sep14"
[1] "http://website.com/api/04Sep14"
[1] "http://website.com/api/05Sep14"
[1] "http://website.com/api/06Sep14"
[1] "http://website.com/api/07Sep14"
[1] "http://website.com/api/08Sep14"
>
You can use
> dates <- seq(as.Date("2014-01-01"), as.Date("2014-12-31"), by=1)
to generate a vector of consecutive days. What you want to do with this is not entirely clear from your pseudo-code, but you can iterate directly over the vector (which is generally not what you want in R)
> for (d in dates) {
# Code goes here.
}
The comment-solution by #Roland will give you a vector of the form:
> paste0("http://website.com/api/", dates)
[1] "http://website.com/api/2014-01-01" "http://website.com/api/2014-01-02"
[3] "http://website.com/api/2014-01-03" "http://website.com/api/2014-01-04"
[5] "http://website.com/api/2014-01-05" "http://website.com/api/2014-01-06"
...
Of course after I ask the question I happen to find this.
days <- seq(from=as.Date('2011-02-01'), to=as.Date("2011-03-02"),by='days' )
for ( i in seq_along(days) )
{
print(paste(days[i],"T12:00:00", sep=""))
}
You could translate your date into julian days and then write a loop based on the julian days.
To convert to julian days you can use the code described here
And then you could write code using the the julian days like:
tmp <- as.POSIXlt("1Jan14", format = "%d%b%y")
strdate <- julian(tmp)
tmp <- as.POSIXlt("31Dec14", format = "%d%b%y")
enddate <- julian(tmp)
for (theDate in strdate:enddate){
paste ("http://website.com/api/", toString(theDate), sep = "")
}
you have to figure out how to convert back. I am not to sure about the julian function. maybe you should also have a look into "yday" of lubridate package.

Two (supposedly) identical date objects in R are not equal?

I have a simple question. I have two Date objects in R that are supposed to be identical (they have the same value and class), but R is saying they are not equal. I am running on linux though I get the same result on a windows machine. Why is this happening?
code:
start=as.Date("2014-12-31")
finish=as.Date("2014-11-28")
dates = seq(start,finish,length=6)
christmasEve = as.Date("2014-12-24")
print(dates[2])
print(christmasEve)
print(class(dates[2]))
print(class(christmasEve))
(christmasEve==dates[2])
output:
[1] "2014-12-24"
[1] "2014-12-24"
[1] "Date"
[1] "Date"
[1] FALSE
Any help would be greatly appreciated!
-Paul
The problem is that you are dividing a number of days that is not a multiple of six by six. Check out:
as.numeric(dates)
# [1] 16435.0 16428.4 16421.8 16415.2 16408.6 16402.0
start - finish
# Time difference of 33 days
Since you are creating the dates as a sequence the dates are not exact round numbers.
> as.numeric(dates)
[1] 16435.0 16428.4 16421.8 16415.2 16408.6 16402.0
> as.numeric(christmasEve)
[1] 16428
> as.character(christmasEve) == as.character(dates[2])
[1] TRUE
It is not possible to test your code as there is no sampleRate. I assumed that sampleRate is 6. You could compare your dates with the code below:
all(as.character(christmasEve) == as.character(dates[2]))
The whole things should work like that
> sampleRate <- 6
>
> start=as.Date("2014-12-31")
> finish=as.Date("2014-11-28")
> dates = seq(start,finish,length=sampleRate)
> christmasEve = as.Date("2014-12-24")
> print(dates[2])
[1] "2014-12-24"
> print(christmasEve)
[1] "2014-12-24"
> print(class(dates[2]))
[1] "Date"
> print(class(christmasEve))
[1] "Date"
> (christmasEve==dates[2])
[1] FALSE
>
> all(christmasEve == dates[2])
[1] FALSE
> all(as.character(christmasEve) == as.character(dates[2])
+ )
[1] TRUE

Get the second value in the row if the dates match in R

I am in the trouble of getting the values which have the same dates from two different data sources in R. The code is
#Monthly data
month_data <- c(580.11, 618.25, 641.24, 604.85, 580.86, 580.07, 632.97,
685.09, 754.50, 680.30, 698.37, 707.38, 480.11, 528.25,
541.24, 614.85, 680.86)
month_dates <- seq(as.Date("2001/06/01"), by = "1 months", length = 17)
month_data <- data.frame(month_dates, month_data)
#the dates_for_match is a list:
dates_for_match<-list(c( "2001-08-01","2001-09-01", "2001-10-01"),c("2001-11-01","2001-12-01","2002-01-01"),c("2002-02-01","2002-03-01","2002-04-01"),c("2002-05-01","2002-06-01","2002-07-01"),c( "2002-08-01","2002-09-01", "2002-10-01"))
Example:
> dates_for_match
[[1]]
[1] "2001-08-01" "2001-09-01" "2001-10-01"
[[2]]
[1] "2001-11-01" "2001-12-01" "2002-01-01"
[[3]]
[1] "2002-02-01" "2002-03-01" "2002-04-01"
[[4]]
[1] "2002-05-01" "2002-06-01" "2002-07-01"
[[5]]
[1] "2002-08-01" "2002-09-01" "2002-10-01"
I want to use the dates_for_match list to get the values from month_data that have the same dates.
You need %in%...
month_data[ month_dates %in% unlist( dates_for_match ) , 2 ]

Resources