I have a list of every day from 2018-01-01 to 2018-06-01. It is a vector and it looks like this:
dates <- c("2018-01-01", "2018-01-02", "2018-01-03", ... , "2018-05-30", "2018-06-01")
I want to make a data frame where the first column has each of those dates and the second column has their day of the week. I am assuming that 2018-01-01 is a Monday.
date day
2018-01-01 Monday
2018-01-02 Tuesday
2018-01-03 Wednesday
... ...
2018-06-01 Monday
I'm working on a data frame towards that end, but I was curious for a better way to recycle through the days of the week than the solution I put together.
day <- NULL
for (i in 1:length(dates)) {
x <- i
while (x > 7) {
x <- i - 7
}
day <- c(day, days[x])
}
cbind(dates,day)
We can use weekdays to get day of the week and put it in a dataframe.
data.frame(dates, day = weekdays(dates))
# dates day
#1 2018-01-01 Monday
#2 2018-01-02 Tuesday
#3 2018-01-03 Wednesday
#4 2018-05-30 Wednesday
#5 2018-06-01 Friday
EDIT
If we don't want to use any in-built function we can create a vector of days and lookup from there. Considering the first day is "Monday" we can use the modulo operator to find the relevant day for rest of the dates
days <- c("Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday")
day <- days[(as.numeric(dates - dates[1]) %% 7) + 1]
day
#[1] "Monday" "Tuesday" "Wednesday" "Wednesday" "Friday"
and then put them in dataframe
data.frame(dates, day)
# dates day
#1 2018-01-01 Monday
#2 2018-01-02 Tuesday
#3 2018-01-03 Wednesday
#4 2018-05-30 Wednesday
#5 2018-06-01 Friday
data
dates<-as.Date(c("2018-01-01","2018-01-02","2018-01-03","2018-05-30","2018-06-01"))
Related
This question already has answers here:
How to filter or subset specific date and time intervals in R? Lubridate?
(2 answers)
Closed 2 years ago.
I am working on a project and would be happy about your help.
I am working with stocks and the effect of weekdays on performance, is there a way to take all the values (for instance the S&P 500) of a data frame (df) from a specific weekday (e.g. Tuesday) and enter these values in a different data frame (df2) in a new column?
Thank you very much,
Ferdinand
df <- read.csv("AAPL.csv") # from Yahoo! Finance
> head(df)
Date Open High Low Close Adj.Close Volume
1 2019-07-10 201.85 203.73 201.56 203.23 200.8332 17897100
2 2019-07-11 203.31 204.39 201.71 201.75 199.3706 20191800
3 2019-07-12 202.45 204.00 202.20 203.30 200.9023 17595200
4 2019-07-15 204.09 205.87 204.00 205.21 202.7898 16947400
5 2019-07-16 204.59 206.11 203.50 204.50 202.0882 16866800
6 2019-07-17 204.05 205.09 203.27 203.35 200.9517 14107500
df$Day <- format(as.Date(df$Date), "%A") # Get the day
idx <- df$Day == "Tuesday" # Where are the Tuesdays ?
df2 <- df[idx, ] # Logical indexing
> head(df2)
Date Open High Low Close Adj.Close Volume Day
5 2019-07-16 204.59 206.11 203.50 204.50 202.0882 16866800 Tuesday
10 2019-07-23 208.46 208.91 207.29 208.84 206.3770 18355200 Tuesday
15 2019-07-30 208.76 210.16 207.31 208.78 206.3177 33935700 Tuesday
20 2019-08-06 196.31 198.07 194.04 197.00 194.6766 35824800 Tuesday
25 2019-08-13 201.02 212.14 200.48 208.97 207.2901 47218500 Tuesday
30 2019-08-20 210.88 213.35 210.32 210.36 208.6689 26884300 Tuesday
Your function :
myfunction <- function(mydf) {
df$Day <- format(as.Date(df$Date), "%A")
idx <- df$Day == "Tuesday"
df2 <- df[idx, ]
}
Testing myfunction :
> out = myfunction(df)
> head(out)
Date Open High Low Close Adj.Close Volume Day
5 2019-07-16 204.59 206.11 203.50 204.50 202.0882 16866800 Tuesday
10 2019-07-23 208.46 208.91 207.29 208.84 206.3770 18355200 Tuesday
15 2019-07-30 208.76 210.16 207.31 208.78 206.3177 33935700 Tuesday
20 2019-08-06 196.31 198.07 194.04 197.00 194.6766 35824800 Tuesday
25 2019-08-13 201.02 212.14 200.48 208.97 207.2901 47218500 Tuesday
30 2019-08-20 210.88 213.35 210.32 210.36 208.6689 26884300 Tuesday
I am interested in creating a dataframe of Date values for the year 2015. There would be one row per date. Also, these would have to correspond to their accurate weekday. For example weekdays() applied to 2015-01-01 would have a value of Thursday. This is because I ultimately want to extract the dates that correspond to Saturdays and Sundays.
try this:
dates <- seq(as.Date("2015-01-01"),as.Date("2015-12-31"),1)
weekdays <- weekdays(dates)
res <- data.frame(dates,weekdays)
res[res$weekdays=="Sunday" | res$weekdays=="Saturday",]
##EDIT thanks to #Jaap
res[res$weekdays %in% c("Sunday","Saturday"),]
dates weekdays
3 2015-01-03 Saturday
4 2015-01-04 Sunday
10 2015-01-10 Saturday
11 2015-01-11 Sunday
17 2015-01-17 Saturday
18 2015-01-18 Sunday
I have dataset consisting of two columns (timestamp and power) as:
str(df2)
'data.frame': 720 obs. of 2 variables:
$ timestamp: POSIXct, format: "2015-08-01 00:00:00" "2015-08-01 01:00:00" " ...
$ power : num 124 149 118 167 130 ..
This dataset is of entire one month duration. I want to create two subsets of it - one containing the weekend data, and other one containing weekday (Monday - Friday) data. In other words, one dataset should contain data corresponding to saturday and sunday and the other one should contain data of other days. Both of the subsets should retain both of the columns. How can I do this in R?
I tried to use the concept of aggregate and split, but I am not clear in the function parameter (FUN) of aggregate, how should I specify a divison of dataset.
You can use R base functions to do this, first use strptime to separate date data from first column and then use function weekdays.
Example:
df1<-data.frame(timestamp=c("2015-08-01 00:00:00","2015-10-13 00:00:00"),power=1:2)
df1$day<-strptime(df1[,1], "%Y-%m-%d")
df1$weekday<-weekdays(df1$day)
df1
timestamp power day weekday
2015-08-01 00:00:00 1 2015-08-01 Saturday
2015-10-13 00:00:00 2 2015-10-13 Tuesday
Building on top of #ShruS example:
df<-data.frame(timestamp=c("2015-08-01 00:00:00","2015-10-13 00:00:00", "2015-10-11 00:00:00", "2015-10-14 00:00:00"))
df$day<-strptime(df[,1], "%Y-%m-%d")
df$weekday<-weekdays(df$day)
df1 = subset(df,df$weekday == "Saturday" | df$weekday == "Sunday")
df2 = subset(df,df$weekday != "Saturday" & df$weekday != "Sunday")
> df
timestamp day weekday
1 2015-08-01 00:00:00 2015-08-01 Saturday
2 2015-10-13 00:00:00 2015-10-13 Tuesday
3 2015-10-11 00:00:00 2015-10-11 Sunday
4 2015-10-14 00:00:00 2015-10-14 Wednesday
> df1
timestamp day weekday
1 2015-08-01 00:00:00 2015-08-01 Saturday
3 2015-10-11 00:00:00 2015-10-11 Sunday
> df2
timestamp day weekday
2 2015-10-13 00:00:00 2015-10-13 Tuesday
4 2015-10-14 00:00:00 2015-10-14 Wednesday
Initially, I tried for complex approaches using extra libraries, but at the end, I came out with a basic approach using R.
#adding day column to existing set
df2$day <- weekdays(as.POSIXct(df2$timestamp))
# creating two data_subsets, i.e., week_data and weekend_data
week_data<- data.frame(timestamp=factor(), power= numeric(),day= character())
weekend_data<- data.frame(timestamp=factor(),power=numeric(),day= character())
#Specifying weekend days in vector, weekend
weekend <- c("Saturday","Sunday")
for(i in 1:nrow(df2)){
if(is.element(df2[i,3], weekend)){
weekend_data <- rbind(weekend_data, df2[i,])
} else{
week_data <- rbind(week_data, df2[i,])
}
}
The datasets created, i.e., weekend_data and week_data are my required sub datasets.
I have dates in year month day format that I want to convert to year month week format like so:
date dateweek
2015-02-18 -> 2015-02-8
2015-02-19 -> 2015-02-8
2015-02-20 -> ....
2015-02-21
2015-02-22
2015-02-23
2015-02-24 ...
2015-02-25 -> 2015-02-9
2015-02-26 -> 2015-02-9
2015-02-27 -> 2015-02-9
I tried
data$dateweek <- week(as.POSIXlt(data$date))
but that returns only weeks without the corresponding year and month.
I also tried:
data$dateweek <- as.POSIXct('2015-02-18')
data$dateweek <- format(data$dateweek, '%Y-%m-%U')
# data$dateweek <- format(as.POSIXct(data$date), '%Y-%m-%U')
but the corresponding columns look strange:
date datetime
2015-01-01 2015-01-00
2015-01-02 2015-01-00
2015-01-03 2015-01-00
2015-01-04 2015-01-01
2015-01-05 2015-01-01
2015-01-06 2015-01-01
2015-01-07 2015-01-01
2015-01-08 2015-01-01
2015-01-09 2015-01-01
2015-01-10 2015-01-01
2015-01-11 2015-01-02
You need to use the '%Y-%m-%V format to change it:
mydate <- as.POSIXct('2015-02-18')
> format(mydate, '%Y-%m-%V')
[1] "2015-02-08"
From the documentation strptime:
%V
Week of the year as decimal number (00–53) as defined in ISO 8601. If the week (starting on Monday) containing 1 January has four or more days in the new year, then it is considered week 1. Otherwise, it is the last week of the previous year, and the next week is week 1. (Accepted but ignored on input.)
and there is also (The US convention) :
%U
Week of the year as decimal number (00–53) using Sunday as the first day 1 of the week (and typically with the first Sunday of the year as day 1 of week 1). The US convention.
It really depends on which one you want to use for your case.
mydate <- as.POSIXct('2015-02-18')
> format(mydate, '%Y-%m-%U')
[1] "2015-02-07"
In your case you should do:
data$dateweek <- format(as.POSIXct(data$date), '%Y-%m-%U')
Let's say that I have a date in R and it's formatted as follows.
date
2012-02-01
2012-02-01
2012-02-02
Is there any way in R to add another column with the day of the week associated with the date? The dataset is really large, so it would not make sense to go through manually and make the changes.
df = data.frame(date=c("2012-02-01", "2012-02-01", "2012-02-02"))
So after adding the days, it would end up looking like:
date day
2012-02-01 Wednesday
2012-02-01 Wednesday
2012-02-02 Thursday
Is this possible? Can anyone point me to a package that will allow me to do this?
Just trying to automatically generate the day by the date.
df = data.frame(date=c("2012-02-01", "2012-02-01", "2012-02-02"))
df$day <- weekdays(as.Date(df$date))
df
## date day
## 1 2012-02-01 Wednesday
## 2 2012-02-01 Wednesday
## 3 2012-02-02 Thursday
Edit: Just to show another way...
The wday component of a POSIXlt object is the numeric weekday (0-6 starting on Sunday).
as.POSIXlt(df$date)$wday
## [1] 3 3 4
which you could use to subset a character vector of weekday names
c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday",
"Friday", "Saturday")[as.POSIXlt(df$date)$wday + 1]
## [1] "Wednesday" "Wednesday" "Thursday"
Use the lubridate package and function wday:
library(lubridate)
df$date <- as.Date(df$date)
wday(df$date, label=TRUE)
[1] Wed Wed Thurs
Levels: Sun < Mon < Tues < Wed < Thurs < Fri < Sat
Look up ?strftime:
%A Full weekday name in the current locale
df$day = strftime(df$date,'%A')
Let's say you additionally want the week to begin on Monday (instead of default on Sunday), then the following is helpful:
require(lubridate)
df$day = ifelse(wday(df$time)==1,6,wday(df$time)-2)
The result is the days in the interval [0,..,6].
If you want the interval to be [1,..7], use the following:
df$day = ifelse(wday(df$time)==1,7,wday(df$time)-1)
... or, alternatively:
df$day = df$day + 1
This should do the trick
df = data.frame(date=c("2012-02-01", "2012-02-01", "2012-02-02"))
dow <- function(x) format(as.Date(x), "%A")
df$day <- dow(df$date)
df
#Returns:
date day
1 2012-02-01 Wednesday
2 2012-02-01 Wednesday
3 2012-02-02 Thursday
start = as.POSIXct("2017-09-01")
end = as.POSIXct("2017-09-06")
dat = data.frame(Date = seq.POSIXt(from = start,
to = end,
by = "DSTday"))
# see ?strptime for details of formats you can extract
# day of the week as numeric (Monday is 1)
dat$weekday1 = as.numeric(format(dat$Date, format = "%u"))
# abbreviated weekday name
dat$weekday2 = format(dat$Date, format = "%a")
# full weekday name
dat$weekday3 = format(dat$Date, format = "%A")
dat
# returns
Date weekday1 weekday2 weekday3
1 2017-09-01 5 Fri Friday
2 2017-09-02 6 Sat Saturday
3 2017-09-03 7 Sun Sunday
4 2017-09-04 1 Mon Monday
5 2017-09-05 2 Tue Tuesday
6 2017-09-06 3 Wed Wednesday
form comment of JStrahl format(as.Date(df$date),"%w"), we get number of current day :
as.numeric(format(as.Date("2016-05-09"),"%w"))