I'm programming a function in which I want to plot let's say the stock price for any stock during the last 365 days. However not trading days, but just a one year period. So if I'm generating the chart on the 15th of February I want data from the 16th February 2012-15th February 2013. How can I do that? last() for example shows me the last 365 trading days or days of which there is data available.
library(quantmod)
getSymbols("AAPL")
plot(AAPL)
User character subsetting.
AAPL[paste(Sys.Date()-365, Sys.Date(), sep="/")]
Related
I am trying to analyze this time series data from the Wooldridge Econometrics book containing weekly data on the New York Stock Exchange, beginning in the year 1976 January and ending in 1989.
I have never worked with the ts() function before but I understand the general grammar already. What I have difficulties with, is how I should define the frequencies since every 4th year has 366 instead of 365 days. In the book is already stated that for holidays or weekends, the following day was used, when the stock exchange was open.
So how do I exactly deal with this problem of creating a time series object?
Here is a screenshot of the first rows of the data frame:
data frame of nyse
I have a question regarding declaring my dataset as a time series. I have weekly historical data of demand of a certain product. Every week is labeled as 201401, 201402, up to the current month 201937.
The problem arises once I declare the set as ts <- ts(data, start=2014, frequency=52). Because every year consists of 365.25 days, in my set I have a week 201553. So from week 201601, every week is basically a week later, which causes problem when finding seasonality patterns.
Do I have to delete week 201553 or what is the appropriate continuation?
If you set frequency=365.25/7, things should work out fine.
References
https://otexts.com/fpp2/ts-objects.html
https://www.rdocumentation.org/packages/stats/versions/3.6.1/topics/ts (interestingly, the trick with frequency=52.17857 is not mentioned here)
http://manishbarnwal.com/blog/2017/05/03/time_series_and_forecasting_using_R/
Hi I am looking at data to do with prices of commodities throughout a period of a few years. I want to summarize prices by work weeks, not weeks defined by seven day periods since Jan 1st. When I tried:
data <- mutate(data, week = week(strptime(Date, "%m/%d/%Y")))
The lubridate week() function counts "1/13/10" (mdy) as week 2 and "1/14/10" as week 3. I want those to be in the same week. Basically any run of mon-fri in the same week. If the year starts on a wednesday I want week1 to be wed-fri, week2 to start the next monday. I have no data on any weekends. Any thoughts? Thanks
This will give you week number assuming Date column is in Date format (you can use as.Date() to convert):
data <- mutate(data, week = format(Date, '%U'))
If you want week and year, you can use:
data <- mutate(data, week = format(Date, '%Y-%U'))
It will correctly number partial weeks.
Note: week number starts with 00 (but, that should be no problem).
You can also do it WITHOUT dplyr and it's mutate, like this:
data$week <- format(data$Date, '%U')
I have an interesting problem arising. I have minute by minute data for 2014 of certain stocks and I want to analyze Fridays only and am using the code below to do so. It works great up until it gets to March. All of a sudden, Thursdays are being given a weekday value of 5 from the 4th line of the code below.
For example, 3/14/14 was this past Friday, however, the code below is setting 3/13/14 as Friday even though it was a Thurs.
My guess is this has something to do with leap years, but it is only a guess. Any idea what is causing this and how to fix it?
LNKD.csv, https://drive.google.com/file/d/0B4xAKSwsHiEBNVpEbHJGMU9QYXc/edit?usp=sharing
LNKD Clean.csv, https://drive.google.com/file/d/0B4xAKSwsHiEBVjBKcTM1VVg3aU0/edit?usp=sharing
data <- read.csv("LNKD.csv", stringsAsFactors=FALSE)
data$Up <- NULL
data$Down <- NULL
data$weekday <- as.POSIXlt(data$Date, format="%m/%d/%y")$wday
data <- subset(data, data$weekday==5)
write.csv(data, file="LNKD Clean.csv", row.names=FALSE)
Thank you.
It's because your date format uses '%y' and not '%Y'.
'%y' is the two-digit year ('14') but '%Y' is the 4-digit year, and your years have 4 digits.
e.g.
as.POSIXlt('03/13/2014', format="%m/%d/%y")
# "2020-03-13"
as.POSIXlt('03/13/2014', format="%m/%d/%Y")
# "2014-03-13"
All your dates are being interpreted as the year 2020 because the first two digits of '2014' is '20' and '%y' means this is the year '2020'.
I have a data frame with two columns. Date, Gender
I want to change the Date column to the start of the week for that observation. For example if Jun-28-2011 is a Tuesday, I'd like to change it to Jun-27-2011. Basically I want to re-label Date fields such that two data points that are in the same week have the same Date.
I also want to be able to do it by-weekly, or monthly and specially quarterly.
Update:
Let's use this as a dataset.
datset <- data.frame(date = as.Date("2011-06-28")+c(1:100))
One slick way to do this that I just learned recently is to use the lubridate package:
library(lubridate)
datset <- data.frame(date = as.Date("2011-06-28")+c(1:100))
#Add 1, since floor_date appears to round down to Sundays
floor_date(datset$date,"week") + 1
I'm not sure about how to do bi-weekly binning, but monthly and quarterly are easily handled with the respective base functions:
quarters(datset$date)
months(datset$date)
EDIT: Interestingly, floor_date from lubridate does not appear to be able to round down to the nearest quarter, but the function of the same name in ggplot2 does.
Look at ?strftime. In particular, the following formats:
%b: Abbreviated month name in the
current locale. (Also matches full
name on input.)
%B: Full month name
in the current locale. (Also matches
abbreviated name on input.)
%m: Month as decimal number (01–12).
%W: Week of the year as decimal number
(00–53) using Monday as the first day
of week (and typically with the first
Monday of the year as day 1 of week
1). The UK convention.
eg:
> strftime("2011-07-28","Month: %B, Week: %W")
[1] "Month: July, Week: 30"
> paste("Quarter:",ceiling(as.integer(strftime("2011-07-28","%m"))/3))
[1] "Quarter: 3"