Using Forecasting for Daily Data - r

Have used forecasting method using R. Using the below codes:
library(forecast)
t1$StartDate <- as.Date(t1$StartDate, origin = "1899-12-30")
## 10,1 indicates 10th Week & Sunday
ordervalu_ts <- ts(t1$Revenue, start = c(10,1), frequency = 7)
print(ordervalu_ts)
ordervalu_ts_decom <- HoltWinters(ordervalu_ts)
print(ordervalu_ts_decom)
ordervalu_ts_for <- forecast:::forecast.HoltWinters(ordervalu_ts_decom, h=30)
print(ordervalu_ts_for)
t1 is the input file. It has two columns: Date and Revenue. I am trying to forecast the Revenue for the next 30 days. ?Able to get output?. The date in the output is not in the right format. I have the following questions:
Start date: wanted to have it dynamic and not static (ie, the start date to take it form the column "Date"
Output is not providing me the exact date (providing 81.42857 instead of the first predicted date) while providing the prediction. Shows as "Below".
print(ordervalu_ts_for)
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
81.42857 1390.4782 368.3917 2412.565 -172.668266 2953.625
81.57143 1351.3890 328.9055 2373.872 -212.364558 2915.142
81.71429 1355.7625 332.8507 2378.674 -208.646034 2920.171
Can some one help ? Have tried reviewing all the video's in youtube and online. Thanks for your help

Related

Converting a data-frame into time series in R

I have 3 years of daily data in a column and need to write the code in R to convert the data-frame into a time series object but I am unsure of the coding. I attach the raw data. I was wondering whether to set the frequency to monthly or leave it daily, or whether to adapt the raw data to make it more user friendly in R. Any advice/help would be appreciated.
Thanks
Martin.
I couldn't get the code to load up. I then changed the frequency to just a year and 1 and it accepted the data but it is not giving the full picture.
This is the R code
`install.packages("readxl")
install.packages("forecast")
install.packages("tseries")
library(readxl)
library(forecast)
library(tseries)
asb <- read_excel("C://Users//BCCAMNHY//OneDrive - Birmingham City Council//HomeFiles//My Documents//DATA ANALYST TRAINING//PROJECT 4//PROJECT DOCUMENTS//ASB_311022.xlsx")
View(asb)
class(
asbtime=ts(asb$`ASB Submitted`,start = min(asb$`Date for R`,end = max(asb$`Date for R`),frequency = 12)
class(asbtime)
library(forecast)
library(tseries)
plot(asbtime)
acf(asbtime)
pacf(asbtime)
adf.test(asbtime)
gdpmodel=auto.arima(gdptime,ic="aic",trace = TRUE) ## dont understand this line of code
acf(ts(asb$residuals)) # not sure if this code should be changed to asb$asb submitted
pacf(ts(asb$residuals))# as above
myasbforecast=forecast(asbmodel,level = c(95),h=10*4) ##### Don't understand this line of code. Want a monthly or daily forecast - think ideally monthly
mygdpforecast
plot(asbforecast)
Box.test(myasbforecast$resid, lag=5, type= "Ljung-Box")
Box.test(mygdpforecast$resid, lag=15, type= "Ljung-Box")
Box.test(myasbforecast$resid, lag=25, type= "Ljung-Box")
An extract of the raw data is:
Submitted Count of Submitted
01/03/2019 1
02/03/2019 0
03/03/2019 0
04/03/2019 0
05/03/2019 1
06/03/2019 0
07/03/2019 1
08/03/2019 2
09/03/2019 0
10/03/2019 0
11/03/2019 27
12/03/2019 54
13/03/2019 52
14/03/2019 46
15/03/2019 44
In your example, the names of the data columns do not match those used in the code. I think it's a coincidence but check it out anyway.
IMHO, these will be enought for conversion into ts:
asbtime=ts(asb$`Count of Submitted`, start=2019, frequency = 365)
plot(forecast(asbtime), xlab = "year", ylab="Submitted")

Trying to extract the date of 52 weeks high and low for stocks

Both maxMATX and maxZIM return no observation, which I am very confused about.
Here is the code
library(tseries)
\#teries have all the Financial Data , hence we need to load it
data.ZIM\<- get.hist.quote("ZIM")
data.MATX\<- get.hist.quote("MATX")
data.ZIM\<-data.ZIM\[Sys.Date()-0:364\]
data.MATX\<-data.MATX\[Sys.Date()-0:364\]
head(data.ZIM)
head(data.MATX)
min(data.ZIM$Close)
max(data.ZIM$Close)
minZIM=data.ZIM\[data.ZIM$Close==24.34\]
maxZIM=data.ZIM\[data.ZIM$Close==88.62\]
data.ZIM\[data.ZIM$Close==88.62\]
minZIM
maxZIM
min(data.MATX$Close)
max(data.MATX$Close)
minMATX=data.MATX\[data.MATX$Close==60.07,\]
maxMATX=data.MATX\[data.MATX$Close==121.47,\]
minMATX
maxMATX
I was trying to extract the data from Tseries and I have faced difficulty when trying to print the row (or specifically I was trying to find the date of which the 52 weeks low and high was happening ).
Use which.min and which.max to find indexes of minimum and maximum close and use those to look up the time.
library(tseries)
data.ZIM <- get.hist.quote("ZIM", start = Sys.Date() - 364)
tmin <- time(data.ZIM)[which.min(data.ZIM$Close)]; tmin
## [1] "2021-03-31"
data.ZIM[tmin]
## Open High Low Close
## 2021-03-31 24.75 24.99 24.15 24.34

What will be the frequency , start and end of ts function in Time series with observations in milliseconds per day

I have tried other solutions made in previous post. I am using R for timeseries. My data has 100 millisec interval for each data point in a day. Total time is 2 months. What will be the frequency in this case? I used 864000. If I want to use start as starting time with milliseconds part how can I do that?
var starttime
1 281.9110 2020-08-10 08:35:04.000
2 281.9110 2020-08-10 08:35:04.100
3 281.9110 2020-08-10 08:35:04.200
4 281.8511 2020-08-10 08:35:04.300
5 281.7913 2020-08-10 08:35:04.400
The data is collected for a month now and the data points comes in every 100 millisecond. It is already a big size now. I tried below code but ended up in the output shown below:
newdf_Long<-newdf_Long[,1]
mymts = ts(newdf_Long,start =(as.POSIXct("2020-08-10 08:35:04.000",format="%Y-%m-%d %H:%M:%OS")),
end = (as.POSIXct("2020-09-07 13:01:41.000",format="%Y-%m-%d %H:%M:%OS")))
Time Series:
Start = 1597041304
End = 1599476501
Frequency = 1
[1] 136.5843 135.5664 133.5305
I want to know if I am giving right start end in ts function? What will be frequency in this case? After that when I tried to decompose it says
"Error in decompose(mymts) : time series has no or less than 2 periods"
My objective with this data is to forecast drift everyday or in weekly basis.
Can anyone please guide me? I am a beginner.
Thanks

The data in the time series is different from the data I entered. How do I get outputs in a similar scale as my inputs?

I had a column of data as follows:
141523
146785
143667
65560
88524
148422
151664
.
.
.
.
I used the ts() function to convert this data into a time series.
{
Aclines <- read.csv(file.choose())
Aclinests <- ts(Aclines[[1]], start = c(2013), end = c(2015), frequency = 52)
}
head(Aclines) gives me the following output:
X141.523
1 146785
2 143667
3 65560
4 88524
5 148422
6 151664
head(Aclinests) gives me the following output:
[1] 26 16 83 87 35 54
The output of all my further analysis including graphs and predictions are scaled to how you can see the head(Aclinets) output. How can I scale the outputs back to how the original data was input? Am I missing something while converting the data to a ts?
It is typically recommended to have a reproducible example How to make a great R reproducible example?. But I will try to help based what I'm reading. If it isn't helpful, I'll delete the post.
First, the read.csv defaults to header = TRUE. It doesn't look like you have a header in your file. Also, it looks like R is reading data in as factors instead of numeric.
So you can try a couple of parameters to reading the file -
Aclines <- read.csv(file.choose(), header=FALSE, stringsAsFactors=FALSE)
Then to get your time series
Aclinests <- ts(Aclines[, 2], start = c(2013), end = c(2015), frequency = 52)
Since your data looks like it has 2 columns, this will read the second column of your data frame into a ts object.
Hope this helps.

Select a value from time series by date in R

How to select a value from time series corresponding needed date?
I create a monthly time series object with command:
producers.price <- ts(producers.price, start=2012+0/12, frequency=12)
Then I try to do next:
value <- producers.price[as.Date("01.2015", "%m.%Y")]
But this doesn't make that I want and value is equal
[1] NA
Instead of 10396.8212805739 if producers.price is:
producers.price <- structure(c(7481.52109434237, 6393.18959031561, 6416.63065650718,
5672.08354710121, 7606.24186413516, 5201.59247092013, 6488.18361474813,
8376.39182893415, 9199.50916585545, 8261.87133079494, 8293.8195347453,
8233.13630279516, 7883.17272003961, 7537.21001580393, 6566.60260432381,
7119.99345843556, 8086.40101607729, 9125.11104610046, 10134.0228610828,
10834.5732454454, 9410.35031874371, 9559.36933274129, 9952.38679679724,
10390.3628690951, 11134.8432864557, 11652.0075507499, 12626.9616107684,
12140.6698452193, 11336.8315981684, 10526.0309052316, 10632.1492109584,
8341.26367412737, 9338.95688558448, 9732.80173656971, 10724.5525831506,
11272.2273444623, 10396.8212805739, 10626.8428853062, 11701.0802817581,
NA), .Tsp = c(2012, 2015.25, 12), class = "ts")
So, I had/have a similar problem and was looking all over to solve it. My solution is not as great as I'd have wanted it to be, but it works. I tried it out with your data and it seems to give the right result.
Explanation
Turns out in R time series data is really stored as a sequence, starting at 1, and not with yout T. Eg. If you have a time series that starts in 1950 and ends in 1960 with each data at one year interval, the Y at 1950 will be ts[1] and Y at 1960 will be ts[11].
Based on this logic you will need to subtract the date from the start of the data and add 1 to get the value at that point.
This code in R gives you the result you expect.
producers.price[((as.yearmon("2015-01")- as.yearmon("2012-01"))*12)+1]
If you need help in the time calculations, check this answer
You will need the zoo and lubridate packages
Get the difference between dates in terms of weeks, months, quarters, and years
Hope it helps :)
1) window.ts
The window.ts function is used to subset a "ts" time series by a time window. The window command produces a time series with one data point and the [[1]] makes it a straight numeric value:
window(producers.price, start = 2015 + 0/12, end = 2015 + 0/12)[[1]]
## [1] 10396.82
2) zoo We can alternately convert it to zoo and subscript it by a yearmon class variable and then use [[1]] or coredata to convert it to a plain number or we can use window.zoo much as we did with window.ts :
library(zoo)
as.zoo(producers.price)[as.yearmon("2015-01")][[1]]
## [1] 10396.82
coredata(as.zoo(producers.price)[as.yearmon("2015-01")])
## [1] 10396.82
window(as.zoo(producers.price), 2015 + 0/12 )[[1]]
## [1] 10396.82
coredata(window(as.zoo(producers.price), 2015 + 0/12 ))
## [1] 10396.82
3) xts The four lines in (2) also work if library(zoo) is replaced with library(xts) and as.zoo is replaced with as.xts.
Looking for a simple command, one line and no library needed?
You might try this.
as.numeric(window(producers.price, 2015.1, 2015.2))

Resources