I am totally new to R and have just started using it. I have three years of weekly data. I want to decompose this time series data into trend, seasonal and other components. I have following doubts:
Which function I should use - ts()or decompose()
How to deal with leap year situation.
Please correct me if I am wrong, the frequency is 52.
Thanks in Advance. I would really appreciate any kind of help.
Welcome to R!
Yes, the frequency is 52.
If the data is not already classed as time-series, you will need both ts() and decompose(). To find the class of the dataset, use class(data). And if it returns "ts", your data is already a time-series as far as R is concerned. If it returns something else, like "data.frame", then you will need to change it to time-series. Assign a variable to ts(data) and check the class again to make sure.
There is a monthly time-series dataset sunspot.month already loaded into R that you can practice on. Here's an example. You can also read the help file for decompose by writing ?decompose
class(sunspot.month)
[1] "ts"
> decomp <- decompose(sunspot.month)
> summary(decomp)
Length Class Mode
x 2988 ts numeric
seasonal 2988 ts numeric
trend 2988 ts numeric
random 2988 ts numeric
figure 12 -none- numeric
type 1 -none- character
> names(decomp)
[1] "x" "seasonal" "trend" "random" "figure" "type"
> plot(decomp) # to see the plot of the decomposed time-series
The call to names indicates that you can also access the individual component data. This can be done with the $ operator. For example, if you want to look at the seasonal component only, use decomp$seasonal.
r time-series
Related
mydatasetI have this data set which consists of two attributes i.e Year(2016,2017,2018) and Month(JAN TO DEC). The data set contains the average sales value for all the months for the years 2016, 2017 & 2018. Now when I import this data set, it shows that the data set is a "data.frame" . However I want it to be in "ts" . Then I ran this command
data.ts<- as.ts(myData)
to convert my data into "ts". The result is as follows:
class(data.ts)
[1] "mts" "ts" "matrix"
Now, I want my data set to be in "ts" only, meaning when I run the command class(data.ts). It should show "ts" only. How can I convert my data in "ts" only? And does this "mts" and "matrix" matters or not?
Also, when I plot my data using the command
plot(data.ts)
It shows a plot in which Time is on x-axis while Year and Sales are on y-axis. On the other hand, I want to plot a graph which shows the Year in x axis and Sales values of Months on y-axis.
How do I arrange my data such that when I import the dataset, it is already in ts? Or is there any other way to do it? Also, how to arrange the dataset that it shows the Year on x axis by default. I'm really confused as all the videos that I have seen on YouTube has their data already in "ts". Also, their plot shows Year on x-axis. Hope I have made myself clear. Any help would be appreciated.
How can I plot the graph such that Year is on x axis?
Reorder the data in a single variable:
data=as.matrix(data)
data= as.data.frame(t(data))
names(data)=c('x2016','x2017','x2018')
series=c(data$x2016,data$x2017,data$x2018)
Then take just index accordingly to the start point and the frequency of data. In your case looks like monthly from 2016 hence:
data.ts=ts(series ,start=c(2016,1), frequency=12)
plot(data.ts)
I am trying to import time series data in R with the below code. The data is from 1-7-2014 to 30-4-2017 making it 1035 data point. But when I use the below code it gives 1093 observation.
series <- ts(data1, start=c(2014,7,1), end=c(2017,4,30), frequency = 365)
Can someone help me in understanding where am I going wrong?
ts doesn't allow input for start and end in this form. Either a single number or a vector of two integers is allowed. In second case it's year and day number, starting from 1st January.
With the help of lubridate you can use the following. decimal_date will convert the date to proper integer, suitable for ts.
library(lubridate)
series <- ts(data1, start=decimal_date(as.Date("2014-07-01")), end=decimal_date(as.Date("2017-04-30") + 1), frequency = 365)
> length(series)
[1] 1035
I have a time series object (ts / mts) called "mydata".
(The dates go from 1980 to 2014)
class(mydata) [1] "mts" "ts" "matrix"
colnames(mydata) [1] "inflation" "unemployment"
equation1 = lm(inflation ~ unemployment + lag(unemployment, 1), data = mydata)
Two questions:
1. Have I specified the lag() correctly? I seem to get lots of NA's.
2. How do I get the residuals to keep the same dates as the time-series?
(i.e: "1981 to 2014" instead of just "1 to 34")
you can try print the output of both unemployment and lagged unemployment to see if there is something unusual happening otherwise the function specification looks fine to me.
You can use cbind(mydata,equation1$residuals) to bind residual together with the rest of your time series so that it will also have the same time.
I have some problems with time-series designation of vectors in R.
I work with time-series and when I want to set a vector to a certain period, I feel quite confident about how to do it. I have simply done as follow name<- ts(name, frequency=12, start=c(2007,1)). As you can see I have monthly data
I am making an R template for colleagues to use, and I want them to be able to carry out a recursive ARIMA regression from any given starting point. That is, I have a range of in-sample predicted valued and I want to designate a start-value that is n monthly observation after 2007 (or whatever start data is used), where n is the start-value of the recursive regression.
first and last from the xts time-series package do exactly what you want.i.e. to get the first 2 months of an object x:
first(x, '2 months’)
or the last 6 weeks:
last(x, '6 weeks’)
Valid period.types are: secs, seconds, mins, minutes, hours, days, weeks, months, quarters, and years. As always you can find much more detailed information using ?xts::first.
I have hourly time series and would like to interpolate sub-hourly values like every 15 min. Linear interpolation will do. But if there is any way to specify Gaussian, Polynomial, that would be great.
For example if I have
a<-c(4.5,7,3.3) which is the first three hour data. How can I get 15 min sub-hourly data, total of 9 values in this case? I have been using approx function and studying zoo package and still don't know how I can do it. Thank you very much!
How about this:
b<-xts(c(4.5,7,3.3), order.by=as.POSIXct(c('2013-07-26 0:00',
'2013-07-26 2:00',
'2013-07-26 3:00')))
approx(b, n=13) ,
adjusting n for the appropriate time interval?
Within xts package, you can either na.approx or na.spline.
Coerce you times series to an xts object
Create a new index having 15 minutes intervals
Use this new index to create a NULL xts object that you merge with your object
Approximate missing values using na.approx for linear/constant approx or na.spline for polynomial one.
here a complete example:
library(xts)
set.seed(21)
## you create the xts object
x <- xts(rnorm(10),
seq(from=as.POSIXct(Sys.Date()),
length.out=10,
by=as.difftime(1,units='hours')))
## new index to be used
new.index <-
seq(min(index(x)),max(index(x)), by=as.difftime(15,units='mins'))
## linear approx
na.approx(merge(x,xts(NULL,new.index)))
## polynomial approx
na.spline(merge(x,xts(NULL,new.index)))