Select a value from time series by date in R - r

How to select a value from time series corresponding needed date?
I create a monthly time series object with command:
producers.price <- ts(producers.price, start=2012+0/12, frequency=12)
Then I try to do next:
value <- producers.price[as.Date("01.2015", "%m.%Y")]
But this doesn't make that I want and value is equal
[1] NA
Instead of 10396.8212805739 if producers.price is:
producers.price <- structure(c(7481.52109434237, 6393.18959031561, 6416.63065650718,
5672.08354710121, 7606.24186413516, 5201.59247092013, 6488.18361474813,
8376.39182893415, 9199.50916585545, 8261.87133079494, 8293.8195347453,
8233.13630279516, 7883.17272003961, 7537.21001580393, 6566.60260432381,
7119.99345843556, 8086.40101607729, 9125.11104610046, 10134.0228610828,
10834.5732454454, 9410.35031874371, 9559.36933274129, 9952.38679679724,
10390.3628690951, 11134.8432864557, 11652.0075507499, 12626.9616107684,
12140.6698452193, 11336.8315981684, 10526.0309052316, 10632.1492109584,
8341.26367412737, 9338.95688558448, 9732.80173656971, 10724.5525831506,
11272.2273444623, 10396.8212805739, 10626.8428853062, 11701.0802817581,
NA), .Tsp = c(2012, 2015.25, 12), class = "ts")

So, I had/have a similar problem and was looking all over to solve it. My solution is not as great as I'd have wanted it to be, but it works. I tried it out with your data and it seems to give the right result.
Explanation
Turns out in R time series data is really stored as a sequence, starting at 1, and not with yout T. Eg. If you have a time series that starts in 1950 and ends in 1960 with each data at one year interval, the Y at 1950 will be ts[1] and Y at 1960 will be ts[11].
Based on this logic you will need to subtract the date from the start of the data and add 1 to get the value at that point.
This code in R gives you the result you expect.
producers.price[((as.yearmon("2015-01")- as.yearmon("2012-01"))*12)+1]
If you need help in the time calculations, check this answer
You will need the zoo and lubridate packages
Get the difference between dates in terms of weeks, months, quarters, and years
Hope it helps :)

1) window.ts
The window.ts function is used to subset a "ts" time series by a time window. The window command produces a time series with one data point and the [[1]] makes it a straight numeric value:
window(producers.price, start = 2015 + 0/12, end = 2015 + 0/12)[[1]]
## [1] 10396.82
2) zoo We can alternately convert it to zoo and subscript it by a yearmon class variable and then use [[1]] or coredata to convert it to a plain number or we can use window.zoo much as we did with window.ts :
library(zoo)
as.zoo(producers.price)[as.yearmon("2015-01")][[1]]
## [1] 10396.82
coredata(as.zoo(producers.price)[as.yearmon("2015-01")])
## [1] 10396.82
window(as.zoo(producers.price), 2015 + 0/12 )[[1]]
## [1] 10396.82
coredata(window(as.zoo(producers.price), 2015 + 0/12 ))
## [1] 10396.82
3) xts The four lines in (2) also work if library(zoo) is replaced with library(xts) and as.zoo is replaced with as.xts.

Looking for a simple command, one line and no library needed?
You might try this.
as.numeric(window(producers.price, 2015.1, 2015.2))

Related

converting bi-monthly Julian days to date in raster image

I have few raster images that represents bi-monthly data. I want to convert bi-monthly into monthly data taking averages of the two images.
There are total of 23 image (single band or single layer)If i stack the image using stack() from list.files, for some reason it reads 46 layers, but if i open all raster images using raster function individually and then stack it reads 23 layers only.
I open image individually and then stacked it, but while i convert the bi-monthly julian days it cannot read correctly after 4th month.
library(raster)
setwd("F:/LANDSAT-NDVI/testAverage")
x1<-raster("landsatNDVISC05SLC2000001.tif")
x2<-raster("landsatNDVISC05SLC2000017.tif")
x3<-raster("landsatNDVISC05SLC2000033.tif")
x4<-raster("landsatNDVISC05SLC2000049.tif")
x5<-raster("landsatNDVISC05SLC2000065.tif")
x6<-raster("landsatNDVISC05SLC2000081.tif")
x7<-raster("landsatNDVISC05SLC2000097.tif")
x8<-raster("landsatNDVISC05SLC2000113.tif")
x9<-raster("landsatNDVISC05SLC2000129.tif")
x10<-raster("landsatNDVISC05SLC2000145.tif")
x11<-raster("landsatNDVISC05SLC2000161.tif")
x12<-raster("landsatNDVISC05SLC2000177.tif")
x13<-raster("landsatNDVISC05SLC2000193.tif")
x14<-raster("landsatNDVISC05SLC2000209.tif")
x15<-raster("landsatNDVISC05SLC2000225.tif")
x16<-raster("landsatNDVISC05SLC2000241.tif")
x17<-raster("landsatNDVISC05SLC2000257.tif")
x18<-raster("landsatNDVISC05SLC2000273.tif")
x19<-raster("landsatNDVISC05SLC2000289.tif")
x20<-raster("landsatNDVISC05SLC2000305.tif")
x21<-raster("landsatNDVISC05SLC2000321.tif")
x22<-raster("landsatNDVISC05SLC2000337.tif")
x23<-raster("landsatNDVISC05SLC2000353.tif")
data<stack(x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16,x17,x18,x19,x20,x21,x22,x23)
julday <-c("landsatNDVISC05SLC2000001.tif","landsatNDVISC05SLC2000017.tif","landsatNDVISC05SLC2000033.tif",
"landsatNDVISC05SLC2000049.tif","landsatNDVISC05SLC2000065.tif","landsatNDVISC05SLC2000081.tif",
"landsatNDVISC05SLC2000097.tif","landsatNDVISC05SLC2000113.tif","landsatNDVISC05SLC2000129.tif",
"landsatNDVISC05SLC2000145.tif","landsatNDVISC05SLC2000161.tif","landsatNDVISC05SLC2000177.tif",
"landsatNDVISC05SLC2000193.tif","landsatNDVISC05SLC2000209.tif","landsatNDVISC05SLC2000225.tif",
"landsatNDVISC05SLC2000241.tif","landsatNDVISC05SLC2000257.tif","landsatNDVISC05SLC2000273.tif",
"landsatNDVISC05SLC2000289.tif","landsatNDVISC05SLC2000305.tif","landsatNDVISC05SLC2000321.tif",
"landsatNDVISC05SLC2000337.tif","landsatNDVISC05SLC2000353.tif")
julday <- as.numeric(substr(julday, 24,26)) #24 to 26th digit in the file name represents Julian days#
dates <- as.Date(julday, origin=as.Date("2000-01-01"))
combinddat <- setZ(data, dates)
monthly <- zApply(combinddat, by = format(dates,"%Y-%m"), fun = mean, na.rm = T)
The dates produced using that data is wrong; result is as follows
> dates
[1] "2000-01-02" "2000-01-18" "2000-02-03" "2000-02-19" "2000-03-06"
[6] "2000-03-22" "2000-04-07" "2000-01-14" "2000-01-30" "2000-02-15"
[11] "2000-03-02" "2000-03-18" "2000-04-03" "2000-01-10" "2000-01-26"
[16] "2000-02-11" "2000-02-27" "2000-03-14" "2000-03-30" "2000-01-06"
[21] "2000-01-22" "2000-02-07" "2000-02-23"
But i want the dates to be 12 months based on my julian days.
This does not answer your question but you probably could improve your code a lot with:
setwd("F:/LANDSAT-NDVI/testAverage")
library(raster)
f <- list.files(pattern="\\.tif$")
f <- sort(f)
data <- stack(f)

XTS:: Help me on the usage & differences between period.apply() & to.period()

I am learning time series analysis with R and came across these 2 functions while learning. I do understand that the output of both of these is a periodic data defined by the frequency of period and the only difference I can see is the OHLC output option in the to.period().
Other than the OHLC when a particular of these functions is to be used?
to.period and all the to.minutes, to.weekly, to.quarterly are indeed meant for OHLC data.
If you take the function to.period it will take the open from the first day of the period, the close of the last day of the period and the highest high / lowest low of the specified period. These functions work very well together with the quantmod / tidyquant / quantstrat packages. See code example 1.
If you give the to.period non-OHLC data, but a timeseries with 1 data column, you still get a sort of OHLC back. See code example 2.
Now period.apply is is more interesting. Here you can supply your own functions to be applied on the data. Especially in combination with endpoints this can be a powerful function in timeseries data if you want to aggregate your function to different time periods. The index is mostly specified with endpoints, since with endpoints you can create the index you need to get to higher time levels (from day to week / etc etc). See code example 3 and 4.
Remember to use matrix functions with period.apply if you have more than 1 column of data since xts is basicly a matrix and an index. See code example 5.
More info on this data.camp course.
library(xts)
data(sample_matrix)
zoo.data <- zoo(rnorm(31)+10,as.Date(13514:13744,origin="1970-01-01"))
# code example 1
to.quarterly(sample_matrix)
sample_matrix.Open sample_matrix.High sample_matrix.Low sample_matrix.Close
2007 Q1 50.03978 51.32342 48.23648 48.97490
2007 Q2 48.94407 50.33781 47.09144 47.76719
# same as to.quarterly
to.period(sample_matrix, period = "quarters")
sample_matrix.Open sample_matrix.High sample_matrix.Low sample_matrix.Close
2007 Q1 50.03978 51.32342 48.23648 48.97490
2007 Q2 48.94407 50.33781 47.09144 47.76719
# code example 2
to.period(zoo.data, period = "quarters")
zoo.data.Open zoo.data.High zoo.data.Low zoo.data.Close
2007-03-31 9.039875 11.31391 7.451139 10.35057
2007-06-30 10.834614 11.31391 7.451139 11.28427
2007-08-19 11.004465 11.31391 7.451139 11.30360
# code example 3 using base standard deviation in the chosen period
period.apply(zoo.data, endpoints(zoo.data, on = "quarters"), sd)
2007-03-31 2007-06-30 2007-08-19
1.026825 1.052786 1.071758
# self defined function of summing x + x for the period
period.apply(zoo.data, endpoints(zoo.data, on = "quarters"), function(x) sum(x + x) )
2007-03-31 2007-06-30 2007-08-19
1798.7240 1812.4736 993.5729
# code example 5
period.apply(sample_matrix, endpoints(sample_matrix, on = "quarters"), colMeans)
Open High Low Close
2007-03-31 50.15493 50.24838 50.05231 50.14677
2007-06-30 48.47278 48.56691 48.36606 48.45318

"circular" mean in R

Given a dataset of months, how do I calculate the "average" month, taking into account that months are circular?
months = c(1,1,1,2,3,5,7,9,11,12,12,12)
mean(months)
## [1] 6.333333
In this dummy example, the mean should be in January or December. I see that there are packages for circular statistics, but I'm not sure whether they suit my needs here.
I think
months <- c(1,1,1,2,3,5,7,9,11,12,12,12)
library("CircStats")
conv <- 2*pi/12 ## months -> radians
Now convert from months to radians, compute the circular mean, and convert back to months. I'm subtracting 1 here assuming that January is at "0 radians"/12 o'clock ...
(res1 <- circ.mean(conv*(months-1))/conv)
The result is -0.3457. You might want:
(res1 + 12) %% 12
which gives 11.65, i.e. partway through December (since we are still on the 0=January, 11=December scale)
I think this is right but haven't checked it too carefully.
For what it's worth, the CircStats::circ.mean function is very simple -- it might not be worth the overhead of loading the package if this is all you need:
function (x)
{
sinr <- sum(sin(x))
cosr <- sum(cos(x))
circmean <- atan2(sinr, cosr)
circmean
}
Incorporating #A.Webb's clever alternative from the comments:
m <- mean(exp(conv*(months-1)*1i))
12+Arg(m)/conv%%12 ## 'direction', i.e. average month
Mod(m) ## 'intensity'

Having trouble with R's time series objects

I have a column of 84 monthly expenditures from 1/2004 - 12/2010, which in Excel looks like...
12247815.55
11812697.14
13741176.13
21372260.37
27412419.28
42447077.96
55563235.3
45130678.8
54579583.53
43406197.32
34318334.64
25321371.4
...(74 more entries)
I am trying to run an stl() from the forecast package on this series, and so I load the data:
d <- ts(read.csv("deseason_vVectForTS.csv",
header = TRUE),
start=c(2004,1),
end=c(2010,12),
frequency = 12)
(If I do header=FALSE it will absorb the first entry - 122...- as the header for the second column, and name the first column's header 'X')
But instead of my environment being populated with a Time Series Object from 2004 to 2011 (as it has said before) it simply says ts[1:84, 1].
Probably related is the fact that,
fit <- stl(d)
throws
Error in stl(d) : only univariate series are allowed.
despite the fact that
head(d)
[1] 12247816 11812697 13741176 21372260 27412419 42447078
and
d
Jan Feb Mar Apr May Jun Jul Aug Sep Oct
2004 12247816 11812697 13741176 21372260 27412419 42447078 55563235 45130679 54579584 43406197
("years 2005-2010 look exactly the same, and all rows have columns for Jan-Dec; it just doesn't fit on here neatly - just trying to show the object has taken the ts labeling structure.")
What am I doing wrong? As far as I know this is the same way I have been building my time series objects in the past...
read.csv reads in a matrix. If it only has one column, it is still a matrix. To make it a vector use
d <- ts(read.csv("deseason_vVectForTS.csv",
header = TRUE)[,1],
start=c(2004,1),
end=c(2010,12),
frequency = 12)
Also, please check your facts. stl is in the stats package, not the forecast package. This is easily checked by using help(stl).

R date to Excel based number

I know that I can get a date from an Excel based number (days since 1899-12-30) in the following way:
as.Date(41000, origin = "1899-12-30")
which will give me "2012-04-01". I want however the opposite. As a user I would like to input a date as a string and get the number of days since "1899-12-30".
Something along the lines
as.integer(as.Date('2014-03-01', origin="1899-12-30"))
which I hoped would result in 41000 and not in the R based days since 1970-01-01 which is 15431.
Maybe this is silly as I realize that I can add the days manually by writing something like:
as.integer(as.Date('2012-04-01')) + 25569
I just wondered if there is a function which does this?
I think you want difftime as in:
difftime(as.Date('2012-04-01'), as.Date("1899-12-30"))
## Time difference of 41000 days
Do it by hand, simpler and safer:
d0 <- as.Date('1899-12-30')
d1 <- as.Date('2014-10-28')
as.integer(d1 - d0)
##[1] 41940 # This is interpreted by Excel as '2014-10-28'
Of course, you can write a function to convert a R date to an Excel one:
convert_to_excel_date <- function(d) {
# Converts a R date value to an Excel date value
#
# Parameters:
# d: a R date object
d0 <- as.Date('1899-12-30')
return(as.integer(d - d0))
}
# Example:
# convert_to_excel_date(as.Date('2018-10-28'))

Resources