How to calculate time spent below a threshold? - r

I have a plot with time as a PosixCt object on the x-axis and a dependent variable "ODBA" on the y-axis. The experiment was a 600-second trial. How do I calculate the total time in seconds that ODBA was below a certain threshold (e.g. 0.25)?

We can use sum
sum(as.numeric(format(time, "%S"))[ODBA > 0.25])

Related

How to determine the time lag at different time intervals between two correlated variables

I have two variables, x and y, measured at one minute intervals for over two years. The average daily values of x and y are almost 90% correlated. However, when I analyze x and y in one minute intervals they are only 50% correlated. How can I detect the time interval at which this correlation becomes 90%? Ideally I'd like to do this in R.
I'm new to statistics/econometrics, so my apologies if this question is very basic!
I'm not quite sure what you are asking here. What do you mean by x and y being 90 "percent" correlated? Do you mean you get a correlation coefficient of .9?
Beyond this clarification you can absolutely have a situation where the average of 2 variables is more correlated than any individual subset of the data. In other words order matters, so the correlation of the average is not the average of the correlation. For example, this R code shows if we took 3 measurements each hour for 2 hours (6 measurements total), the overall correlation is .5, while the correlation of the average hourly measure is a perfect 1. Essentially when you take the correlation of averages you are effectively removing the impact of the order your measurement values are distributed within the interval you are averaging over, which ends up actually being very important when taking the correlations. Let me know if I missed something about your question though.
X=c(1,2,3,4,5,6)
Y=c(3,2,1,6,5,4)
cor(X,Y)
HourAvgX=c(mean(X[1:3]),mean(X[4:6]))
HourAvgY=c(mean(Y[1:3]),mean(Y[4:6]))
cor(HourAvgX,HourAvgY)

Find period with lowest variability in time series r

I have a time series and would like to find the period that has the lowest contiguous variability, i.e. the period in which the rolling SD hovers around the minimum for the longest consecutive time steps.
test=c(10,12,14,16,13,13,14,15,15,14,16,16,16,16,16,16,16,15,14,15,12,11,10)
rol=rollapply(x, width=4, FUN=sd)
rol
I can easily see from the data or the graph that the longest period with the lowest variability start at t=11. Is there a function that can help me find this period of continued low variability, perhaps trying automatically different size for the rolling window? I am not interested in finding the time step with the lowest SD, but a period where this low SD is more consistent than others.
All I can think for now is looking at the difference between rol[i]-rol[i+1], looping through the vector and use a counter to find periods of consecutive low values of SD. I was also thinking of using cluster analysis, something like kmeans(rol, 5) but I can have long time series which are complex and I would have to manually pick the number of clusters.

How are the intermediate values between observations in time series calculated in ts() in R?

I need help regarding how frequency affects my time series. I fit a daily time series data with frequency = 7 When I view the time series, I get intermediate values between days. I have data for 60 days. I created a time series for the same
ts.v1<- ts(V1, start = as.Date("2017-08-01"), end = as.Date("2017-09-30"), frequency = 7)
which gives me 421 values. I kind of understood that it has to do with the frequency as the value is a product of 7 and 60. What I need to know is- how are these calculated? And why? Isn't frequency used only to tell your time series whether the data is daily/weekly/annual etc.? (I referred to this)
Similarly in my ACF and PACF plots, the lag values are < 1 meaning there are seven values to make 1 'lag'. In that scenario, when I estimate arima(p,d,q) using these plots would the values be taken as lag x frequency?
Normally one does not use Date class with ts. With ts, the frequency is the number of points in a unit interval. Just use:
ts(V1, frequency = 7)
The times will be 1, 1 + 1/7, 1 + 2/7, ... You can later match them to the proper dates if need be.

Censored model JAGS count data

I'm coding a hierarchical Poisson model in JAGS+R for count data for censored data.
A is a matrix, rows are the different places and the columns are the different time intervals in which I count the rainy days. As covariates I have a set of X_k$ matrices. For every place and time interval I have covariates such as mean temperature (stored in X_1), count of windy days (in X_2), mean humidity (in X_3).
Should I separate the censored data from the non censored ones? How do I do that with matrices?
Thanks for your help!
Update:
I have this in a cycle up to the last but one registered time interval (for each place there might be censoring due to equipment failure)
mu[i,j]<- a[1]*x[i,1]+a[2]*x[i,2]+ b[1,j]*varx[i,j]+b[2,j]*varx[i,j]
N[i,j] <- dpois(lambda[i,j])
log(lambda[i,j]) <- mu[i,j] + alpha[i]
alpha[i]<- G0[latent[i]]
latent[i]~ dcat(prob[])
repeated this for the last time interval. Added censoor[i]<-step(-censored[i]) in the last time interval bit, censored[i] is a vec that indicates if there is an equipment failure
I am new & it doesn't work, any help? Thanks

[R+zoo]: Operations on time series with different temporal resolutions

I have two time series (sensor data) with different temporal resolutions. A time series from the class "xts / zoo" (TS1) includes hourly values and the other time series (TS2) has a better temporal resolution (one observation every 10 minutes). I.e. for TS1 I have 24 data points (observations) per day and for TS2 I have 144 data points per day.
When I calculate TS1-TS2 for one day I get a result with 24 data points (low temporal resolution). What I would like to achieve is to obtain a result with 144 data points (as TS2, better temporal resolution).
Is it possible to achieve this in R?
P.S.:
That's no a trivial problem because in an hourly interval I just have one observation from TS1 and 6 observations from TS2, so I could imagine this problem can be solved if one draws a fit line between every two points of TS1 and calculate the difference between the line and the data points from TS2. But I know no R Function to do this.
You can approximate missing values using na.approx for linear/constant approx or na.spline for polynomial one.
## new index to be used
new.index <-
seq(min(index(TS1)),max(index(TS1)), by=as.difftime(10,units='mins'))
## linear approx
TS1.new <- na.approx(merge(TS1 ,xts(NULL,new.index)))
Now you can susbtract your ts, (even if maybe you should check that they have same start dates)
TS2-TS1.new

Resources