I have the dataset below with half hourly timeseries data.
Date <- c("2018-01-01 08:00:00", "2018-01-01 08:30:00",
"2018-01-01 08:59:59","2018-01-01 09:29:59")
Volume <- c(195, 188, 345, 123)
Dataset <- data.frame(Date, Volume)
I convert to Date format with:
Dataset$Date <- as.POSIXct(Dataset$Date)
Create xts object
library(xts)
Dataset.xts <- xts(Dataset$Volume, order.by=Dataset$Date)
When I try to decompose it based on this Q with:
attr(Dataset.xts, 'frequency')<- 48
decompose(ts(Dataset.xts, frequency = 48))
I get:
Error in decompose(ts(Dataset.xts, frequency = 48)) :
time series has no or less than 2 periods
As I mentioned in the comments you need as.ts instead of ts. Also you are specifying a frequency higher than the number of records you have. Both lead to errors.
This code works:
library(xts)
df1 <- data.frame(date = as.POSIXct(c("2018-01-01 08:00:00", "2018-01-01 08:30:00",
"2018-01-01 08:59:59","2018-01-01 09:29:59")),
volume = c(195, 188, 345, 123))
df1_xts <- xts(df1$volume, order.by = df1$date)
attr(df1_xts, 'frequency') <- 2
decompose(as.ts(df1_xts))
This doesn't (frequency higher than number of records):
attr(df1_xts, 'frequency') <- 48
decompose(as.ts(df1_xts))
Error in decompose(as.ts(df1_xts)) :
time series has no or less than 2 periods
Neither does this (ts instead of as.ts):
attr(df1_xts, 'frequency') <- 2
decompose(ts(df1_xts))
Error in decompose(ts(df1_xts)) :
time series has no or less than 2 periods
Related
I have a data frame with time series data for several different groups. I want to apply different start and end cutoff dates to each group within the original data frame.
Here's a sample data frame:
date <- seq(as.POSIXct("2014-07-21 17:00:00", tz= "GMT"), as.POSIXct("2014-09-11 24:00:00", tz= "GMT"), by="hour")
group <- letters[1:4]
datereps <- rep(date, length(group))
attr(datereps, "tzone") <- "GMT"
sitereps <- rep(group, each = length(date))
value <- rnorm(length(datereps))
df <- data.frame(DateTime = datereps, Group = group, Value = value)
and here's the data frame 'cut' of cutoff dates to use:
start <- c("2014-08-01 00:00:00 GMT", "2014-07-26 00:00:00 GMT", "2014-07-21 17:00:00 GMT", "2014-08-03 24:00:00 GMT")
end <- c("2014-09-11 24:00:00 GMT", "2014-09-01 24:00:00 GMT", "2014-09-07 24:00:00 GMT", "2014-09-11 24:00:00 GMT")
cut <- data.frame(Group = group, Start = as.POSIXct(start), End = as.POSIXct(end))
I can do it manually for each group, getting rid of the data I don't want on both ends of the time series using ![(),]:
df2 <- df[!(df$Group == "a" & df$DateTime > "2014-08-01 00:00:00 GMT" & df$DateTime < "2014-09-11 24:00:00 GMT"),]
But, how can I automate this?
Just merge the cuts into the data frame, and then create a new data frame using the new columns, like below. df3 contains the removed records, df4 contains the retained ones.
df2 <- merge(x = df,y = cut,by = "Group")
df3 <- df2[df2$DateTime <= df2$Start | df2$DateTime >= df2$End,]
df4 <- df2[!(df2$DateTime <= df2$Start | df2$DateTime >= df2$End),]
I have a time series object with the following dates:
my_data = rnorm(155)
my_data_ts <- ts(my_data, start = c(2002, 10), frequency = 12)
How do I get the date as a POSIXlt object, and the values?
my_data_date = STHGTOCONVERTTOPOSIXLT(my_data)???
my_data_values = STHGTOGETVALUES(my_data)???
To get the first day of the month, you can use:
as.POSIXlt( paste0(floor(time(my_data_ts)),'-', round(12*(time(my_data_ts)-floor(time(my_data_ts))))+1,'-01'), tz="UTC")
For the values, just use :
as.vector(my_data_ts)
I have a data frame with daily data. I need to bind it to hourly data, but first I need to convert it to a suitable posixct format. This looks like this:
set.seed(42)
df <- data.frame(
Date = seq.Date(from = as.Date("2015-01-01", "%Y-%m-%d"), to = as.Date("2015-01-29", "%Y-%m-%d"), by = "day"),
var1 = runif(29, min = 5, max = 10)
)
result <- data.frame(
Date = d <- seq.POSIXt(from = as.POSIXct("2015-01-01 00:00:00", "%Y-%m-%d %H:%M:%S", tz = ""),
to = as.POSIXct("2015-01-29 23:00:00", "%Y-%m-%d %H:%M:%S", tz = ""), by = "hour"),
var1 = rep(df$var1, each = 24) )
However, my data is not as easy to work with as the above. I have lots of missing dates, so I need to be able to take the specific df$Date-vector and convert it to a posixct frame, with the matching daily values.
I've looked high and low but been unable to find anything on this.
The way I went about this was to find the min and max of the dataset and deem them hour 0 and hour 23.
hourly <- data.frame(Hourly=seq(min(as.POSIXct(paste0(df$Date, "00:00:00"),tz="")),max(as.POSIXct(paste0(df$Date, "23:00:00"),tz="")),by="hour"))
hourly[,"Var1"] <- df[match(x = as.Date(hourly$Hourly),df$Date),"var1"]
This achieves a result of the daily values becoming hourly with the daily var1 assigned to each hour that contains the day. In this respect missing daily values should not be an issue and if there is no match, it will add in NA's.
I am subtracting dates in xts i.e.
library(xts)
# make data
x <- data.frame(x = 1:4,
BDate = c("1/1/2000 12:00","2/1/2000 12:00","3/1/2000 12:00","4/1/2000 12:00"),
CDate = c("2/1/2000 12:00","3/1/2000 12:00","4/1/2000 12:00","9/1/2000 12:00"),
ADate = c("3/1/2000","4/1/2000","5/1/2000","10/1/2000"),
stringsAsFactors = FALSE)
x$ADate <- as.POSIXct(x$ADate, format = "%d/%m/%Y")
# object we will use
xxts <- xts(x[, 1:3], order.by= x[, 4] )
#### The subtractions
# anwser in days
transform(xxts, lag = as.POSIXct(BDate, format = "%d/%m/%Y %H:%M") - index(xxts))
# asnwer in hours
transform(xxts, lag = as.POSIXct(CDate, format = "%d/%m/%Y %H:%M") - index(xxts))
Question: How can I standardise the result so that I always get the answer in hours. Not by multiplying the days by 24 as I will not know before han whther the subtratcion will round to days or hours....
Unless I can somehow check if the format is in days perhaps using grep and regexand then multiply within an if clause.
I have tried to work through this and went for the grep regex apprach but this doesnt even keep the negative sign..
p <- transform(xxts, lag = as.POSIXct(BDate, format = "%d/%m/%Y %H:%M") - index(xxts))
library(stringr)
ind <- grep("days", p$lag)
p$lag[ind] <- as.numeric( str_extract_all(p$lag[ind], "\\(?[0-9,.]+\\)?")) * 24
p$lag
#2000-01-03 2000-01-04 2000-01-05 2000-01-10
# 36 36 36 132
I am convinced there is a more elegant solution...
ok difftime works...
transform(xxts, lag = difftime(as.POSIXct(BDate, format = "%d/%m/%Y %H:%M"), index(xxts), unit = "hours"))
I have a data.frame of a time series of data, I would like to thin the data by only keeping the entries that are measured on every even day number. For example:
set.seed(1)
RandData <- rnorm(100,sd=20)
Locations <- rep(c('England','Wales'),each=50)
today <- Sys.Date()
dseq <- (seq(today, by = "1 days", length = 100))
Date <- as.POSIXct(dseq, format = "%Y-%m-%d")
Final <- data.frame(Loc = Locations,
Doy = as.numeric(format(Date,format = "%j")),
Temp = RandData)
So, how would I reduce this data frame to only contain every entry that is measured on even numbered days such as Lloc, day, and temp on day 172, day 174 and so on...
What about:
Final[Final$Doy%%2==0,]