I've run into a trouble working with time series & zones in R, and I can't quite figure out how to proceed.
I have an time series data like this:
df <- data.frame(
Date = seq(as.POSIXct("2014-01-01 00:00:00"), length.out = 1000, by = "hours"),
price = runif(1000, min = -10, max = 125),
wind = runif(1000, min = 0, max = 2500),
temp = runif(1000, min = - 10, max = 25)
)
Now, the Date is in UTC-time. I would like to subset/filter the data, so for example I get the values from today (Today is 2014-05-13):
df[ as.Date(df$Date) == Sys.Date(), ]
However, when I do this, I get data that starts with:
2014-05-13 02:00:00
And not:
2014-05-13 00:00:00
Because im currently in CEST-time, which is two hours after UTC-time. So I try to change the data:
df$Date <- as.POSIXct(df$Date, format = "%Y-%m-%d %H", tz = "Europe/Berlin")
Yet this doesn't work. I've tried various variations, such as stripping it to character, and then converting and so on, but I've run my head against a wall, and Im guessing there is something simple im missing.
To avoid using issues with timezones like this, use format to get the character representation of the date:
df[format(df$Date,"%Y-%m-%d") == Sys.Date(), ]
Related
I am currently trying to convert quarterly GDP data into monthly data and make it a time series. I figured out a way to do so before, however, I lost my code for it and cannot figure it out again. My data needs to span from 1997-01-01 to 2018-12-01, and the original data I had taken from the Fred database.
library(pdfetch)
GDP.1 <- pdfetch_FRED("GDP")
My code looked something like this...
GDP.working <- seq(as.Date(GDP.1, "1997/1/1"),
by = "month",
length.out = 252)
xts.1 <- xts(x = rep(NA,252),
order.by = seq(as.Date("1997/1/1"),
by = "month",
length.out = 252),
frequency = 12)
xts.GDP.1 <- merge(GDP.working, xts.1)
GDP.final <- na.locf(xts.GDP.1, fromLast = TRUE)
Any suggestions/methods would be very much appreciated.
Thank you for your time.
ERV
Dear Stackoverflow Community,
I have a Dataset with Datetimes [posixct '%d.%m.%Y %H:%M'] and Sensor measurements in [A] and [V].
The Datetime is one column and the different sensors are the other columns, with one column for each sensor.
I'd like to calculate a correction value with values within the column of each sensor.
The correction value should be written into a new colum hourly.
Therefore I'd like to calculate the correction as following:
correction = |x - (0.5 * (y+z))|
x= value of sensor 1, if Minute =='00'
y= value of sensor 1, if Minute =='03'
z= value of sensor 1, if Minute =='06'
What I'd like to have is a function, which calculates the written formula for every hour, but only if a value for all three minutes ('00'&'03'&'06') in the hour is given and write out the correction value into a new column (Data$correction).
I hope I could explain, what I'd like to do.
I tried several loops and apply and mapply functions, but there was always a problem with the date format, or the function.
This is, what seems to be the the best approach to me, though it doesn't work right now, but I hope there is a way to make it start working.
Also I think, that writing out vectors and merge them back with melt or merge might not be the best way. but right now I'm jst struggling and don't now how to solve the problem.
I really hope you can help me. Thanks so much.
Test_sub <- read.table(file= 'Test_sub.csv',
header=T, sep= ';', dec='.', stringsAsFactors= F)
sensor1_V_0 <- Test_sub[format(Test_sub$Datehour, format = '%M') == '00',]
sensor1_V_3 <- Test_sub[format(Test_sub$Datehour, format = '%M') == '03',]
sensor12_V_6 <- Test_sub[format(Test_sub$Datehour, format = '%M') == '06',]
test_sub2<- mapply(function(x, y, z) x-(0.5*(y+z)), sensor1_V_0$sensor1_V, sensor1_V_3$sensor1_V, sensor1_V_6$sensor1_V)
Let's start by creating some fake data:
dill<-data.frame(time=seq(as.POSIXct("2019-01-01 11:30"), as.POSIXct("2019-01-01 13:20"), by=180),val=runif(37,0,100))
Now we can do this:
require(tidyverse)
require(lubridate)
dill<- dill %>%
group_by(hour(time)) %>% # group by the hour -- note this assumes there's only one day in the data, you'll need to adjust this if there's more than one day
filter(any(minute(time)==3) & any(minute(time)==6) & any(minute(time)==0)) %>% # remove any hours in the data that don't have minutes 0, 3 and 6
mutate(correction=abs(val[minute(time)==0]-0.5*(val[minute(time)==3]+val[minute(time)==6]))) # calculate the correction
An example of the data would be:
y <- seq(from= 0.1, to= 0.5, by= 0.1)
min <- as.POSIXct('2018-09-25 09:00:00')
max <- as.POSIXct('2018-09-26 17:45:00')
SEQ <- data.frame(Datehour = seq.POSIXt(min,max, by = 60*03))
str(SEQ)
SEQ <- data.frame(SEQ[format(SEQ, format = '%M') == '00' |
format(SEQ, format = '%M') == '03' |
format(SEQ, format = '%M') == '06' |
format(SEQ, format = '%M') == '15' |
format(SEQ, format = '%M') == '30' |
format(SEQ, format = '%M') == '45' ,])
data <- data.frame(Datehour=SEQ, y = 0.1, z= 0.3)
I have a time series object with the following dates:
my_data = rnorm(155)
my_data_ts <- ts(my_data, start = c(2002, 10), frequency = 12)
How do I get the date as a POSIXlt object, and the values?
my_data_date = STHGTOCONVERTTOPOSIXLT(my_data)???
my_data_values = STHGTOGETVALUES(my_data)???
To get the first day of the month, you can use:
as.POSIXlt( paste0(floor(time(my_data_ts)),'-', round(12*(time(my_data_ts)-floor(time(my_data_ts))))+1,'-01'), tz="UTC")
For the values, just use :
as.vector(my_data_ts)
I have a data frame with daily data. I need to bind it to hourly data, but first I need to convert it to a suitable posixct format. This looks like this:
set.seed(42)
df <- data.frame(
Date = seq.Date(from = as.Date("2015-01-01", "%Y-%m-%d"), to = as.Date("2015-01-29", "%Y-%m-%d"), by = "day"),
var1 = runif(29, min = 5, max = 10)
)
result <- data.frame(
Date = d <- seq.POSIXt(from = as.POSIXct("2015-01-01 00:00:00", "%Y-%m-%d %H:%M:%S", tz = ""),
to = as.POSIXct("2015-01-29 23:00:00", "%Y-%m-%d %H:%M:%S", tz = ""), by = "hour"),
var1 = rep(df$var1, each = 24) )
However, my data is not as easy to work with as the above. I have lots of missing dates, so I need to be able to take the specific df$Date-vector and convert it to a posixct frame, with the matching daily values.
I've looked high and low but been unable to find anything on this.
The way I went about this was to find the min and max of the dataset and deem them hour 0 and hour 23.
hourly <- data.frame(Hourly=seq(min(as.POSIXct(paste0(df$Date, "00:00:00"),tz="")),max(as.POSIXct(paste0(df$Date, "23:00:00"),tz="")),by="hour"))
hourly[,"Var1"] <- df[match(x = as.Date(hourly$Hourly),df$Date),"var1"]
This achieves a result of the daily values becoming hourly with the daily var1 assigned to each hour that contains the day. In this respect missing daily values should not be an issue and if there is no match, it will add in NA's.
I have a data.frame of a time series of data, I would like to thin the data by only keeping the entries that are measured on every even day number. For example:
set.seed(1)
RandData <- rnorm(100,sd=20)
Locations <- rep(c('England','Wales'),each=50)
today <- Sys.Date()
dseq <- (seq(today, by = "1 days", length = 100))
Date <- as.POSIXct(dseq, format = "%Y-%m-%d")
Final <- data.frame(Loc = Locations,
Doy = as.numeric(format(Date,format = "%j")),
Temp = RandData)
So, how would I reduce this data frame to only contain every entry that is measured on even numbered days such as Lloc, day, and temp on day 172, day 174 and so on...
What about:
Final[Final$Doy%%2==0,]