get monthly date in time series object in R as a POSIXlt - r

I have a time series object with the following dates:
my_data = rnorm(155)
my_data_ts <- ts(my_data, start = c(2002, 10), frequency = 12)
How do I get the date as a POSIXlt object, and the values?
my_data_date = STHGTOCONVERTTOPOSIXLT(my_data)???
my_data_values = STHGTOGETVALUES(my_data)???

To get the first day of the month, you can use:
as.POSIXlt( paste0(floor(time(my_data_ts)),'-', round(12*(time(my_data_ts)-floor(time(my_data_ts))))+1,'-01'), tz="UTC")
For the values, just use :
as.vector(my_data_ts)

Related

Transforming data frame to time series in r

I am currently trying to convert a data.frame to a time series. The data frame looks like this:
All I want to do is be able to plot the doc data as a function of time and run a statistical test on it.
Any help would be greatly appreciated!
This is what my code currently looks like:
x=aggregate( doc ~ mo + yr , B , mean )
x$Date <- as.yearmon(paste(x$yr, x$mo), "%Y %m")
df_ts <- xts(x, order.by = x$Date)
keeps <- "doc"
df_ts <- df_ts[ , keeps, drop = FALSE]
df_ts_1 <- as.ts(df_ts, start = head(index(df_ts), 1), end =
tail(index(df_ts), 1))
The issue I'm running into is that the months and years are not in sequential order so when I try to apply a as.tf function, the data does not fill in correctly.
Using DF defined reproducibly in the Note at the end read the data frame into a zoo object converting the Date column to yearmon class and plot using ggplot2. If there exist duplicate dates (there are none in the example data) then add the aggregate = mean argument to read.zoo.
library(ggplot2)
library(zoo)
z <- read.zoo(DF[c("Date", "doc")], FUN = as.yearmon, format = "%b %Y")
autoplot(z) + scale_x_yearmon()
This would also work:
tt <- as.ts(z)
plot(na.approx(tt), ylab = "tt")
Note
In the future please do not use images. I have retyped the first three rows this time.
DF <- data.frame(month = c("02", "10", "12"), year = c(1998, 2000, 2000),
doc = c(1.55, 2.2, 0.96), Date = c("Feb 1998", "Oct 2000", "Dec 2000"),
stringsAsFactors = FALSE)

"Week-Year" string to a meaningful date or numeric format that can be plotted

I have a set of data that includes date that are in the "week-year" format. Therefore, "30-2010" represents the 30th week in 2010. I'm trying to plot the data, but need to adjust the date values to something in a date format or as a numeric value so that ggplot2 will use it as the labels on my x axis. Any ideas on how this can be done?
dte = "30-2010"
Check out ?week in the lubridate package. This seems to do what you want:
library(lubridate)
str <- "30-2010"
wk <- substr(str, 1, 2)
yr <- substr(str, 4, 7)
dt <- as.Date(paste0(yr, "-01-01"))
week(dt) <- as.numeric(wk)
dt
[1] "2010-07-23"

Convert daily (date) format to hourly (posixct)

I have a data frame with daily data. I need to bind it to hourly data, but first I need to convert it to a suitable posixct format. This looks like this:
set.seed(42)
df <- data.frame(
Date = seq.Date(from = as.Date("2015-01-01", "%Y-%m-%d"), to = as.Date("2015-01-29", "%Y-%m-%d"), by = "day"),
var1 = runif(29, min = 5, max = 10)
)
result <- data.frame(
Date = d <- seq.POSIXt(from = as.POSIXct("2015-01-01 00:00:00", "%Y-%m-%d %H:%M:%S", tz = ""),
to = as.POSIXct("2015-01-29 23:00:00", "%Y-%m-%d %H:%M:%S", tz = ""), by = "hour"),
var1 = rep(df$var1, each = 24) )
However, my data is not as easy to work with as the above. I have lots of missing dates, so I need to be able to take the specific df$Date-vector and convert it to a posixct frame, with the matching daily values.
I've looked high and low but been unable to find anything on this.
The way I went about this was to find the min and max of the dataset and deem them hour 0 and hour 23.
hourly <- data.frame(Hourly=seq(min(as.POSIXct(paste0(df$Date, "00:00:00"),tz="")),max(as.POSIXct(paste0(df$Date, "23:00:00"),tz="")),by="hour"))
hourly[,"Var1"] <- df[match(x = as.Date(hourly$Hourly),df$Date),"var1"]
This achieves a result of the daily values becoming hourly with the daily var1 assigned to each hour that contains the day. In this respect missing daily values should not be an issue and if there is no match, it will add in NA's.

Changing time zones with POSIXct time series, R

I've run into a trouble working with time series & zones in R, and I can't quite figure out how to proceed.
I have an time series data like this:
df <- data.frame(
Date = seq(as.POSIXct("2014-01-01 00:00:00"), length.out = 1000, by = "hours"),
price = runif(1000, min = -10, max = 125),
wind = runif(1000, min = 0, max = 2500),
temp = runif(1000, min = - 10, max = 25)
)
Now, the Date is in UTC-time. I would like to subset/filter the data, so for example I get the values from today (Today is 2014-05-13):
df[ as.Date(df$Date) == Sys.Date(), ]
However, when I do this, I get data that starts with:
2014-05-13 02:00:00
And not:
2014-05-13 00:00:00
Because im currently in CEST-time, which is two hours after UTC-time. So I try to change the data:
df$Date <- as.POSIXct(df$Date, format = "%Y-%m-%d %H", tz = "Europe/Berlin")
Yet this doesn't work. I've tried various variations, such as stripping it to character, and then converting and so on, but I've run my head against a wall, and Im guessing there is something simple im missing.
To avoid using issues with timezones like this, use format to get the character representation of the date:
df[format(df$Date,"%Y-%m-%d") == Sys.Date(), ]

select even numbered days from data frame

I have a data.frame of a time series of data, I would like to thin the data by only keeping the entries that are measured on every even day number. For example:
set.seed(1)
RandData <- rnorm(100,sd=20)
Locations <- rep(c('England','Wales'),each=50)
today <- Sys.Date()
dseq <- (seq(today, by = "1 days", length = 100))
Date <- as.POSIXct(dseq, format = "%Y-%m-%d")
Final <- data.frame(Loc = Locations,
Doy = as.numeric(format(Date,format = "%j")),
Temp = RandData)
So, how would I reduce this data frame to only contain every entry that is measured on even numbered days such as Lloc, day, and temp on day 172, day 174 and so on...
What about:
Final[Final$Doy%%2==0,]

Resources