How to extract date from time series and convert it to date in R - r

I have a dataset consisting of 4 variables namely date,Gold price,crude price and dollar price. I converted the class of the data to time series using ts() function. After converting, the dates got changed into some value. I am able to retrieve the dates from converted values using as.Date() function.Now I want to replace the date values by date itself in the ts object's date variable.
#CONVERTING DATA FRAME TO TIME SERIES OBJECT.
Gold.ts <- ts(Gold,start=Gold$DATE[1])
head(Gold.ts)
#OUTPUT.
DATE GOLD.PRICE CRUDE DOLLAR.INR
[1,] 13152 533.9 63.42 44.705
[2,] 13153 526.3 62.79 44.600
[3,] 13154 539.7 64.21 44.320
head(as.Date(index(Gold.ts)))
[1] "2006-01-04" "2006-01-05" "2006-01-06" "2006-01-07" "2006-01-08" "2006- 01-09"
Gold.ts$DATE <- as.Date(index(Gold.ts)) # This won't work because $ is not acceptable to extract variables from a time series object.
index(Gold.ts) <- as.Date(index(Gold.ts)) #This should work but gives error. How to display date instead of values in time series object i.e Gold.ts?
What is the right way to do it?

Related

Time series and how I can construct a ts object

I have a dataset with 3 variables:
the first is date (example"01/01/2019" )
the second is hour (example:"01:00"), and
the third is a numeric.
I want to construct an object ts, but I don't know how I can do this.The first and second variables are characters.
I want an hour time series

Convert characters to dates in a panel data set

I am downloading and using the following panel data,
# load / install package
library(rsdmx)
library(dplyr)
# Total
Assets.PIT <- readSDMX("http://widukind-api.cepremap.org/api/v1/sdmx/IMF/data/IFS/..Q.BFPA-BP6-USD")
Assets.PIT <- as.data.frame(Assets.PIT)
names(Assets.PIT)[10]<-"A.PI.T"
names(Assets.PIT)[6]<-"Code"
AP<-Assets.PIT[c("WIDUKIND_NAME","Code","TIME_PERIOD","A.PI.T")]
AP<-rename(AP, Country=WIDUKIND_NAME, Year=TIME_PERIOD)
My goal is to convert the column vector Year in the dataframe AP into a vector of class dates. In other words, I want R to understand the time series part of my panel data. For your information, I have quarterly data, with unbalanced date range across cross sections (in my case countries).
head(AP$Year)
[1] "2008-Q2" "2008-Q3" "2008-Q4" "2009-Q1" "2009-Q2" "2009-Q3"
Or,
AP$Year<-as.factor(AP$Year)
head(AP$Year)
[1] 2008-Q2 2008-Q3 2008-Q4 2009-Q1 2009-Q2 2009-Q3
264 Levels: 1950-Q1 1950-Q2 1950-Q3 1950-Q4 1951-Q1 1951-Q2 1951-Q3 1951-Q4 1952-Q1 1952-Q2 1952-Q3 ... 2015-Q4
Is there any easy solution to convert these character dates into time-series dates?
library(zoo)
as.Date(as.yearqtr(AP$year, format ='%YQ-%q'))
This should do it.

R - converting dataframe to xts adds quotations (in xts)

I am using a dataframe to create an xts.
The xts gets created but all values (except the index in the xts) are within quotation. This leads that I cannot use the data, since many functions such as sum, does not work.
Any ideas how I can get an xts produced without the quotations?
Here is my code [updated due to comment of inconsisty names of dataframes/xts]:
# creates a dataframe with dates and currency
mydf3 <- data.frame(date = c("2013-12-10", "2015-04-01",
"2016-01-03"), plus = c(4, 3, 2), minus = c(-1, -2, -4))
# transforms the column date to date-format
mydf3 = transform(mydf3,date=as.Date(as.character(date),format='%Y-%m-%d'))
# creates the xts, based on the dataframe mydf3
myxts3 <- xts(mydf3, order.by = mydf3$date)
# removes the column date, since date is now stored as index in the xts
myxts3$date <- NULL
You need to realize that the underlying data structure that stores your data in the xts object is an R matrix object, which can only be of one R type (e.g. all numeric or all character). The timestamps are stored as a separate vector (your date column in this case) which is used for indexing/subsetting the data by time.
The cause of your problem is that your date column is forcing the matrix of data to convert to character type matrix (in the xts object) instead of numeric. It seems that the date class converts to character when it is included in the matrix:
> as.matrix(mydf3)
date plus minus
[1,] "2013-12-10" "4" "-1"
[2,] "2015-04-01" "3" "-2"
[3,] "2016-01-03" "2" "-4"
Any time you have non-numeric data in your data you're converting to xts (in the x argument of xts), you'll get this kind of problem.
Your issue can be resolved as follows (which wici has shown in the comments)
myxts3 <- xts(x= mydf3[, c("plus", "minus")], order.by = mydf3[, "date"])
> coredata(myxts3)
plus minus
[1,] 4 -1
[2,] 3 -2
[3,] 2 -4
> class(coredata(myxts3))
[1] "matrix"
The date part should be in the index, not part of the data. read.zoo will do that for you producing a zoo object. You can then convert that to xts if you need it in that form.
library(xts)
as.xts(read.zoo(mydf3))
## plus minus
## 2013-12-10 4 -1
## 2015-04-01 3 -2
## 2016-01-03 2 -4

Extract numbers from string as numeric or dates in R

I am working with some hdf5 data sets. However, the dates are stored in the file and no hint of these dates from the file name. The attribute file consists of day of the year, month of the year, day of the month and year columns.
I would like to pull out data to create time series identity for each of the files i.e.year month date format that can be used for time series.
A sample of the data can be downloaded here:
[ ftp://l5eil01.larc.nasa.gov/tesl1l2l3/TES/TL3COD.003/2007.08.31/TES-Aura_L3-CO_r0000006311_F01_09.he5 ]
There is an attribute group file and a data group file.
I use the R library "rhdf5" to explore the hdf5 files. E.g
CO1<-h5ls ("TES-Aura_L3-CO_r0000006311_F01_09.he5")
Attr<-h5read("TES-Aura_L3-CO_r0000006311_F01_09.he5","HDFEOS INFORMATION/coremetadata")
Data<-h5read("TES-Aura_L3-CO_r0000006311_F01_09.he5", "HDFEOS\SWATHS\ColumnAmountNO2\Data Fields\ColumnAmountNO2Trop")
The Attr when read consist of a long string with the only required information being "2007-08-31" which is the date of acquisition. I have been able to extract this using the Stringr library:
regexp <- "([[:digit:]]{4})([-])([[:digit:]]{2})([-])([[:digit:]]{2})"
Date<-str_extract(Attr,pattern=regexp)
which returns the Date as:
"2007-08-31"
The only problem left now is that the Date isnt recognised as numeric or date. How do I change this as I need to bind the Date with the data for all days to create a time series (more like an identifier as the data sets are irregular), please? a sample of how it looks after extracting the dates from string and binding with the CO values for each date is below
Dates CO3b
[1,] "2011-03-01" 1.625811e+18
[2,] "2011-03-04" 1.655504e+18
[3,] "2011-03-11" 1.690428e+18
[4,] "2011-03-15" 1.679871e+18
[5,] "2011-03-17" 1.705987e+18
[6,] "2011-03-17" 1.661198e+18
[7,] "2011-03-17" 1.662694e+18
[8,] "2011-03-20" 1.520328e+18
[9,] "2011-03-21" 1.510642e+18
[10,] "2011-03-21" 1.556637e+18
However, R recognises these dates as character and not as date. I need to convert them to a time series I can work with.
Seems like you've already done all the hard work! Based off your comment, here's how you could take it across the finish line.
From your comment, seems like you have the strings in a good format. Given that your variable is named date, simply go
dateObjects<-as.Date(Date) #where Date is your variable
and either the single value or vector of character strings (as the format you gave in the comment) will now be date objects, which you could use with a library like zoo to create time series.
If your strings are not necessarily in the format you've described, then refer to the following link to see how to format other string forms as dates.
http://www.statmethods.net/input/dates.html
Given your example data frame you can create a time series in the following way, using the package zoo.
library(zoo)
datavect<-as.zoo(df$CO3b)
index(datavect)<-as.Date(df$Date)
here we take your CO data, covert it to a zoo object, then assign the appropriate date to each entry, converting it from a character to a date object. Now if you print datavect, you'll see each data entry attached to a date. This allows you to take advantage of zoo methods, such as merge and window.
Here is one approach not using string extraction. If you know how long your time series should be, which you should based on the length of your dataset and knowledge of its periodicity, you could just create a regular date series and then add that into a data.frame with other variables of interest. Assuming you have daily data the below would work. Obviously your length.out would be different.
d1 <- ISOdate(year=2007,month=8,day=31)
d2 <- as.Date(format(seq(from=d1,by="day",length.out=10),"%Y-%m-%d"))
[1] "2007-08-31" "2007-09-01" "2007-09-02" "2007-09-03" "2007-09-04" "2007-09-05" "2007-09-06" "2007-09-07" "2007-09-08" "2007-09-09"
class(d2)
[1] "Date"
Edit of Original:
Oh I see. Well after reading in your new data example the below worked for me. It was a pretty straight forward transform. cheers
library(magrittr) # Needed for the pipe operator %>% it makes it really easy to string steps together.
dateData
Dates CO3b
1 2011-03-01 1.63e+18
2 2011-03-04 1.66e+18
3 2011-03-11 1.69e+18
4 2011-03-15 1.68e+18
5 2011-03-17 1.71e+18
6 2011-03-17 1.66e+18
7 2011-03-17 1.66e+18
8 2011-03-20 1.52e+18
9 2011-03-21 1.51e+18
10 2011-03-21 1.56e+18
dateData %>% sapply(class) # classes before transforming (character,numeric)
dateData[,1] <- as.Date(dateData[,1]) # Transform to date
dateData %>% sapply(class) # classes after transforming (Date,numeric)
str(dateData) # one more check
'data.frame': 10 obs. of 2 variables:
$ Dates: Date, format: "2011-03-01" "2011-03-04" "2011-03-11" "2011-03-15" ...
$ CO3b : num 1.63e+18 1.66e+18 1.69e+18 1.68e+18 1.71e+18 ...

Converting hour, minutes columns in dataframe to time format

I have a dataframe containing 3 columns including minutes and hours.
I want to convert these columns (namely minutes and column) to time format. Given the data in drame:
Score Hour Min
10 10 56
23 17 01
I would like to get:
Score Time
10 10:56:00
23 17:01:00
You could use ISOdatetime to convert the numbers in the hour and min to a POSIXct object. However, a POSIXct object is only defined when it also includes a year, month and day. So depending on your needs to can do several things:
If you need a real time object which is correctly printed in graphs for example and can be used in arithmetic (addition, subtraction), you need to use ISOdatetime. ISOdatetime returns a so called POSIXct object, which is an R object which represents time. Then in ISOdatetime you just use fixed values for year, month, and day. This ofcourse only works if your dataset does not span multiple years.
If you just need a character column Time, you can convert the POSIXct output to string using strftime. By setting the format argument to "%H:%M:00". In this case however, you could also use sprintf to create the new character column without converting to POSIXct: sprintf("%s:%s:00", drame$Hour, drame$Min).
You can use paste() function to merge the two column data into a char and then use strptime() to convert to timestamp
x<-1:6
##1 2 3 4 5 6
y<-8:13
## 8 9 10 11 12 13
timestamp <- paste(x,":",y,":00",sep="")
timestamp
will result in
#1:8:00 2:9:00 3:10:00 4:11:00 5:12:00 6:13:00
If you prefer to convert this to timestamp object try using
strptime(mergedData,"%H:%M:%S")
## uses current date by default
if you happen to have Date in another column use paste() to make a char formattted date and use below to get date time
##strptime(mergedData,"%d/%m/%Y %H:%M:%S")

Categories

Resources