Generate an xts of numerics from .csv with some characters/"#N/A" - r

I enter a headed Excel CSV and examine with str(returns.xts). The following code generates character values within the xts.
file <- "~/GCS/returns_Q216.csv"
returns_Q216_ <- read.csv(file=file)
returns <- read.zoo(data.frame(returns_Q216_), FUN = as.Date, format='%d/%m/%Y')
returns.xts <- as.xts(returns)
What is the best way to convert the xts contents to numeric from character whilst preserving xts (and date column)?
> `str(returns)`
An ‘xts’ object on 2007-01-31/2015-05-31 containing:
Data: `chr` [1:101, 1:18] "-0.002535663" "-0.001687755" "0.032882512" "0.024199512" "0.027812955" ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:18] "UK.EQUITY" "EUR.EQUITY" "NA.EQUITY" "ASIA.EQUITY" ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
NULL
> returns[8,9]
PROPERTY
2007-08-31 "-4.25063E-05"
When I try as.numeric(returns.xts) I get a structure 1x1 cell without the date as row.
> str(as.numeric(returns))
num [1:1818] -0.00254 -0.00169 0.03288 0.0242 0.02781 ...

You should use the na.strings argument to read.csv (which can be passed via read.zoo), as I said in my answer to your previous question.
file <- "~/GCS/returns_Q216.csv"
returns <- read.zoo(file, FUN=as.Date, format='%d/%m/%Y', na.strings="#N/A")
returns.xts <- as.xts(returns)

Related

Converting xts objects from FRED to data.table

I have an xts object from FRED, and would like to convert it to a data.table (or a dataframe) object instead. The relevant code is:
library(data.table)
library(quantmod)
library(Quandl)
library(zoo)
library(knitr)
library(ggplot2)
dataTableTemp <- getSymbols('DJIA', src='FRED')
dataTableTemp <- as.data.table(dataTableTemp)
And this is the content of the xts object it gets:
DJIA
2007-08-08 13657.86
2007-08-09 13270.68
2007-08-10 13239.54
... ...
str(DJIA), which is the name it is given when it downloads, gives
> str(DJIA)
An ‘xts’ object on 2007-08-08/2017-08-08 containing:
Data: num [1:2610, 1] 13658 13271 13240 13237 13029 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "DJIA"
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
List of 2
$ src : chr "FRED"
$ updated: POSIXct[1:1], format: "2017-08-09 09:41:49"
It goes on like that for a few thousand rows. When I convert it to a data.table with the second line of code, however, this is all that there is (in a table format):
dataTableTemp
1 DJIA
I've tried using fortify(dataTableTemp) from ggplot2, in addition to
dataTableTemp <- data.frame(date=index(dataTableTemp), coredata(dataTableTemp)), and even the tribble() method, but none of them seem to work. What should I do to convert it to a dataframe/data.table?
Any help would be appreciated. Thank you.
So that others know how this issue was solved:
getSymbols('DJIA', src='FRED')
dataTableTemp <- as.data.table(DJIA)
You can get the result you expect, if you adjust the auto.assign parameter in getSymbols:
# Note the auto.assign = FALSE parameter specification (this will avoid assigning the data to JDIA in the global environment.:
dataTableTemp <- getSymbols('DJIA', src='FRED', auto.assign = FALSE)
x = data.table("date" = index(dataTableTemp), coredata(dataTableTemp))

Merge output from quantmod::getSymbols

I looked many entries on merging R data frames, however they are not clear to me, they talk about merging/joining using a common column, but in my case its missed or may I don't know how to extract. Here is what I am doing.
library(quantmod)
library(xts)
start = '2001-01-01'
end = '2015-08-14'
ticker = 'AAPL'
f = getSymbols(ticker, src = 'yahoo', from = start, to = end, auto.assign=F)
rsi14 <- RSI(f$AAPL.Adjusted,14)
The output I am expecting is all the columns of f and rsi14 match by date, however 'date' is not available as column, so not sure how do I join. I have to join few Moving Average columns as well.
The premise of your question is wrong. getSymbols returns an xts object, not a data.frame:
R> library(quantmod)
R> f <- getSymbols("AAPL", auto.assign=FALSE)
R> str(f)
An ‘xts’ object on 2007-01-03/2015-08-14 containing:
Data: num [1:2170, 1:6] 86.3 84 85.8 86 86.5 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "AAPL.Open" "AAPL.High" "AAPL.Low" "AAPL.Close" ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
List of 2
$ src : chr "yahoo"
$ updated: POSIXct[1:1], format: "2015-08-15 00:46:49"
xts objects do not have a "Date" column. They have an index attribute that holds the datetime. xts extends zoo, so please see the zoo vignettes as well as the xts vignette and FAQ for information about how to use the classes.
Merging xts objects is as simple as:
R> f <- merge(f, rsi14=RSI(Ad(f), 14))
Or you could just use $<- to add/merge a column to an existing xts object:
R> f$rsi14 <- RSI(Ad(f), 14)

R - quantmod, how to reference getsymbol data later in script

very new to programming in R - but I am stumped on this one:
I'd like to only have to enter stock symbol data once in the script, but can't figure out how to reference ie adjusted close later on using Ad(x) without having to type the stock name again. I've tried passing a variable in like below but get error messages:
#get stock series data
stockPair <- c("SPY","DIA")
look_per <- "2015-01-01"
stckA <- suppressWarnings(getSymbols(stockPair[1], from = look_per))
stckB <- suppressWarnings(getSymbols(stockPair[2], from = look_per))
#get Adjusted close data
adA <- Ad(stckA )
adB <- Ad(stckB )
Error in Ad(stckA) :
subscript out of bounds: no column name containing "Adjusted"
The first thing you should do when you get an error is to look at your data. In this case, stckA and stckB are not what you think they are.
R> stckA <- suppressWarnings(getSymbols(stockPair[1], from = look_per))
R> stckB <- suppressWarnings(getSymbols(stockPair[2], from = look_per))
R> str(stckA)
chr "SPY"
R> str(stckB)
chr "DIA"
As you can see, those two objects are only character strings of the symbols returned by getSymbols, not the data. You need to set auto.assign=FALSE if you want to assign the output of getSymbols to an object.
R> stckA <- getSymbols(stockPair[1], from = look_per, auto.assign = FALSE)
R> str(Ad(stckA)) # now stckA contains data
An ‘xts’ object on 2015-01-02/2015-08-05 containing:
Data: num [1:149, 1] 204 200 198 200 204 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "SPY.Adjusted"
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
List of 2
$ src : chr "yahoo"
$ updated: POSIXct[1:1], format: "2015-08-05 20:02:30"

Change xts object date indexing

I have two data files with stock returns. I'm trying to apply the same function to both but I get an error for one of them. I wanted to find out what's causing the error, so I compared the output of str for both xts objects and the only line that differs is:
Indexed by objects of class: [POSIXct,POSIXt] TZ: # this object errors
Indexed by objects of class: [Date] TZ: GMT # this object works
Is there a way to change the indexing of the dates in an xts object so that the output of str returns: Indexed by objects of class: [Date] TZ: GMT?
I generated the dates using: seq(as.Date("1963/07/01"), as.Date("2004/12/01"), by = "1 month",tzone="GMT").
A reproducible example:
library(xts)
library("PerformanceAnalytics")
load("https://dl.dropboxusercontent.com/u/22681355/data.Rdata")
data(edhec)
data2 <- as.xts(french1)
The function I want to call is Return.portfolio() with the argument rebalance_on="months"
Return.portfolio(edhec["1997",1:10],rebalance_on="months") #this works
Return.portfolio(data2["1976",1:10],rebalance_on="months") #this does not work
xts:::as.xts.data.frame by default assumes that the rownames of your data.frame should be coerced to a POSIXct object/index. If you want to use a different class, specify it via the dateFormat= argument to as.xts.
> data2 <- as.xts(french1, dateFormat="Date")
> str(data2)
An ‘xts’ object on 1963-06-30/2004-11-30 containing:
Data: num [1:498, 1:10] -0.47 4.87 -1.68 2.66 -1.13 2.83 0.79 1.85 3.08 -0.45 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:10] "NoDur" "Durbl" "Manuf" "Enrgy" ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
NULL
Though I'm not convinced this is the cause of whatever error you encounter, because I do not get an error without specifying dateFormat="Date".
> data2 <- as.xts(french1)
> Return.portfolio(data2["1976",1:10],rebalance_on="months")
portfolio.returns
1976-01-31 0.3980000
1976-02-29 0.1017811
1976-03-31 1.3408273
1976-04-30 -11.7395151
1976-05-31 8.0197492
1976-06-30 -0.2550812
1976-07-31 2.5732207
1976-08-31 1.3784635
1976-09-30 -1.6859705
1976-10-31 -21.4958124
1976-11-30 5.6863828
1976-12-31 -7.8071966
Warning message:
In Return.portfolio(data2["1976", 1:10], rebalance_on = "months") :
weighting vector is null, calulating an equal weighted portfolio

Sorting xts data to look like panel data in R

I need to use 'PerformanceAnalytics' package of R and to use this package, it requires me to convert the data into xts data. The data can be downloaded from this link: https://drive.google.com/file/d/0B8usDJAPeV85elBmWXFwaXB4WUE/edit?usp=sharing . Hence, I have created an xts data by using the following commands:
data<-read.csv('monthly.csv')
dataxts <- xts(data[,-1],order.by=as.Date(data$datadate,format="%d/%m/%Y"))
But after doing this, it looses the panel data structure. I tried to sort the xts data to get it back in panel data form but failed.
Can anyone please help me to reorganize the xts data to look like a panel data. I need to sort them by firm id (gvkey) and data(datadate).
xts objects are sorted by time index only. They cannot be sorted by anything else.
I would encourage you to split your data.frame into a list, by gvkey. Then convert each list element to xts and remove the columns that do not vary across time, storing them as xtsAttributes. You might also want to consider using the yearmon class, since you're dealing with monthly data.
You will have to determine how you want to encode non-numeric, time-varying values, since you cannot mix types in xts objects.
Data <- read.csv('monthly.csv', nrow=1000, as.is=TRUE)
DataList <- split(Data, Data$gvkey)
xtsList <- lapply(DataList, function(x) {
attrCol <- c("iid","tic","cusip","conm","exchg","secstat","tpci",
"cik","fic","conml","costat","idbflag","dldte")
numCol <- c("ajexm","ajpm","cshtrm","prccm","prchm","prclm",
"trfm", "trt1m", "rawpm", "rawxm", "cmth", "cshom", "cyear")
toEncode <- c("isalrt","curcdm")
y <- xts(x[,numCol], as.Date(x$datadate,format="%d/%m/%Y"))
xtsAttributes(y) <- as.list(x[1,attrCol])
y
})
Each list element is now an xts object, and is much more compact, since you do not repeat completely redundant data. And you can easily run analysis on each gvkey via lapply and friends.
> str(xtsList[["1004"]])
An ‘xts’ object on 1983-01-31/2012-12-31 containing:
Data: num [1:360, 1:13] 3.38 3.38 3.38 3.38 3.38 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:13] "ajexm" "ajpm" "cshtrm" "prccm" ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
List of 13
$ iid : int 1
$ tic : chr "AIR"
$ cusip : int 361105
$ conm : chr "AAR CORP"
$ exchg : int 11
$ secstat: chr "A"
$ tpci : chr "0"
$ cik : int 1750
$ fic : chr "USA"
$ conml : chr "AAR Corp"
$ costat : chr "A"
$ idbflag: chr "D"
$ dldte : chr ""
And you can access the attributes via xtsAttributes:
> xtsAttributes(xtsList[["1004"]])$fic
[1] "USA"
> xtsAttributes(xtsList[["1004"]])$tic
[1] "AIR"
An efficient way to achieve this goal is to covert the Panel Data (long format) into wide format using 'reshape2' package. After performing the estimations, convert it back to long format or panel data format. Here is an example:
library(foreign)
library(reshape2)
dd <- read.dta("DDA.dta") // DDA.dta is Stata data; keep only date, id and variable of interest (i.e. three columns in total)
wdd<-dcast(dd, datadate~gvkey) // gvkey is the id
require(PerformanceAnalytics)
wddxts <- xts(wdd[,-1],order.by=as.Date(wdd$datadate,format= "%Y-%m-%d"))
ssd60A<-rollapply(wddxts,width=60,SemiDeviation,by.column=TRUE,fill=NA) // e.g of rolling window calculation
ssd60A.df<-as.data.frame(ssd60A.xts) // convert dataframe to xts
ssd60A.df$datadate=rownames(ssd60A.df) // insert time index
lssd60A.df<-melt(ssd60A.df, id.vars=c('datadate'),var='gvkey') // convert back to panel format
write.dta(lssd60A.df,"ssd60A.dta",convert.factors = "string") // export as Stata file
Then simply merge it with the master database to perform some regression.

Resources