Selecting from xts by column name - r

I'm trying to operate on a specific column in an xts object by name within a function but I keep getting an error:
Error in if (length(c(year, month, day, hour, min, sec)) == 6 && all(c(year, :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In as_numeric(YYYY) : NAs introduced by coercion
2: In as_numeric(YYYY) : NAs introduced by coercion
If I have an xts object:
xts1 <- xts(x=1:10, order.by=Sys.Date()-1:10)
xts2 <- xts(x=1:10, order.by=Sys.Date()+1:10)
xts3 <- merge(xts1, xts2)
Then I can select a specific column with:
xts3$xts1
With a dataframe I can pass xts3 to another function and then select a specific column with:
xts3['xts1']
But if I try to do the same thing with an xts object I get the error above. e.g.
testfun <- function(xts_data){
print(xts_data['xts1'])
}
Called with:
testfun(xts3)
This works:
testfun <- function(xts_data){
print(xts_data[,1])
}
But I'd really like to select by name as I can't be certain of the column order.
Can anyone suggest how to solve this?
Thanks!

xts-objects have class c("xts", "zoo"), which means they are matrices with special attributes that are assigned by their creation functions. Although $ will not succeed with a matrix, it works with xts and zoo objects thanks to the $.zoo method. (It's also not recommended to use $ inside functions because of the potential for name-evaluation-confusion and partial name matching.) See: ?xts and examine the sample.xts object created with the first example with str:
> ?xts
starting httpd help server ... done
> data(sample_matrix)
> sample.xts <- as.xts(sample_matrix, descr='my new xts object')
>
> str(sample.xts)
An ‘xts’ object on 2007-01-02/2007-06-30 containing:
Data: num [1:180, 1:4] 50 50.2 50.4 50.4 50.2 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:4] "Open" "High" "Low" "Close"
Indexed by objects of class: [POSIXct,POSIXt] TZ:
xts Attributes:
List of 1
$ descr: chr "my new xts object"
class(sample.xts)
# [1] "xts" "zoo"
This explains why the earlier answer advising the use of xts3[ , "x"] or equivalently xts3[ , 1] should succeed. The [.xts function extracts the "Data" element first and then returns the either named or numbered column specified by the j-argument.
str(xts3)
An ‘xts’ object on 2018-05-24/2018-06-13 containing:
Data: int [1:20, 1:2] 10 9 8 7 6 5 4 3 2 1 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "xts1" "xts2"
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
NULL
> xts3[ , "xts1"]
xts1
2018-05-24 10
2018-05-25 9
2018-05-26 8
2018-05-27 7
2018-05-28 6
2018-05-29 5
2018-05-30 4
2018-05-31 3
2018-06-01 2
2018-06-02 1
2018-06-04 NA
2018-06-05 NA
2018-06-06 NA
2018-06-07 NA
2018-06-08 NA
2018-06-09 NA
2018-06-10 NA
2018-06-11 NA
2018-06-12 NA
2018-06-13 NA
The merge.xts operation might not have delivered what you expected since the date ranges didn't overlap. It seems possible that you wanted:
> xts4 <- rbind(xts1, xts2)
> str(xts4)
An ‘xts’ object on 2018-05-24/2018-06-13 containing:
Data: int [1:20, 1] 10 9 8 7 6 5 4 3 2 1 ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
NULL
Note that the rbind.xts-operation failed to deliver an object with the shared column name so numeric access would be needed. (I would have expected a named "Data" element, but you/we also need to read ?rbind.xts.)

Type ?`[.xts` and you'll see that the function has a i and a j argument (among others).
i - the rows to extract. Numeric, timeBased or ISO-8601 style (see details)
j - the columns to extract, numeric or by name
You passed 'xts1' as the i argument, while it should be j. So your function should be
testfun <- function(xts_data){
print(xts_data[, 'xts1']) # or xts3[j = 'xts1']
}

Related

Using the subsetting operator :: in quantmod with variables

How do I apply a user initialized date variables as the start and end values of the subset operator :: from the R-package, quantmod?
For example, when I apply user initialized date variables,
end.date <- Sys.Date()
start.date <- end.date - 5*365 #5- years to-date
start.date.char <- as.character(start.date)
end.date.char <- as.character(end.date)
to get 5-years of stock data
library(quantmod)
getSymbols("GILD",src="yahoo")
GILD.5YTD <- GILD['start.date.char::end.date.char']
I get the following error:
Error in if (length(c(year, month, day, hour, min, sec))
== 6 && c(year, :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In as_numeric(YYYY) : NAs introduced by coercion
2: In as_numeric(MM) : NAs introduced by coercion
3: In as_numeric(DD) : NAs introduced by coercion
4: In as_numeric(YYYY) : NAs introduced by coercion
5: In as_numeric(MM) : NAs introduced by coercion
6: In as_numeric(DD) : NAs introduced by coercion
I'm sure this is a basic question, but I'm a newbie.
There are convenient high-level functions to subset an xts object as returned, e.g., by quantmod's getSymbols().
For a time-based subset, the last() function from the xts package (automatically loaded by quantmod) is quite handy:
library(quantmod)
getSymbols("GILD",src="yahoo")
GILD_last5Years <- last(GILD, "5 years")
#> head(GILD_last5Years)
# GILD.Open GILD.High GILD.Low GILD.Close GILD.Volume GILD.Adjusted
#2012-01-03 41.46 41.99 41.35 41.86 19564000 20.46895
#2012-01-04 41.95 42.06 41.70 42.02 16236000 20.54719
#2012-01-05 42.04 42.97 42.00 42.52 18431800 20.79168
#2012-01-06 42.38 43.10 42.20 42.78 15542000 20.91882
#2012-01-09 42.49 42.99 42.35 42.73 16801200 20.89437
#2012-01-10 43.10 45.04 42.94 44.25 30110000 21.63763
This can be combined with an equivalent function first() to select a specific time span within the series.
Your current argument to [.xts is just the character value 'start.date.char::end.date.char' and would not be evaluated further, since R is not a macro language. Try instead to build the desired character value, which I believe is: "2011-08-28::2016-08-26". So this succeeds:
GILD.5YTD<-GILD[paste(start.date.char, end.date.char, sep="::")]
str(GILD.5YTD)
#-------
An ‘xts’ object on 2011-08-29/2016-08-25 containing:
Data: num [1:1257, 1:6] 39 39.7 40.2 39.8 39 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "GILD.Open" "GILD.High" "GILD.Low" "GILD.Close" ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
List of 2
$ src : chr "yahoo"
$ updated: POSIXct[1:1], format: "2016-08-26 17:00:52"
So technically the :: is not acting as an R operator, but is being parsed by the [.xts function. Pkg:quantmod is built on top of the xts;package. The "::" function is really for package-directed function access for exported functions of installed packages.
The reason for your errors are that you are submitting the variables within a string which can not work. ( By the way you do not have to convert the date into as.character as in your example as pasting will do that for you). Using paste0 like so will subset your data accordingly:
GILD.5YTD<-GILD[paste0(start.date.char,'::',end.date.char)]

Merge output from quantmod::getSymbols

I looked many entries on merging R data frames, however they are not clear to me, they talk about merging/joining using a common column, but in my case its missed or may I don't know how to extract. Here is what I am doing.
library(quantmod)
library(xts)
start = '2001-01-01'
end = '2015-08-14'
ticker = 'AAPL'
f = getSymbols(ticker, src = 'yahoo', from = start, to = end, auto.assign=F)
rsi14 <- RSI(f$AAPL.Adjusted,14)
The output I am expecting is all the columns of f and rsi14 match by date, however 'date' is not available as column, so not sure how do I join. I have to join few Moving Average columns as well.
The premise of your question is wrong. getSymbols returns an xts object, not a data.frame:
R> library(quantmod)
R> f <- getSymbols("AAPL", auto.assign=FALSE)
R> str(f)
An ‘xts’ object on 2007-01-03/2015-08-14 containing:
Data: num [1:2170, 1:6] 86.3 84 85.8 86 86.5 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "AAPL.Open" "AAPL.High" "AAPL.Low" "AAPL.Close" ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
List of 2
$ src : chr "yahoo"
$ updated: POSIXct[1:1], format: "2015-08-15 00:46:49"
xts objects do not have a "Date" column. They have an index attribute that holds the datetime. xts extends zoo, so please see the zoo vignettes as well as the xts vignette and FAQ for information about how to use the classes.
Merging xts objects is as simple as:
R> f <- merge(f, rsi14=RSI(Ad(f), 14))
Or you could just use $<- to add/merge a column to an existing xts object:
R> f$rsi14 <- RSI(Ad(f), 14)

Change xts object date indexing

I have two data files with stock returns. I'm trying to apply the same function to both but I get an error for one of them. I wanted to find out what's causing the error, so I compared the output of str for both xts objects and the only line that differs is:
Indexed by objects of class: [POSIXct,POSIXt] TZ: # this object errors
Indexed by objects of class: [Date] TZ: GMT # this object works
Is there a way to change the indexing of the dates in an xts object so that the output of str returns: Indexed by objects of class: [Date] TZ: GMT?
I generated the dates using: seq(as.Date("1963/07/01"), as.Date("2004/12/01"), by = "1 month",tzone="GMT").
A reproducible example:
library(xts)
library("PerformanceAnalytics")
load("https://dl.dropboxusercontent.com/u/22681355/data.Rdata")
data(edhec)
data2 <- as.xts(french1)
The function I want to call is Return.portfolio() with the argument rebalance_on="months"
Return.portfolio(edhec["1997",1:10],rebalance_on="months") #this works
Return.portfolio(data2["1976",1:10],rebalance_on="months") #this does not work
xts:::as.xts.data.frame by default assumes that the rownames of your data.frame should be coerced to a POSIXct object/index. If you want to use a different class, specify it via the dateFormat= argument to as.xts.
> data2 <- as.xts(french1, dateFormat="Date")
> str(data2)
An ‘xts’ object on 1963-06-30/2004-11-30 containing:
Data: num [1:498, 1:10] -0.47 4.87 -1.68 2.66 -1.13 2.83 0.79 1.85 3.08 -0.45 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:10] "NoDur" "Durbl" "Manuf" "Enrgy" ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
NULL
Though I'm not convinced this is the cause of whatever error you encounter, because I do not get an error without specifying dateFormat="Date".
> data2 <- as.xts(french1)
> Return.portfolio(data2["1976",1:10],rebalance_on="months")
portfolio.returns
1976-01-31 0.3980000
1976-02-29 0.1017811
1976-03-31 1.3408273
1976-04-30 -11.7395151
1976-05-31 8.0197492
1976-06-30 -0.2550812
1976-07-31 2.5732207
1976-08-31 1.3784635
1976-09-30 -1.6859705
1976-10-31 -21.4958124
1976-11-30 5.6863828
1976-12-31 -7.8071966
Warning message:
In Return.portfolio(data2["1976", 1:10], rebalance_on = "months") :
weighting vector is null, calulating an equal weighted portfolio

R from character to numeric

I have this csv file (fm.file):
Date,FM1,FM2
28/02/2011,14.571611,11.469457
01/03/2011,14.572203,11.457512
02/03/2011,14.574798,11.487183
03/03/2011,14.575558,11.487802
04/03/2011,14.576863,11.490246
And so on.
I run this commands:
fm.data <- as.xts(read.zoo(file=fm.file,format='%d/%m/%Y',tz='',header=TRUE,sep=','))
is.character(fm.data)
And I get the following:
[1] TRUE
How do I get the fm.data to be numeric without loosing its date index. I want to perform some statistics operations that require the data to be numeric.
I was puzzled by two things: It didn't seem that that 'read.zoo' should give you a character matrix, and it didn't seem that changing it's class would affect the index values, since the data type should be separate from the indices. So then I tried to replicate the problem and get a different result:
txt <- "Date,FM1,FM2
28/02/2011,14.571611,11.469457
01/03/2011,14.572203,11.457512
02/03/2011,14.574798,11.487183
03/03/2011,14.575558,11.487802
04/03/2011,14.576863,11.490246"
require(xts)
fm.data <- as.xts(read.zoo(file=textConnection(txt),format='%d/%m/%Y',tz='',header=TRUE,sep=','))
is.character(fm.data)
#[1] FALSE
str(fm.data)
#-------------
An ‘xts’ object from 2011-02-28 to 2011-03-04 containing:
Data: num [1:5, 1:2] 14.6 14.6 14.6 14.6 14.6 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "FM1" "FM2"
Indexed by objects of class: [POSIXct,POSIXt] TZ:
xts Attributes:
List of 2
$ tclass: chr [1:2] "POSIXct" "POSIXt"
$ tzone : chr ""
zoo- and xts-objects have their data in a matrix accessed with coredata and their indices are a separate set of attributes.
I think the problem is you have some dirty data in you csv file. In other words FM1 or FM2 columns contain a character, somewhere, that stops it being interpreted as a numeric column. When that happens, XTS (which is a matrix underneath) will force the whole thing to character type.
Here is one way to use R to find suspicious data:
s <- scan(fm.file,what="character")
# s is now a vector of character strings, one entry per line
s <- s[-1] #Chop off the header row
all(grepl('^[-0-9,.]*$',s,perl=T)) #True means all your data is clean
s[ !grepl('^[-0-9,.]*$',s,perl=T) ]
which( !grepl('^[-0-9,.]*$',s,perl=T) ) + 1
The second-to-last line prints out all the csv rows that contain characters you did not expect. The last line tells you which rows in the file they are (+1 because we removed the header row).
Why not simply use read.csv and then convert the first column to an Date object using as.Date
> x <- read.csv(fm.file, header=T)
> x$Date <- as.Date(x$Date, format="%d/%m/%Y")
> x
Date FM1 FM2
1 2011-02-28 14.57161 11.46946
2 2011-03-01 14.57220 11.45751
3 2011-03-02 14.57480 11.48718
4 2011-03-03 14.57556 11.48780
5 2011-03-04 14.57686 11.49025

Adding column to xts object based on another xts object in R

I have one main XTS object "Data" with ~1M rows spanning 22 days. I have another XTS object "Set" with 22 rows, with 1 entry per day. I would like to combine this smaller XTS object into the larger one, such that it would have an additional column containing the value in Set for that day.
First I tried:
> Data=cbind(Data,as.numeric(Set[as.Date(index(Data[]))]))
Error in error(x, ...) :
improper length of one or more arguments to merge.xts
Then I tried:
> Data=cbind(Data,1)
> Data[,6]=as.numeric(Set[as.Date(index(Data[,6]))])
Error in NextMethod(.Generic) :
number of items to replace is not a multiple of replacement length
I also tried without the as.numeric but received the same error. I tried turning Data into a data.frame and got the error:
Error in `[<-.data.frame`(`*tmp*`, , 6, value = c(1, 397.16, 397.115, :
replacement has 22 rows, data has 835771
What am I doing wrong and how do I make this happen? I've only been using R the past two weeks.
Thanks!
> str(Data)
An ‘xts’ object from 2012-01-03 05:01:05 to 2012-01-31 14:59:59 containing:
Data: num [1:835771, 1:5] 397 397 397 397 397 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:5] "SYN" "\"WhitePack.BID_SIZE\"" "\"WhitePack.BID_PRICE\"" "\"WhitePack.ASK_PRICE\"" ...
Indexed by objects of class: [POSIXct,POSIXt] TZ:
xts Attributes:
NULL
> str(Set)
An ‘xts’ object from 2012-01-02 to 2012-01-31 containing:
Data: chr [1:22, 1] " 1.000" "397.160" "397.115" "397.175" "397.200" "397.390" "397.560" "397.580" "397.715" ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "Settle"
Indexed by objects of class: [POSIXct,POSIXt] TZ:
xts Attributes:
NULL
Do you get success with :
df3 <- merge(Data, Set)
To address my lack of full understanding of the original problem, I think the only additional step would be:
df3[, 6] <- na.locf( df3[, 6] )

Resources