I want to run a function on all periods of an xts matrix. apply() is very fast but the returned matrix has transposed dimensions compared to the original object:
> dim(myxts)
[1] 7429 48
> myxts.2 = apply(myxts, 1 , function(x) { return(x) })
> dim(myxts.2)
[1] 48 7429
> str(myxts)
An 'xts' object from 2012-01-03 09:30:00 to 2012-01-30 16:00:00 containing:
Data: num [1:7429, 1:48] 4092500 4098500 4091500 4090300 4095200 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:48] "Open" "High" "Low" "Close" ...
Indexed by objects of class: [POSIXlt,POSIXt] TZ:
xts Attributes:
NULL
> str(myxts.2)
num [1:48, 1:7429] 4092500 4098500 4091100 4098500 0 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:48] "Open" "High" "Low" "Close" ...
..$ : chr [1:7429] "2012-01-03 09:30:00" "2012-01-03 09:31:00" "2012-01-03 09:32:00" "2012-01-03 09:33:00" ...
> nrow(myxts)
[1] 7429
> head(myxts)
Open High Low Close
2012-01-03 09:30:00 4092500 4098500 4091100 4098500
2012-01-03 09:31:00 4098500 4099500 4092000 4092000
2012-01-03 09:32:00 4091500 4095000 4090000 4090200
2012-01-03 09:33:00 4090300 4096400 4090300 4094900
2012-01-03 09:34:00 4095200 4100000 4095200 4099900
2012-01-03 09:35:00 4100000 4100000 4096500 4097500
How can I preserve myxts dimensions?
That's what apply is documented to do. From ?apply:
Value:
If each call to ‘FUN’ returns a vector of length ‘n’, then ‘apply’
returns an array of dimension ‘c(n, dim(X)[MARGIN])’ if ‘n > 1’.
In your case, 'n'=48 (because you're looping over rows), so apply will return an array of dimension c(48, 7429).
Also note that myxts.2 is not an xts object. It's a regular array. You have a couple options:
transpose the results of apply before re-creating your xts object:
data(sample_matrix)
myxts <- as.xts(sample_matrix)
dim(myxts) # [1] 180 4
myxts.2 <- apply(myxts, 1 , identity)
dim(myxts.2) # [1] 4 180
myxts.2 <- xts(t(apply(myxts, 1 , identity)), index(myxts))
dim(myxts.2) # [1] 180 4
Vectorize your function so it operates on all the rows of an xts
object and returns an xts object. Then you don't have to worry
about apply at all.
Finally, please start providing reproducible examples. It's not that hard and it makes it a lot easier for people to help. I've provided an example above and I hope you can use it in your following questions.
Related
I have two data frames and I need to put the line of my second data frame as the last line of my first data frame:
The first data frame is PETR3.SA:
tail(PETR3.SA)
PETR3.SA.Open PETR3.SA.High PETR3.SA.Low PETR3.SA.Close
2020-04-23 17.35522 17.63133 16.85232 17.09884
2020-04-24 16.86218 17.01009 15.30415 15.84650
2020-04-27 16.14233 16.68468 15.74789 16.56635
2020-04-28 17.49000 18.02000 17.11000 18.02000
2020-04-29 18.51000 19.30000 18.35000 19.00000
2020-04-30 18.73000 19.18000 18.43000 18.65000
PETR3.SA.Volume PETR3.SA.Adjusted
2020-04-23 19498900 17.09884
2020-04-24 39716700 15.84650
2020-04-27 25446600 16.56635
2020-04-28 24004700 18.02000
2020-04-29 26938000 19.00000
2020-04-30 23209200 18.65000
str(PETR3.SA)
An ‘xts’ object on 2015-01-02/2020-04-30 containing:
Data: num [1:1322, 1:6] 9.07 8.18 7.84 7.86 8.15 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "PETR3.SA.Open" "PETR3.SA.High" "PETR3.SA.Low" "PETR3.SA.Close" ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
List of 3
$ src : chr "yahoo"
$ updated : POSIXct[1:1], format: "2020-05-04 17:14:02"
$ na.action: 'omit' int [1:3] 779 1038 1281
..- attr(*, "index")= num [1:3] 1.52e+09 1.55e+09 1.58e+09
My second df:
cotacao_xts
PETR3.SA.Open PETR3.SA.High PETR3.SA.Low PETR3.SA.Close
2020-05-04 19.02 19.02 19.02 19.02
PETR3.SA.Volume PETR3.SA.Adjusted
2020-05-04 0 19.02
> str(cotacao_xts)
An ‘xts’ object on 2020-05-04/2020-05-04 containing:
Data: num [1, 1:6] 19 19 19 19 0 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "PETR3.SA.Open" "PETR3.SA.High" "PETR3.SA.Low" "PETR3.SA.Close" ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
NULL
I need to put my second df (cotacao_xts) as the last line of my first df.
I tried bind_rows, but this is what I got:
> new_df <- PETR3.SA %>%
+ bind_rows(cotacao_xts)
Error: Argument 1 must have names
As these are xts objects, we can use rbind assuming the index are unique
library(xts)
rbind(PETR3.SA, cotacao_xts)
methods(class = 'xts')[50]
#[1] "rbind.xts"
According to ?bind_rows
... - Data frames to combine
It can be data.table, data.frame or tbl_df. The xts object is neither one of those. It is a matrix with xts attribute. If we need to use bind_rows, then the objects needs to be converted to data.frame
I would like to subset dates from an xts object based on logical values in another xts object, but R returns an out-of-range error despite having in-range values.
For example I would like to filter dates and prices where RSI is above 60.
> strength <- RSI(d, 14)>60
> strength["2016-10-17::"]
RSI
2016-10-17 TRUE
2016-10-18 TRUE
2016-10-19 TRUE
2016-10-20 FALSE
2016-10-21 FALSE
> d["2016-10-17::"]
Open
2016-10-17 642.2760
2016-10-18 640.5988
2016-10-19 637.0000
2016-10-20 631.9800
2016-10-21 633.6470
> d["2016-10-17::"][strength == TRUE]
Error in `[.xts`(d["2016-10-17::"], strength == TRUE) :
'i' or 'j' out of range
This is not the output I expect because both my objects have data until 2016-10-21. What could be wrong? I would like something like :
> d["2016-10-17::"][strength == TRUE]
Open
2016-10-17 642.2760
2016-10-18 640.5988
2016-10-19 637.0000
This is the str of my xts objects :
> str(d)
An ‘xts’ object on 2013-09-02/2016-10-21 containing:
Data: num [1:1146, 1] 127 128 121 121 116 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "Open"
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
List of 2
$ dateFormat: chr "Date"
$ na.action :Class 'omit' atomic [1:92] 1 2 3 4 5 6 7 8 9 10 ...
.. ..- attr(*, "index")= num [1:92] 1.37e+09 1.37e+09 1.37e+09 1.37e+09 1.37e+09 ...
> str(strength)
An ‘xts’ object on 2013-09-16/2016-10-21 containing:
Data: logi [1:1132, 1] FALSE FALSE FALSE FALSE FALSE FALSE ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "RSI"
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
NULL
>
Thank you
You didn't make a reproducible example, so here is some toy data. Your issue is that you did not subset strength with the same time window (so your inner strength == TRUE logical series has different row length to your d row length, generating your error. i.e. NROW(strength == TRUE) >> NROW(d["2016-10-17::"]) ):
library(quantmod)
getSymbols("AAPL")
d <- AAPL
strength <- RSI(Cl(d)) > 60
You shouldn't get an error if you do this:
d["2016-10-17::"][strength["2016-10-17::"] == TRUE]
I want to run a function on all periods of an xts matrix. apply() is very fast but the returned matrix has transposed dimensions compared to the original object:
> dim(myxts)
[1] 7429 48
> myxts.2 = apply(myxts, 1 , function(x) { return(x) })
> dim(myxts.2)
[1] 48 7429
> str(myxts)
An 'xts' object from 2012-01-03 09:30:00 to 2012-01-30 16:00:00 containing:
Data: num [1:7429, 1:48] 4092500 4098500 4091500 4090300 4095200 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:48] "Open" "High" "Low" "Close" ...
Indexed by objects of class: [POSIXlt,POSIXt] TZ:
xts Attributes:
NULL
> str(myxts.2)
num [1:48, 1:7429] 4092500 4098500 4091100 4098500 0 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:48] "Open" "High" "Low" "Close" ...
..$ : chr [1:7429] "2012-01-03 09:30:00" "2012-01-03 09:31:00" "2012-01-03 09:32:00" "2012-01-03 09:33:00" ...
> nrow(myxts)
[1] 7429
> head(myxts)
Open High Low Close
2012-01-03 09:30:00 4092500 4098500 4091100 4098500
2012-01-03 09:31:00 4098500 4099500 4092000 4092000
2012-01-03 09:32:00 4091500 4095000 4090000 4090200
2012-01-03 09:33:00 4090300 4096400 4090300 4094900
2012-01-03 09:34:00 4095200 4100000 4095200 4099900
2012-01-03 09:35:00 4100000 4100000 4096500 4097500
How can I preserve myxts dimensions?
That's what apply is documented to do. From ?apply:
Value:
If each call to ‘FUN’ returns a vector of length ‘n’, then ‘apply’
returns an array of dimension ‘c(n, dim(X)[MARGIN])’ if ‘n > 1’.
In your case, 'n'=48 (because you're looping over rows), so apply will return an array of dimension c(48, 7429).
Also note that myxts.2 is not an xts object. It's a regular array. You have a couple options:
transpose the results of apply before re-creating your xts object:
data(sample_matrix)
myxts <- as.xts(sample_matrix)
dim(myxts) # [1] 180 4
myxts.2 <- apply(myxts, 1 , identity)
dim(myxts.2) # [1] 4 180
myxts.2 <- xts(t(apply(myxts, 1 , identity)), index(myxts))
dim(myxts.2) # [1] 180 4
Vectorize your function so it operates on all the rows of an xts
object and returns an xts object. Then you don't have to worry
about apply at all.
Finally, please start providing reproducible examples. It's not that hard and it makes it a lot easier for people to help. I've provided an example above and I hope you can use it in your following questions.
I have a datetime variable (vardt) as a character in large data table. E.g. "21/07/2011 15:54:57"
I can turn it into ITime class (e.g. 15:54:57) with DT[,newtimevar:=as.ITime(substr(DT$vardt,12,19))] but I would like to create groups of minutes, so from 21/07/2011 15:54:57 I would obtain 15:54:00 or 15:54.
I have tried: DT[,cuttime := as.ITime(cut(DT$vardt, breaks = "1 min",))]
but it didn't work. I am reading the zoo package documentation but I haven't found anything yet. Any idea/function that could be useful for this case in a large data table?
Here are two possible approaches:
library(data.table)
##
x <- Sys.time()+sample(seq(0,24*3600,60),101,TRUE)
x <- gsub(
"(\\d+)\\-(\\d+)\\-(\\d+)",
"\\3/\\2/\\1",
x)
##
DT <- data.table(vardt=x)
##
DT[,time:=as.ITime(substr(vardt,12,19))]
##
DT[,hour_min:=as.ITime(
gsub("(\\d+)\\:(\\d+)\\:(\\d+)",
"\\1\\:\\2\\:00",time))]
DT[,c_hour_min:=substr(time,1,5)]
##
R> head(DT)
vardt time hour_min c_hour_min
1: 28/01/2015 05:38:30 05:38:30 05:38:00 05:38
2: 27/01/2015 14:15:30 14:15:30 14:15:00 14:15
3: 28/01/2015 06:03:30 06:03:30 06:03:00 06:03
4: 28/01/2015 00:37:30 00:37:30 00:37:00 00:37
5: 27/01/2015 17:59:30 17:59:30 17:59:00 17:59
6: 28/01/2015 03:46:30 03:46:30 03:46:00 03:46
R> str(DT,vec.len=2)
Classes ‘data.table’ and 'data.frame': 101 obs. of 4 variables:
$ vardt : chr "28/01/2015 05:38:30" "27/01/2015 14:15:30" ...
$ time :Class 'ITime' int [1:101] 20310 51330 21810 2250 64770 ...
$ hour_min :Class 'ITime' int [1:101] 20280 51300 21780 2220 64740 ...
$ c_hour_min: chr "05:38" "14:15" ...
- attr(*, ".internal.selfref")=<externalptr>
The first case, hour_min, preserves the ITime class, while the second case, c_hour_min, is just a character vector.
I was about to blog about a useful R function I'd made, went to create some dummy data, but the dummy data behaves differently! Help!
library(xts)
data=xts(1:139,Sys.Date()-139:1)
Looking at it, it all looks good:
> head(data)
[,1]
2012-03-07 1
2012-03-08 2
2012-03-09 3
2012-03-10 4
2012-03-11 5
2012-03-12 6
> tail(data)
[,1]
2012-07-18 134
2012-07-19 135
2012-07-20 136
2012-07-21 137
2012-07-22 138
2012-07-23 139
> head(index(data))
[1] "2012-03-07" "2012-03-08" "2012-03-09" "2012-03-10" "2012-03-11" "2012-03-12"
> tail(index(data))
[1] "2012-07-18" "2012-07-19" "2012-07-20" "2012-07-21" "2012-07-22" "2012-07-23"
> range(index(data))
[1] "2012-03-07" "2012-07-23"
But, rollapply is weird. The range(index()) gives "1 40" instead of the strings.
> rollapply(data,width=40,by=30,FUN=function(x){print(range(index(x)));length(x)})
[1] 1 40
[1] 1 40
[1] 1 40
[1] 1 40
2012-03-26 40
2012-04-25 40
2012-05-25 40
2012-06-24 40
This is officially weird, because on my real data rollapply outputs a date range as strings. Comparing str on my real data and the above artificial data, and they are identical. In particular they both say 'Indexed by objects of class: [Date] TZ:' and they both say: 'tclass: chr "Date"'
Well, no, I exaggerate; the following artificial data has identical structure to my real data:
data=xts(data.frame(a=1:139,b=seq(3.14,by=0.01,length.out=139)),Sys.Date()-139:1)
It has exactly the same weird rollapply issue.
P.S. The useful function I mentioned is a rollapply wrapper; I've not shown it above because I don't need to: the core xts rollapply shows the problem too. But I'll post a link to it, in a comment, when I finally blog about it :-)
UPDATE
Here is some output with an xts object where it works:
> rollapply(data,width=40,by=30,FUN=function(x){print(class(x));print(range(index(x)));length(x)})
[1] "xts" "zoo"
[1] "2012-01-02" "2012-02-24"
...
> class(data)
[1] "xts" "zoo"
> str(data)
An ‘xts’ object from 2012-01-02 to 2012-07-18 containing:
Data: num [1:139, 1] 76.9 76.7 76.7 77.1 76.9 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "Close"
Indexed by objects of class: [Date] TZ:
xts Attributes:
List of 2
$ tclass: chr "Date"
$ tzone : chr ""
Here is some output with my artificial xts object (except I've added: colnames(data)=c("Close"))
> rollapply(data,width=40,by=30,FUN=function(x){print(class(x));print(range(index(x)));length(x)})
[1] "integer"
[1] 1 40
...
> class(data)
[1] "xts" "zoo"
> str(data)
An ‘xts’ object from 2012-03-07 to 2012-07-23 containing:
Data: int [1:139, 1] 1 2 3 4 5 6 7 8 9 10 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "Close"
Indexed by objects of class: [Date] TZ:
xts Attributes:
List of 2
$ tclass: chr "Date"
$ tzone : chr ""
I.e. identical str/class, identical function call, but different result. The xts object where it works is read from a csv file using this code:
d=read.table(fname,sep=',',header=T,stringsAsFactors=F)
x=as.xts(subset(d,select=-datestamp),order.by=as.Date(d$datestamp))
Observe the following:
rollapply(data,width=40,by=30,FUN=function(x){class(x)})
2012-03-26 integer
2012-04-25 integer
2012-05-25 integer
2012-06-24 integer
rollapply is passing the subsets of data as integer rather than xts objects.
The code for zoo:::rollapply.zoo appears to only use standard [ subsetting so it's not clear why the class information is being lost.
Edit
Actually there is a line:
dat <- mapply(f, seq_along(time(data)), width, MoreArgs = list(data = coredata(data),
...), SIMPLIFY = FALSE)
So only the coredata is being passed to the eventual function. This means you can't use rollapply to get these partial ranges.