I would like to apply the xts class to a list.
y <- list(1, 2, 3)
tm <- Sys.time() + 1:3
require(xts)
xts(x = y, order.by = tm)
## Error in coredata.xts(x) : currently unsupported data type
Fair enough, is it fairly straightforward to extend this so that I can make this work for my own (extension of list) class? Do I write methods for coredata, index and xts that apply to my own class or do I need to first add similar methods for list?
I couldn't find anything in the documentation or vignettes on this, but I'm probably missing something obvious.
Primarily I would like to create a simple class based on a recursive vector, and then apply the xts tools of index and [ to that. The extraction tools allow indexing by time interval with simple character strings, i.e. ["2013-05-31 10"] means the interval between 10:00:00 and 10:59:59 on that day and this is the feature I'd like to get for free.
An xts object is (in essence) a numerical matrix plus an index attribute.
The constraints are hence a) to have a numeric matrix (which you know how to create from a list) and b) to have a POSIXt object for the index.
If you are beholden to lists, keep your data as a list of ... xts objects.
Exploring the source code shows that this is really not possible without substantial work (as Joshua says in the comment above).
The code that provides the general support for input types is in C in xts, so that alone makes it extra effort to apply this outside of atomic vectors, matrices and data.frames.
The analogous code in zoo is pure R so that could work a little more easily, but I wanted the support that allows indexing by time interval with simple character strings, i.e. ["2013-05-31 10"] means the interval between 10:00:00 and 10:59:59 on that day.
The best options I can see are
Poach or recreate the code for time interval indexing and apply to the new class
Create an object containing xts and define methods to propagate the support to the recursive list component. (There are examples of this in an overall S4 context, e.g. in spacetime.)
Related
I'm on a project in remote sensing running on R. I've got a RasterBrick(x) with the raster for all the dates I'm interested in, a Time Serie with the dates corresponding (called time in the function), and a function which works as I want it when processed manually (z is the pixel I want) :
function(x,z)
{
d<-bfastts(as.vector(x[as.numeric(z)]),time,type="16-day")
n<-bfast(d, h=0.15, season="harmonic", max.iter = 1)
l[[z]]<-list(n$output[[1]]$Tt)
}
The bfastts function is used to create a ts object containing the values of one pixel along the time serie, the bfast is another processing some statisticals of which I only want one result (this is the third line)? None of this two functions are mine, and they are stable and foundable in the R package repository.
So, I would like to add "another level" of function (sorry for my vocabulary which may not be very precise) which would allow to run this function automatically. My expected result would be a list of the result of the function above, so in other words a list of each pixel's time series.
I've tried this (x is still the RasterBrick) :
function(x)
{
z<-nrow(x)*ncol(x)
j<-last(z[[1]])
l<-vector('list',length = j)
index<-function(x)
{
d<-bfastts(as.vector(x[as.numeric(z)]),time,type="16-day")
n<-bfast(d, h=0.15, season="harmonic", max.iter = 1)
l[[z]]<-list(n$output[[1]]$Tt) # this is to add the newly created element to the list
}
lapply(x, FUN='index')
}
but I'm getting an answer that it is not possible to coerce a S4 object to a vector, I guess the problem is in lapply who doesn't like the RasterBrick class... Furthermore I want a list of list in output, and not a list of RasterBrick (I think I understood lapply returns a list of object with the same class as x).
I've tried different workaround, none succesfully, which is not surprising giving my low level in programming, and this one seems to me the closest to what I need. I don't think I fully understand neither how lapply works nor the use of a function in a function.
Thank you very much if you can help me.
Cheers
Guillaume
So, in case it could be useful to someone, here is how I solved this problem (it seems rather very simple finally), the "brick" object is the RasterBrick:
pixelts<- as.list(as.data.frame(t(as.data.frame(brick))))
I want to know the differences into use ts() or zoo() function.
A zoo object has the time values (possibly irregular) in an index attribute displayed like a row name at the console by the print.zoo method and the values in a matrix or atomic vector which places constraints on the values that can be used (generally numeric, but necessarily all of a single mode, i.e. not as a list with multiple modes like a dataframe might hold). With pkg:zoo loaded, to get a list of functions that have zoo-methods:
library(zoo)
methods(class="zoo")
The yrmon- class is added to allow monthly date indices. you can see the range of methods:
methods(class="yearmon")
The xts-class is an important extension to the zoo methods but an additional package is needed. There are many worked examples of zoo and xts functions on SO.
A ts-object has values of a single mode with attributes that always imply regular observations and those attributes support a recurring cycle such as years and months. Rather than storing the index item by item or row by row, the index is calculated on the fly using 'start', 'end' and 'frequency' values stored as attributes and accessible with functions by those names. The list of functions for ts-objects is distinctly small (and most people find them more difficult to work with):
methods(class="ts")
There was also an its-package for irregular time series, but it was distinctly less popular than the zoo-package and has apparently been abandoned.
I am trying to learn how to use R. I can use it to do basic things like reading in data and running a t-test. However, I am struggling to understand the way R is structured (I am have a very mediocre java background).
What I don't understand is the way the functions are classified.
For example in is.na(someVector), is is a class? Or for read.csv, is csv a method of the read class?
I need an easier way to learn the functions than simply memorizing them randomly. I like the idea of things belonging to other things. To me it seems like this gives a language a tree structure which makes learning more efficient.
Thank you
Sorry if this is an obvious question I am genuinely confused and have been reading/watching quite a few tutorials.
Your confusion is entirely understandable, since R mixes two conventions of using (1) . as a general-purpose word separator (as in is.na(), which.min(), update.formula(), data.frame() ...) and (2) . as an indicator of an S3 method, method.class (i.e. foo.bar() would be the "foo" method for objects with class attribute "bar"). This makes functions like summary.data.frame() (i.e., the summary method for objects with class data.frame) especially confusing.
As #thelatemail points out above, there are some other sets of functions that repeat the same prefix for a variety of different options (as in read.table(), read.delim(), read.fwf() ...), but these are entirely conventional, not specified anywhere in the formal language definition.
dotfuns <- apropos("[a-z]\\.[a-z]")
dotstart <- gsub("\\.[a-zA-Z]+","",dotfuns)
head(dotstart)
tt <- table(dotstart)
head(rev(sort(tt)),10)
## as is print Sys file summary dev format all sys
## 118 51 32 18 17 16 16 15 14 13
(Some of these are actually S3 generics, some are not. For example, Sys.*(), dev.*(), and file.*() are not.)
Historically _ was used as a shortcut for the assignment operator <- (before = was available as a synonym), so it wasn't available as a word separator. I don't know offhand why camelCase wasn't adopted instead.
Confusingly, methods("is") returns is.na() among many others, but it is effectively just searching for functions whose names start with "is."; it warns that "function 'is' appears not to be generic"
Rasmus Bååth's presentation on naming conventions is informative and entertaining (if a little bit depressing).
extra credit: are there any dot-separated S3 method names, i.e. cases where a function name of the form x.y.z represents the x.y method for objects with class attribute z ?
answer (from Hadley Wickham in comments): as.data.frame.data.frame() wins. as.data.frame is an S3 generic (unlike, say, as.numeric), and as.data.frame.data.frame is its method for data.frame objects. Its purpose (from ?as.data.frame):
If a data frame is supplied, all classes preceding ‘"data.frame"’
are stripped, and the row names are changed if that argument is
supplied.
Recently I encountered the following problem in my R code. In a function, accepting a data frame as an argument, I needed to add (or replace, if it exists) a column with data calculated based on values of the data frame's original column. I wrote the code, but the testing revealed that data frame extract/replace operations, which I've used, resulted in a loss of the object's special (user-defined) attributes.
After realizing that and confirming that behavior by reading R documentation (http://stat.ethz.ch/R-manual/R-patched/library/base/html/Extract.html), I decided to solve the problem very simply - by saving the attributes before the extract/replace operations and restoring them thereafter:
myTransformationFunction <- function (data) {
# save object's attributes
attrs <- attributes(data)
<data frame transformations; involves extract/replace operations on `data`>
# restore the attributes
attributes(data) <- attrs
return (data)
}
This approach worked. However, accidentally, I ran across another piece of R documentation (http://stat.ethz.ch/R-manual/R-patched/library/base/html/Extract.data.frame.html), which offers IMHO an interesting (and, potentially, a more generic?) alternative approach to solving the same problem:
## keeping special attributes: use a class with a
## "as.data.frame" and "[" method:
as.data.frame.avector <- as.data.frame.vector
`[.avector` <- function(x,i,...) {
r <- NextMethod("[")
mostattributes(r) <- attributes(x)
r
}
d <- data.frame(i = 0:7, f = gl(2,4),
u = structure(11:18, unit = "kg", class = "avector"))
str(d[2:4, -1]) # 'u' keeps its "unit"
I would really appreciate if people here could help by:
Comparing the two above-mentioned approaches, if they are comparable (I realize that the second approach as defined is for data frames, but I suspect it can be generalized to any object);
Explaining the syntax and meaning in the function definition in the second approach, especially as.data.frame.avector, as well as what is the purpose of the line as.data.frame.avector <- as.data.frame.vector.
I'm answering my own question, since I have just found an SO question (How to delete a row from a data.frame without losing the attributes), answers to which cover most of my questions posed above. However, additional explanations (for R beginners) for the second approach would still be appreciated.
UPDATE:
Another solution to this problem has been proposed in an answer to the following SO question: indexing operation removes attributes. Personally, however, I better like the approach, based on creating a new class, as it's IMHO semantically cleaner.
I would like to import a time-series where the first field indicates a period:
08:00-08:15
08:15-08:30
08:30-08:45
Does R have any features to do this neatly?
Thanks!
Update:
The most promising solution I found, as suggested by Godeke was the cron package and using substring() to extract the start of the interval.
I'm still working on related issues, so I'll update with the solution when I get there.
CRAN shows a package that is actively updated called "chron" that handles dates. You might want to check that and some of the other modules found here: http://cran.r-project.org/web/views/TimeSeries.html
xts and zoo handle irregular time series data on top of that. I'm not familiar with these packages, but a quick look over indicates you should be able to use them fairly easily by splitting on the hyphen and loading into the structures they provide.
So you're given a character vector like c("08:00-08:15",08:15-08:30) and you want to convert to an internal R data type for consistency? Check out the help files for POSIXt and strftime.
How about a function like this:
importTimes <- function(t){
t <- strsplit(t,"-")
return(lapply(t,strptime,format="%H:%M:%S"))
}
This will take a character vector like you described, and return a list of the same length, each element of which is a POSIXt 2-vector giving the start and end times (on today's date). If you want you could add a paste("1970-01-01",x) somewhere inside the function to standardize the date you're looking at if it's an issue.
Does that help at all?