I am looking for an approach to convert a time series data into vectors. An example of what I am trying to achieve is given below.
Data x = x1,x2,x3,..x100
Required vectors = V1(x1,x2,x3),V2(x2,x3,x4), V3(x3,x4,x5).. v98(x98,x99,x100)
I could convert the complete time series into Vector. But I do not know how I could achieve the above result.
Thanks for all leads.
I am trying this in R.
Use embed(x,98).
(Entering extra characters just to post this.)
Related
I have 2 input dataframes :
One is the origin of some numerical features
Second is the evolutions of that features over time
origin <- data.frame("year"=2017,"a"=5,"b"=2)
evolutions <- data.frame("year"=c(2018,2019,2020),"a"=c(1.2,1.5,2),"b"=c(0.5,2,2))
What I need is to recover the values of the numerical features for each period
output <- data.frame("year"=c(2017,2018,2019,2020),"a"=c(5,6,9,18),"b"=c(2,1,2,4))
Seems simple, I basically need to multiply the origin by the evolutions step by step. However, I can't find an efficient way to do this : currently I am looping with a for loop but I am not satisfied with this.
Any idea ? Any package solution (dplyr, data.table or even base) would be great.
Thank you !
Following the proposition of #Henrik in the comments, here is a solution, with a loop on colnames, assuming date is at the first position and the two inputs share the same colnames :
output <- data.frame("year" = c(origin$year, evolutions$year),
sapply(colnames(origin)[-1], function(x) cumprod(c(origin[[x]], evolutions[[x]]))))
I am new to R and need to use the function getnfac from the PANICr package. And it seems that the function only takes an xts object as its first argument. However, after I went through some reading I still don't understand what an xts object is. Could anyone please tell me how I can convert a matrix into an xts object?
Below I use return matrix as the first argument. Therefore I just need to convert return to an xts object.
getnfac(return,143,"BIC3")
Error in getnfac(return, 143, "BIC3") :
x must be an xts object so lags and differences are taken properly
xts is an extensible time series object, essentially a regular ts object (or more correctly a zoo object) with some bits added.
The 'extensible' part of the name refers to how you can add attributes of your own choice.
While a matrix can be converted into a multivariate time series quite easily
m <- matrix(1:16, 4)
m.ts <- ts(m)
index(m.ts)
An xts requires its index (a vector describing at what time each sample was taken) to be in a date or time format
library(xts)
m <- matrix(1:16, 4)
d <- as.Date(1:nrow(m))
m.xts <- xts(m, order.by=d)
index(m.xts)
If your data is sampled at evenly spaced intervals a dummy index like the one above is probably ok. If not, you'll need to supply a vector corresponding to the sampling times.
In my opinion, the first argument to the getnfac() function should be matrix containing the data.
In addition to the above answers,
You can convert matrix format using coredata() about xts object.
I want to create a Time Series data frame by doing this:
x <- xts(data$length,data$Time.Elapsed)
Then, I got a warning message:
Error in xts(data$length, data$Time.Elapsed) :
order.by requires an appropriate time-based object
So, I was thinking the problem is my "Time.Elapsed" is numeric data. Then I want to convert the data type of "Time.Elapsed", how can I achieve that?
>data$Time Elapsed
Time Elapsed
0
1
2
3
4
5
I want to create a time series data frame, so I need to have a time-based object in R. Here, "Time Elapsed" is a numeric variable (those numbers represent seconds); how can I convert it to time type "seconds"? I searched the Data-time conversion function, like: as.POSIX* {base} But I don't think this function suits my case. Anyone can help me about this? Thank you very much!
I believe you're not going low-level enough on this. xts provides some convenience functions to help determine if you can convert something to xts or not.
xtsible(data) #Will probably tell you it fails with your current setup.
xts builds on zoo, and zoo is a bit more flexible though harder to work with.
library(zoo)
zooData <- zoo(data$length, data$Time.Elapsed)
xtsible(zooData) #Will probably tell you it's ok, but probably doesn't matter since
#most/all of xts's functions work on zoo objects.
xtsData <- xts(zooData)
require(lubridate)
x <- as.POSIXct(strptime(data$Time.Elapsed, format = "%S"))
as.duration(x)
This should do the trick.
My knowledge and experience of R is limited, so please bear with me.
I have a measurements of duration in the following form:
d+h:m:s.s
e.g. 3+23:12:11.931139, where d=days, h=hours, m=minutes, and s.s=decimal seconds. I would like to create a histogram of these values.
Is there a simple way to convert such string input into a numerical form, such as seconds? All the information I have found seems to be geared towards date-time objects.
Ideally I would like to be able to pipe a list of data to R on the command line and so create the histogram on the fly.
Cheers
Loris
Another solution based on SO:
op <- options(digits.secs=10)
z <- strptime("3+23:12:11.931139", "%d+%H:%M:%OS")
vec_z <- z + rnorm(100000)
hist(vec_z, breaks=20)
Short explanation: First, I set the option in such a way that the milliseconds are shown. Now, if you type z into the console you get "2012-05-03 23:12:11.93113". Then, I parse your string into a date-object. Then I create some more dates and plot a histogramm. I think the important step for you is the parsing and strptime should help you with that
I would do it like this:
str = "3+23:12:11.931139"
result = sum(as.numeric(unlist(strsplit(str, "[:\\+]", perl = TRUE))) * c(24*60*60, 60*60, 60, 1))
> result
[1] 342731.9
Then, you can wrap it into a function and apply over the list or vector.
New to R and having problem with a very simple task! I have read a few columns of .csv data into R, the contents of which contains of variables that are in the natural numbers plus zero, and have missing values. After trying to use the non-parametric package, I have two problems: first, if I use the simple command bw=npregbw(ydat=y, xdat=x, na.omit), where x and y are column vectors, I get the error that "number of regression data and response data do not match". Why do I get this, as I have the same number of elements in each vector?
Second, I would like to call the data ordered and tell npregbw this, using the command bw=npregbw(ydat=y, xdat=ordered(x)). When I do that, I get the error that x must be atomic for sort.list. But how is x not atomic, it is just a vector with natural numbers and NA's?
Any clarifications would be greatly appreciated!
1) You probably have a different number of NA's in y and x.
2) Can't be sure about this, since there is no example. If it is of following type:
x <- c(3,4,NA,2)
Then ordered(x) should work fine. Please provide an example of your case.
EDIT: You of course tried bw=npregbw(ydat=y, xdat=x)? ordered() makes your vector an ordered factor (see ?ordered), which is not an atomic vector (see 2.1.1 link and ?factor)
EDIT2: So the problem was the way of subsetting data. Note the difference in various ways of subsetting. data$x and data[,i] (where i = column number of column x) give you vectors, while data[c("x")] and data[i] give a data frame. Functions expect vectors, unless they call for data = (your data). In that case they work with column names