Rolling dataset having different row numbers [closed] - r

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
My sample data is as follows.
There are event, datetime and ten_minute.
The format of datetime is "POSIXlt" "POSIXt". Ten minute is just a substring of the first digit of minute in datetime variable.
I’d like to generate multiple rolling dataset using R. For example, Data_1 have rows with ten_minute value of 0, 1, 2. Data_2 have rows with ten_minute value of 1, 2, 3. (And finally, Data_n would have value of 3, 4, 5.) I also want to change the width of window. In this example the width of window is 3. I want to change the width to 5, 10 and etc.
I've tried R coding myself over a week. But I can't figure it out how to do this.

First, a function to generate the windows you need:
generate.windows <- function(vec.start, vec.numberofsets, vec.wdith) {
vec.sets <- vec.start:vec.numberofsets
lapply(vec.sets, function(n) {
seq(from = n, length.out = vec.wdith)})}
Next, extract the data.frames that correspond to each window:
# Assume your original data set is called df.data
list.windows <- generate.windows(1, 10, 3)
list.data.frames <- lapply(list.windows, function(n) {df.data[df.data[,"Ten_minute"] %in% n,]}

Related

Change dates to number [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
tsReturns = xts(x = returns, order.by = dates)
I have a time series ordered by date, how can I change it to order by numbers(1-n)? The first day corresponds to 1, and the last date corresponds to n. don't change the data set value.
1) xts does not support a plain numeric index. It requires one of several date or datetime classes; however, a plain numeric index could be achieved with zoo or ts. If x is an xts object and we want the plain numeric index to be consecutive numbers from 1 to nrow(x) then:
zoo(coredata(x))
ts(coredata(x))
2) If instead:
the desired index of the ith row is to be the number of days since the first date plus 1 and
the index of x is of Date class and
the dates are not consecutive but are unique, e.g. there are gaps for weekends
then this will give a non-consecutive index for zoo. Since ts can only represent regularly spaced series the ts solution below will fill in the values with NA where there is no date in the input.
tt <- as.numeric(time(x))
z <- zoo(coredata(x), tt - tt[1] + 1)
as.ts(z)

How to split a list in R? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have an R list where all of the values are in the first position (i.e. list[1]), while I want all the values to be spread evenly throughout the list (list[1] contains one value, list[2] contains the next, etc.). I have been trying unsuccessfully for a while to split the values one position into separate values (each value is a string of characters separated by spaces) but nothing has worked.
Below is an illustration of the sort of situation I am in.
Say "test" is the name of a list in R. Test is an object of length 1, and if you enter test[1] in the console, the output is thousands of values formatted like so:
[1] "value1" "value2" "value3" ... etc.
Now I want to somehow split the contents of list[1] so that each separated character string is in a separate position, so test[1] is "value1", test[2] is "value2", etc. I have looked around for and attempted many purported solutions to this sort of issue (recent example here: List to integer or double in R) but nothing has worked for me so far.
Here's a simple way:
l1 <- list(l1 = round(rnorm(100, 0, 5), 0))
v <- unlist(l1)
l2 <- as.list(v)
length of l1 is 1 and length of l2 is 100. Is this what you are after?

Loop over NA values in subsequent rows in R [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I want to replace each missing value in the first column of my dataframe with the previous one multiplied by a scalar (eg. 3)
nRowsDf <- nrow(df)
for(i in 1:nRowsDf){
df[i,1] =ifelse(is.na(df[i,1]), lag(df[i,1])+3*lag(df[i,1]), df[i,1])
}
The above code does not give me an error but does not do the job either.
In addition, is there a better way to do this instead of writing a loop?
Update and Data:
Here is an example of data. I want to replace each missing value in the first column of my dataframe with the previous one multiplied by a scalar (eg. 3). The NA values are in subsequent rows.
df <- mtcars
df[c(2,3,4,5),1] <-NA
IND <- is.na(df[,1])
df[IND,1] <- df[dplyr::lead(IND,1L, F),1] * 3
The last line of the above code does the job row by row (I should run it 4 times to fill the 4 missing rows). How can I do it once for all rows?
reproducible data which YOU should provide:
df <- mtcars
df[c(1,5,8),1] <-NA
code:
IND <- is.na(df[,1])
df[IND,1] <- df[dplyr::lag(IND,1L, F),1] * 3
since you use lag I use lag. You are saying "previous". So maybe you want to use lead.
What happens if the first value in lead case or last value in lag case is missing. (this remains a mystery)

I am using the data set "mhw.csv" from https://datahub.io/nl/dataset/mercer-and-hall-wheat-yield-data [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I am currently working with the data set "mhw.csv" , located at https://datahub.io/nl/dataset/mercer-and-hall-wheat-yield-dat
Which is a data frame pertaining
The data frame is separated into 4 columns:
"r" "c" "wheat" "straw"
Column r is a row number and c is a column number corresponding to an individual plot in the field. The field is 20 x 25. With a length of 500.
I want to divide the data into 4 quadrants, a NorthWest (rows 1:5 and Columns 1:12) NorthEast (rows 1:5 and columns 13:25) SouthWest (rows 5:10 and columns 1:12) SouthEast (rows 5:10 and columns 13:25)
Then add a 5th column to the data.frame that would denote where each of the plot is located.
Any help would be greatly appreciated. This is my first question, I hope I gave enough information.
Thank you!
I'm not going to go download that data, but using sample data:
test1 <- data.frame(r = sample(1:10, 10), c = sample(1:25, 10))
The simplest no-frills answer is probably:
test1$Quadrant[test1$r<=5 & test1$c<=12] <- "Northwest"
test1$Quadrant[test1$r>5 & test1$c<=12] <- "Southwest"
...
Et cetera. Do it for your four quadrants and the dataframe should now have the new column you're looking for.
PS: Generally you'll get quicker answers if you provide a sample dataframe like I did above with 'test1'.

R: sort rows, query them and add results as colum [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have an R dataframe with the dimension 32 x 11. For each row I would like to determine the highest value, the second highest, and the third highest value and add these values as extra colums to the initial dataframe (32 x 14). Many thanks in advance!
library(car)
data(mtcars)
mtcars
First, create a function to get the nth highest value for a vector. Then, create a copy of the dataframe, since the second highest value may change as you add more columns. Then apply your function using apply and 1 to operate row-wise. I'm not sure what would happen if there are NAs in the data. I haven't tested it...
Something like this...
nth_highest <- function(x, n)sort(x, decreasing=TRUE)[n]
tmp <- mtcars
mtcars$highest <- apply(tmp, 1, function(x)nth_highest(x,1))
mtcars$second_highest <- apply(tmp, 1, function(x)nth_highest(x,2))
mtcars$third_highest <- apply(tmp, 1, function(x)nth_highest(x,3))
rm(tmp)

Resources