R Fill backwards with flexible window based on number of rows in a separate column

R Fill backwards with flexible window based on number of rows in a separate column - r

I am trying to carry a value in one column backwards by a number of rows given in a second column and fill everything in between.
So column y mainly has 1s in it but might have individual numbers up to about 20 (in my real data, up to 3 in my example below). If the number in y is 20, I need the 19 rows before that row and that row itself to equal the value of x for the row where y is 20. If the value in y is 1 the output will just equal x.
y also has many NAs, these NAs are either legitimate NAs where I want an NA output or are placeholders where the filling should occur if a y value afterwards is > 1.
I thought I could use dplyr::lead but I cannot have a variable n value to look forwards a different number of steps, and it wouldn't fill inbetween, and I wondered about making a new, always increasing column and using RcppRoll::roll_max but have similar problems with the flexible window size.
Typically y-values in the lead up to a y > 1 will be 0 or NA, but if there were conflicts I would want to adopt the later value still eg in row 8 of my data frame y is 1 followed by y = 2 in row 9 so I want the value associated with row 9 in both cases. If y in NA and there is not covered by filling backwards, I want it to remain NA (or 0 would be fine)
Thanks for any thoughts
set.seed(1)
test <- data.frame(x = sample(1:15,replace = F), y = c(NA,NA,NA,1,NA,NA,3,1,2,1,1,NA,NA,NA,2))
desired_out <- test
desired_out$out <- c(NA,NA,NA,1,11,11,11,8,8,12,5,NA,NA,14,14)
desired_out
#> x y out
#> 1 9 NA NA
#> 2 4 NA NA
#> 3 7 NA NA
#> 4 1 1 1
#> 5 2 NA 11
#> 6 13 NA 11
#> 7 11 3 11
#> 8 3 1 8
#> 9 8 2 8
#> 10 12 1 12
#> 11 5 1 5
#> 12 6 NA NA
#> 13 15 NA NA
#> 14 10 NA 14
#> 15 14 2 14
#try adopting #sirius answer before I specified about the extra NAs
test$y <- ifelse(is.na(test$y),0,test$y)
test$out <- with( test, rep( x, y ) )
#> Error in `$<-.data.frame`(`*tmp*`, out, value = c(1L, 11L, 11L, 11L, 3L, : replacement has 11 rows, data has 15
Created on 2021-04-08 by the reprex package (v0.3.0)

Things got a bit complex, but essentially calculate all the repeated x's for each y > 0, and then let subsequent x'es overwrite earlier ones
set.seed(1)
test <- data.frame(x = sample(1:15,replace = F), y = c(NA,NA,NA,1,NA,NA,3,1,2,1,1,NA,NA,NA,2))
desired_out <- test
desired_out$out <- c(NA,NA,NA,1,11,11,11,8,8,12,5,NA,NA,14,14)
desired_out
test %<>% mutate( id = seq(n()) ) %>%
filter( !is.na(y) & y != 0 ) %>%
group_by(id) %>%
slice( rep(1,y) ) %>%
mutate( id = rev( max(id)+1-1:n() ) ) %>%
group_by(id) %>%
summarize( out = as.numeric(last(x)) ) %>%
right_join( test %>% mutate( id=seq(n()) ) ) %>%
arrange( id ) %>% select( -id ) %>% relocate( x, y, out )
identical( as.data.frame(test), desired_out ) ## TRUE
test
Output:
> test
# A tibble: 15 x 3
x y out
<int> <dbl> <dbl>
1 9 NA NA
2 4 NA NA
3 7 NA NA
4 1 1 1
5 2 NA 11
6 13 NA 11
7 11 3 11
8 3 1 8
9 8 2 8
10 12 1 12
11 5 1 5
12 6 NA NA
13 15 NA NA
14 10 NA 14
15 14 2 14
What the algorithm does, which after a few piped lines is no longer very clear, is the following:
temporarily add id as original row number
take away 0 and NA rows for y
repeat each row y times
within each such repeated row, create a new id that counts backwards (these will be the new row numbers for the x-values to
go)
group by id again this time to let later values overwrite earlier values (so keep only the highest row number for any collision)
join these data back on the original data, using the newly calculated row numbers, repeated x's will now be inserted
sort and clean up

Sequencing and indexing to the rescue:
test$rn <- seq_len(nrow(test))
src <- with(test[!is.na(test$y),],
list(val = rep(x,y), idx = rep(rn,y) - sequence(y) + 1) )
test$out[src$idx] <- src$val
test$rn <- NULL
# x y out
#1 9 NA NA
#2 4 NA NA
#3 7 NA NA
#4 1 1 1
#5 2 NA 11
#6 13 NA 11
#7 11 3 11
#8 3 1 8
#9 8 2 8
#10 12 1 12
#11 5 1 5
#12 6 NA NA
#13 15 NA NA
#14 10 NA 14
#15 14 2 14
I'm generating a row number, getting the row numbers prior to the key rows, and then overwriting those rows with repeats of the selected rows. Sometimes they specify the same location, but the later value will be taken as you can see in the output.
Should be pretty efficient as everything is vectorised and there's only one major assignment operation back to the original dataset for updating all the rows at once. Here's 4.5M rows processed in a fraction of a second:
test <- test[rep(1:15, 3e5),]
system.time({
test$rn <- seq_len(nrow(test))
src <- with(test[!is.na(test$y),],
list(val = rep(x,y), idx = rep(rn,y) - sequence(y) + 1) )
test$out[src$idx] <- src$val
test$rn <- NULL
})
# user system elapsed
# 0.28 0.00 0.28

Related

R - Merging rows with numerous NA values to another column

I would like to ask the R community for help with finding a solution for my data, where any consecutive row with numerous NA values is combined and put into a new column.
For example:
df <- data.frame(A= c(1,2,3,4,5,6), B=c(2, "NA", "NA", 5, "NA","NA"), C=c(1,2,"NA",4,5,"NA"), D=c(3,"NA",5,"NA","NA","NA"))
A B C D
1 1 2 1 3
2 2 NA 2 NA
3 3 NA NA 5
4 4 5 4 NA
5 5 NA 5 NA
6 6 NA NA NA
Must be transformed to this:
A B C D E
1 1 2 1 3 2 NA 2 NA 3 NA NA 5
2 4 5 4 NA 5 NA 5 NA 6 NA NA NA
I would like to do the following:
Identify consecutive rows that have more than 1 NA value -> combine entries from those consecutive rows into a single combined entiry
Place the above combined entry in new column "E" on the prior row
This is quite complex (for me!) and I am wondering if anyone can offer any help with this. I have searched for some similar problems, but have been unable to find one that produces a similar desired output.
Thank you very much for your thoughts--

Using tidyr and dplyr:
Concatenate values for each row.
Keep the concatenated values only for rows with more than one NA.
Group each “good” row with all following “bad” rows.
Use a grouped summarize() to concatenate “bad” row values to a single string.
df %>%
unite("E", everything(), remove = FALSE, sep = " ") %>%
mutate(
E = if_else(
rowSums(across(!E, is.na)) > 1,
E,
""
),
new_row = cumsum(E == "")
) %>%
group_by(new_row) %>%
summarize(
across(A:D, first),
E = trimws(paste(E, collapse = " "))
) %>%
select(!new_row)
# A tibble: 2 × 5
A B C D E
<dbl> <dbl> <dbl> <dbl> <chr>
1 1 2 1 3 2 NA 2 NA 3 NA NA 5
2 4 5 4 NA 5 NA 5 NA 6 NA NA NA

Removing NAs from a large dataframe

I have a very large dataframe, with number of rows = 10 703 009. I want to remove NAs but getting this error, 'Colloc couldnot allocate memory of 10703009 bytes.
My input dataframe is 'a' with many rows with NAs,
IDs
Codes
1
C493
1
NA
2
E348
3
NA
I need a output with rows without NAs
IDs
Codes
1
C493
2
E348
I tried both, but getting memory error,
drop_na(a,Codes)
subset(a,Codes)
Please suggest the solution to this in R.

A frame of 10,703,009 lines is no problem for R. See below. I generated a tibble with exactly the number of lines where the variable Codes contains NA with a probability of probNA = 0.3.
library(tidyverse)
n=10703009
probNA = 0.3
df = tibble(IDs = 1:n,
Codes = paste0(sample(LETTERS[1:10], n, replace = TRUE),
sample(100:999, n, replace = TRUE))) %>%
mutate(Codes = ifelse(sample(c(T,F), n, replace = TRUE,
prob = c(probNA, 1-probNA)), NA, Codes))
df
output
# A tibble: 10,703,009 x 2
IDs Codes
<int> <chr>
1 1 I586
2 2 A188
3 3 H674
4 4 D641
5 5 A793
6 6 B455
7 7 B837
8 8 A805
9 9 NA
10 10 E380
# ... with 10,702,999 more rows
The size of such a tibble is object.size (df) return 12 894 1096 bytes.
We will try to get rid of the lines with NA values.
df %>% filter(!is.na(Codes))
output
# A tibble: 7,490,809 x 2
IDs Codes
<int> <chr>
1 1 I586
2 2 A188
3 3 H674
4 4 D641
5 5 A793
6 6 B455
7 7 B837
8 8 A805
9 10 E380
10 11 C231
# ... with 7,490,799 more rows
Now let's replace all NA values with an empty string.
df %>% mutate(Codes = ifelse(is.na(Codes), "", Codes))
output
# A tibble: 10,703,009 x 2
IDs Codes
<int> <chr>
1 1 "I586"
2 2 "A188"
3 3 "H674"
4 4 "D641"
5 5 "A793"
6 6 "B455"
7 7 "B837"
8 8 "A805"
9 9 ""
10 10 "E380"
# ... with 10,702,999 more rows
As you can see, everything works smoothly and without any problems.

How to divide all previous observations by the last observation iteratively within a data frame column by group in R and then store the result

I have the following data frame:
data <- data.frame("Group" = c(1,1,1,1,1,1,1,1,2,2,2,2),
"Days" = c(1,2,3,4,5,6,7,8,1,2,3,4), "Num" = c(10,12,23,30,34,40,50,60,2,4,8,12))
I need to take the last value in Num and divide it by all of the preceding values. Then, I need to move to the second to the last value in Num and do the same, until I reach the first value in each group.
Edited based on the comments below:
In plain language and showing all the math, starting with the first group as suggested below, I am trying to achieve the following:
Take 60 (last value in group 1) and:
Day Num Res
7 60/50 1.2
6 60/40 1.5
5 60/34 1.76
4 60/30 2
3 60/23 2.60
2 60/12 5
1 60/10 6
Then keep only the row that has the value 2, as I don't care about the others (I want the value that is greater or equal to 2 that is the closest to 2) and return the day of that value, which is 4, as well. Then, move on to 50 and do the following:
Day Num Res
6 50/40 1.25
5 50/34 1.47
4 50/30 1.67
3 50/23 2.17
2 50/12 4.17
1 50/10 5
Then keep only the row that has the value 2.17 and return the day of that value, which is 3, as well. Then, move on to 40 and do the same thing over again, move on to 34, then 30, then 23, then 12, the last value (or Day 1 value) I don't care about. Then move on to the next group's last value (12) and repeat the same approach for that group (12/8, 12/4, 12/2; 8/4, 8/2; 4/2)
I would like to store the results of these divisions but only the most recent result that is greater than or equal to 2. I would also like to return the day that result was achieved. Basically, I am trying to calculate doubling time for each day. I would also need this to be grouped by the Group. Normally, I would use dplyr for this but I am not sure how to link up a loop with dyplr to take advantage of group_by. Also, I could be overlooking lapply or some variation thereof. My expected dataframe with the results would ideally be this:
data2 <- data.frame(divres = c(NA,NA,2.3,2.5,2.833333333,3.333333333,2.173913043,2,NA,2,2,3),
obs_n =c(NA,NA,1,2,2,2,3,4,NA,1,2,2))
data3 <- bind_cols(data, data2)
I have tried this first loop to calculate the division but I am lost as to how to move on to the next last value within each group. Right now, this is ignoring the group, though I obviously have not told it to group as I am unclear as to how to do this outside of dplyr.
for(i in 1:nrow(data))
data$test[i] <- ifelse(!is.na(data$Num), last(data$Num)/data$Num[i] , NA)
I also get the following error when I run it:
number of items to replace is not a multiple of replacement length
To store the division, I have tried this:
division <- function(x){
if(x>=2){
return(x)
} else {
return(FALSE)
}
}
for (i in 1:nrow(data)){
data$test[i]<- division(data$test[i])
}
Now, this approach works but only if i need to run this once on the last observation and only if I apply it to 1 group. I have 209 groups and many days that I would need to run this over. I am not sure how to put together the first for loop with the division function and I also am totally lost as to how to do this by group and move to the next last values. Any suggestions would be appreciated.

You can modify your division function to handle vector and return a dataframe with two columns divres and ind the latter is the row index that will be used to calculate obs_n as shown below:
division <- function(x){
lenx <- length(x)
y <- vector(mode="numeric", length = lenx)
z <- vector(mode="numeric", length = lenx)
for (i in lenx:1){
y[i] <- ifelse(length(which(x[i]/x[1:i]>=2))==0,NA,x[i]/x[1:i] [max(which(x[i]/x[1:i]>=2))])
z[i] <- ifelse(is.na(y[i]),NA,max(which(x[i]/x[1:i]>=2)))
}
df <- data.frame(divres = y, ind = z)
return(df)
}
Check the output of division function created above using data$Num as input
> division(data$Num)
divres ind
1 NA NA
2 NA NA
3 2.300000 1
4 2.500000 2
5 2.833333 2
6 3.333333 2
7 2.173913 3
8 2.000000 4
9 NA NA
10 2.000000 9
11 2.000000 10
12 3.000000 10
Use cbind to combine the above output with dataframe data1, use pipes and mutate from dplyr to lookup the obs_n value in Day using ind, select appropriate columns to generate the desired dataframe data2:
data2 <- cbind.data.frame(data, division(data$Num)) %>% mutate(obs_n = Days[ind]) %>% select(-ind)
Output
> data2
Group Days Num divres obs_n
1 1 1 10 NA NA
2 1 2 12 NA NA
3 1 3 23 2.300000 1
4 1 4 30 2.500000 2
5 1 5 34 2.833333 2
6 1 6 40 3.333333 2
7 1 7 50 2.173913 3
8 1 8 60 2.000000 4
9 2 1 2 NA NA
10 2 2 4 2.000000 1
11 2 3 8 2.000000 2
12 2 4 12 3.000000 2

You can create a function with a for loop to get the desired day as given below. Then use that to get the divres in a dplyr mutation.
obs_n <- function(x, days) {
lst <- list()
for(i in length(x):1){
obs <- days[which(rev(x[i]/x[(i-1):1]) >= 2)]
if(length(obs)==0)
lst[[i]] <- NA
else
lst[[i]] <- max(obs)
}
unlist(lst)
}
Then use dense_rank to obtain the row number corresponding to each obs_n. This is needed in case the days are not consecutive, i.e. have gaps.
library(dplyr)
data %>%
group_by(Group) %>%
mutate(obs_n=obs_n(Num, Days), divres=Num/Num[dense_rank(obs_n)])
# A tibble: 12 x 5
# Groups: Group [2]
Group Days Num obs_n divres
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 10 NA NA
2 1 2 12 NA NA
3 1 3 23 1 2.3
4 1 4 30 2 2.5
5 1 5 34 2 2.83
6 1 6 40 2 3.33
7 1 7 50 3 2.17
8 1 8 60 4 2
9 2 1 2 NA NA
10 2 2 4 1 2
11 2 3 8 2 2
12 2 4 12 2 3
Explanation of dense ranks (from Wikipedia).
In dense ranking, items that compare equally receive the same ranking number, and the next item(s) receive the immediately following ranking number.
x <- c(NA, NA, 1,2,2,4,6)
dplyr::dense_rank(x)
# [1] NA, NA, 1 2 2 3 4
Compare with rank (default method="average"). Note that NAs are included at the end by default.
rank(x)
[1] 6.0 7.0 1.0 2.5 2.5 4.0 5.0

R - delete consecutive (ONLY) duplicates

I need to eliminate rows from a data frame based on the repetition of values in a given column, but only those that are consecutive.
For example, for the following data frame:
df = data.frame(x=c(1,1,1,2,2,4,2,2,1))
df$y <- c(10,11,30,12,49,13,12,49,30)
df$z <- c(1,2,3,4,5,6,7,8,9)
x y z
1 10 1
1 11 2
1 30 3
2 12 4
2 49 5
4 13 6
2 12 7
2 49 8
1 30 9
I would need to eliminate rows with consecutive repeated values in the x column, keep the last repeated row, and maintain the structure of the data frame:
x y z
1 30 3
2 49 5
4 13 6
2 49 8
1 30 9
Following directions from help and some other posts, I have tried using the duplicated function:
df[ !duplicated(x,fromLast=TRUE), ] # which gives me this:
x y z
1 1 10 1
6 4 13 6
7 2 12 7
9 1 30 9
NA NA NA NA
NA.1 NA NA NA
NA.2 NA NA NA
NA.3 NA NA NA
NA.4 NA NA NA
NA.5 NA NA NA
NA.6 NA NA NA
NA.7 NA NA NA
NA.8 NA NA NA
Not sure why I get the NA rows at the end (wasn't happening with a similar table I was testing), but works only partially on the values.
I have also tried using the data.table package as follows:
library(data.table)
dt <- as.data.table(df)
setkey(dt, x)
dt[J(unique(x)), mult ='last']
Works great, but it eliminates ALL duplicates from the data frame, not just those that are consecutive, giving something like this:
x y z
1 30 9
2 49 8
4 13 6
Please, forgive if cross-posting. I tried some of the suggestions but none worked for eliminating only those that are consecutive.
I would appreciate any help.
Thanks

How about:
df[cumsum(rle(df$x)$lengths),]
Explanation:
rle(df$x)
gives you the run lengths and values of consecutive duplicates in the x variable. Then:
rle(df$x)$lengths
extracts the lengths. Finally:
cumsum(rle(df$x)$lengths)
gives the row indices which you can select using [.
EDIT for fun here's a microbenchmark of the answers given so far with rle being mine, consec being what I think is the most fundamentally direct answer, given by #James, and would be the answer I would "accept", and dp being the dplyr answer given by #Nik.
#> Unit: microseconds
#> expr min lq mean median uq max
#> rle 134.389 145.4220 162.6967 154.4180 172.8370 375.109
#> consec 111.411 118.9235 136.1893 123.6285 145.5765 314.249
#> dp 20478.898 20968.8010 23536.1306 21167.1200 22360.8605 179301.213
rle performs better than I thought it would.

You just need to check in there is no duplicate following a number, i.e x[i+1] != x[i] and note the last value will always be present.
df[c(df$x[-1] != df$x[-nrow(df)],TRUE),]
x y z
3 1 30 3
5 2 49 5
6 4 13 6
8 2 49 8
9 1 30 9

A cheap solution with dplyr that I could think of:
Method:
library(dplyr)
df %>%
mutate(id = lag(x, 1),
decision = if_else(x != id, 1, 0),
final = lead(decision, 1, default = 1)) %>%
filter(final == 1) %>%
select(-id, -decision, -final)
Output:
x y z
1 1 30 3
2 2 49 5
3 4 13 6
4 2 49 8
5 1 30 9
This will even work if your data has the same x value at the bottom
New Input:
df2 <- df %>% add_row(x = 1, y = 10, z = 12)
df2
x y z
1 1 10 1
2 1 11 2
3 1 30 3
4 2 12 4
5 2 49 5
6 4 13 6
7 2 12 7
8 2 49 8
9 1 30 9
10 1 10 12
Use same method:
df2 %>%
mutate(id = lag(x, 1),
decision = if_else(x != id, 1, 0),
final = lead(decision, 1, default = 1)) %>%
filter(final == 1) %>%
select(-id, -decision, -final)
New Output:
x y z
1 1 30 3
2 2 49 5
3 4 13 6
4 2 49 8
5 1 10 12

Here is a data.table solution. The trick is to create a shifted version of x with the shift function and compare it with x
library(data.table)
dattab <- as.data.table(df)
dattab[x != shift(x = x, n = 1, fill = -999, type = "lead")] # edited to add closing )
This way you compare each value of x with its immediately following value and throw out where they match. Make sure to set fill to something that is not in x in order for correct handling of the last value.

if condition is true find max in 3 consecutive rows and report it in a new column - r

Reproducible example:
Label<-c(0,0,1,1,1,2,2,3,3,3,4,5,5,5,6,6)
Value<-c(NA,NA,1,2,3,1,2,3,2,1,"NC",1,3,2,1,NA)
dat1<-as.data.frame(cbind(Label, Value))
The output I am after is a new column "test" that gets the maximum of the column "Value" for each value of the column "Label" when there are 3 consecutives values that are the same and otherwise just report the values of the column "Value".
I do not mind about the missing values at the beggining and at the end, they can stay.
Expected result of the column test: NA, NA, 3,3,3,1,2,3,3,3,NC,3,3,3,NA,NA
in excel it was very easy and I coded successfully as follow:
=IF(AND(BN6=BN5,BN6=BN4),X4,Y6)
but in R I cannot.
I tried several methods, the closest to a result is the following:
test <-c(NA,NA)
test_tot <-NULL
for(i in 3:length(dat1$Label)){
test_tot<-c(test_tot, test)
if( dat1$Label[i]==dat1$Label[i+1]&& dat1$Label[i]==dat1$Label[i+2] ){
test<-max(as.numeric(c(dat1$Value[i],dat1$Value[i+1],dat1$Value[i+2])))
}
if(dat1$Label[i]==dat1$Label[i-1]&& dat1$Label[i]==dat1$Label[i+1]){
test<-max(as.numeric(c(dat1$Value[i],dat1$Value[i-1],dat1$Value[i+1])))
}
if(dat1$Label[i]==dat1$Label[i-1]&& dat1$Label[i]==dat1$Label[i-2]){
test<-max(as.numeric(c(dat1$Value[i],dat1$Value[i-1],dat1$Value[i-2])))
}
else {test<-dat1$Value[i]}
}
test_tot<-c(test_tot,NA,NA)
dat1$test<-test_tot
EDIT:
The difficulty apparently is that the column "Value" has character based values. Any solution able to deal with it is greatly appreciated.

Edit: The OP has pointed out that column Value may contain character-based values which are important to identify a specific behaviour happened at a specific time.
Consequently, the whole vector or column is of type character in R (or factor). The code below has been amended to handle this by extracting numeric values to a separate column, computing the maximum values per group, coercing the result back to character and to copy the character-based values into the result.
The data.table solution below
Label<-c(0,0,1,1,1,2,2,3,3,3,4,5,5,5,6,6)
Value<-c(NA,NA,1,2,3,1,2,3,2,1,"NC",1,3,2,1,NA)
Expected <- c(NA, NA, 3,3,3,1,2,3,3,3,"NC",3,3,3,NA,NA)
dat1<-data.frame(Label, Value, Expected)
library(data.table) # CRAN version 1.10.4 used
# coerce to data.table
setDT(dat1)[
# create temporary column with only numeric values
, Value_num := as.numeric(as.character(Value))][
# create temp cols for group id and group size
, `:=`(grp = .GRP, N = .N), by = rleid(Label)][
# for sufficiently large groups compute max values and coerce to char
N >= 3, new := as.character(max(Value_num)), by = grp][
# copy missing values
is.na(new), new := as.character(Value)][
# clean up
, c("grp", "N", "Value_num") := NULL][]
returns the expected result
Label Value Expected new
1: 0 NA NA NA
2: 0 NA NA NA
3: 1 1 3 3
4: 1 2 3 3
5: 1 3 3 3
6: 2 1 1 1
7: 2 2 2 2
8: 3 3 3 3
9: 3 2 3 3
10: 3 1 3 3
11: 4 NC NC NC
12: 5 1 3 3
13: 5 3 3 3
14: 5 2 3 3
15: 6 1 NA 1
16: 6 NA NA NA
except for row 15 where I believe the expected result should be 1 if we follow the words of the OP otherwise just report the values of the column "Value"
The warning message:
In eval(jsub, SDenv, parent.frame()) : NAs introduced by coercion
can be ignored as it's intended to convert non-numbers to NA, here.

Here is a dplyr solution. . NOTE: NC was changed to NA
Label<-c(0,0,1,1,1,2,2,3,3,3,4,5,5,5,6,6)
Value<-c(NA,NA,1,2,3,1,2,3,2,1,NA,1,3,2,1,NA)
dat1<-as.data.frame(cbind(Label, Value))
library(dplyr)
dat1 %>%
filter(!is.na(Value)) %>%
group_by(Label) %>%
summarize(n = n(), max_Value = max(Value)) %>%
mutate(test = if_else(n>=3, max_Value, as.numeric(NA))) %>%
right_join(dat1, by = "Label") %>%
mutate(test = if_else(is.na(test), Value, test)) %>%
select(Label, Value, test)
# # A tibble: 16 × 3
# Label Value test
# <dbl> <dbl> <dbl>
# 1 0 NA NA
# 2 0 NA NA
# 3 1 1 3
# 4 1 2 3
# 5 1 3 3
# 6 2 1 1
# 7 2 2 2
# 8 3 3 3
# 9 3 2 3
# 10 3 1 3
# 11 4 NA NA
# 12 5 1 3
# 13 5 3 3
# 14 5 2 3
# 15 6 1 1
# 16 6 NA NA

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R Fill backwards with flexible window based on number of rows in a separate column - r

Related

R - Merging rows with numerous NA values to another column

Removing NAs from a large dataframe

How to divide all previous observations by the last observation iteratively within a data frame column by group in R and then store the result

R - delete consecutive (ONLY) duplicates

if condition is true find max in 3 consecutive rows and report it in a new column - r

Categories

Resources