This question already has answers here:
Create counter within consecutive runs of certain values
(6 answers)
Closed 5 years ago.
I have a vector, for example
ind <- c(TRUE,FALSE,TRUE,TRUE,FALSE,FALSE,FALSE,TRUE,TRUE,TRUE,FALSE)
and I want to count the number of subsequent "TRUE" values, whereas the counting should start from 1 if there was a "FALSE" value between a block of subsequent "TRUE" values. The result for the example above should be
result <- c(1,0,1,2,0,0,0,1,2,3,0)
Any ideas how to do this nicely?
rle computes "the lengths and values of runs of equal values in a vector"
sequence creates for "each element of nvec the sequence seq_len(nvec[i])"
logical values are automatically coerced to 0/1 when multiplied with numbers
All these functions together:
sequence(rle(ind)$lengths) * ind
#[1] 1 0 1 2 0 0 0 1 2 3 0
Related
This question already has an answer here:
Include levels of zero count in result of table()
(1 answer)
Closed 2 years ago.
I have a list of integers that are all between 1 and 365. There are some integers that appear multiple times and some that do not appear. I would like to use a function like count to have a dataframe that counts the number of occurrences it appears including if it does not appear.
df
x freq
1 0
2 1
3 3
4 0
Currently, both the rows for 1 and 4 do not exist in my current count function df=count(list)
We can use factor with levels specified so that it will also take care of the missing elements and report the count as 0
table(factor(df$x, levels = 1:4))
This question already has answers here:
How to count the frequency of a string for each row in R
(4 answers)
Counting number of instances of a condition per row R [duplicate]
(1 answer)
Closed 3 years ago.
I've the following data frame and I want to count the occurrences of the word "High" for each rows and append as another column say "countHigh" to the data frame
a b c
1 High High High
2 High Low High
3 Low Low High
So I should get a vector of counts (3,2,1).
I've tried apply() and stringr:str_count as follows:
> apply(test.df[,1:3],1,str_count,"High" )
[,1] [,2] [,3]
[1,] 1 1 0
[2,] 1 0 0
[3,] 1 1 1
and I used the apply() function twice:
> apply(apply(test.df[,1:3],1,str_count,"High" ),2,sum)
[1] 3 2 1
Is there a better way to do this, particularly using apply() just once and using grep() or which() ?
Thanks
If it a fixed string, then create a logical matrix with == and get the rowSums from base R (should be fast compared to apply)
test.df$countHigh <- rowSums(test.df == "High")
Say I have 100 numerical values in a column. I want to create a vector that returns 0 when the numerical value of an entry is greater than or equal to 2, and returns 1 if the numeric value is less than 2. I suppose it's trivial but still don't know how...
This question already has answers here:
Cumulative number of unique values in a column up to current row
(2 answers)
Closed 4 years ago.
My data frame looks something like this:
USER URL
1 homepage.com
1 homepage.com/welcome
1 homepage.com/overview
1 homepage.com/welcome
What I want is a vector with the following values:
UNIQUE
1
2
3
3
How do I do that?
We could use cumsum and duplicated
df$unique <- cumsum(!duplicated(df$URL))
df$unique
#[1] 1 2 3 3
duplicated gives us logical vector of whether a value is duplicate or not, we negate it (!) and then use cumsum over it so we have cumulative sum of unique values.
Using dplyr to add a new column:
library(dplyr)
df %>%
mutate(Dups=cumsum(!duplicated(URL)))
This question already has an answer here:
How to create a TRUE or FALSE column based on regexpr() findings in R?
(1 answer)
Closed 5 years ago.
df <-
SUB CONC
1 baseline (predose)
2 screen
2 predose
I want to add a flag such that if CONC column has "predose" written in it regardless of other things in the cell, then give it a flag 1, otherwise 0.
dfout <-
SUB CONC PREDOSE
1 baseline (predose) 1
2 screen 0
2 predose 1
How can I do this in R? I used RStudio.
We can use grepl with pattern specified as 'predose' to create a logical vector and then coerce that to binary with as.integer
df$PREDOSE <- as.integer(grepl('predose', df$CONC))
df$PREDOSE
#[1] 1 0 1