For loop with condition in R - r

Im new at programing in R.
I have a list which contains numbers between 0 and 5. I want to count how many times 1 appears before 5, in result2 stored my list. I have done this:
counting<-function(lista,n,m){
p=2
for (p in data_list){
if(results2[p]==n && results2[p-1]==m){
length(p)
}
p<-p+1
}
}
counting(results2,5,1)
Can anyone please provide me with som helpful adivce to imporve my code since it does not work.

We loop over the list, find the index of the first 5, get the sequence (seq), use that to subset the list element and count the number of 1 by creating a logical expression with == and using sum on that
sapply(data_list, function(x) {
i1 <- which(x == 5)
i2 <- i1[i1 > 1]
if(length(i2) > 0) {
sum(x[i2-1] == 1)
} else NA_real_
})
#[1] 3 3
Or in tidyverse, we can make use of lag
library(dplyr)
library(purrr)
map_dbl(data_list, ~ sum(.x == 5 & lag(.x) == 1, na.rm = TRUE))
#[1] 3 3
data
data_list <- list(c(3,4,1,5 ,2,3,1,5,4,1,5),
c(3,4,1,5 ,2,3,1,5,4,1,5))

Related

find the best combination of rows in a matrix

I need help to generate a function in Rstudio that helps me to select in a matrix of 0 and 1 the most number of rows with a single 1 per column and the most number of columns with 1.
in the following example
m<- matrix(c(1,1,0,0,0,0,0,1,0,1,1,0,0,1,1,0,1,0,1,0), ncol = 3 , nrow = 5)
print(m)
the result would be the rows 2,3,4
Here is a brute-force approach (assuming you always have less columns than rows)
Filter(
length,
combn(1:nrow(m),
ncol(m),
function(k) {
mat <- m[k, , drop = FALSE]
if (all(rowSums(mat) == 1) & all(colSums(mat) == 1)) {
return(k)
}
},
simplify = FALSE
)
)
which gives
[[1]]
[1] 2 3 4

how to select values from a list within a Range using while and if loop?

I have a list with values and I need to select values from this list that are greater or equal to 3 and less than/equal to 4 but I don't know how to do so using the while and if loops. Anyone could give me a clue on how to solve this?
If I understood what you have in mind correctly, you can use the following solution. Imagine we have a vector of length 20 called vec:
We first create an empty vector out to store our result in it during in every iteration (if any)
Then we set the iterator i to its first value (here we set the initial value to 1)
While loops begin by testing a condition (i <= length(vec)), so after that they execute the body (our if clause and subsequent assigning of value that meets our requirements(>=3 & <=4) to out). It then adds one to the iterator and evaluates the condition again and so forth.
vec <- sample(1:10, size = 20, replace = TRUE)
out <- c()
i <- 1
while(i <= length(vec)) {
if(vec[i] <= 4 & vec[i] >= 3) {
out <- c(out, vec[i])
}
i <- i + 1
}
out
[1] 4 3
Actually you don't need any while/if statements, you can simply apply
x[x >= 3 & x <= 4]
If you have to use while and if to make it, below is one option
k <- 1
res <- c()
while(k < length(x)) {
if (x[k] >= 3 & x[k] <= 4) {
res <- append(res,x[k])
}
k <- k + 1
}

Repeated conditional change with sapply or a loop in R

I am trying to do a conditional change for a list of 11 columns in R. My conditional is always the same survey$only0 == 1. I wrote the following code:
survey$w.house[survey$only0 == 1] <- 1
survey$w.inc[survey$only0 == 1] <- 1
survey$w.jobs[survey$only0 == 1] <- 1
survey$w.com[survey$only0 == 1] <- 1
survey$w.edu[survey$only0 == 1] <- 1
survey$w.env[survey$only0 == 1] <- 1
survey$w.health[survey$only0 == 1] <- 1
survey$w.satisf[survey$only0 == 1] <- 1
survey$w.safe[survey$only0 == 1] <- 1
survey$w.bal[survey$only0 == 1] <- 1
survey$w.civic[survey$only0 == 1] <- 1
My code works well, but I would like to shorten my code using a loop or a function as sapply or lapply. Does anyone know how to do it ?
Thank you for your help !
David
We can do this easily with lapply by looping through the columns of interest ('nm1'), and replace the values of it to 1 where 'only0' is 1.
survey[nm1] <- lapply(survey[nm1], function(x) replace(x, survey$only0==1, 1))
Or as #Vlo mentioned the anonymous function call is not needed
survey[nm1] <- lapply(survey[nm1], replace, list = survey$only0==1, values=1)
where
nm1 <- c("w.house", "w.inc", "w.jobs", "w.com", "w.edu", "w.env",
"w.health", "w.satisf", "w.safe", "w.bal", "w.civic")
You can try,
survey[survey$only0 == 1, cols] <- 1
where cols are the columns for which you want to check the condition.
cols <- c("w.house", "w.inc", "w.jobs", "w.com", "w.edu", "w.env",
"w.health", "w.satisf", "w.safe", "w.bal", "w.civic")

Possible combinations using R

I have edited my question and changed certain lines in my script, to make it clear to find the number of times I can get the output 1 or 0.
I have 19 variables.I tried the possible combinations of these 19 variables for giving a binary output of 0 or 1 i.e. 2 to the power of 19 (5,24,288). But I couldn't display the truth table in R for all the 5,24,288 combinations because of the limited memory space. Is there any way to find the number of combinations that give the output 1 and 0. Below is the script, where I have given the following inputs using logical gate AND and OR. Kindly give me ideas or suggestions to find the number of times I can get values 0 or 1 as output
n <- 19
l <- rep(list(0:1), n)
inputs <- expand.grid(l)
len <-dim(inputs)
len <-len[1]
output <- 1;
for(i in 1:len)
{
if((inputs[i,1] == 1 & inputs[i,2] == 1 & inputs[i,3] == 1 & (inputs[i,4] == 1 & inputs[i,5] == 1 | inputs[i,6] == 1 & inputs[i,7] == 0)) | (inputs[i,1] == 1 & inputs[i,2] == 1 & inputs[i,8] == 1 & inputs[i,9] == 1) | (inputs[i,1] == 1 & inputs[i,10] == 0 & inputs[i,11] == 0) |(inputs[i,1] == 1 & inputs[i,12] == 1 & inputs[i,13] == 1 & inputs[i,14] == 1) | (inputs[i,1] == 1 & inputs[i,15] == 1 & inputs[i,16] == 1) | (inputs[i,1] == 1 & inputs[i,17] == 0) | (inputs[i,1] == 1 & inputs[i,18] == 1 & inputs[i,19] == 1)){
output[i] <- 1
}
else
{
output[i] <- 0
}
}
data <- cbind(inputs, output)
write.csv(data, "data.csv", row.names=FALSE)
1048576 isn't absurdly big. If all you want are the 20 0/1 columns it takes about 80 Mb if you use integers:
x = replicate(n = 20, expr = c(0L, 1L), simplify = FALSE)
comb = do.call(expand.grid, args = x)
dim(comb)
# [1] 1048576 20
format(object.size(comb), units = "Mb")
# [1] "80 Mb"
In your question you use && a lot. && is good for comparing something of length 1. Use & for a vectorized comparison so you don't need a for loop.
For example:
y = matrix(c(1, 1, 0, 0, 1, 0, 1, 0), nrow = 4)
y[, 1] & y[, 2] # gives the truth table for & applied across columns
# no for loop needed
# R will interpret 0 as FALSE and non-zero numbers as TRUE
# so you don't even need the == 1 and == 0 parts.
It seems like you're really after the number of combinations where all the values are 1. (Or where they all have specific values.) I'm not going to give away the answer here because I suspect this is for homework, but I will say that you shouldn't need to program a single line of code to find that out. If you understand what the universe of 'all possible combinations' is, the answer will be quite clear logically.
I guess this is what you want:
key <- c(1,0,1,1,1,1,1,1,1,1,1,0,1,1,0,1,1,1,1,1) # based on your if condition
inputs <- expand.grid(rep(list(0:1), 20))
len <- nrow(inputs)
output <- sapply(1:len, function(i) all(inputs[i,]==key))
data <- cbind(inputs, as.numeric(output))
write.csv(data, "data.csv", row.names=FALSE)
Although, as stressed by others, key can be found only in one row out of all 1048576 rows.

How to create a conditional dummy in R?

I have a dataframe of time series data with daily observations of temperatures. I need to create a dummy variable that counts each day that has temperature above a threshold of 5C. This would be easy in itself, but an additional condition exists: counting starts only after ten consecutive days above the threshold occurs. Here's an example dataframe:
df <- data.frame(date = seq(365),
temp = -30 + 0.65*seq(365) - 0.0018*seq(365)^2 + rnorm(365))
I think I got it done, but with too many loops for my liking. This is what I did:
df$dummyUnconditional <- 0
df$dummyHead <- 0
df$dummyTail <- 0
for(i in 1:nrow(df)){
if(df$temp[i] > 5){
df$dummyUnconditional[i] <- 1
}
}
for(i in 1:(nrow(df)-9)){
if(sum(df$dummyUnconditional[i:(i+9)]) == 10){
df$dummyHead[i] <- 1
}
}
for(i in 9:nrow(df)){
if(sum(df$dummyUnconditional[(i-9):i]) == 10){
df$dummyTail[i] <- 1
}
}
df$dummyConditional <- ifelse(df$dummyHead == 1 | df$dummyTail == 1, 1, 0)
Could anyone suggest simpler ways for doing this?
Here's a base R option using rle:
df$dummy <- with(rle(df$temp > 5), rep(as.integer(values & lengths >= 10), lengths))
Some explanation: The task is a classic use case for the run length encoding (rle) function, imo. We first check if the value of temp is greater than 5 (creating a logical vector) and apply rle on that vector resulting in:
> rle(df$temp > 5)
#Run Length Encoding
# lengths: int [1:7] 66 1 1 225 2 1 69
# values : logi [1:7] FALSE TRUE FALSE TRUE FALSE TRUE ...
Now we want to find those cases where the values is TRUE (i.e. temp is greater than 5) and where at the same time the lengths is greater than 10 (i.e. at least ten consecutive tempvalues are greater than 5). We do this by running:
values & lengths >= 10
And finally, since we want to return a vector of the same lengths as nrow(df), we use rep(..., lengths) and as.integer in order to return 1/0 instead of TRUE/FALSE.
I think you could use a combination of a simple ifelse and the roll apply function in the zoo package to achieve what you are looking for. The final step just involves padding the result to account for the first N-1 days where there isnt enough information to fill the window.
library(zoo)
df <- data.frame(date = seq(365),
temp = -30 + 0.65*seq(365) - 0.0018*seq(365)^2 + rnorm(365))
df$above5 <- ifelse(df$temp > 5, 1, 0)
temp <- rollapply(df$above5, 10, sum)
df$conseq <- c(rep(0, 9),temp)
I would do this:
set.seed(42)
df <- data.frame(date = seq(365),
temp = -30 + 0.65*seq(365) - 0.0018*seq(365)^2 + rnorm(365))
thr <- 5
df$dum <- 0
#find first 10 consecutive values above threshold
test1 <- filter(df$temp > thr, rep(1,10), sides = 1) == 10L
test1[1:9] <- FALSE
n <- which(cumsum(test1) == 1L)
#count days above threshold after that
df$dum[(n+1):nrow(df)] <- cumsum(df$temp[(n+1):nrow(df)] > thr)

Resources