I was hoping to create a function with the if statements in the following code:
data <- data.frame(
id = c(1, 5, 6, 11, 15, 21),
intervention = c(2, 2, 2, 1, 1, 1),
death = c(0, 1, 0, 1, 0, 0)
)
test <- c()
for (i in data$id[which(data$intervention == 1)]) {
print(paste0("id = ", i, ": "))
for (j in data$id[which(data$intervention == 2)]) {
if (data$death[data$id == i] < data$death[data$id == j]) {
test <- c(test, -1)
} else if (data$death[data$id == i] > data$death[data$id == j]) {
test <- c(test, 1)
} else if (data$death[data$id == i] == data$death[data$id == j]) {
test <- c(test, 0)
}
}
print(test)
test <- c()
}
I had tried to do it as follows, however the code is not writing the result to the vector. However if I replaced the return with print, it would print it out. Would anyone have any suggestions on what I might be doing wrong? Many thanks!
first <- function () {
if(data$death[data$id == i]<data$death[data$id ==j]){
return (test <- c(test,-1))}
else if(data$death[data$id == i]>data$death[data$id == j]){
return (test <- c(test,1))}
else if(data$death[data$id == i]==data$death[data$id == j]){
return (test <- c(test,0))}
}
for (i in data$id[which(data$intervention == 1)]){
for (j in data$id[which(data$intervention == 2)]){
first()
}
test
}
The following function returns a list of the wanted vectors.
first <- function(data, interv1 = 1, interv2 = 2) {
inx <- which(data[["intervention"]] == interv1)
jnx <- which(data[["intervention"]] == interv2)
out <- lapply(inx, \(i) {
sapply(jnx, \(j) sign(data[["death"]][i] - data[["death"]][j]))
})
setNames(out, data[["id"]][inx])
}
first(data)
#> $`11`
#> [1] 1 0 1
#>
#> $`15`
#> [1] 0 -1 0
#>
#> $`21`
#> [1] 0 -1 0
Created on 2022-11-22 with reprex v2.0.2
You can then access the return values as
test <- first(data)
# any of the following extracts the 1st vector
test[['11']]
#> [1] 1 0 1
# notice the backticks
test$`11`
#> [1] 1 0 1
Created on 2022-11-22 with reprex v2.0.2
Related
i have data frame that looks like this :
is severe encoding sn_id
1 1 1
0 2 1
1 2 2
0 1 2
1 1 2
im using on by function this function :
catt <-
function(y, x, score = c(0, 1, 2)) {
miss <- unique(c(which(is.na(y)), which(is.na(x))))
n.miss <- length(miss)
if(n.miss > 0) {
y <- y[-miss]
x <- x[-miss]
}
if(!all((y == 0) | (y == 1)))
stop("y should be only 0 or 1.")
if(!all((x == 0) | (x == 1) |(x == 2)))
stop("x should be only 0, 1 or 2.")
ca <- x [y == 1]
co <- x [y == 0]
htca <- table(ca)
htco <- table(co)
A <- matrix(0, 2, 3)
colnames(A) <- c(0, 1, 2)
rownames(A) <- c(0, 1)
A[1, names(htca)] <- htca
A[2, names(htco)] <- htco
ptt <- prop.trend.test(A[1, ], colSums(A), score = score)
#list(#"2x3-table" = A,
#chisq = as.numeric(ptt$statistic),
#df = as.numeric(ptt$parameter),
res= p.value = as.numeric(ptt$p.value)
#n.miss = n.miss)
return(res)
}
when i run it :
by(es_test,es_test$sn_id, function (es_test) {catt(es_test$ï..is_severe,es_test$encoding)})
i get these results:
es_test$sn_id: 1
[1] 0.1572992
------------------------------------------------------------------------
es_test$sn_id: 2
[1] 0.3864762
it is not a very comfortable format as i want to further work with it , is there any way to get these results as list :[0.157,0.386]?
i tried this :
result_pv=c(by(es_test,es_test$sn_id, function (es_test) {catt(es_test$ï..is_severe,es_test$encoding)}))
but it produced double and i want it as vector or list :
the double :
Browse[6]> result_pv
1 2
0.1572992 0.3864762
> typeof(result_pv)
[1] "double"
what i want to do with it later is to add this result_pv to data frame as column and when it is a double i cant do that
thank you
Can someone tell me what is wrong with this function in R? The functions can work on a single input, but when I use a vector I get an error:
input_check3 <- function(x){
if (is.finite(x)) {
if (x %% 2 == 0){
print(TRUE)
} else {
print(FALSE)
}
} else {
NA
}
}
data_for_e2 <- c(1, 2, 4, 5, 3)
input_check3(data_for_e2)
#> [1] FALSE
#> Warning messages:
#> 1: In if (is.finite(x)) { : The length of the condition is greater than one, so only its first element can be used
#> 2: In if (x%%2 == 0) { : The length of the condition is greater than one, so only its first element can be used
You could use ifelse, which is a vectorized function:
input_check3 <- function(x){
ifelse(is.finite(x),
x %% 2 == 0, # equiv to ifelse(x %% 2 == 0, TRUE, FALSE), thanks Martin Gal!
NA)
}
Result
[1] FALSE TRUE TRUE FALSE FALSE
I have a question regarding a for-loop within R's dplyr. Imagine I have the following dataframe:
id <- c(rep(8, 9))
check <- c(0,1,1,0,0,1,0,0,0)
df <- data.frame(id, check)
df$count_x <- cumsum(df$check)
df$count_y <- NA
df$count_y[1] <- ifelse(df$check[1] == 0, 0, 1)
co <- df$count_y[1]
I want to fill the variable count_y based on an adjusted cumulative function below:
for (idx in 2:nrow(df)){
if(df[idx, 2] == 1 & df[idx - 1, 2] == 0){
co <- 1
df[idx, 4] <- co
} else if (df[idx, 2] == 1 & df[idx - 1, 2] == 1){
co <- co + 1
df[idx, 4] <- co
} else if (df[idx, 2] == 0){
df[idx, 4] <- co
}
}
The output of this for-loop is correct. However, in my current data set, I have many IDs, and using a for loop to iterate over the IDs will take too much time. I'm trying to use the functionality of dplyr to speed up the process.
id <- c(rep(8, 9))
check <- c(0,1,1,0,0,1,0,0,0)
df <- data.frame(id, check)
df <- df %>% group_by(id) %>% mutate(count_x = cumsum(check),
count_y = NA) %>% ungroup()
df <- df %>% group_by(id) %>% mutate(count_y = replace(count_y, 1, ifelse(check[1] == 0, 0 , 1)))
count_n <- function(df){
co <- df$count_y[1]
for (idx in 2:nrow(df)){
if(df[idx, 2] == 1 & df[idx - 1, 2] == 0){
co <- 1
df[idx, 4] <- co
} else if (df[idx, 2] == 1 & df[idx - 1, 2] == 1){
co <- co + 1
df[idx, 4] <- co
} else if (df[idx, 2] == 0){
df[idx, 4] <- co
}
}
}
I want to use mutate to call the function count_n to fill count_y as described above. I'm aware that I'm passing just one variable, where I have to pass a data frame as the function relies on the column 'check' (col number 2) and 'count_y' (col number 4). I have tried multiple options (mutate_at, all, etc) but I couldn't make it to work. What can I do differently?
df <- df %>% group_by(id) %>% mutate_at(vars(count_y), ~count_n(.))
I think this is the perfect case to use purrr::accumulate2().
purrr::accumulate() is often used to calculate conditional cumulative sums. It takes a function as the second argument. This function should have 2 arguments: the cumulative output co, and the currently evaluated value x.
purrr::accumulate2() allows us to use a second variable to iterate on, and here we use lag(check) as lx. The tricky part is that this second variable should be one item shorter, as it does not matter for the initial value.
Here is the code, matching your expected output.
library(tidyverse)
df = structure(list(id = c(8, 8, 8, 8, 8, 8, 8, 8, 8),
check = c(0, 1, 1, 0, 0, 1, 0, 0, 0),
count_x = c(0, 1, 2, 2, 2, 3, 3, 3, 3)),
row.names = c(NA, -9L), class = "data.frame")
df %>%
mutate(
count_y = accumulate2(check, lag(check)[-1], function(co, x, lx){
case_when(
x==0 ~ co,
x==1 & lx==0 ~ 1,
x==1 & lx==1 ~ co+1,
TRUE ~ 999 #error value in case of unexpected input
)
})
)
#> id check count_x count_y
#> 1 8 0 0 0
#> 2 8 1 1 1
#> 3 8 1 2 2
#> 4 8 0 2 2
#> 5 8 0 2 2
#> 6 8 1 3 1
#> 7 8 0 3 1
#> 8 8 0 3 1
#> 9 8 0 3 1
Created on 2021-05-05 by the reprex package (v2.0.0)
The first issue is that you weren't returning anything in your function. The second issue is that you don't need to use a mutate_at (or even a mutate as would be more appropriate for a single variable) when you're writing the function that modifies the entire tibble. The simplest way to get it working is adding a return statement and running it in line like so:
count_n <- function(df){
co <- df$count_y[1]
for (idx in 2:nrow(df)){
if(df[idx, 2] == 1 & df[idx - 1, 2] == 0){
co <- 1
df[idx, 4] <- co
} else if (df[idx, 2] == 1 & df[idx - 1, 2] == 1){
co <- co + 1
df[idx, 4] <- co
} else if (df[idx, 2] == 0){
df[idx, 4] <- co
}
}
return(df)
}
df %>% group_by(id) %>% count_n(.)
However, I would use Dan's answer above because it's much cleaner and has the advantage of not running a for loop, which isn't very "R". :)
I received the error
Error in if (x[i] == 0 && x[i - 1] > 0) { :
missing value where TRUE/FALSE needed
when running this function on a numeric vector
number_rn <- function(x) {
a <- 0
for (i in 1:length(x)) {
if (x[i] == 0 && x[i-1] > 0) {
a <- a +1
}
}
print(a)
}
However, the following function works fine:
number_rr <- function(x) {
a <- 0
for (i in 1:length(x)) {
if (x[i] > 0 && x[i-1] > 0) {
a <- a +1
}
}
print(a)
}
I note from previous answers to similar questions that this can occur if the if conditional does not have either a TRUE or FALSE result, but I do not believe this to be the case in my example. What could be causing this error?
There are several issues with the for loop (even if x does not contain any NA values):
In the first iteration (i == 1), x[i-1] refers to x[0] which is undefined as indexing in R starts at 1.
The code is using a for loop where vectorized functions can be used.
Unfortunately, starting the loop at i == 2, i.e., for (i in 2:length(x)), is not error-proof in case of a one element vector where length(x) == 1.
My suggestion is to use the vectorized version
number_rn_vec <- function(x) {
n <- length(x)
sum(x[2:n] == 0 & x[1:(n - 1)] > 0, na.rm = TRUE)
}
This will return a without error for many use cases:
sapply(
list(
c(),
c(1),
c(1, 0),
c(1, 0, 3),
c(0, 1, 0, 3),
c(NA, 1, 0, 3),
c(1, NA, 0, 3),
c(1, 0, NA, 3),
c(1, 0, 3, NA)
),
number_rn_vec
)
[1] 0 0 1 1 1 1 0 1 1
This is most likely occurring because you vector x has NULL or NA values. See what happens when I try to run a if condition with NULL values -
x <- NULL
if (x == 0 && x > 5) print("yes")
Make sure to remove any NAs or NULLs using is.na() or is.null() and you should be fine
Say we have some vector:
someVector = c(1, 3, 4, 6, 3, 9, 2, -5, -2)
I want to get a vector that has the locations in someVector of all the odd elements
so in this case it would look like...
resultVector = c(1, 2, 5, 6, 8)
> which(someVector %% 2 == 1)
[1] 1 2 5 6 8
library(schoolmath)
which(is.odd(someVector))
[1] 1 2 5 6 8
just for fun here the code of the is.odd function :
function (x)
{
start <- 1
end <- length(x) + 1
while (start < end) {
y <- x[start]
if (y == 0) {
cat("Please enter a number > 0")
end
}
test1 <- y/2
test2 <- floor(test1)
if (test1 != test2) {
if (start == 1) {
result = TRUE
}
else {
result <- c(result, TRUE)
}
}
else {
if (start == 1) {
result = FALSE
}
else {
result <- c(result, FALSE)
}
}
start <- start + 1
}
return(result)
}
Definitely , Don't use this function !