R: adding logical to sapply - r

I am trying to run this code:
check_zeros <- function(x) { # WIP
if (x == 0) {
!(df[gsub('\\b0+','',format(as.Date(formation$study_start_dates_list[i]),'%m/%d/%Y')), names(x)] == df[gsub('\\b0+','',format(as.Date(formation$study_end_dates_list[i]),'%m/%d/%Y')), names(x)])
}
}
remove_undesired_stocks2 <- function(n) {
i = 1
listofdfs_filtered <- list()
for (i in 1:n) {
a <- subset(average_returns, row.names(average_returns) == i)
b <- as.data.frame(sapply(subset(average_returns, row.names(average_returns) == i), function(x) all(x == 0 | is.nan(x) | check_zeros(x) )))
c <- a[, !b]
listofdfs_filtered[[i]] <- c
}
return(listofdfs_filtered)
}
Error comes out as:
Error in if (x == 0) { : missing value where TRUE/FALSE needed
I think it is bc there is a NaN going into x == 0 of the check_zeros function.
Any how I can overcome this? Thanks in advance.

I think I solved it myself:
the check_zero function is constructed in a way which cannot take objects with length > 1. more specifically logic inside if cannot use objects length > 1.
Since I was using a object with length > 1, there was an error
You should use ifelse in this case:
check_zeros <- function(x) {
ifelse(x == 0, (df[gsub('\\b0+','',format(as.Date(formation$study_start_dates_list[i]),'%m/%d/%Y')), names(x)] == df[gsub('\\b0+','',format(as.Date(formation$study_end_dates_list[i]),'%m/%d/%Y')), names(x)]), FALSE)
}
Cheers.

Related

"object 'i' not found" the for loop

I am trying to make a simple for loop, that would add a new column to the existing data frame, by assigning a time class to each observation. I am getting 'object 'i' not found' and not sure where is a mistake. I am not that experienced with R, so thank you for any help!
for (i in 1:nrow(my.data)) {
if(my.data$RTime[i] <= 3600){
my.data$RTimeHour[i] <- 1
}ifelse (my.data$RTime[i] > 3601 & my.data$RTime[i] < 7200){
my.data$RTimeHour[i] <- 2
}esle {
my.data$RTimeHour[i] <- 3
}
}
I think you can simply use a nested ifelse statement:
my.data$RTimeHour <- ifelse(my.data$RTime <= 3600, 1,
ifelse(my.data$RTime > 3600 & my.data$RTime <= 7200, 2, 3))
The error is probably coming from the ifelse() you have:
for (i in 1:nrow(my.data)) {
if(my.data$RTime[i] <= 3600){
my.data$RTimeHour[i] <- 1
}ifelse (my.data$RTime[i] > 3601 & my.data$RTime[i] < 7200){
my.data$RTimeHour[i] <- 2
}esle {
my.data$RTimeHour[i] <- 3
}
}
Try this instead:
if(my.data$RTime[i] <= 3600){
my.data$RTimeHour[i] <- 1
}else if(my.data$RTime[i] > 3601 & my.data$RTime[i] < 7200){
my.data$RTimeHour[i] <- 2
}else {
my.data$RTimeHour[i] <- 3
}
}
ifelse vectorizes how the problem, so ifelse(x[i]) will try to refer to something outside of the loop.
Vectorized with tidyverse we can just do:
library(tidyverse)
my.data %>% mutate(RTimeHour = case_when(
RTime <= 3600 ~ 1,
RTime < 7200 ~ 2,
TRUE ~ 3)
)

I want to fix the behavior of my function

I'm making a function that categorize the customers based on there sales into 3 classes A,B and C, but the function give me wired results i don't know why
f <- function(x)
{
for(j in 1:length(x))
{
if(x[j] > 0 & x[j] < 501 )
{
x[j] = "C"
}
else if(x[j] > 500 & x[j] < 1001 )
{
x[j] = "B"
}
else if(x[j] > 1000 )
{
x[j] = "A"
}
}
return(x)
}
This is the function.
print(f(c(2000,2000,2000)))
when i run this for example it gave me A,C,C where is should be all A
print(f(c(600,600)))
this gave B which is right but then A !
As noted by #shwan you were rewriting the x vector as character values. To avoid defining an other vector for result and also avoid the loop structure you could just use the vectorized ifelse command and write your function as:
f=function(x){ifelse(x>0 & x<501,"A",ifelse(x>500 & x <1001,"B","C"))}
By using x[j] = "C", you are coercing x to class 'character', which then returns unexpected logical comparisons.
You need to save the result in some other, character vector ('ret' below).
f <- function(x) {
ret <- NA_character_
for(j in 1:length(x)) {
if(x[j] > 0 & x[j] < 501 ) {
ret <- c(ret,"C")
} else if(x[j] > 500 & x[j] < 1001 ) {
ret <- c(ret,"B")
} else if(x[j] > 1000 ) {
ret <- c(ret,"A")
}
}
ret <- ret[2:length(ret)] # remove the first element
return(ret)
}

Testing a prime number with loop 1: sqrt(x)

I am learning about loops and I have this code to check if a number is prime or not, but doesn't work. Where is the bug?
x <- 7
y <- seq(1,sqrt(x),by=1)
for(i in 1: sqrt(x)){
if(y[x%%y == 0]) {
print("FALSE")
}else{
print("TRUE")
}
}
This gives me the right solution, but it repeats the answer as many times as number of elements in i. Also I would like to ask how to use function inside a for with if:
i <- c(1: sqrt(x))
y3 <- x%%i == 0
y4 <- y3[-1]
for(value in i){
if(y4 == FALSE) {
print("TRUE")
}else{
print("FALSE")
}
}
version 3, gives me the solution but for evey element in i:
x <- 107
i <- c(1: sqrt(x))
y3 <- c(x%%i == 0)
y4 <- y3[-1]
for(value in i){
if(all(y4==F)) {
print("TRUE")
}else{
print("FALSE")
}
}
Since you mentioned that you must use a loop, the following code will work:
x <- 7
y <- seq(1, ceiling(sqrt(x)), by=1)
# is.factor is a vector which checks whether each element in y is a factor or not
# initialize to F
is.factor = F
# Start at y = 2 because 1 will be a factor
for(i in 2:length(y) ){
# Check whether current value in vector is a factor of x or not
# if not a factor, set value in index to F, otherwise set to T
ifelse( x%%y[i] != 0, is.factor[i] <- F, is.factor[i] <- T)
# If we are at the last element in y, print a result
if(i == length(y)){
# check if we have any factors.
# if we have any factors (i.e. any index in is.factor vector is T), then number is not prime
ifelse( any(is.factor), print("FALSE"), print("TRUE") )
}
}
You can do this-
check_prime <- function(num) {
if (num == 2) {
TRUE
} else if (any(num %% 2:(num-1) == 0)) {
FALSE
} else {
TRUE
}
}
> check_prime(7)
[1] TRUE

How can I write a function in the same way they are written in standard library?

EDIT: although this question will be relevant to many statistics students (especially ones with experience with other programming languages), it has been closed. The hint given was very helpful, though. You can use the ifelse function as shown below:
qbern24 <- function(x, p) {
ifelse(x <= 0, -Inf,
ifelse(x > 0 & x <= 1-p, 0,
ifelse(x > 1-p & x <= 1, 1,
Inf)
)
)
}
If you want to vectorize something else, I wish you luck.
/EDIT
The functions in the standard library will take "numeric vectors" and return numeric vectors but mine don't. How can i write qbern04 so that I can use it in the same way as qnorm, without having to use as.numeric and lapply?
qbern <- function(x, p) {
if (x <= 0) {
-Inf
} else if (x > 0 & x <= 1-p) {
0
} else if (x > 1-p & x <= 1) {
1
} else {
Inf
}
}
qbern04 <- function(x) {
qbern(x, 0.4)
}
U <- runif(1e6)
X <- as.numeric(lapply(U, qbern04))
Z <- qnorm(U)
qbern <- function(x, p) {
OTHERWISE <- TRUE
dplyr::case_when(
(x <= 0) ~ -Inf,
((x > 0) && (x <= (1-p))) ~ 0,
((x > 1-p) && (x <= 1)) ~ 1,
OTHERWISE ~ Inf
)
}
qbern04 <- function(x) { qbern(x, 0.4) }
case_when() is from dplyr. If you don't want that dependency, then read up on vectorization and learn to use ifelse(), sapply() and vapply() since lapply() is designed to return lists.

switch statement help in R

I've got the following code in R:
func.time <- function(n){
times <- c()
for(i in 1:n){
r <- 1 #x is the room the mouse is in
X <- 0 #time, starting at 0
while(r != 5){
if(r == 1){
r <- sample(c(2,3),1) }
else if(r == 2){
r <- sample(c(1,3), 1) }
else if(r == 3){
r <- sample(c(1,2,4,5), 1) }
else if (r == 4){
r <- sample(c(3,5), 1) }
X <- X + 1
}
times <- c(X, times)
}
mean(times)
}
func.time(10000)
It works fine, but I've been told that using switch() can speed it up seeing as I've got so many if else statements but I can't seem to get it to work, any help is appreciated in advance.
Edit
I've tried this:
func.time <- function(n) {
times <- c()
for(i in 1:n) {
r <- 1 #x is the room the mouse is in
X <- 0 #time, starting at 0
while(r != 5) {
switch(r, "1" = sample(c(2,3), 1),
"2" = sample(c(1,3), 1),
"3" = sample(c(1,2,4,5), 1),
"4" = sample(c(3,5)))
X <- X + 1
}
times <- c(X, times)
}
mean(times)
}
func.time(10000)
But it was a basic attempt, I'm not sure I've understood the switch() method properly.
I though Dominic's assessment was very useful but when I went to examine the edit it was being held up on what I thought was an incorrect basis. So I decided to just fix the code. When usign a numeric argument to the EXPR parameter you do not use the item=value formalism but rather just put in the expressions:
func.time <- function(n){times <- c()
for(i in 1:n){; r <- 1; X <- 0
while(r != 5){
r <- switch(r,
sample(c(2,3), 1) , # r=1
sample(c(1,3), 1) , # r=2
sample(c(1,2,4,5), 1), #r=3
sample(c(3,5), 1) ) # r=4
X <- X + 1 }
times <- c(X, times) }
mean(times) }
func.time(1000)
#[1] 7.999
For another example of how to use switch with a numeric argument to EXPR, consider my answer to this question: R switch statement with varying outputs throwing error

Resources