I wrote a function to remove outliers resursively (for any data points 3sd away from its median.
rm.outlier <- function (var) {
has.3sd =1
while (has.3sd>0) {
for (l in var) {
if ( (l-median(var))> 3*sd(var) & !is.na(l)) {
var[var==l] <- NA
}
}
has.3sd <- sum(var > 3*sd(var))
if (has.3sd==0) {
break
}
}
return (var)
}
However, I always got the error message:
Error in if ((l - median(var)) > 3 * sd(var) & !is.na(l)) { :
missing value where TRUE/FALSE needed
I wonder why I got this error message? I spent long time trying to figure it out but couldn't. I appreciate it if anyone can help me with that. Thanks a lot.
Related
I have tried the following but the output brings an argument stating,
Error in append("0") : argument "values" is miss
for (rowz in final_data$Ingridients) {
Cobalt_row<-lst()
if (sum(str_detect(rowz, 'Cobalt'))>0) {
Cobalt_row.append(1)
} else {
Cobalt_row<-append(0)
}
print(Cobalt_row)
}
I intended to loop through the list and generate a boolean of ones and twos depending on
whether or not I had the value.
Please help
Without the data, I can't test it, but this should work:
Cobalt_row<-lst()
k <- 1
for (rowz in final_data$Ingridients) {
Cobalt_row[[k]] <- ifelse(str_detect(rowz, 'Cobalt'), 1, 0)
k <- k+1
}
or even simpler if you need a list:
Cobalt_row <- as.list(as.numeric(str_detect(final_data$Ingredients, "Cobalt")))
I am using Rstudio 2022.07.1 on Window. I am new to R and trying to sort a porfolio of stocks with the following steps:
I will base on a value of be_me in June of each year to sort my portfolio.
If the value of be_me in June is smaller than the average value of be_me in the year, I will assign the stock as "S" (small). Otherwise, the stock is considered "B" (big).
Here is the code I write, and the error is "Error in sort_size[[x]] : subscript out of bounds":
for (i in seq_along(year)) {
for (j in seq_along(month)) {
if (j == 06) {
average_me <- mean(be_me)
sort_size <- vector('list', length=1)
for (x in seq_along(be_me)) {
if (isTRUE(x<= average_me)==TRUE) {
sort_size[[x]] == "S"
}else{sort_size[[x]] == "B"}
}
}
}
}
lapply(sort_size, print)
Could you please to show me how to fix the error as well as if you could recommend me any better way to do the task.
Thank you very much for your help!
I'm working with panel data in R and am endeavoring to build a function that returns every user ID where PCA==1. I've largely gotten this to work, with one small problem: it only returns the values when I end the function with print() but does not do so when I end the function with return(). As I want the ids in a vector so I can later subset the data to only include those IDs, that's a problem. Code reflected below - can anyone advise on what I'm doing wrong?
The version that works (but doesn't do what I want):
retrievePCA<-function(data) {
for (i in 1:dim(data)[1]) {
if (data$PCA[i] == 1) {
id<-data$CPSIDP[i]
print(id)
}
}
}
retrievePCA(data)
The version that doesn't:
retrievePCA<-function(data) {
for (i in 1:dim(data)[1]) {
if (data$PCA[i] == 1) {
id<-data$CPSIDP[i]
return(id)
}
}
}
vector<-retrievePCA(data)
vector
Your problem is a simple misunderstanding of what a function and returning from a function does.
Take the small example below
f <- function(x){
x <- x * x
return x
x <- x * x
return x
}
f(2)
[1] 4
4 is returned, 8 is not. That is because return exits the function returning the specific value. So in your function the function hits the first instance where PCA[i] == 1 and then exits the function. Instead you should create a vector, list or another alternative and return this instead.
retrievePCA<-function(data) {
ids <- vector('list', nrow(data))
for (i in 1:nrow(data)) {
if (data$PCA[i] == 1) {
ids[[i]] <-data$CPSIDP[i]
}
}
return unlist(ids)
}
However you could just do this in one line
data$CPSIDP[data$PCA == 1]
enter image description here
I know there exists function 'unique' which works similar to what I want to make, but I want to make this function.
I want this function finally returns 'result' which contains unique elements of input vector.
But I don't know why this function's result is totally different from my expect.
Why c which is to combine before result and new unique element is not working.
Please tell me how to fix my code.
Thank you.
I think what you expect might be something like below, where result should be an argument of m_uni:
m_uni <- function(x,result = c()) {
if (class(x)=='numeric'| class(x)=='character') {
if (length(x) <= 1){
return(result)
} else {
if (x[length(x)] %in% result) {
x <- x[-length(x)]
m_uni(x,result)
} else {
result <- c(result,x[length(x)])
x <- x[-length(x)]
m_uni(x,result)
}
}
} else {
return('This function only gets numeric or character vector')
}
}
such that
> m_uni(x)
[1] 0 4 5 -2
y <- as.integer(readline(prompt ="Enter a number: "))
factorial = 1
if (y< 0){
print("Error")
} else if (y== 0)
{
print("1")
} else
{
for(i in 1:y) {
factorial = factorial * i
}
return(factorial)
}
wondering why this is giving:
Error in if (y< 0) { : missing value where TRUE/FALSE needed
is it cause the first line has data type NA_integer?
There are three possible ways to pass values to the if statement.
y <- 1
if (y > 0) print("more")
This one works as expected.
y <- 1:3
if (y > 0) print("ignores all but 1st element")
As the warning message will tell you, only the first element was used to evaluate it. You could use any or all to make this right.
y <- NA
if (y > 0) print("your error")
This case actually gives you your error. I would wager a bet that y is somehow NA. You will probably need to provide a reproducible example (with data and the whole shebang) if you'll want more assistance. Note also that it helps visually structure your code to improve readability.