I feel a bit embarrassed to ask that rather simple question but I'm searching for a couple of hours now and can't get my head around.
I'm trying to build a switch for my function:
output <- "both"
if (output== "both" | "partone")
{cat("partone")}
if (output=="both" | "parttwo")
{cat("parttwo")}
This should produce partone and parttwo. Whereasoutput <- "partone" just partone.
How could this work?
Use something like this.
if (output %in% c("both","partone"))
{cat("partone")}
if (output %in% c("both","parttwo"))
{cat("parttwo")}
It will produce your desired output.
If we check the logical condition
output== "both" | "partone"
Error in output == "both" | "partone" : operations are possible
only for numeric, logical or complex types
As we need to check for either 'both' or 'partone', use the %in% on a vector of string elements
output %in% c('both', 'partone')
#[1] TRUE
Now, create a function for reusability
f1 <- function(out, vec) {
if(out %in% vec) cat(setdiff(vec, 'both'), '\n')
}
output <- 'both'
f1(output, c('both', 'partone'))
#partone
f1(output, c('both', 'parttwo'))
#parttwo
output <- 'partone'
f1(output, c('both', 'partone'))
#partone
f1(output, c('both', 'parttwo'))
This syntax is incorrect:
if (output== "both" | "partone")
{cat("partone")}
You can write like this:
if (output == "both" || output == "partone")
{cat("partone")}
Or like this:
if (output %in% c("both", "partone"))
{cat("partone")}
Related
B <- 10000
results <- replicate(B, {
hand <- sample(hands1, 2)
(hand[1] %in% aces & hand[2] %in% facecard) | (hand[2] %in% aces & hand[1] %in% facecard)
})
mean(results)
this piece of code works perfectly and do the desired thi
this is a monte carlo simulation. I don't understand the way they put curly brackets {} in the replicate function. i can understand the function of that code but i cant understand the way they put the code.
The reason is that we have multiple expressions
hand <- sample(hands1, 2)
is the first expression and the second is
(hand[1] %in% aces & hand[2] %in% facecard) | (hand[2] %in% aces & hand[1] %in% facecard)
i.e. if there is only a single expression, we don't need to block with {}
It is a general case and not related to replicate i.e. if we use a for loop with a single expression, it doesn't need any {}
for(i in 1:5)
print(i)
and similarly, something like if/else
n <- 5
if(n == 5)
print(n)
It is only needed when we need more than one expression
I'm running several tests for a given object x. For a given test (being a test a function that returns TRUE or FALSE when applied to an object) it is quite easy, as you can do lapply(x, test). For example:
# This would return TRUE
lapply('a', is.character)
However, I would like to create a function pass_tests, which would be able to combine multiple tests, i.e. that it could run something like this:
pass_tests('a', is.character | is.numeric)
Therefore, it should combine multiple functions given in an argument of the function, combining its result when testing an object x. In this case, it would return whether 'a' is character OR numeric, which would be TRUE. The following line should return FALSE:
pass_tests('a', is.character & is.numeric)
The idea is that it could be flexible for different combinations , e.g.:
pass_tests(x, test1 & (test2 | test3))
Any idea if functions can be logically combined this way?
Another option would be to use the pipes
library(magrittr) # or dplyr
"a" %>% {is.character(.) & is.numeric(.)}
#FALSE
"a" %>% {is.character(.) | is.numeric(.)}
#TRUE
1 %>% {is.finite(.) & (is.character(.) | is.numeric(.))}
#TRUE
Edit: used in a function with string
pass_test <- function(x, expr) {
x %>% {eval(parse(text = expr))}
}
pass_test(1, "is.finite(.) & (is.character(.) | is.numeric(.))")
#TRUE
The argument expr can be a string or an expression as in expression(is.finite(.) & (is.character(.) | is.numeric(.))).
Here's another way to do it by creating infix operators.
`%and%` <- function(lhs, rhs) {
function(...) lhs(...) & rhs(...)
}
`%or%` <- function(lhs, rhs) {
function(...) lhs(...) | rhs(...)
}
(is.character %and% is.numeric)('a')
#> [1] FALSE
(is.character %or% is.numeric)('a')
#> [1] TRUE
These can be chained together. However, it will not have the normal AND/OR precedence. It will be evaluated left-to-right.
(is.double %and% is.numeric %and% is.finite)(12)
#> [1] TRUE
I was given a large csv that is 115 columns across and 1000 rows. The columns have a variety of data, some is character-based, some is integer, etc. However, the data has a LOT of null variables of varying types (NA, -999, NULL, etc.).
What I want to do is write a script that will generate a LIST of columns where over 30% of the data in the column is a NULL of some type.
To do this, I wrote a script to give me the null percentage (as decimal) for one column. This script works fine for me.
length(which(indata$ObservationYear == "" | is.na(indata$ObservationYear) |
indata$ObservationYear == "NA" | indata$ObservationYear == "-999" |
indata$ObservationYear == "0"))/nrow(indata)
I want to write a script to do this for all columns. I believe I need to use the lapply function.
I attempted to do this here, however, I can't seem to get this script to work at all:
Null_Counter <- lapply(indata, 2, length(x),
length(which(indata == "" | is.na(indata) | indata == "NA" | indata == "-999" | indata == "0")))
names(indata(which(0.3>=Null_Counter / nrow(indata))))
I get the following errors:
Error in match.fun(FUN) : '2' is not a function, character or symbol
and:
Error: could not find function "indata"
Ideally, what I want it to give me is a vector LIST of all column names where the percentage of all null variables (NA, -999, 0, NULL) is over 30%.
Can anyone help?
I believe you want to use apply rather than lapply which apply a function to a list.
Try this:
Null_Counter <- apply(indata, 2, function(x) length(which(x == "" | is.na(x) | x == "NA" | x == "-999" | x == "0"))/length(x))
Null_Name <- colnames(indata)[Null_Counter >= 0.3]
Here's a different way to do this in data.table:
#first, make a reproducible example:
library(data.table)
#make it so that all columns have ~30% "NA" as you define it
dt<-as.data.table(replicate(
115,sample(c(1:100,"",NA,"NA",-999,0),size=1000,replace=T,
prob=c(rep(.007,100),rep(.06,5)))))
Now, figure out which are troublesome:
x<-as.matrix(dt[,lapply(.SD,function(x){
mean(is.na(x) | x %in% c("","NA","-999","0"))})])
colnames(x)[x>.3]
There's probably a more concise way of doing this, but it's eluding me.
If you're trying to drop those columns, this could be adjusted:
dt[,!colnames(x)[x>.3],with=F]
I would like to use if to load the data from csv files which I determine at the beginning of my script.
I use this function:
if(which_data == "data1") {tbl <- read.csv("aaa.csv")}
but I would like to add operator OR | to load the data which I want if I put two different names to which_data.
The function should look like:
if(which_data == "data1" | "data2") {tbl <- read.csv("aaa.csv")}
but the problem is that such operator can be used only for numeric, logical or complex types. What else can I do ?
Test if your variable is "in" one of the values:
if(which_data %in% c("data1" ,"data2")) {tbl <- read.csv("aaa.csv")}
Note that | doesn't do what maybe you think it does with numeric types:
> 3 == 2|3
[1] TRUE
> 3 == 2|1
[1] TRUE
Its testing (3==2) or (1), and in R, 1 evaluates as TRUE, so the expression 3==2|1 is TRUE.
I have a dataframe where the dates are given as hydrological years (October to September). To change this I am trying to use a if statement:
if(cet$month== 10|cet$month==11|cet$month==12)
cet$year <- substr(as.character(cet[,2]),1,4) else
cet$year <- substr(as.character(cet[,2]),6,9)
but I get an error:
the condition has length > 1 and only the first element will be used
Reading the "if" help file I realized that the condition has to be a length-one logical vector. Is there no way of using an "or" with an "if"? All I want is to apply that expression if the month is October, November or December.
ifelse is the vectorised version. You can also use %in% to reduce the number of statements.
cet$year <- ifelse(cet$month%in%(10:12), substr(as.character(cet[,2]),1,4), substr(as.character(cet[,2]),6,9))
Ok, here's a reproducible example that should help to clarify things:
# generate some vector
x <- c(1,2,4,4,5,5,6,6,6)
# have a check using OR, return values
x[x == 2 | x == 1]
## or return TRUE / FALSE
(x == 2 | x == 1)
or check ?ifelse
EDIT: Note that for characters you need to use "", like x == "yourchars" | x == "someotherchars"
Here's also some simple reference and how to work with operators: QuickR
the OR instruction is double pipes
| => || in the if()