I'm trying to create a factor from vector d that indicates whether each value of d is missing, less than threshhold, or greater than/equal to threshhold. I attempted with the following code, using the cases function from the memisc package.
threshhold = 5
d <- sample(c(1:10, NaN), 50, replace=TRUE)
d_case <- cases(
is.na(d),
d > threshhold,
d <= threshhold
)
and got a warning: In cases(is.na(d), d > threshhold, :
condition is.na(d) is never satisfied.
I've also tried using assignment operators,
d_case <- cases(
is.na(d) -> 0,
d > threshhold -> 1,
d <= threshhold -> 2
)
and got the same warning. I've checked, and d does contain NaN values, which is.na() should be returning as true (and is, when I check it outside of cases). Does anyone know why cases isn't working, or what I can do to get the indicator factor I need?
Look at ?cases according to the documenation you can do it like this
d <- c(-1:3,NA,1:2)
fun <- function(x){
cases(
is.na(x) -> 0,
x > threshhold -> 1,
x <= threshhold -> 2
)
}
d_case <- fun(d)
d_cases
How about:
addNA(as.factor(d>threshold))
....
Levels: FALSE TRUE <NA>
Related
I have a code in which I want to be able to specify a certain condition, and then fill-in this condition at a later point in my code, executing it as regular code. A simple example shows it. The following code returns a certain value for d depending on the values sampled for a and b.
a <- as.numeric(sample(1:2,1))
b <- as.numeric(sample(1:2,1))
d <- ifelse(a==1 & b==1,3,0)
But let's say I want to make it more flexible, and allow any condition to be specified, and then simply fill it in within the ifelse. So for example we could have:
a <- as.numeric(sample(1:2,1))
b <- as.numeric(sample(1:2,1))
c <- as.numeric(sample(1:2,1))
And I would like to specify two conditions:
condition_1 <- "a==1"
condition_2 <- "b==1"
or
condition_1 <- "a==1"
condition_2 <- "c==1"
and so on. Then I would like to fill in this conditions into ifelse. This does not work:
d <- ifelse(noquote(condition_1) & noquote(condition_1),3,0)
This also does not work:
d <- ifelse(paste(noquote(condition_1)) & paste(noquote(condition_1)),3,0)
I have tried anything I could think of but with no success. Is there a way to do this? More in general, how can I store parts of code, and then past them into the code at a later point and have it executed like the rest of the code?
Please do not provide workarounds that only work for this specific example. I need to do something analogous in a much more complex code.
"Storing parts of code [for later use]" sounds to me like using functions. You can pass functions as arguments to other functions. So you could do something like:
dFunc1 <- function(aVal, bVal) {
ifelse(a == aVal & b == bVal, 3, 0)
}
set.seed(1234)
a <- as.numeric(sample(1:2,1))
b <- as.numeric(sample(1:2,1))
d <- dFunc1(1, 1)
a
b
d
> a
[1] 2
> b
[1] 2
> d
[1] 0
and then
set.seed(1234)
dFunc2 <- function(aVal, cVal) {
ifelse(a == aVal & c == cVal, 3, 0)
}
c <- as.numeric(sample(1:2,1))
d <- dFunc2(1, 1)
c
d
> c
[1] 2
> d
[1] 0
If your derivations are embedded in another function, that's not a problem.
doItAll <- function(f, ...) {
set.seed(1234)
a <- as.numeric(sample(1:2,1))
b <- as.numeric(sample(1:2,1))
c <- as.numeric(sample(1:2,1))
d <- f(...)
return(list("a"=a, "b"=b, "c"=c, "d"=d))
}
doItAll(dFunc1, aVal=1, bVal=1)
$a
[1] 2
$b
[1] 2
$c
[1] 2
$d
[1] 0
and
doItAll(dFunc2, aVal=1, cVal=1)
$a
[1] 2
$b
[1] 2
$c
[1] 2
$d
[1] 0
The use of the elipsis (...) is key to the ability of passing arbitrary arguments to functions that are called from inside another function.
In the end I decided to solve this with a set of if and else if conditions. It seemed more practical than setting up a function as suggested by Limey.
I wrote a simple function to hopefully calculate the total of negative values but failed. Ideally, when I passed a vector into the function, it should give the total count of negative values. Can anyone help, please?
My code:
arg <- c(rnorm(50, 0))
neg <- 0
count.negative.fun <- function(x) {
ifelse(x <= 0, neg = neg +1,)
return(neg)
}
When I called:
count.negative.fun(arg)
It gives me this error message: "Error in ifelse(x <= 0, neg = neg + 1, ) :
unused argument (neg = neg + 1)"
When using ifelse and defining a function, one might do
count.negative.fun <- function(x) sum(ifelse(x <= 0, 1, 0))
count.negative.fun(arg)
# [1] 26
See ?ifelse. It returns 1 for those cases when an element of x is nonpositive and 0 otherwise. Then we may sum the result.
However, you may also simply write
sum(arg < 0)
# [1] 26
Another possible way is by using length function itself. As:
length(arg[arg<0])
#[1] 26
Let's say i have a function defined as the following in R:
> f <- function(x) 0.5*sin(x)*(x>=0)*(x<=pi)
i can do this to integrate it between 0 and pi:
> Integrate <- function(f,a,b) integrate(Vectorize(f),a,b)$value
> F <- Integrate(f,0,pi)
But if i want to evaluate and return some values of F i get this error:
> F(c(-100,0,1,2,pi,100))
Error in F(c(-100, 0, 1, 2, pi, 100)) :
function "F" is not found
i can understand that this is due to the fact, that my integrate <- function(f,a,b) returns a constant value C which is the result of the integration of f between a and b, but how can i return F as a function to be able to evaluate it's values as a vector and plot it ?
like in this case F should give 0 for any value less than 0 and 1 for any value bigger than pi and be variable between them.
Thanks.
Edit: just to sum it up more clearly: how can i define a function f(x) in [a,b] that will give me f(x) if x is in [a,b], 0 if xb ?
Try wrapping your function call in an sapply and have Integrate return a function.
Integrate <- function(f, a, b) function(x) if (x < a) 0 else if (x > b) 1 else integrate(Vectorize(f), a, x)$value
F <- Integrate(f, 0, pi)
sapply(c(-100,0,1,2,pi,100), F)
gives
[1] 0.0000000 0.0000000 0.2298488 0.7080734 1.0000000 1.0000000
My question is, does there exist a function that, given a logical statement, identifies the source of FALSE (if it is false)?
For example,
x=1; y=1; z=1;
x==1 & y==1 & z==2
Obviously it is the value of z that makes the statement false. In general though, is there a function that let's me identify the variable(s) in a logical statement who's value makes a logical statement false?
Instead of writing x==1 & y==1 & z==2 you could define
cn <- c(x == 1, y == 1, z == 2)
or
cn <- c(x, y, z) == c(1, 1, 2)
and use all(cn). Then
which(!cn)
# [1] 3
gives the source(s) of FALSE.
In general, no, there is no such function that you are looking for, but for different logical statements a similar approach should work, although it might be too lengthy to pursue.
Considering (!(x %in% c(1,2,3)) & y==3) | z %in% c(4,5), we get FALSE if z %in% c(4,5) is FALSE and (!(x %in% c(1,2,3)) & y==3) is FALSE simultaneously. So, if (!(x %in% c(1,2,3)) & y==3) | z %in% c(4,5) returns FALSE, we are sure about z and still need to check x and y, so that the list of problematic variables can be obtained as follows:
if(!((!(x %in% c(1,2,3)) & y==3) | z %in% c(4,5)))
c("x", "y", "z")[c(x %in% c(1,2,3), !y == 3, TRUE)]
# [1] "x" "y" "z"
or
a <- !(x %in% c(1,2,3))
b <- y == 3
c <- z %in% c(4,5)
if(!((a & b) | c))
c("x", "y", "z")[c(!a, !b, TRUE)]
# [1] "x" "y" "z"
I like #julius's answer but there is also the stopifnot function.
x <- 1; y <- 1; z <- 2
stopifnot(x == 1, y == 1, z == 1)
#Error: z == 1 is not TRUE
Not that the result is an error if there are any false statements and nothing if they're all true. It also stops at the first false statement so if you had something like
x <- T; y <- F; z <- F
stopifnot(x, y, z)
#Error: y is not TRUE
you would not be told that z is FALSE in this case.
So the result isn't a logical or an index but instead is either nothing or an error. This doesn't seem desirable but it is useful if the reason you're using it is for checking inputs to a function or something similar where you want to produce an error on invalid inputs and just keep on moving if everything is fine. I mention stopifnot because it seems like this might be the situation you're in. I'm not sure.
Here is a silly example where you might use it. In this case you apparently only want positive numbers as input and reject everything else:
doublePositiveNumber <- function(x){
stopifnot(is.numeric(x), x >= 0)
return(2*x)
}
which results in
> doublePositiveNumber("hey")
Error: is.numeric(x) is not TRUE
> doublePositiveNumber(-2)
Error: x >= 0 is not TRUE
> doublePositiveNumber(2)
[1] 4
So here you guarantee you get the inputs you want and produce and error message for the user that hopefully tells them what the issue is.
Let me describe the problem setting. The function I am depicting is a probability function and upon integration it's value would have to be equal to 1. So I will be dividing the result of the integration by 1 to give the value of C. So I can't assign value to C.
Have a look at the below code and error message -
> f <- function(x) (C*x*(exp(-x)))
> z=integrate(f, lower = 0, upper=Inf)
Error in C * x : non-numeric argument to binary operator
How am I supposed to define C here ?
Second Question- Can somebody figure what's wrong with value of z?
> f <- function(x) (x*(exp(-x)))
> z=integrate(f, lower = 0, upper=Inf)
> z
1 with absolute error < 6.4e-06
> 1/z
Error in 1/z : non-numeric argument to binary operator
Make C = 1 for when you compute the integral of the function. For that, you can make it an optional argument to your function with a default value:
f <- function(x, C = 1) C * x * exp(-x)
Then, compute:
z <- integrate(f, lower = 0, upper = Inf)
For the integral to be 1 with the real value for C, you need C * z == 1, i.e.:
C <- 1 / z$value
C
# [1] 1
As it turns out, the integral z is already equal to 1 so picking C = 1 was a lucky choice. You have nothing to do and you can just start using f as-is. Had it not been the case, I would have suggested to redefine f:
f_final <- function(x) f(x, C = 1 / z$value)
(Regarding your second question, you just had to look at the documentation for ?integrate and refer to the "Value" section.)