Say there is a formula:
f1 = as.formula(y~ var1 + var2 + var3)
f1
## y ~ var1 + var2 + var3
Then I want to update the formula by adding a named vector a.
a = 'aaabbbccc'
f2 = update(f1, ~ . + a)
f2
## y ~ var1 + var2 + var3 + a
This is not what I expected. I want a to be evaluated in the formula. Then I tried this:
f3 = update(f1, ~ . + get(a))
f3
## y ~ var1 + var2 + var3 + get(a)
Also failed. What I expected is this:
y ~ var1 + var2 + var3 + aaabbbccc
Any help will be highly appreciated!
If you are evaluating these statements in your global environment, then you could do:
f <- y ~ var1 + var2 + var3
a <- as.name("aaabbbccc")
update(f, substitute(~ . + a, env = list(a = a)))
## y ~ var1 + var2 + var3 + aaabbbccc
Otherwise, you could do:
update(f, substitute(~ . + a, env = environment()))
## y ~ var1 + var2 + var3 + aaabbbccc
The important thing is that the value of a in env is a symbol, not a string: as.name("aaabbbccc") or quote(aaabbbccc), but not "aaabbbccc".
Somewhat unintuitively, substitute(expr, env = .GlobalEnv) is equivalent to substitute(expr, env = NULL). That is the only reason why it is necessary to pass list(a = a) (or similar) in the first case.
I should point out that, in this situation, it is not too difficult to create the substitute result yourself, "from scratch":
update(f, call("~", call("+", quote(.), a)))
## y ~ var1 + var2 + var3 + aaabbbccc
This approach has the advantage of being environment-independent, and, for that reason, is probably the one I would use.
Related
I need to convert the probabilities to true or false (> 0.5), and then print a confusion matrix. I can't find an example of how to do this.
In my attempts, I'm also having difficulty referencing the transformed "Success" target, which is now "SuccessTRUE."
Success is Boolean, Comp is a factor (4 levels), the others are numeric.
require(neuralnet)
m <- model.matrix(~Success + Comp + Var2 + Var3 + Var4, data=Train3)
m1 <- model.matrix(~Success + Comp + Var2 + Var3 + Var4, data=Test3)
nn=neuralnet(SuccessTrue~Comp2 + Comp3 + Comp4 + Var2 + Var3 + Var4,data=m, hidden=4, act.fct = "logistic",linear.output = FALSE)
pred <- compute(nn,m1)
I figured this out.
> predicted.classes <- ifelse(pred$net.result > 0.5, "TRUE", "FALSE")
> t <- table(predicted.classes,Test3$Success)
> confusionMatrix(t)
How can I dynamically update a formula?
Example:
myvar <- "x"
update(y ~ 1 + x, ~ . -x)
# y ~ 1 (works as intended)
update(y ~ 1 + x, ~ . -myvar)
# y ~ x (doesn't work as intended)
update(y ~ 1 + x, ~ . -eval(myvar))
# y ~ x (doesn't work as intended)
You can use paste() within the update()call.
myvar <- "x"
update(y ~ 1 + x, paste(" ~ . -", myvar))
# y ~ 1
Edit
As #A.Fischer noted in the comments, this won't work if myvar is a vector of length > 1
myvar <- c("k", "l")
update(y ~ 1 + k + l + m, paste(" ~ . -", myvar))
# y ~ l + m
# Warning message:
# Using formula(x) is deprecated when x is a character vector of length > 1.
# Consider formula(paste(x, collapse = " ")) instead.
Just "k" gets removed, but "l" remains in the formula.
In this case we could transform the formula into a strings, add/remove what we want to change and rebuild the formula using reformulate, something like:
FUN <- function(fo, x, negate=FALSE) {
foc <- as.character(fo)
s <- el(strsplit(foc[3], " + ", fixed=T))
if (negate) {
reformulate(s[!s %in% x], foc[2], env=.GlobalEnv)
} else {
reformulate(c(s, x), foc[2], env=.GlobalEnv)
}
}
fo <- y ~ 1 + k + l + m
FUN(fo, c("n", "o")) ## add variables
# y ~ 1 + k + l + m + n + o
FUN(fo, c("k", "l"), negate=TRUE)) ## remove variables
# y ~ 1 + m
It may look like an easy question but is there any fast and robust way to expand a formula like
f=formula(y ~ a * b )
to
y~a+b+ab
I'd try this:
f = y ~ a * b
reformulate(labels(terms(f)), f[[2]])
# y ~ a + b + a:b
It works on more complicated formulas as well, and relies on more internals. (I'm assuming you want a useful formula object out, so in the result a:b is nicer than the ab in the question or a*b in d.b's answer.)
f = y ~ a + b * c
reformulate(labels(terms(f)), f[[2]])
# y ~ a + b + c + b:c
f = y ~ a + (b + c + d)^2
reformulate(labels(terms(f)), f[[2]])
# y ~ a + b + c + d + b:c + b:d + c:d
vec = all.vars(f)
reformulate(c(vec[2:3], paste(vec[2:3], collapse = "*")), vec[1])
#y ~ a + b + a * b
I have a formula in R for example
y ~ x + z + xx + zz + tt + x:xx + x:zz + xx:z + zz:xx + xx:zz:tt
or even more complicated (y~x*z*xx*zz*tt)
Note that the names on the right-hand side of the formula are intentionally selected to be somehow similar to at least one other term.
The question is now how to remove the interaction terms that are related to a specific main effect. For example, if I remove the term x (main effect) I want to remove the interaction terms that also include x, here x:xx.
I have tried grepl() but it would remove any term that contains partially or fully the word. In my example it removes x,xx,x:xx,xx:z,zz:xx,xx:zz:tt
any ideas about a function to do it?
Update:
What I have already tried:
f = y ~ x + z + xx + zz + tt + x:xx + x:zz + xx:z + zz:xx + xx:zz:tt
modelTerms = attr(terms(f) , which = 'term.labels')
modelTerms[!grepl(pattern = 'x', x = modelTerms)]
Use update.formula:
f <- y~x*z*xx*zz*tt
update(f, . ~ . - x - x:.)
#y ~ z + xx + zz + tt + z:xx + z:zz + xx:zz + z:tt + xx:tt + zz:tt +
# z:xx:zz + z:xx:tt + z:zz:tt + xx:zz:tt + z:xx:zz:tt
f <- y ~ x + z + xx + zz + tt + x:xx + x:zz + xx:z + zz:xx + xx:zz:tt
update(f, . ~ . - x - x:.)
#y ~ z + xx + zz + tt + z:xx + xx:zz + xx:zz:tt
Are you looking for this?
> modelTerms[!grepl(pattern='^x\\:x+', x=modelTerms)]
[1] "x" "z" "xx" "zz" "tt" "x:zz" "z:xx" "xx:zz"
[9] "xx:zz:tt"
Simple:
f = y~x*z*xx*zz*tt
modelTerms = attr(terms(f) , which = 'term.labels')
l = sapply(
strsplit(x = modelTerms, split = '[:*]'),
FUN = function(x) {
'x' %in% x
}
)
modelTerms[!l]
What is the proper string parsing required to use reformulate() when the termlabels have embedded spaces?
This works:
reformulate(c("A", "B"), "Y")
Y ~ A + B
These all fail:
reformulate(c("A var", "B"), "Y")
reformulate(quote(c("A var", "B")), "Y")
reformulate(as.formula(quote(c("A var", "B"))), "Y")
Expected results:
Y ~ `A var` + B
# or
Y ~ `A var` + `B`
NOTE
I cannot hard code the backticks. This is part of a larger shiny application, therefore, if backticks are the answer, I need a method to do this programmatically.
Here are a few other ways that work with symbols rather than strings (so no need for explicit backticks).
input <- "A var"
eval(bquote( Y ~ .(as.name(input)) + B))
# Y ~ `A var` + B
eval(substitute( Y ~ INPUT + B, list(INPUT = as.name(input))))
# Y ~ `A var` + B
library(rlang)
eval(expr(Y ~ !!sym(input) + B))
# Y ~ `A var` + B
Use backticks, e.g.
reformulate(c("`A var`", "B"), "Y")
#Y ~ `A var` + B
Or better yet, don't use spaces in variable names.
Or with a helper function
bt <- function(x) sprintf("`%s`", x)
reformulate(c(bt(var1), var2), "Y")
#Y ~ `A var` + B