I am trying to add columns to my df for woba, an advanced baseball stat, but when I try to mutate the column it gives an unexpected input.
This is my code
library(baseballr)
output <- bref_daily_batter("2015-05-10", "2015-05-30")
output <- mutate(output, wOBA = (0.687*uBB + 0.718*HBP + 0.881*X1B + 1.256*X2B + 1.594*X3B + 2.065*HR) / (AB + BB – IBB + SF + HBP))
Error: unexpected input in "output<- mutate(output, wOBA = (0.687uBB + 0.718HBP + 0.881X1B + 1.256X2B + 1.594X3B + 2.065HR) / (AB + BB –"
The problem is in whatever negative symbol you are using.
This does not work:
output <- mutate(output, wOBA = (0.687*uBB + 0.718*HBP + 0.881*X1B + 1.256*X2B + 1.594*X3B + 2.065*HR) / (AB + BB – IBB + SF + HBP))
but this does:
output <- mutate(output, wOBA = (0.687*uBB + 0.718*HBP + 0.881*X1B + 1.256*X2B + 1.594*X3B + 2.065*HR) / (AB + BB - IBB + SF + HBP))
I just replaced the minus sign with a new minus sign, which looks slightly smaller on my screen
Suppose I have a list of variables names x = c('a','b','c','d','e') for a statistical model. When building the formula, it's nice to use something like paste('y ~',paste(x,collapse=' + ')) to get y ~ a + b + c + d + e, especially when x may change.
Now I'd like to do the same thing with interaction terms, but paste(x,collapse=' : ') produces a : b : c : d : e, which is only one term, and paste(x,collapse=' * ') produces a * b * c * d * e, which includes all possible interactions across all orders -- i.e. a + b + c + ... + a:b + a:c + ... a:b:c + a:b:d + ... + a:b:c:d:e. How can I limit the order of interaction terms up to say, 2nd, e.g. a:b ?
The most straightforward way to achieve this, assuming you want to cross all terms to a specified degree, is to use the ^ operator in the formula.
x = c('a','b','c','d','e')
# Build formula using reformulate
(fm <- reformulate(x, "y"))
y ~ a + b + c + d + e
# Cross to second degree
(fm2 <- update(fm, ~ .^2))
y ~ a + b + c + d + e + a:b + a:c + a:d + a:e + b:c + b:d + b:e +
c:d + c:e + d:e
# Terms of f2 as character:
attr(terms.formula(fm2), "term.labels")
[1] "a" "b" "c" "d" "e" "a:b" "a:c" "a:d" "a:e" "b:c" "b:d" "b:e" "c:d" "c:e" "d:e"
# Cross to third degree
(fm3 <- update(fm, ~ .^3))
y ~ a + b + c + d + e + a:b + a:c + a:d + a:e + b:c + b:d + b:e +
c:d + c:e + d:e + a:b:c + a:b:d + a:b:e + a:c:d + a:c:e +
a:d:e + b:c:d + b:c:e + b:d:e + c:d:e
reformulate handles this problem quite naturally, though how you would apply it is context-dependent.
If you want to drop interactions of order greater than order_max from an existing formula, then you can do:
f1 <- function(formula, order_max) {
a <- attributes(terms(formula))
reformulate(termlabels = a$term.labels[a$order <= order_max],
response = if (r <- a$response) a$variables[[1L + r]],
intercept = a$intercept,
env = environment(formula))
}
f1(y ~ a * b * c * d * e, 2L)
## y ~ a + b + c + d + e + a:b + a:c + b:c + a:d + b:d + c:d + a:e +
## b:e + c:e + d:e
If you have a character vector x listing names of variables, and you want to construct a formula containing their interactions up to order order_max, then you can do:
Edit: Never mind - follow #RitchieSacramento's suggestion and use the ^ operator in this case.
f2 <- function(x, order_max, response = NULL, intercept = TRUE, env = parent.frame()) {
paste1 <- function(x) paste0(x, collapse = ":")
combn1 <- function(n) if (n > 1L) combn(x, n, paste1) else x
termlabels <- unlist(lapply(seq_len(order_max), combn1), FALSE, FALSE)
reformulate(termlabels = termlabels, response = response,
intercept = intercept, env = env)
}
f2(letters[1:5], 2L, response = quote(y))
## y ~ a + b + c + d + e + a:b + a:c + a:d + a:e + b:c + b:d + b:e +
## c:d + c:e + d:e
To be parsed correctly, nonsyntactic variable names must be protected with backquotes:
f2(c("`!`", "`?`"), 1L, response = quote(`#`))
## `#` ~ `!` + `?`
Here is a another solution to create the : terms:
iterms = function(x,n,lower=TRUE){
return(paste(lapply(ifelse(lower,1,n):n,function(ni){
paste(apply(combn(x,ni),2,paste,collapse=':'),collapse=' + ')
}),collapse=' + '))
}
Testing with:
x = c('a','b','c','d')
print(iterms(x,1))
print(iterms(x,2))
print(iterms(x,3))
print(iterms(x,3,lower=FALSE))
yields:
[1] "a + b + c + d"
[1] "a + b + c + d + a:b + a:c + a:d + b:c + b:d + c:d"
[1] "a + b + c + d + a:b + a:c + a:d + b:c + b:d + c:d + a:b:c + a:b:d + a:c:d + b:c:d"
[1] "a:b:c + a:b:d + a:c:d + b:c:d"
I have this recurrence relation
L^2 G[p]= 2(p-1)(2p-1)G[p-1] + ((p-1)(p-2)+a^2) G[p-2], where L and a are parameters.
Does anyone could help me to find the solution? Thanks
This looks like quite a complicated calculation, especially when no start values are given.
To get some more insight, one could use sympy, Python's symbolic math library to print the formulas for small values of p:
from sympy import symbols
def func_G(p):
if p == 0:
return G0
elif p == 1:
return G1
else:
return (2 * (p - 1) * (2 * p - 1) * func_G(p - 1) + ((p - 1) * (p - 2) + a ** 2) * func_G(p - 2)) / L ** 2
a, L, G0, G1 = symbols('a L G0 G1')
for p in range(8):
print(p, ':', func_G(p).simplify())
Prints out:
0 : G0
1 : G1
2 : (G0*a**2 + 6*G1)/L**2
3 : (20*G0*a**2 + G1*L**2*(a**2 + 2) + 120*G1)/L**4
4 : (840*G0*a**2 + 42*G1*L**2*(a**2 + 2) + 5040*G1 + L**2*(a**2 + 6)*(G0*a**2 + 6*G1))/L**6
5 : (60480*G0*a**2 + 3024*G1*L**2*(a**2 + 2) + 362880*G1 + 72*L**2*(a**2 + 6)*(G0*a**2 + 6*G1) + L**2*(a**2 + 12)*(20*G0*a**2 + G1*L**2*(a**2 + 2) + 120*G1))/L**8
6 : (G0*L**4*a**6 + 26*G0*L**4*a**4 + 120*G0*L**4*a**2 + 10960*G0*L**2*a**4 + 90720*G0*L**2*a**2 + 6652800*G0*a**2 + 158*G1*L**4*a**4 + 2620*G1*L**4*a**2 + 5040*G1*L**4 + 398400*G1*L**2*a**2 + 1209600*G1*L**2 + 39916800*G1)/L**10
7 : (248*G0*L**4*a**6 + 7488*G0*L**4*a**4 + 38880*G0*L**4*a**2 + 1770240*G0*L**2*a**4 + 15966720*G0*L**2*a**2 + 1037836800*G0*a**2 + G1*L**6*a**6 + 44*G1*L**6*a**4 + 444*G1*L**6*a**2 + 720*G1*L**6 + 28224*G1*L**4*a**4 + 526080*G1*L**4*a**2 + 1088640*G1*L**4 + 62513280*G1*L**2*a**2 + 199584000*G1*L**2 + 6227020800*G1)/L**12
I need help simplifying the following to the simplest terms. Boolean algebra just doesn't quite click with me yet, any help is appreciated.
(!A!B!C)+(!AB!C)+(!ABC)+(A!B!C)+(A!BC)+(AB!C)
I got it to the following, but I don't know where to go from here:
!A(!B!C + B!C + BC) + A(!B!C + B(XOR)C)
If you are curious and want to check my previous work, I got the original equation from the truth table:
Initially we have A(~B~C + ~BC + ~CB) + ~A(~B~C + B~C + BC)
First Term: A(~B~C + ~BC + ~CB)
= A(~B(~C + C) + ~CB)
= A(~B(True) + ~CB)
= A(~B + ~CB)
= A((~B + ~C)(~B + B))
= A((~B + ~C)(True))
= A(~B + ~C)
Second Term: ~A(~B~C + B~C + BC)
= ~A(~C(~B + B) + BC)
= ~A(~C(True) + BC)
= ~A(~C + BC)
= ~A((~C + C) (~C + B))
= ~A((True) (~C + B))
= ~A(~C + B)
So First Term + Second Term becomes: ~A(~C + B) + A(~B + ~C)
= ~A~C + ~AB + A~B + A~C
= AxorB + ~A~C + A~C
= AxorB + ~C(~A + A)
= AxorB + ~C(True)
= AxorB + ~C
Hence we end up with AxorB + ~C
I've been using quosures with dplyr:
library(dplyr)
library(ggplot2)
thing <- quo(clarity)
diamonds %>% select(!!thing)
print(paste("looking at", thing))
[1] "looking at ~" "looking at clarity"
I really want to print out the string value put into the quo, but can only get the following:
print(thing)
<quosure: global>
~clarity
print(thing[2])
clarity()
substr(thing[2],1, nchar(thing[2]))
[1] "clarity"
is there a simpler way to "unquote" a quo()?
We can use quo_name
print(paste("looking at", quo_name(thing)))
quo_name does not work if the quosure is too long:
> q <- quo(a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z)
> quo_name(q)
[1] "+..."
rlang::quo_text (not exported by dplyr) works better, but introduces line breaks (which can be controlled with parameter width):
> rlang::quo_text(q)
[1] "a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + \n q + r + s + t + u + v + w + x + y + z"
Otherwise, as.character can also be used, but returns a vector of length two. The second part is what you want:
> as.character(q)
[1] "~"
[2] "a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z"
> as.character(q)[2]
[1] "a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z"
If you use within a function, you will need to enquo() it first. Note also that with newer versions of rlang, as_name() seems to be preferred!
library(rlang)
fo <- function(arg1= name) {
print(rlang::quo_text(enquo(arg1)))
print(rlang::as_name(enquo(arg1)))
print(rlang::quo_name(enquo(arg1)))
}
fo()
#> [1] "name"
#> [1] "name"
#> [1] "name"