I am trying to add columns to my df for woba, an advanced baseball stat, but when I try to mutate the column it gives an unexpected input.
This is my code
library(baseballr)
output <- bref_daily_batter("2015-05-10", "2015-05-30")
output <- mutate(output, wOBA = (0.687*uBB + 0.718*HBP + 0.881*X1B + 1.256*X2B + 1.594*X3B + 2.065*HR) / (AB + BB – IBB + SF + HBP))
Error: unexpected input in "output<- mutate(output, wOBA = (0.687uBB + 0.718HBP + 0.881X1B + 1.256X2B + 1.594X3B + 2.065HR) / (AB + BB –"
The problem is in whatever negative symbol you are using.
This does not work:
output <- mutate(output, wOBA = (0.687*uBB + 0.718*HBP + 0.881*X1B + 1.256*X2B + 1.594*X3B + 2.065*HR) / (AB + BB – IBB + SF + HBP))
but this does:
output <- mutate(output, wOBA = (0.687*uBB + 0.718*HBP + 0.881*X1B + 1.256*X2B + 1.594*X3B + 2.065*HR) / (AB + BB - IBB + SF + HBP))
I just replaced the minus sign with a new minus sign, which looks slightly smaller on my screen
Suppose I have a list of variables names x = c('a','b','c','d','e') for a statistical model. When building the formula, it's nice to use something like paste('y ~',paste(x,collapse=' + ')) to get y ~ a + b + c + d + e, especially when x may change.
Now I'd like to do the same thing with interaction terms, but paste(x,collapse=' : ') produces a : b : c : d : e, which is only one term, and paste(x,collapse=' * ') produces a * b * c * d * e, which includes all possible interactions across all orders -- i.e. a + b + c + ... + a:b + a:c + ... a:b:c + a:b:d + ... + a:b:c:d:e. How can I limit the order of interaction terms up to say, 2nd, e.g. a:b ?
The most straightforward way to achieve this, assuming you want to cross all terms to a specified degree, is to use the ^ operator in the formula.
x = c('a','b','c','d','e')
# Build formula using reformulate
(fm <- reformulate(x, "y"))
y ~ a + b + c + d + e
# Cross to second degree
(fm2 <- update(fm, ~ .^2))
y ~ a + b + c + d + e + a:b + a:c + a:d + a:e + b:c + b:d + b:e +
c:d + c:e + d:e
# Terms of f2 as character:
attr(terms.formula(fm2), "term.labels")
[1] "a" "b" "c" "d" "e" "a:b" "a:c" "a:d" "a:e" "b:c" "b:d" "b:e" "c:d" "c:e" "d:e"
# Cross to third degree
(fm3 <- update(fm, ~ .^3))
y ~ a + b + c + d + e + a:b + a:c + a:d + a:e + b:c + b:d + b:e +
c:d + c:e + d:e + a:b:c + a:b:d + a:b:e + a:c:d + a:c:e +
a:d:e + b:c:d + b:c:e + b:d:e + c:d:e
reformulate handles this problem quite naturally, though how you would apply it is context-dependent.
If you want to drop interactions of order greater than order_max from an existing formula, then you can do:
f1 <- function(formula, order_max) {
a <- attributes(terms(formula))
reformulate(termlabels = a$term.labels[a$order <= order_max],
response = if (r <- a$response) a$variables[[1L + r]],
intercept = a$intercept,
env = environment(formula))
}
f1(y ~ a * b * c * d * e, 2L)
## y ~ a + b + c + d + e + a:b + a:c + b:c + a:d + b:d + c:d + a:e +
## b:e + c:e + d:e
If you have a character vector x listing names of variables, and you want to construct a formula containing their interactions up to order order_max, then you can do:
Edit: Never mind - follow #RitchieSacramento's suggestion and use the ^ operator in this case.
f2 <- function(x, order_max, response = NULL, intercept = TRUE, env = parent.frame()) {
paste1 <- function(x) paste0(x, collapse = ":")
combn1 <- function(n) if (n > 1L) combn(x, n, paste1) else x
termlabels <- unlist(lapply(seq_len(order_max), combn1), FALSE, FALSE)
reformulate(termlabels = termlabels, response = response,
intercept = intercept, env = env)
}
f2(letters[1:5], 2L, response = quote(y))
## y ~ a + b + c + d + e + a:b + a:c + a:d + a:e + b:c + b:d + b:e +
## c:d + c:e + d:e
To be parsed correctly, nonsyntactic variable names must be protected with backquotes:
f2(c("`!`", "`?`"), 1L, response = quote(`#`))
## `#` ~ `!` + `?`
Here is a another solution to create the : terms:
iterms = function(x,n,lower=TRUE){
return(paste(lapply(ifelse(lower,1,n):n,function(ni){
paste(apply(combn(x,ni),2,paste,collapse=':'),collapse=' + ')
}),collapse=' + '))
}
Testing with:
x = c('a','b','c','d')
print(iterms(x,1))
print(iterms(x,2))
print(iterms(x,3))
print(iterms(x,3,lower=FALSE))
yields:
[1] "a + b + c + d"
[1] "a + b + c + d + a:b + a:c + a:d + b:c + b:d + c:d"
[1] "a + b + c + d + a:b + a:c + a:d + b:c + b:d + c:d + a:b:c + a:b:d + a:c:d + b:c:d"
[1] "a:b:c + a:b:d + a:c:d + b:c:d"
How can i find out how many observations were used in a regression?
model_simple <- as.formula("completion_yesno ~ ac + ov + UCRate + FirstWeek + LastWeek + DayofWeekSu + DayofWeekMo + DayofWeekTu + DayofWeekWe + DayofWeekTh + DayofWeekFr + MonthofYearJan + MonthofYearFeb + MonthofYearMar + MonthofYearApr +MonthofYearMay+ MonthofYearJun + MonthofYearJul + MonthofYearAug + MonthofYearSep + MonthofYearOct + MonthofYearNov")
clog_simple1 = glm(model_simple,data=cllw,family = binomial(link = cloglog))
summary(clog_simple1)
I have tried the fitted command which did not result in a concrete number of observations N
Use the built in nobs() function
nobs(clog_simple1)
I've been using quosures with dplyr:
library(dplyr)
library(ggplot2)
thing <- quo(clarity)
diamonds %>% select(!!thing)
print(paste("looking at", thing))
[1] "looking at ~" "looking at clarity"
I really want to print out the string value put into the quo, but can only get the following:
print(thing)
<quosure: global>
~clarity
print(thing[2])
clarity()
substr(thing[2],1, nchar(thing[2]))
[1] "clarity"
is there a simpler way to "unquote" a quo()?
We can use quo_name
print(paste("looking at", quo_name(thing)))
quo_name does not work if the quosure is too long:
> q <- quo(a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z)
> quo_name(q)
[1] "+..."
rlang::quo_text (not exported by dplyr) works better, but introduces line breaks (which can be controlled with parameter width):
> rlang::quo_text(q)
[1] "a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + \n q + r + s + t + u + v + w + x + y + z"
Otherwise, as.character can also be used, but returns a vector of length two. The second part is what you want:
> as.character(q)
[1] "~"
[2] "a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z"
> as.character(q)[2]
[1] "a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z"
If you use within a function, you will need to enquo() it first. Note also that with newer versions of rlang, as_name() seems to be preferred!
library(rlang)
fo <- function(arg1= name) {
print(rlang::quo_text(enquo(arg1)))
print(rlang::as_name(enquo(arg1)))
print(rlang::quo_name(enquo(arg1)))
}
fo()
#> [1] "name"
#> [1] "name"
#> [1] "name"
I have a little problem with this expression:
x = (A'+B)(A+C)
I know it can be simplified to:
A'C+AB
since ive used some software to simplify it, but i simply can't see how it is done.
This is what i've done so far:
(A'+B)(A+C) =>
A'A + AB + A'C + BC =>
0 + AB + A'C + BC =>
AB + A'C + BC
I just fail to see how i can do this differently and get to the correct result.
So we are trying to prove:
AB + A'C + BC = AB + A'C
Using the Identity Law X = X1, the left side can become:
AB + A'C + BC1
Inverse Law 1 = X' + X
AB + A'C + BC(A + A')
Distributive Law X(Y + Z) = XY + XZ
AB + A'C + BCA + BCA'
Associative Law (XY)Z = X(YZ)
AB + A'C + ABC + A'BC
Commutative Law X + Y= Y + X
AB + ABC + A'C + A'BC
Distributive again
AB(1 + C) + A'C(1 + B)
Finally, the Null Law 1 + X = 1
AB(1) + A'C(1)
AB + A'C = AB + A'C