The Simple Request:
I'd like to take formulas/strings similar to the following:
"A ~ 1 + B + C + L(diff(B), -k:k) + L(diff(C), -k:k)"
and change them to treat functions as character as follows:
"A ~ 1 + B + C + `L(diff(B), -k:k)` + `L(diff(C), -k:k)`"
There may be any number of "L(diff(___), -____:____)" in a string.
The Background:
This is so that I may take the output models made with dynlm and run them with functions that rely on "lm" objects only.
# package
library(dynlm)
# data
A <- as.ts(rnorm(20, 10, 2))
B <- as.ts(A + rnorm(20, 6, 2))
C <- as.ts(rnorm(20, 3, 1))
# lags/leads
k <- 1
# dynlm model
dyn.mod <- dynlm(A ~ 1 + B + C + L(diff(B), -k:k) + L(diff(C), -k:k))
# capture the formula and data
dyn.mod.call <- gsub(" ", "", paste(deparse(dyn.mod$call$formula), collapse = "")) # just in case formula is too long
dyn.mod.model <- dyn.mod$model # the matrix that was created from the call formula
# Do the following
lm(dyn.mod.call, data = dyn.mod.model) # Will not run obviously,
lm(A ~ 1 + B + C + `L(diff(B), -k:k)` + `L(diff(C), -k:k)`, data = dyn.mod.model) # will run
# how do I change
dyn.mod.call
# [1] "A ~ 1 + B + C + L(diff(B), -k:k) + L(diff(C), -k:k)"
# to ad " ` " around each dynlm "L()" function so the process is not manual?
Thanks for your help.
Note that we don't really want to replace all function calls since + is a function (and even ~ can be regarded as a function) but we only want to replace just certain ones. Suppose that the only function call that we want to process is L. Modify the second if appropriately according to what it is that is to be matched. The function shown works recursively. No packages are used.
enquote_L <- function(x) {
if (length(x) == 1) return(x)
if (x[[1]] == as.name("L")) return(as.name(format(x)))
replace(x, -1, lapply(x[-1], enquote_L))
}
s <- "A ~ 1 + B + C + L(diff(B), -k:k) + L(diff(C), -k:k)"
enquote_L(as.formula(s))
## A ~ 1 + B + C + `L(diff(B), -k:k)` + `L(diff(C), -k:k)`
ADDED
If there were a variety of functions and + and ~ were the only ones not to be processed then a variation might be to replace the second if with:
if (x[[1]] != as.name("+") && x[[1]] != as.name("~")) return(as.name(format(x)))
You can use string manipulation to change the formula.
x <- deparse(A ~ 1 + B + C + L(diff(B), -k:k) + L(diff(C), -k:k))
parts <- unlist(strsplit(x, " \\+ "))
parts <- c(parts[1:3], paste0("`", parts[4:5], "`"))
as.formula(paste(parts, collapse = " + "))
Related
With Ryacas, one has:
> yac_str("Simplify(2*x^2*4 + x^2*5)")
[1] "13*x^2"
Now, instead of 4 and 5, I would like to give two letters treated as constants. That is, Ryacas does:
> yac_str("Simplify(2*x^2*a + x^2*b)")
[1] "2*x^2*a+x^2*b"
but I would like that it treats x as an unknown variable and a and b as constants, i.e. I would like to get the result:
(2*a+b)*x^2
I spent one hour to try with no luck. Is it possible? Otherwise, with another package?
This seems to be just how yacas wants to simplify that expression. SymPy does it the way you want though, and it has an R wrapper called caracas
library(caracas)
a <- as_sym("a")
b <- as_sym("b")
x <- as_sym("x")
caracas::simplify(2*x^2*a+x^2*b)
#> [caracas]: 2
#> x *(2*a + b)
I don't like the way it prints x^2, but you can't have everything.
I found the way with caracas.
library(caracas)
def_sym(x, y, z, a, b)
as.character(sympy_func(x^2 + a*x^2 + 2*y + b*y + x*z + a*x*z, "Poly", domain = "RR[a,b]"))
# "Poly((1.0*a + 1.0)*x^2 + (1.0*a + 1.0)*x*z + (1.0*b + 2.0)*y, x, y, z, domain='RR[a,b]')"
You can also work with rational numbers:
poly <-
sympy_func(x^2 + a*x^2 + 2/3*y + b*y + x*z + a*x*z, "Poly", domain = "QQ[a,b]")
as.character(poly)
# "Poly((a + 1)*x^2 + (a + 1)*x*z + (b + 2/3)*y, x, y, z, domain='QQ[a,b]')"
To get the coefficient of a term, e.g. xz (i.e. x^1y^0z^1):
sympy <- get_sympy()
sympy$Poly$nth(poly$pyobj, 1, 0, 1)
# a + 1
I am trying to update a formula for a linear model in R, based on names of variables that I have stored in an array. I am using substitute() for that and the code is as follows.
var = 'a'
covar = c('b', 'c')
covar = paste(c(var, covar), collapse = ' + ')
formula = substitute(condition ~ (1|subject) + v, list(v = as.name(covar)))
print(formula)
Output
condition ~ (1 | subject) + `a + b + c`
How do I remove the extra `` around a + b + c?
If I don't concatenate with paste, then it works, but I need those extra variables...
var = 'a'
formula = substitute(condition ~ (1|subject) + v, list(v = as.name(var)))
print(formula)
Output
condition ~ (1 | subject) + a
Both var and covar are char type.
Another solution that lets iteratively change v in formula that could also work
Assume that v is a term by itself (which is the case in the question) and the inputs shown in the Note at the end. Then here are two approaches.
1) update Use reformulate to create the formula ~ . - v + a + b + c and update the input formula with it.
update(fo, reformulate(c(". - v", var, covar)))
## condition ~ (1 | subject) + a + b + c
2) getTerms Another approach is to decompose the formula into terms using getTerms from this post, remove v, append var and covar and reformulate it back into a formula:
reformulate(c(setdiff(sapply(getTerms(fo[[3]]), format), "v"), var, covar), fo[[2]])
## condition ~ (1 | subject) + a + b + c
Note
The inputs are assumed to be:
var <- 'a'
covar <- c('b', 'c')
fo <- condition ~ (1 | subject) + v
Maybe I misunderstood what you are doing, but the following seems to work:
form <- 'condition ~ (1|subject) + v'
var <- 'a'
covar <- c('b', 'c')
Then combine with paste and turn to formula directly:
covar <- paste(var, paste(covar, collapse=" + "), sep=" + ")
form <- formula(paste(form, covar, sep=" + "))
Output:
condition ~ (1 | subject) + v + a + b + c
The following problem is given: https://imgur.com/a/2OpPKfW
Write a function polyPrint, which prints the polynomial of Equation (1) in a nice way. For example,
if a corresponds to the vector c(2, 0, 1, -2, 5), then polyPrint(a) is supposed to print:
2 + x^2 + (-2)*x^3 + 5*x^4
Note how polyPrint is supposed to handle the cases where an element of a is either 0, 1 or negative!
At the moment I am using a loop and cat. Thats pretty rough because I need to do a lot of exeptions with ifelse statements for 1,0 and any negative.
Are there any other functions I could be using to streamline this process?
# Regular function
polyFunction <- function(x,a){
n <- seq(1, length(a))
sum(ax^(n-1))
}
# Printing the calculation method
a <- c(2,0,1,-2,5)
polyPrint <- function(a){
n <- seq(1, length(a))
for(i in n){
result <- c()
result <- c(result, cat(a[i],'x^', 1+i, '+ '))
}
}
polyPrint(a)
The expected output is a polynomial function that has this format:
p(x) = a0 + a1x + a2x^2 + ... + anxn
At the moment I am getting 2 *x^ 2 + 0 *x^ 3 + 1 *x^ 4 + -2 *x^ 5 + 5 *x^ 6 +
with the vector c(2, 0, 1, -2, 5)
I think you have an error in the way you are aggregating values and also the way you are taking in considerations special values (0,1, negative values).
Here an example of how you can do that (maybe not the best solution, but it does the job):
polyPrint <- function(a){
formule = NULL
for(i in 1:length(a))
{
if(a[i]==0){}
else{
if(i == 1) {formule = a[i]}
else{
if(a[i]<0){formule = paste0(formule, " + (",a[i],")x^",i)}
else{
if(a[i] == 1){formule = paste0(formule," + x^",i)}
else{formule = paste0(formule, " + ", a[i],"x^",i)}
}
}
}
}
print(formule)
}
And the result for a <- c(2,0,1,-2,5):
> polyPrint(a)
[1] "2 + x^3 + (-2)x^4 + 5x^5"
I can construct a formula that does what I desire starting with the character versions of terms in a formula, but I'm stumbling in starting with a formula object:
form1 <- Y ~ A + B
form1[-c(1,2)][[1]]
#A + B
Now how to build a formula object that looks like:
Y ~ poly(A, 2) + poly(B, 2) + poly(C, 2)
Or:
Y ~ pspline(A, 4) + pspline(B, 4) + pspline(C, 4)
Seems that it might involve a recursive walk along the RHS but I'm not getting progress. It just occurred to me that I might use
> attr( terms(form1), "term.labels")
[1] "A" "B"
And then use the as.formula(character-expr) approach, but I's sorly of like to see an lapply (RHS_form, somefunc) version of a polyize (or perhaps polymer?) function.
If I borrow some functions I originally wrote here, you could do something like this. First, the helper functions...
extract_rhs_symbols <- function(x) {
as.list(attr(delete.response(terms(x)), "variables"))[-1]
}
symbols_to_formula <- function(x) {
as.call(list(quote(`~`), x))
}
sum_symbols <- function(...) {
Reduce(function(a,b) bquote(.(a)+.(b)), do.call(`c`, list(...), quote=T))
}
transform_terms <- function(x, f) {
symbols_to_formula(sum_symbols(sapply(extract_rhs_symbols(x), function(x) do.call("substitute",list(f, list(x=x))))))
}
And then you can use
update(form1, transform_terms(form1, quote(poly(x, 2))))
# Y ~ poly(A, 2) + poly(B, 2)
update(form1, transform_terms(form1, quote(pspline(x, 4))))
# Y ~ pspline(A, 4) + pspline(B, 4)
There's a formula.tools package that provides various utility functions for working with formulas.
f <- y ~ a + b
rhs(f) # a + b
x <- get.vars(rhs(f)) # "a" "b"
r <- paste(sprintf("poly(%s, 4)", x), collapse=" + ") # "poly(a, 4) + poly(b, 4)"
rhs(f) <- parse(text=r)[[1]]
f # y ~ poly(a, 4) + poly(b, 4)
Suppose I have four doubles a, b, c, d that at various points in my script will assume different real numbers. Assume also that all four doubles have values that center around another double called X. Namely, the following relationships must always hold:
a = X + 1
b = X + 5
c = X + 10
d = X + 15
In my script, the value of X is always changing. How do I write a function such that a, b, c, d change alongside X?
Creating the setAll function below and calling whenever X changes will of course not work but is in the spirit of what I want.:
setAll <- function(X) {
a = X + 1
b = X + 5
c = X + 10
d = X + 15
}
setAll(100) #if X = 100
If you'd want to keep the clutter at a minimal level in the .GlovalEnv, it might be better to keep all these variables in a separate environment, e.g.:
> setAll <- function(X) {
+ if (!(exists('myParams') && is.environment(myParams))) {
+ myParams <- new.env()
+ }
+ myParams$a = X + 1
+ myParams$b = X + 5
+ myParams$c = X + 10
+ myParams$d = X + 15
+ }
> setAll(100) #if X = 100
> myParams$a
[1] 101
Or you might just create a reference class in the means of OO programming as an alternative solution:
> myParam <- setRefClass('myParam', fields = list('X' = 'numeric', 'a' = 'numeric', 'b' = 'numeric', 'c' = 'numeric', 'd' = 'numeric'))
> myParam$methods(initialize = function(X, ...) {
+ .self$a <- X + 1
+ .self$b <- X + 5
+ .self$c <- X + 10
+ .self$d <- X + 15
+ callSuper(...)
+ })
> foo <- myParam(pi)
> foo$a
[1] 4.141593
> foo$b
[1] 8.141593
...
Sure, these are just initial and dummy wire-frames, but hopefully this would be useful for further ideas.
If you are working with scripts and you want these global variables in your workspace then use the <<- operator: ?"<<-" Be careful though - this approach assumes that your critical variables don't get changed by any means other than what you intend, and are not very portable.
Update: Your setAll function should work if you change it to setAll <- function() - no argument is needed if X is reset each time with the <<- operator.