Suppose I have four doubles a, b, c, d that at various points in my script will assume different real numbers. Assume also that all four doubles have values that center around another double called X. Namely, the following relationships must always hold:
a = X + 1
b = X + 5
c = X + 10
d = X + 15
In my script, the value of X is always changing. How do I write a function such that a, b, c, d change alongside X?
Creating the setAll function below and calling whenever X changes will of course not work but is in the spirit of what I want.:
setAll <- function(X) {
a = X + 1
b = X + 5
c = X + 10
d = X + 15
}
setAll(100) #if X = 100
If you'd want to keep the clutter at a minimal level in the .GlovalEnv, it might be better to keep all these variables in a separate environment, e.g.:
> setAll <- function(X) {
+ if (!(exists('myParams') && is.environment(myParams))) {
+ myParams <- new.env()
+ }
+ myParams$a = X + 1
+ myParams$b = X + 5
+ myParams$c = X + 10
+ myParams$d = X + 15
+ }
> setAll(100) #if X = 100
> myParams$a
[1] 101
Or you might just create a reference class in the means of OO programming as an alternative solution:
> myParam <- setRefClass('myParam', fields = list('X' = 'numeric', 'a' = 'numeric', 'b' = 'numeric', 'c' = 'numeric', 'd' = 'numeric'))
> myParam$methods(initialize = function(X, ...) {
+ .self$a <- X + 1
+ .self$b <- X + 5
+ .self$c <- X + 10
+ .self$d <- X + 15
+ callSuper(...)
+ })
> foo <- myParam(pi)
> foo$a
[1] 4.141593
> foo$b
[1] 8.141593
...
Sure, these are just initial and dummy wire-frames, but hopefully this would be useful for further ideas.
If you are working with scripts and you want these global variables in your workspace then use the <<- operator: ?"<<-" Be careful though - this approach assumes that your critical variables don't get changed by any means other than what you intend, and are not very portable.
Update: Your setAll function should work if you change it to setAll <- function() - no argument is needed if X is reset each time with the <<- operator.
Related
I’m going to make a continuous graph with conditional log, in particular:
f <- function(x) {
if (0 < x && x <= 2) {
return (log(x));
} else {
return (x^2*log(x));
}
}
Now, I want to plot it:
ggplot(data.frame(x=seq(0,5,length.out=1000)), aes(x)) + geom_function(fun=f)
This code shows the graph fine on the part that is from x = 2 on, but for some reasons even on x between 0 and 2, it uses the second function x^2*log(x), not the first one.
This is more easily seen if I change the seq to seq(0, 2.1, length.out=1000).
But why does this happen and how can I fix it? I think I'm fine with the if statement with multiple conditions...
if is not vectorized. try ifelse instead:
f <- function(x) {
ifelse(test = 0 < x & x <= 2,
yes = log(x),
no = (x^2) * log(x))
}
df <- data.frame(x = seq(0, 5, length.out = 10000))
ggplot(df, aes(x)) +
geom_function(fun = f)
Vectorized means that ifelse takes a logical vector and applies the corresponding logic to each element of the vector, whether test was TRUE or FALSE.
if instead can handle just a single TRUE or FALSE. In case a vector is provided, only the first element of the vector is considered. Hence you probably received the following warning:
#> Warning messages:
#> 1: In 0 < x && x <= 2 : 'length(x) = 1000 > 1' in coercion to 'logical(1)'
I've got a bit of code that works, but which I understand relies on bad practice to do so. To use a simple representation of the problem, take the code;
operation <- function(index){
a <- 0
if(data[index] == FALSE){
data[index] <<- TRUE
a <- a + 1}
a <- a + 1
return(a)
}
data <- c(FALSE, FALSE, FALSE)
x <- 0
x <- x + operation(sample(c(1,2,3),1))
x <- x + operation(sample(c(1,2,3),1))
x <- x + operation(sample(c(1,2,3),1))
x
The "operation" function has two purposes - firstly, to output 2 if the value specified by the inputs is FALSE or 1 if TRUE, and importantly to change the input to TRUE so that future calls of the same input return 1.
The problems with this are that the operation function references a global variable which I know for my use case will always exist, but hypothetically may not, and that the function writes to the global variable with the <<- command, which I understand is incredibly bad practice.
Is there a better-practice way to achieve the same functionality without the function writing to the global variable?
R does, by design, only return one object. To return multiple objects, you have to store them in a list and use both elements as inputs.
operation <- function(index, data){
a <- 0
if(data[index] == FALSE) {
data[index] <- TRUE
a <- a + 1}
a <- a + 1
return(list(a = a, data = data))
}
data <- c(FALSE, FALSE, FALSE)
x <- 0
set.seed(999)
res <- operation(sample(1:3, 1), data)
x <- x + res$a
res <- operation(sample(1:3, 1), res$data)
x <- x + res$a
res <- operation(sample(1:3, 1), res$data)
x <- x + res$a
x
#> [1] 5
res$data
#> [1] TRUE FALSE TRUE
Another option would be to create a R6-Object that has two bindings x and data and change those by self referencing
We can use object oriented programming (OOP). Compare this to using lists in another answer to see the increased clarity of using OOP once the object has been defined -- the actual code which runs the op method hardly changes from the question. 1a, 2 and 3 do not require any addon packages.
1) proto First we use the proto package for OOP. proto objects are environments with certain added methods. Here p is a proto object that contains data and also a method op. Note that with proto we can avoid the use of <<- and unlike class-based object oriented systems proto allows definitions of objects, here p is an object, without needing classes.
library(proto)
p <- proto(op = function(., index) {
a <- 0
if( ! .$data[index] ) {
.$data[index] <- TRUE
a <- a + 1
}
a <- a + 1
return(a)
})
p$data <- c(FALSE, FALSE, FALSE)
x <- 0
x <- x + p$op(sample(c(1,2,3),1))
x <- x + p$op(sample(c(1,2,3),1))
x
p$data
1a A variation of this is to use just use plain environments.
e <- local({
op <- function(index) {
a <- 0
if( ! data[index] ) {
data[index] <<- TRUE
a <- a + 1
}
a <- a + 1
return(a)
}
environment()
})
e$data <- c(FALSE, FALSE, FALSE)
x <- 0
x <- x + e$op(sample(c(1,2,3),1))
x <- x + e$op(sample(c(1,2,3),1))
x
e$data
2) Reference Classes Reference classes for OOP come with R and do not require any packages. This may be overkill since it requires creating a class which only ever instantiates one object whereas with proto we can directly generate an object without this extra step.
MyClass <- setRefClass("MyClass", fields = "data",
methods = list(
op = function(index) {
a <- 0
if( ! data[index] ) {
data[index] <<- TRUE
a <- a + 1
}
a <- a + 1
return(a)
}
)
)
obj <- MyClass$new(data = c(FALSE, FALSE, FALSE))
x <- 0
x <- x + obj$op(sample(c(1,2,3),1))
x <- x + obj$op(sample(c(1,2,3),1))
x
obj$data
3) scoping It is possible to devise a poor man's OOP system that works with R by making use of function scoping. Try demo(scoping) for another example. This also does not require any packages. It does have the disadvantage of (2) that it requires the definition of a class which is only used once.
cls <- function(data = NULL) {
list(
put_data = function(x) data <<- x,
get_data = function() data,
op = function(index) {
a <- 0
if( ! data[index] ) {
data[index] <<- TRUE
a <- a + 1
}
a <- a + 1
return(a)
}
)
}
obj <- cls(data = c(FALSE, FALSE, FALSE))
x <- 0
x <- x + obj$op(sample(c(1,2,3),1))
x <- x + obj$op(sample(c(1,2,3),1))
x
obj$get_data()
4) You can also explore R6, R.oo and oops which are other CRAN packages that implement OOP in R.
I have a vector of fish weight. I have written a function to check on the weight of the fish in the vector. If the weight is above 20, then update y (i.e., fish count) and z (i.e., fish total lbs). Below is the code:
function(x, y, z) {
for (fish in 1:x) {
if (x >= 20) {
y <- y + 1
z <- z + 1
return (y, z)
}
}
}
When I call the function with
funcy_fish(BF_caught, BF_kept_ct, BF_kept_lbs)
I get multiple error messages but no optimal return; I want the variables to be updated with the fish count kept and the total lbs of fish kept. The error messages:
funcy_fish(BF_caught, BF_kept_ct, BF_kept_lbs)
[1] 1
Warning messages:
1: In 1:x : numerical expression has 4 elements: only the first used
2: In if (x >= 20) { :
the condition has length > 1 and only the first element will be used
Please advise.
UPDATE
I have updated the function, and got the right output. But now, the variables I passed to the function are not updated:
> # function to determine which fish from caught var to keep
> funcy_fish <- function(x, y, z) {
+ y <- y + (sum(x > 20))
+ z <- z + (x %>% sum())
+ return (c(y, z))
+ # return (z)
+ }
> funcy_fish(BF_caught, BF_kept_ct, BF_kept_lbs)
[1] 4.0000 190.8728
The output is right, but BF_kept_ct and BF_kept_lbs are not updated in the global scope.
Return a named list from the function -
funcy_fish <- function(x, y, z) {
y <- y + (sum(x > 20))
z <- z + (x %>% sum())
return(list(BF_kept_ct = y, BF_kept_lbs = z))
}
res <- funcy_fish(BF_caught, BF_kept_ct, BF_kept_lbs)
If you want to update original BF_kept_ct and BF_kept_lbs variable use list2env.
list2env(res, .GlobalEnv)
The Simple Request:
I'd like to take formulas/strings similar to the following:
"A ~ 1 + B + C + L(diff(B), -k:k) + L(diff(C), -k:k)"
and change them to treat functions as character as follows:
"A ~ 1 + B + C + `L(diff(B), -k:k)` + `L(diff(C), -k:k)`"
There may be any number of "L(diff(___), -____:____)" in a string.
The Background:
This is so that I may take the output models made with dynlm and run them with functions that rely on "lm" objects only.
# package
library(dynlm)
# data
A <- as.ts(rnorm(20, 10, 2))
B <- as.ts(A + rnorm(20, 6, 2))
C <- as.ts(rnorm(20, 3, 1))
# lags/leads
k <- 1
# dynlm model
dyn.mod <- dynlm(A ~ 1 + B + C + L(diff(B), -k:k) + L(diff(C), -k:k))
# capture the formula and data
dyn.mod.call <- gsub(" ", "", paste(deparse(dyn.mod$call$formula), collapse = "")) # just in case formula is too long
dyn.mod.model <- dyn.mod$model # the matrix that was created from the call formula
# Do the following
lm(dyn.mod.call, data = dyn.mod.model) # Will not run obviously,
lm(A ~ 1 + B + C + `L(diff(B), -k:k)` + `L(diff(C), -k:k)`, data = dyn.mod.model) # will run
# how do I change
dyn.mod.call
# [1] "A ~ 1 + B + C + L(diff(B), -k:k) + L(diff(C), -k:k)"
# to ad " ` " around each dynlm "L()" function so the process is not manual?
Thanks for your help.
Note that we don't really want to replace all function calls since + is a function (and even ~ can be regarded as a function) but we only want to replace just certain ones. Suppose that the only function call that we want to process is L. Modify the second if appropriately according to what it is that is to be matched. The function shown works recursively. No packages are used.
enquote_L <- function(x) {
if (length(x) == 1) return(x)
if (x[[1]] == as.name("L")) return(as.name(format(x)))
replace(x, -1, lapply(x[-1], enquote_L))
}
s <- "A ~ 1 + B + C + L(diff(B), -k:k) + L(diff(C), -k:k)"
enquote_L(as.formula(s))
## A ~ 1 + B + C + `L(diff(B), -k:k)` + `L(diff(C), -k:k)`
ADDED
If there were a variety of functions and + and ~ were the only ones not to be processed then a variation might be to replace the second if with:
if (x[[1]] != as.name("+") && x[[1]] != as.name("~")) return(as.name(format(x)))
You can use string manipulation to change the formula.
x <- deparse(A ~ 1 + B + C + L(diff(B), -k:k) + L(diff(C), -k:k))
parts <- unlist(strsplit(x, " \\+ "))
parts <- c(parts[1:3], paste0("`", parts[4:5], "`"))
as.formula(paste(parts, collapse = " + "))
I'm trying to adjust the names of an argument inside a function. I want to create a procedure that takes the body of a function, looks for x, changes every x into x0, and then restores the function to what it was before. To provide an example:
f = function(x, y) -x^2 + x + -y^2 + y
# Take old names
form_old = names(formals(f))
# Make new names
form_new = paste0(form_old, 0)
# Give f new formals
formals(f) = setNames(vector("list", length(form_new)), form_new)
# Copy function body
bod = as.list(body(f))
for (i in 1:length(form_new)) {
bod = gsub(form_old[i], form_new[i], bod)
}
# return from list to call ?
body(f) = as.call(list(bod))
f(1, 1) # produces an error
So far, this code will change all variable names from x to x0 and from y to y0. However, the final output of bod is a character vector and not a call. How can I now change this back to a call?
Thanks in advance!
Surely there is a better way to do what you are trying to do that doesn't require modifying functions. That being said, you definetly don't want to be replacing variables by regular expressions, that could have all sorts of problems. Generally, trying to manipulate code as strings is going to lead to problems, for example, a function like tricky <- function(x, y) { tst <- "x + y"; -xx*x + yy*y }, where there are strings and variable names overlap, will lead to the wrong results.
Here is a function that takes a recursive approach (Recall) to traverse the expression tree (recursion could be avoided using a 'stack' type structure, but it seems more difficult to me).
## Function to replace variables in function body
## expr is `body(f)`, keyvals is a lookup table for replacements
rep_vars <- function(expr, keyvals) {
if (!length(expr)) return()
for (i in seq_along(expr)) {
if (is.call(expr[[i]])) expr[[i]][-1L] <- Recall(expr[[i]][-1L], keyvals)
if (is.name(expr[[i]]) && deparse(expr[[i]]) %in% names(keyvals))
expr[[i]] <- as.name(keyvals[[deparse(expr[[i]])]])
}
return( expr )
}
## Test it
f <- function(x, y) -x^2 + x + -y^2 + y
newvals <- c('x'='x0', 'y'='y0') # named lookup vector
newbod <- rep_vars(body(f), newvals)
newbod
# -x0^2 + x0 + -y0^2 + y0
## Rename the formals, and update the body
formals(f) <- pairlist(x0=bquote(), y0=bquote())
body(f) <- newbod
## The new 'f'
f
# function (x0, y0)
# -x0^2 + x0 + -y0^2 + y0
f(2, 2)
# [1] -4
With a more difficult function, where you want to avoid modifying strings or the other variables named yy and xx for example,
tricky <- function(x, y) { tst <- "x + y"; -xx*x + yy*y }
formals(tricky) <- pairlist(x0=bquote(), y0=bquote())
body(tricky) <- rep_vars(body(tricky), newvals)
tricky
# function (x0, y0)
# {
# tst <- "x + y"
# -xx * x0 + yy * y0
# }
#
There are a few ways to go here. Following your code, I would go with something like this:
f = function(x, y) -x^2 + x + -y^2 + y
# Take old names
form_old = names(formals(f))
# Make new names
form_new = paste0(form_old, 0)
deparse(body(f)) -> bod
for (i in 1:length(form_new)) {
bod = gsub(form_old[i], form_new[i], bod, fixed = TRUE)
}
formals(f) = setNames(vector("list", length(form_new)), form_new)
body(f) <- parse(text = bod)
f(1, 1)