R: optim() when function has inputs in different formats - r

I have data
dat1 <- data.frame(a=1:3, b=rnorm(3))
dat2 <- data.frame(a=c(rep(1,3),rep(2,5),rep(3,4)), c=runif(12,1,50))
and a function that takes both data frames as inputs
foo <- function(dat1,dat2,par){
if(par< 25){return(dat1$b*par)}
if(par>=25){return(sum(dat2$c>par))}
}
which might work if it was embedded in a loop over different values of a.
However, I would like to find the value of par that minimizes the output of foo across all values of a. The optim() funtion should be able to do just this, but my problem is that I need to pass it two dataframes of different dimensions. I suspect some form of list could help but wouldn't know how.

from the help documentation on optim,
fn - A function to be minimized (or maximized), with first argument the vector of parameters over which minimization is to take place. It should return a scalar result.
Your function is not returning a scalar when par < 25. Since you are not changing the data.frames during optimization process, you do not have to pass in them again. Below is an example usage of optim in your case:
foo <- function(par) {
if(par < 25) {
return(sum(dat1$b*par))
} else {
return(sum(dat2$c>par))
}
}
optim(0, foo, method="Brent", lower=-1e6, upper=1e6)

Related

Using variable from another script (inter-file function closure)

I have script main.R, where I create inv_cov_mat variable. I later load metrics.R and use it to calculate function value (I use it as kind of inter-script function closure). I get error "object 'inv_cov_mat' not found". My code:
main.R:
knn <- function(...)
{
# some code
source("./source/metrics.R")
if (metric == "mahalanobis")
inv_cov_mat <- solve(cov(training_set))
# other code
# calculate distance in given metric between current vector and every row vector from training set matrix
distances <- apply(training_set, 1, metric, vec2=curr_vec) # error
metrics.R:
mahalanobis <- function(vec1, vec2)
{
diff <- vec1 - vec2
sqrt(t(diff) %*% inv_cov_mat %*% diff)
}
I've found simple, even if not elegant answer: use inv_cov_mat as global variable, not inside knn function. Then other scripts can see it.
It's not entirely clear what you want, but if I understand you correctly---you have a character string identifying the metric you want to use, and a function with the same name. So you should be able to use get to retrieve the function based on the name.
metric == "mahalanobis"
metric.fun = get(metric)
distances <- apply(training_set, 1, metric.fun, vec2=curr_vec)
That said, there are probably better ways to organize your code that would avoid this problem entirely, e.g. create a named list of functions for accessing metrics.
EDIT regarding the issue of inv_cov_mat, either pass it as an argument to your metric function or use get inside that function to access variables from the parent environment using the envir argument. Passing the variable as an argument to your metric function is definitely the better and cleaner approach.

R argument is missing, with no default

I want to calculate the log return of data . I define a function and want to load the data. but system always mentions second factor is missing. Otherwise it just calculate the log of row number.
#read data
data <- read.csv(file="E:/Lect-1-TradingTS.csv",header=TRUE)
mode(data)
p<-data["Price"]
#func1
func1 <- function(x1,x2)
{
result <- log(x2)-log(x1)
return(result)
}
#calculate log return
log_return<-vector(mode="numeric", length=(nrow(data)-1))
for(i in 2:nrow(p))
{
log_return[i-1] <- func1(p[(i-1):i])
}
Error in func1(p[(i - 1):i]) : argument "x2" is missing, with no default
Your function func1 was defined to accept two arguments, but you are passing it a single argument: the vector p[(i-1):i], which has two elements but is still considered a single object. To fix this you need to pass two separate arguments, p[i-1] and p[i]. Alternatively, modify the definition of func1 to accept a two-element vector:
func1 <- function(v)
{
x1 <- v[1]
x2 <- v[2]
result <- log(x2)-log(x1)
return(result)
}
Thank you guys,all your answers inspired me. I think I found a solution.
log_return[i-1] <- func1(p[(i-1),"Price"],p[(i),"Price"])
basically you do not need a func for those calcs in R
R's vectorization comes in handy in these cases
data <- read.csv(file="E:/Lect-1-TradingTS.csv",header=TRUE)
mode(data)
p <- data[["Price"]]
logrets <- log(p[2:length(p)]) - log(p[1:length(p)-1])
This vectorized computation will usually also heavily outperform any function you define "by hand".

When passing a function as an argument to another function, how do I loop over that function?

I currently set up a function which takes another function as an argument. I am trying to create this function to loop over the function it took as an argument.
However, with the way I set it up, it seems like it only really runs the function once, as it should be dependent on a random seed but it keeps coming back with the exact same value.
The only way I found to solve this was to not have the function as an argument but to hard code it in to the function, but this is not really what I want to do.
Here is my code:
runModel <- function(model,runs){
b = c()
for (i in 1:runs){
a <- model$lambda.min
b <- append(b,a)
}
return (c(mean(b),sd(b)))
}
cvModel = cv.glmnet(predictors,outcome,family=c("binomial"),alpha=.9,nfolds=20)
runModel(cvModel,20)
I don't know what are your objects 'predictors' or 'outcome' but it looks like your passing the result of your function cv.glmnet rather than the expression of your function to your runModel function.
One way to run the function at each loop could be to rewrite your function to pass function name and arguments:
runModel <-function(fun,runs,predictors,outcome,family=c("binomial"),alpha=.9,nfolds=20){
b = c()
model <- fun(predictors,outcome,family,alpha,nfolds)
for (i in 1:runs){
a <- model$lambda.min
b <- append(b,a)
}
return (c(mean(b),sd(b)))
}
runModel(cv.glmnet,20, predictors,outcome,family=c("binomial"),alpha=.9,nfolds=20)

Use optimize() for a function which returns several values

Easier to ask by example. If I have a function
fn <- function(x) {
...
return(c(a,b,c))
}
and I wish to maximize (or minimize) with respect to a, but also get the values of b and c at the optimal value.
Of course I can use fn2 <- function(x) fn(x)[1] to determine the optimal value, then call the function again, but I wonder if there is a smarter way of doing this.
optim needs the return value to be a scalar. The documentation says so
fn: A function to be minimized (or maximized), with first
argument the vector of parameters over which minimization is
to take place. It should return a scalar result.
You could write the values of interest to a global variable inside your function though. This isn't necessarily best practice but it could work.
f <- function(x){
.vals <<- c(x, x+1)
x^2
}
optim(1, f)
then after we can look at what is stored in .vals
> .vals
[1] 9.765625e-05 1.000098e+00

how to "test" a function and repeat it if is not executed? in R

I have to run a function over a list of elements (in R). This function estimates the parameters of a model by maximum likelihood. To run the function, I need to give a set of initial parameter values, which then will be optimised by the algorithm.
Then here is my problem:
I want to randomise the initial parameter values every time I execute the function over each element of my list. I do it with the function runif(). HOWEVER, not all combinations of the initial parameters seem to work (it gives an error). In this case, I would like to execute again and again the function over that element of the list, trying different starting values, until I get the combination of initial parameter values that make the function work. Only then the loop can continue to the next element of the list.
Could you please give me some ideas of how can I tackle this problem?
thank very much in advance.
Tina.
Wrap the call that generates the error in try or tryCatch.
In pseudo code:
out <- vector(mode = "list", length = 10)
for (i in seq_along(out)) {
res <- try(...function call here...)
## res contains the actual error msg in object of class "try-error"
while(inherits(res, "try-error")) {
res <- try(...function call here...)
}
out[[i]] <- res
}
or using repeat
out <- vector(mode = "list", length = 10)
for (i in seq_along(out)) {
repreat {
res <- try(...function call here...)
if(!inherits(res, "try-error"))
break
}
out[[i]] <- res
}
You'll need to arrange out to be the required length or type of object to hold the results, as you see fit, and whether you need to record if the function failed (as the example above does with NA) or not will depend on what you are doing. Without an explicit example, that is the best you'll get.
Read ?try and also look at ?tryCatch, the latter is a more general mechanism for trapping errors. try itself is implemented using tryCatch for example...

Resources