fminsearch on a single variable - r

Using R's help page example on fminsearch as a starting point:
# Rosenbrock function
rosena <- function(x, a) 100*(x[2]-x[1]^2)^2 + (a-x[1])^2 # min: (a, a^2)
fminsearch(rosena, c(-1.2, 1), a = sqrt(2))
# x = (1.414214 2.000010) , fval = 1.239435e-11
I want to evaluate something like this but with only one variable such as:
rosena <- function(x, a) 100*(x[1]-x[1]^2)^2 + (a-x[1])^2
but when I run
fminsearch(rosena, c(1), a = sqrt(2))
It gives the error: Error in X[2:d1, ] : incorrect number of dimensions
fminsearch seems to want a vector of length greater than or equal to 2, but no less, however for this example, the vector requires length 1
Note: fminsearch is in the "pracma" package

It looks like a bug in the pracma package.
The anms function is dropping a dimension upon a subscript, relevant excerpts:
d <- length(x0) # i.e. 1
d1 <- d + 1 # i.e. 2
...
X <- matrix(0, nrow = d1, ncol = d)
...
X <- X[o, ] # could put drop = FALSE here
I think you should post a bug with the author of the package.

Related

Apply function to cartesian product of numeric and function type

I have a function
eval_ = function(f, i) f(i)
for a list of functions, say
fns = list(function(x) x**2, function(y) -y)
and a vector of integers, say
is = 1:2
I would like to get eval_ evaluated at all combinations of fns and is.
I tried the following:
cross = expand.grid(fns, is)
names(cross) = c("f", "i")
results = sapply(1:nrow(cross), function(i) do.call(eval_, cross[i,]))
This throws an error:
Error in f(i) : could not find function "f"
I think that the underlying problem is, that cross is a data.frame and can not carry functions. Hence, it puts the function into a list and then carries a list (indeed, class(cross[1,][[1]]) yields "list". My ugly hack is to change the third line to:
results = sapply(
1:nrow(cross),
function(i) do.call(eval_, list(f = cross[i,1][[1]], i = cross[i,2]))
)
results
#[1] 1 -1 4 -2
This works, but it defeats the purpose of do.call and is very cumbersome.
Is there a nice solution for this kind of problem?
Note: I would like a solution that generalizes well to cases where the cross product is not only over two, but possibly an arbitrary amount of lists, e.g. functions that map R^n into R.
Edit:
For a more involved example, I think of the following:
fns = list(mean, sum, median)
is1 = c(1, 2, 4, 9), ..., isn = c(3,6,1,2) and my goal is to evaluate the functions on the cartesian product spanned by is1, ..., isn, e.g. on the n-dimensional vector c(4, ..., 6).
You can use mapply() for this:
eval_ <- function(f, i) f(i)
fns <- list(function(x) x**2, function(y) -y)
is <- 1:2
cross <- expand.grid(fns = fns, is = is)
cross$result <- mapply(eval_, cross$fn, cross$is)
print(cross)
#> fns is result
#> 1 function (x) , x^2 1 1
#> 2 function (y) , -y 1 -1
#> 3 function (x) , x^2 2 4
#> 4 function (y) , -y 2 -2
An attempt for my "more involved example" with n = 2.
Let X = expand.grid(c(1, 2, 4, 9), c(3,6,1,2)).
The following pattern generalizes to higher dimensions:
nfns = length(fns)
nn = nrow(X)
res = array(0, c(nfns, nn))
for(i in 1:nfns){
res[i,] = apply(X, MARGIN = 1, FUN = fns[[i]])
}
The shape of the margin of X (i.e. nrow(X)) must correspond to the shape of the slice res[i,] (i.e. nn). The function must map the complement of the margin of X (i.e. slices of the form X[i,]) to a scalar. Note that a function that is not scalar has components that are scalar, i.e. in a non-scalar case, we would loop over all components of the function.

plot coordinates from two vectors into a 0,1 matrix

I have two vectors of equal length
a <- 1:10
b <- sample.int(10,size=10)
I would like to plot them into a matrix of the same length (10) where a is the row coordinate, b the column coordinate, with the value 1 for the coordinates and 0 for everything else. I have below a way to do this using a for loop, but was hoping to do this without a loop. Thanks!
matrix01 <- matrix(0, nrow = 10, ncol = 10)
for(i in 1:10) {
matrix01[i, b[i]] = 1
}
1) Use replace and cbind like this:
replace(matrix(0, 10, 10), cbind(a, b), 1)
2) In the question a equals 1:10 and in that special case (but not more generally) another possibility is:
+ outer(b, 1:10, `==`)
or
+ sapply(1:10, `==`, b)
3) In the question a is 1:10 and b is a premutation of 1:10. In that special case (but not more generally) this works:
diag(10)[b, ]
4) In the question both a and b are premutations of 1:10 and in that special case (but not more generally) this works to give a table:
table(a, b)
This would also work and is similar:
xtabs(~ a + b)
Note
Note that the question
uses seq(1:10) which should be just 1:10
needs to add set.seed to make the input reproducible
defines a but then does not use it in the code
You can create a row/column matrix using cbind and assign the value 1 to those position.
matrix01[cbind(a, b)] <- 1

Error number of items to replace is not a multiple of replacement length

I realize this error is touched on in other posts, but I still can't figure out how it applies to my particular situation. I have the following code.
myfun <- function(x, g, o){
y <- x
fs <- ((g-1)/o) * (o*g/((g-1)*(1+o)))^g
xb <- o/(g-1)
y[x>=xb] <- ((x+o)/(1+o))^g
y[x<xb] <- x*fs
return(y)
}
x <- seq(0,1,length=5)
y <- myfun(x, 1.5, 0.05)
My code is returning the following errors.
Warning messages:
1: In y[y >= xb] <- ((x + o)/(1 + o))^g :
number of items to replace is not a multiple of replacement length
2: In y[y < xb] <- x * fs :
number of items to replace is not a multiple of replacement length
In addition the results seem to be incorrect.
I expect
y =
0 0.152720709664243 0.379105500429200 0.665044998814453 1
but get :
y =
[1] 0.00000000 0.01039133 0.15272071 0.37910550 0.66504500
This leads me to believe I'm doing something incorrect in my indexing, or there's something going on with the math on the vector x. Any help would be much appreciated.
By construction, x is of length 5, such that y and ((x+o)/(1+o))^g are of length 5 as well.
However, the test x>=xb is only true for 4 elements out of 5, such that y[x>=xb] is 4 elements long. Therefore your assignement y[x>=xb] <- ((x+o)/(1+o))^g clashes because the two elements are not of the same length.
I guess what you want to do is something like
y[x>=xb] <- ((x[x>=xb]+o)/(1+o))^g
y[x<xb] <- x[x<xb]*fs
I get
>y
[1] 0.0000000 0.1527207 0.3791055 0.6650450 1.0000000
which is close from what you want, I'll let you figure that out.

Using standard evaluation and do_ to run simulations on a grid of parameters without do.call

Goals
I want to use dplyr to run simulations on grids of parameters. Specifically, I'd like a function that I can use in another program that
gets passed a data.frame
for every row calculates some simulation using each column as an argument
also is passed some extra data (e.g., initial conditions)
Here's my approach
require(dplyr)
run <- function(data, fun, fixed_parameters, ...) {
## ....
## argument checking
##
fixed_parameters <- as.environment(fixed_parameters)
grouped_out <- do_(rowwise(data), ~ do.call(fun, c(., fixed_parameters, ...)))
ungroup(grouped_out)
}
This works. For example, for
growth <- function(n, r, K, b) {
# some dynamical simulation
# this is an obviously-inefficient way to do this ;)
n + r - exp(n) / K - b - rnorm(1, 0, 0.1)
}
growth_runner <- function(r, K, b, ic, ...) {
# a wrapper to run the simulation with some fixed values
n0 = ic$N0
T = ic$T
reps = ic$reps
data.frame(n_final = replicate(reps, {for(t in 1:T) {
n0 <- growth(n0, r, K, b)
};
n0})
)
}
I can define and run,
data <- expand.grid(b = seq(0.01, 0.5, length.out=10),
K = exp(seq(0.1, 5, length.out=10)),
r = seq(0.5, 3.5, length.out=10))
initial_data = list(N0=0.9, T=5, reps=20)
output <- run(data, growth_runner, initial_data)
Question
Even though this seems to work, I wonder if there's a way to do it without do.call. (In part because of issues with do.call.)
I really am interested in a way to replace the line grouped_out <- do_(rowwise(data), ~ do.call(fun, c(., fixed_parameters, ...))) with something that does the same thing but without do.call. Edit: An approach that somehow avoids the performance penalties of using do.call outlined at the above link would also work.
Notes and References
this question on do.call and standard evaluation in dplyr is helpful, but I'm looking for a way to avoid do.call if possible
dplyr's nse vignette was helpful in writing this; and makes me think .values could work in place of do.call
I found it a little tricky to follow your code, but I think this is equivalent.
First I define a function that does the computation you're interested in:
growth_t <- function(n0, r, K, b, T) {
n <- n0
for (t in 1:T) {
n <- n + r - exp(n) / K - b - rnorm(1, 0, 0.1)
}
n
}
Then I define the data that you want to vary, including a "dummy" variable for reps:
data <- expand.grid(
b = seq(0.01, 0.5, length.out = 5),
K = exp(seq(0.1, 5, length.out = 5)),
r = seq(0.5, 3.5, length.out = 5),
rep = 1:20
)
Then I can feed it into purrr::pmap_d(). pmap_d() does a "parallel" map - i.e. it takes a list (or data frame) as input, and calls the function varying all the named arguments for each iteration. The fixed parameters are supplied after the function name.
library(purrr)
data$output <- pmap_dbl(data[1:3], growth_t, n0 = 0.9, T = 5)
This really doesn't feel like a dplyr problem to me, because it's not really about data manipulation.
The below avoids using do.call and presents the output in the same way as the OP.
First, replace the parameters of the function with a vector that you'll pass in - this is what you'll pass through using apply.
growth_runner <- function(data.in, ic, ...) {
# a wrapper to run the simulation with some fixed values
n0 = ic$N0
T = ic$T
reps = ic$reps
data.frame(n_final = replicate(reps, {for(t in 1:T) {
n0 <- growth(n0, data.in[3], data.in[2], data.in[1])
};
n0})
)
}
Set your grid you want to search over, just as you did before.
data <- expand.grid(b = seq(0.01, 0.5, length.out=10),
K = exp(seq(0.1, 5, length.out=10)),
r = seq(0.5, 3.5, length.out=10))
initial_data = list(N0=0.9, T=5, reps=20)
Use apply to go through your grid, then append the results
output.mid = apply(data, 1, ic=initial_data, FUN=growth_runner)
output <- data.frame('n_final'=unlist(output.mid))
And you have your output without any calls to do.call or any external library.
> dim(output)
[1] 20000 1
> head(output)
n_final
1 -0.6375070
2 -0.7617193
3 -0.3266347
4 -0.7921655
5 -0.5874983
6 -0.4083613
You can replace the line with do.call with the following (Thanks to #shorpy for pointing out purrr:invoke_rows()):
grouped_out <- purrr::invoke_rows(fun, dplyr::rowwise(data), fixed_parameters)
without any other changes, this will give a data frame with a column of data.frames, like
Source: local data frame [1,000 x 4]
b K r .out
(dbl) (dbl) (dbl) (chr)
1 0.01000000 1.105171 0.5 <data.frame [20,1]>
2 0.06444444 1.105171 0.5 <data.frame [20,1]>
3 0.11888889 1.105171 0.5 <data.frame [20,1]>
To recover something closer to the original behavior, replace the final line of run with
dplyr::ungroup(tidyr::unnest(grouped_out, .out))
which gives
Source: local data frame [20,000 x 4]
b K r n_final
(dbl) (dbl) (dbl) (dbl)
1 0.01 1.105171 0.5 -0.6745470
2 0.01 1.105171 0.5 -0.7500365
3 0.01 1.105171 0.5 -0.6568312
No other changes to the code are needed :)

Avoiding empty and small groups when using pretty_breaks with cut2

I'm working with variables resembling the data val values created below:
# data --------------------------------------------------------------------
data("mtcars")
val <- c(mtcars$wt, 10.55)
I'm cutting this variable in the following manner:
# Cuts --------------------------------------------------------------------
cut_breaks <- pretty_breaks(n = 10, eps.correct = 0)(val)
res <- cut2(x = val, cuts = cut_breaks)
which produces the following results:
> table(res)
res
[ 1, 2) [ 2, 3) [ 3, 4) [ 4, 5) [ 5, 6) 6 7 8 9 [10,11]
4 8 16 1 3 0 0 0 0 1
In the created output I would like to change the following:
I'm not interested in creating grups with one value. Ideally, I would like to for each group to have at least 3 / 4 values. Paradoxically, I can leave with groups having 0 values as those will dropped later on when mergining on my real data
Any changes to the cutting mechanism, have to work on a variable with integer values
The cuts have to be pretty. I'm trying to avoid something like 1.23 - 2.35. Even if those values would be most sensible considering the distribution.
In effect, what I'm trying to achieve is this: try to make more or less even pretty group and if getting a really tiny group then bump it together with the next group, do not worry about empty groups.
Full code
For convenience, the full code is available below:
# Libs --------------------------------------------------------------------
Vectorize(require)(package = c("scales", "Hmisc"),
character.only = TRUE)
# data --------------------------------------------------------------------
data("mtcars") val <- c(mtcars$wt, 10.55)
# Cuts --------------------------------------------------------------------
cut_breaks <- pretty_breaks(n = 10, eps.correct = 0)(val) res <-
cut2(x = val, cuts = cut_breaks)
What I've tried
First approach
I tried to play with the eps.correct = 0 value in the pretty_breaks like in the code:
cut_breaks <- pretty_breaks(n = cuts, eps.correct = 0)(variable)
but none of the values gets me anwhere were close
Second approach
I've also tried using the m= 5 argument in the cut2 function but I keep on arriving at the same result.
Comment replies
My breaks function
I tried the mybreaks function but I would have to put some work into it to get nice cuts for more bizzare variables. Broadly speaking, pretty_breaks cuts well for me, juts the tiny groups that occur from time to time are not desired.
> set.seed(1); require(scales)
> mybreaks <- function(x, n, r=0) {
+ unique(round(quantile(x, seq(0, 1, length=n+1)), r))
+ }
> x <- runif(n = 100)
> pretty_breaks(n = 5)(x)
[1] 0.0 0.2 0.4 0.6 0.8 1.0
> mybreaks(x = x, n = 5)
[1] 0 1
You could use the quantile() function as a relatively easy way to get similar numbers of observations in each of your groups.
For example, here's a function that takes a vector of values x, a desired number of groups n, and a desired rounding off point r for the breaks, and gives you suggested cut points.
mybreaks <- function(x, n, r=0) {
unique(round(quantile(x, seq(0, 1, length=n+1)), r))
}
cut_breaks <- mybreaks(val, 5)
res <- cut(val, cut_breaks, include.lowest=TRUE)
table(res)
[2,3] (3,4] (4,11]
8 16 5

Resources