Optimize function in r with the boundary value - r

I have the following function and want find $x$ satisfying this requirement.
$$\frac{X^{2}(1.5)^{2}\exp{1.5X^{2}}}{2} < 1$$
I wrote the following r function.
f <- function(X) 0.5*X^2 * 1.5^2 * exp(X*1.5) < 1
optimize(f, c(0, 1))
But it is giving me an error. I want to find the value of X satisfying the requirement. Thank you for the help.

If we define ff as
ff <- function(X) 0.5*X^2 * 1.5^2 * exp(X*1.5)
then graphing it
curve(ff)
we see that ff(0) = 0 and ff(x) is monotonically increasing in x. The largest value of x for which ff(x) <= 1 can be calculated as the solution to ff(x) = 1 which occurs at the minimum of g:
g <- function(x) (ff(x) - 1)^2
optimize(g, c(0, 1))
giving:
$minimum
[1] 0.6008074
$objective
[1] 1.058761e-09
Thus any value of x between 0 and 0.6008074 gives a value of ff in the closed interval [0, 1].
# create graph
curve(ff)
opt <- optimize(g, c(0, 1))
abline(h = 0:1)
abline(v = c(0, opt$minimum))

Related

Is there a R function to derive a "kink"

Suppose I have a function with a kink. I want to derive a kink point, which in this case is 0.314. I tried optim but it does not work.
Here is an example. In general, I want to derive c. Of course, I could use brute force, but it is slow.
# function with a kink
f <- function(x, c){
(x >= 0 & x < c) * 0 + (x >= c & x <=1) * (sin(3*(x-c))) +
(x < 0 | x > 1) * 100
}
# plot
x_vec <- seq(0, 1, .01)
plot(x_vec, f(x_vec, c = pi/10), "l")
# does not work
optim(.4, f, c = pi/10)
This function has no unique minimum.
Here, a trick is to transform this function a little bit, so that its kink becomes a unique minimum.
g <- function (x, c) f(x, c) - x
x_vec <- seq(0, 1, 0.01)
plot(x_vec, g(x_vec, c = pi/10), type = "l")
# now works
optim(0.4, g, c = pi/10, method = "BFGS")
#$par
#[1] 0.3140978
#
#$value
#[1] -0.3140978
#
#$counts
#function gradient
# 34 5
#
#$convergence
#[1] 0
#
#$message
#NULL
Note:
In mathematics, if we want to find something, we have to first define it precisely. So what is a "kink" exactly? In this example, you refer to the parameter c = pi / 10. But what is it in general? Without a clear definition, there is no algorithm/function to get it.

How to integrate a function when the lower limit is unknown?

I use RStudio for university. And I got this task where I need help:
Response times of people were measured. The following density function was found:
f (x) = 0.62 * (1 / x)
only positive reaction times between 1 and 5 seconds were measured.
In which interval [c, 5] do the top 30 percent of the response times fall? Calculate c!
Normally I would integrate in this way:
integrand_2 <- function(x) {0.62 * (1/x)}
integrate(integrand, lower = , upper = 5)
But as you can see, I have the problem that the lower limit is unknown. How can I find this unknown lower limit (c)?
Are you looking for this?
f <- function(z) integrate(function(x) 0.62 / x, z, 5)$value - 0.3
res <- uniroot(f, c(1, 5))$root
then we have
> res
[1] 3.081973
> integrate(function(x) 0.62 / x, res, 5)$value
[1] 0.2999982
Here is a way, but not with the result in ThomasIsCoding's answer.
pdf <- function(x) {
0.62/x
}
cdf <- function(x){
integrate(pdf, lower = 1, upper = x)$value
}
u <- uniroot(function(x) cdf(x) - 0.7, c(1, 5))
u$root
#[1] 3.092671
1 - cdf(u$root)
#[1] 0.2999982
But if the correct normalizing constant 1/log(5) is used instead of the rounded value 0.62, the result becomes closer.
pdf <- function(x) {
1/log(5)/x
}
# Same cdf
u <- uniroot(function(x) cdf(x) - 0.7, c(1, 5))
u$root
#[1] 3.085178
1 - cdf(u$root)
#[1] 0.2999982

no sign change found in 1000 iterations in R

my question is "Find the maximum of the function myfun = - (sin(x)-3)^2 + 1 ,on the interval (0,5), and please answer x=? and f(x)= ?"
there is my code in R:
f <- function(x) { return((-1*sin(x)-3)^2+1 }
result <- uniroot(f,c(0,5),extendInt = "yes"
result$root
result$f.root
but the console is :
Error in uniroot(f, c(0, 5), extendInt = "yes") :
no sign change found in 1000 iterations
what's wrong with my code
Thanks a lot
optimize is the standard function for finding a max or min of a 1-dimensional function. uniroot is used for finding a root (0) of the function, not max or min values.
optimize(f, interval = c(0, 5), maximum = TRUE)
$maximum
[1] 1.570796
$objective
[1] 17
See ?optimize for examples and details.
(Note: I added a ) to the f in your question to avoid syntax errors.)
The base R function being used in the question is wrong. From the documentation:
uniroot
The function uniroot searches the interval from lower to upper for a
root (i.e., zero) of the function f with respect to its first
argument.
optimize
The function optimize searches the interval from lower to upper for
a minimum or maximum of the function f with respect to its first
argument.
Code
f <- function(x) { (-1*sin(x) - 3)^2 + 1 }
m <- optimize(f, c(0, 5), maximum = TRUE)
m
#$maximum
#[1] 1.570796
#
#$objective
#[1] 17
curve(f, 0, 5)
points(m$maximum, m$objective, pch = 16, col = "red")
Also, the function f is identical to
g <- function(x) { (sin(x) + 3)^2 + 1 }
For you purpose, you should use optim() or optimize(), instead of uniroot(), i.e.:
Given f <- function(x) -(sin(x)-3)^2+1 (the objective function in you code is not the one as you described at the beginning of your post), you will get the result via
optim(0,f,method = "L-BFGS-B",lower = 0,upper = 5,control = list(fnscale = -1))
> optim(0,f,method = "L-BFGS-B",lower = 0,upper = 5,control = list(fnscale = -1))
$par
[1] 1.570796
$value
[1] -3
$counts
function gradient
7 7
$convergence
[1] 0
$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
or
optimize(f, interval = c(0, 5), maximum = TRUE)
> optimize(f, interval = c(0, 5), maximum = TRUE)
$maximum
[1] 1.570796
$objective
[1] -3

How to efficiently do complex row operations with nested functions in R?

Given a multidimensional array, e.g. a zoo object z, with columns a,b,c,x. Given further a function W(w=c(1,1,1), x) which for example weights every column individually, but which also DEPENDS on the specific row value in column x. How to efficiently do row operations here, e.g. calculating the rowWeightedMeans?
It is known that R::zoo is very fast and efficient for row operations, if the function is very simple, e.g.:
W <- function(w) { return(w); }
z[,"wmean"] <- rowWeightedMeans(z[,1:3], w=W(c(0.1,0.5,0.3)))
But what if W() depends on a value in that row? E.g.:
W <- function(w, x) { return(w*x); }
z[,"wmean"] <- rowWeightedMeans(z[,1:3], w=W(c(0.1,0.5,0.3), z[,4]))
R complains here because it does not know how to hanlde the multi-dimensions of the arguments in the nested function.
The solution could be a for(i in 1:nrow(z)) loop, and computing the values individually for every row i. However, for large data sets this takes a enormous amount of extra computational effort and time.
EDIT
Ok guys, thanks for your time and critics. I tried and tested all your answers but must admit that the actual problem was not solved or understood. For example, I hadn't ask to rewrite my weight function or calculations, because I already presented a minimal version of much more complex calculations. The issue or question here lies much deeper. So I sat back and tried to boil down the problem to the root of the evil and found a minimal working example for you without any zoos, weightedMeans, and so on. Here you go:
z <- data.frame(matrix (1:20, nrow = 4))
colnames (z) <- c ("a", "b", "c", "x", "y")
z
# a b c x y
#1 1 5 9 13 17
#2 2 6 10 14 18
#3 3 7 11 15 19
#4 4 8 12 16 20
W <- function(abc, w, p) {
ifelse (w[1] == p, return(length(p)), return(0))
# Please do not complain! I know this is stupid, but it is an MWE
# and my calculations contained in W() are much more complex!
}
z[,"y"] <- W(z[,1:3], c(14,7,8), z[,"x"])
# same result: z[,"y"] <- apply(z[,1:3], 1, W, c(14,7,8), z[,"x"])
z
# a b c x y
#1 1 5 9 13 4
#2 2 6 10 14 4
#3 3 7 11 15 4
#4 4 8 12 16 4
# expected outcome:
# a b c x y
#1 1 5 9 13 0
#2 2 6 10 14 4
#3 3 7 11 15 0
#4 4 8 12 16 0
The problem I am facing is, that R passes all lines of z[,"x"] to the function, however, I expect it to take only the line which corresponds to the line of z[,"y"] that is currently processed internally when R loops through it. In this example, I expect 14==14 only in line number 2!
So: how to tell R to pass line by line to functions?
SOLUTION
Besides the awarded and accepted answer, I like to summarize the solution here to improve clarity and provide a better overview about the discussion.
This question was not about rewriting the specific function W (e.g. weighting). It was only about the inability of R to pass multiple row-by-row arguments to a general function. By either using z$y <- f(z$a, z$x) or z$y <- apply(z$a, 1, f, z$x), both methods only pass the first argument as row-by-row, and the second argument as a complete column with all rows. It seems that this is an intrinsic behaviour of R around which we need to work around.
To solve this, the whole row needs to be passed as a single argument to a wrapper function, which in turn then applies the specific calculations on that row. Solution for the problem with the weights:
f <- function(x) weighted.mean(x[1:3], W(c(0.1,0.5,0.3), x[4]))
z[,"wmean"] <- apply(z[,1:4], 1, f)
Solution for the geenral problem with the data frame:
f <- function(x) W(x[1:3], c(14,7,8), x[4])
z$y <- apply(z, 1, f)
Brian presents also even faster methods using compiled C code in his accepted answer. Thanks to #BrianAlbertMonroe, #jaimedash and #inscaven for dealing with the poorly clarified question and for hinting to this solution.
Haven't really worked with zoo or rowWeightedMeans but if you simply apply weights to row elements before taking the mean of them, and require the weights to depend on one of the elements of the row:
z <- matrix(rnorm(100),ncol=4)
W <- function(row, weights){
weights <- weights * row[4]
row2 <- row[1:3] * weights
sum(row2) / sum(weights)
}
w.means <- apply(z, 1, W, weights = c(0.1, 0.5, 0.3))
If the above gives the correct answer but you're worried about quickness write the W function in Rcpp or use the built in cmpfun,
N <- 10000
z <- matrix(rnorm(N),ncol=4)
# Interpreted R function
W1 <- function(row, weights){
weights <- weights * row[4]
row2 <- row[1:3] * weights
mean(row2)
}
# Compiled R function
W2 <- compiler::cmpfun(W1)
# C++ function imported into R via Rcpp
Rcpp::cppFunction('double Wcpp(NumericVector row, NumericVector weights){
int x = row.size() ;
NumericVector wrow(x - 1);
NumericVector nweights(x - 1);
nweights = weights * row[x - 1];
for( int i = 0; i < (x-1) ; i++){
wrow[i] = row[i] * nweights[i];
}
double res = sum(wrow) / sum(nweights);
return(res);
}')
w.means0 <- apply(z,1,W,weights=c(0.1,0.5,0.3))
w.means1 <- apply(z,1,W2,weights=c(0.1,0.5,0.3))
w.means2 <- apply(z,1,Wcpp,weights=c(0.1,0.5,0.3))
identical( w.means0, w.means1, w.means2 )
#[1] TRUE
Or
# Write the whole thing in C++
Rcpp::cppFunction('NumericVector WM(NumericMatrix z , NumericVector weights){
int x = z.ncol() ;
int y = z.nrow() ;
NumericVector res(y);
NumericVector wrow(x - 1);
NumericVector nweights(x - 1);
double nwsum;
double mult;
for( int row = 0 ; row < y ; row++){
mult = z(row,x-1);
nweights = weights * mult;
nwsum = sum(nweights);
for( int i = 0; i < (x-1) ; i++){
wrow[i] = z(row,i) * nweights[i] ;
}
res[row] = sum(wrow) / nwsum;
}
return(res);
}')
microbenchmark::microbenchmark(
w.means0 <- apply(z,1,W1,weights=c(0.1,0.5,0.3)),
w.means1 <- apply(z,1,W2,weights=c(0.1,0.5,0.3)),
w.means2 <- apply(z,1,Wcpp,weights=c(0.1,0.5,0.3)),
w.means3 <- WM(z = z, weights = c(0.1, 0.5, 0.3))
)
Unit: microseconds
expr min lq mean median uq max neval
w.means0 <- apply(z, 1, W1, weights = c(0.1, 0.5, 0.3)) 12114.834 12536.9330 12995.1722 12838.2805 13163.4835 15796.403 100
w.means1 <- apply(z, 1, W2, weights = c(0.1, 0.5, 0.3)) 9941.571 10286.8085 10769.7330 10410.9465 10788.6800 19526.840 100
w.means2 <- apply(z, 1, Wcpp, weights = c(0.1, 0.5, 0.3)) 10919.112 11631.5530 12849.7294 13262.9705 13707.7465 17438.524 100
w.means3 <- WM(z = z, weights = c(0.1, 0.5, 0.3)) 94.172 107.9855 146.2606 125.0075 140.2695 2089.933 100
EDIT:
Incorporating the weighted.means function slows down the computation dramatically, and does not handle missing values specially according to the help file, so you will still need to write code to manage them.
> z <- matrix(rnorm(100),ncol=4)
> W <- function(row, weights){
+ weights <- weights * row[4]
+ row2 <- row[1:3] * weights
+ sum(row2) / sum(weights)
+
+ }
> W1 <- compiler::cmpfun(W)
> W2 <- function(row, weights){
+ weights <- weights * row[4]
+ weighted.mean(row[1:3],weights)
+ }
> W3 <- compiler::cmpfun(W2)
> w.means1 <- apply(z, 1, W, weights = c(0.1, 0.5, 0.3))
> w.means2 <- apply(z, 1, W2, weights = c(0.1, 0.5, 0.3))
> identical(w.means1,w.means2)
[1] TRUE
> microbenchmark(
+ w.means1 <- apply(z, 1, W, weights = c(0.1, 0.5, 0.3)),
+ w.means1 <- apply(z, 1, W1, weights = c(0.1, 0.5, 0.3)),
+ w.means2 < .... [TRUNCATED]
Unit: microseconds
expr min lq mean median uq max neval
w.means1 <- apply(z, 1, W, weights = c(0.1, 0.5, 0.3)) 145.315 167.4550 172.8163 172.9120 180.6920 194.673 100
w.means1 <- apply(z, 1, W1, weights = c(0.1, 0.5, 0.3)) 124.087 134.3365 143.6803 137.8925 148.7145 225.459 100
w.means2 <- apply(z, 1, W2, weights = c(0.1, 0.5, 0.3)) 307.311 346.6320 356.4845 354.7325 371.7620 412.110 100
w.means2 <- apply(z, 1, W3, weights = c(0.1, 0.5, 0.3)) 280.073 308.7110 323.0156 324.1230 333.7305 407.963 100
Here's a solution with zoo::rollapply. It produces the same answer as matrixStats::rowWeightedMeans for the simpler case.
if(! require(matrixStats)) {
install.packages('matrixStats')
library(matrixStats)
}
if(! require(zoo)) {
install.packages('zoo')
library(zoo)
}
z <- zoo (matrix (1:20, nrow = 5))
colnames (z) <- c ("a", "b", "c", "x")
z$x <- 0 # so we can see an effect below...
z
## a b c x
## 1 1 6 11 0
## 2 2 7 12 0
## 3 3 8 13 0
## 4 4 9 14 0
## 5 5 10 15 0
weights <- c(0.1,0.5,0.3)
W <- function (w) { return(w); }
z$wmean <- rowWeightedMeans(z[,1:3], w=W(weights))
## z[,new]<- doesn't work to create new columns in zoo
## objects
## use $
rowWeightMean_zoo <- function (r, W, weights) {
s <- sum(W(weights))
return(sum(r[1:3] * W(weights) / s))
}
z$wmean_zoo <- rollapply(z, width=1, by.column=FALSE,
function (r) rowWeightMean_zoo(r, W, weights))
z
For the requirement in the question, that the return value be dependent on some ancillary data in the row, rowWeightedMeans doesn't work. But, the function passed to rollapply can be modified to use other elements of the row.
W2 <- function (w, x) { return(w * x); }
# z$wmean2 <- rowWeightedMeans(z[,1:3], w=W2(c(0.1,0.5,0.3), z[,4]))
## doesn't work
## Error in rowWeightedMeans(z[, 1:3], w = W#(c(0.1, 0.5, 0.3), z[, 4])) :
## The length of argument 'w' is does not match the number of column in 'x': 5 != 3
## In addition: Warning message:
## In `*.default`(w, x) :
## longer object length is not a multiple of shorter object length
## Calls: rowWeightedMeans -> W -> Ops.zoo -> NextMethod
rowWeightMean_zoo_dependent <- function (r, W, weights) {
s <- sum(W(weights, r[4]))
return(sum(r[1:3] * W2(weights, r[4]) / s))
}
z$wmean2_zoo <- rollapply(z, width=1, by.column=FALSE,
function (r) rowWeightMean_zoo_dependent(r, W2, weights))
z
## a b c x wmean wmean_zoo wmean2_zoo
## 1 1 6 11 0 7.111111 7.111111 NaN
## 2 2 7 12 0 8.111111 8.111111 NaN
## 3 3 8 13 0 9.111111 9.111111 NaN
## 4 4 9 14 0 10.111111 10.111111 NaN
## 5 5 10 15 0 11.111111 11.111111 NaN
I think this can be solved by clever reshaping. I would use dplyr for that - but the workflow should work similar for plyr or data.table - all these packages are heavily optimized.
for this example I assume the weight function is w(x) = w0 ^ x
Here I create some sample data z, and generic weights w (note I add a row number r to z):
library(dplyr)
library(tidyr)
N <- 10
z <- data.frame(r=1:N, a=rnorm(N), b=rnorm(N), c=rnorm(N), x=rpois(N, 5))
w <- data.frame(key=c('a','b','c'), weight=c(0.1,0.5,0.3))
Now the calculation would be:
res <- z %>% gather(key,value,-r,-x) %>% # convert to long format, but keep row numbers and x
left_join(w, 'key') %>% # add generic weights
mutate(eff_weight = weight^x) %>% # calculate effective weights
group_by(r) %>% # group by the orignal lines for the weighted mean
summarise(ws = sum(value*eff_weight), ww=sum(eff_weight)) %>% # calculate to helper values
mutate(weighted_mean = ws/ww) %>% # effectively calculate the weighted mean
select(r, weighted_mean) # remove unneccesary output
left_join(z, res) # add to the original data
I added some notes - but if you have trouble understanding you could evaluate res stepwise (remove tail including %>%) and have a look at the results.
Update
took the challenge to find the way to do the same in base R:
N <- 10
z <- data.frame(a=rnorm(N), b=rnorm(N), c=rnorm(N), x=rpois(N, 5))
w <- data.frame(key=c('a','b','c'), weight=c(0.1,0.5,0.3))
long.z <- reshape(z, idvar = "row", times=c('a','b','c'),
timevar='key',
varying = list(c('a','b','c')), direction = "long")
compose.z <- merge(long.z,w, by='key')
compose.z2 <- within(compose.z, eff.weight <- weight^x)
sum.stat <- by(compose.z2, compose.z2$row, function(x) {sum(x$a * x$eff.weight )/sum(x$eff.weight)})
nice.data <- c(sum.stat)
It requires a bit more verbose function. But the same pattern can be applied.

Triple integral in R (how to specifying the domain)

I would like to compute the triple integral of a function of three variables f(x,y,z) in R. I'm using the package cubature and the function adaptIntegrate(). The integrand is equal to 1 only in a certain domain (x<y<z, 0 otherwise) which I don't know how to specify. I'm trying 2 different implementations of the function, but none of them work:
#First implementation
fxyz <- function(w) {
x <- w[1]
y <- w[2]
z <- w[3]
x*y*z*(x < y)&(y < z)
}
#Second implementation
fxyz <- function(w) {
x <- w[1]
y <- w[2]
z <- w[3]
if(x<y&y<z)
out<-1
else
out<-0
out
}
#Computation of integral
library(cubature)
lower <- rep(0,3)
upper <- rep(1, 3)
adaptIntegrate(f=fxyz, lowerLimit=lower, upperLimit=upper, fDim = 3)
Any idea on how to specify the domain correctly?
I don't know about the cubature package, but you can do this by repeated application of base R's integrate function for one-dimensional integration.
f.xyz <- function(x, y, z) ifelse(x < y & y < z, 1, 0)
f.yz <- Vectorize(function(y, z) integrate(f.xyz, 0, 1, y=y, z=z)$value,
vectorize.args="y")
f.z <- Vectorize(function(z) integrate(f.yz, 0, 1, z=z)$value,
vectorize.args="z")
integrate(f.z, 0, 1)
# 0.1666632 with absolute error < 9.7e-05
You'll probably want to play with the control arguments to set the numeric tolerances; small errors in the inner integration can turn into big ones on the outside.
In your first function the return value is wrong. It should be as.numeric(x<=y)*as.numeric(y<=z). In your second function you should also use <= instead of <, otherwise `adapIntegrate won't work correctly. You also need to specify a maximum number of evaluations. Try this
library(cubature)
lower <- rep(0,3)
upper <- rep(1,3)
# First implementation (modified)
fxyz <- function(w) {
x <- w[1]
y <- w[2]
z <- w[3]
as.numeric(x <= y)*as.numeric(y <= z)
}
adaptIntegrate(f=fxyz,lowerLimit=lower,upperLimit=upper,doChecking=TRUE,
maxEval=2000000,absError=10e-5,tol=1e-5)
#$integral
#[1] 0.1664146
#$error
#[1] 0.0001851699
#$functionEvaluations
#[1] 2000031
#$returnCode
#[1] 0
The domain 0 <= x <= y <= z <= 1 is the "canonical" simplex. To integrate over a simplex, use the SimplicialCubature package.
library(SimplicialCubature)
f <- function(x) 1
S <- CanonicalSimplex(3)
> adaptIntegrateSimplex(function(x) 1, S)
$integral
[1] 0.1666667
$estAbsError
[1] 1.666667e-13
$functionEvaluations
[1] 55
$returnCode
[1] 0
$message
[1] "OK"
Note that integrating the constant function f(x)=1 over the simplex simply gives the volume of the simplex, which is 1/6. The integration is useless for this example.
> SimplexVolume(S)
[1] 0.1666667

Resources