As you know, the Gradient of a function is the following vector:
and the Hessian is the following matrix:
Now, I wonder, is there any way to calculate these in R for a user defined function at a given point?
First, I've found a package named numDeriv, which seems to have the necessary functions grad and hessian but now I can't get the correct results... Thus, here's my workflow:
Let's say that we are given the function f(x,y) = x^2 * x^3, and we need to calculate the Gradient and the Hessian at the point (x=1, y=2).
That's been said, I define this function within R:
dummy <- function(x,y) {
rez <- (z^2)*(y^3)
rez
}
and then use grad the following way:
grad(func=dummy, x=1, y=2)
which gives me result 16 -- and the problem is that this only the first value from a gradient vector, the correct version of which is
[16, 12]
Same goes with the hessian:
hessian(func=dummy, x=1, y=2)
which gives my 1x1 matrix with the value 16 instead of the 2x2 matrix
[,1] [,2]
[1,] 16 24
[2,] 24 12
So, the question is what am I doing wrong?
Thank you.
You can use the pracma library, such as:
library(pracma)
dummy <- function(x) {
z <- x[1]; y <- x[2]
rez <- (z^2)*(y^3)
rez
}
grad(dummy, c(1,2))
[1] 16 12
hessian(dummy, c(1,2))
[,1] [,2]
[1,] 16 24
[2,] 24 12
The following code is an extension of the answer provided. It treats the case where you have the values of the function and not the actual function. Here the function has 1 parameter. The Grad function calculates in a single point. If you have 3 parameters then you need to provide them to x0 with c(x1,x2,x3).
#i is an index, s_carvone$val contains the values of the function
dummy <- function(i)
{
return (s_carvone$val[i])
}
#function that calculates the gradient in a specific point i
calc_grad <- function(i)
{
return (pracma::grad(dummy, x0=i, heps=1))
}
#calculates the derivative from point 2 to 61
first_derivative = unlist(purrr::map(calc_grad, .x = c(2:61)));
plot(first_derivative);
Related
I want to transform my excel solver model into a model in R. I need to find 3 sets of coordinates which minimizes the distance to the 5 other given coordinates. I've made a program which calculates a distance matrix which outputs the minimal distance from each input to the given coordinates. I want to minimize this function by changing the input. Id est, I want to find the coordinates such that the sum of minimal distances are minimized. I tried several methods to do so, see the code below (Yes my distance matrix function might be somewhat cluncky, but this is because I had to reduce the input to 1 variable in order to run some algorithms such as nloprt (would get warnings otherwise). I've also seen some other questions (such as GRG Non-Linear Least Squares (Optimization)) but they did not change/improve the solution.
# First half of p describes x coordinates, second half the y coordinates # yes thats cluncky
p<-c(2,4,6,5,3,2) # initial points
x_given <- c(2,2.5,4,4,5)
y_given <- c(9,5,7,1,2)
f <- function(Coordinates){
# Predining
Term_1 <- NULL
Term_2 <- NULL
x <- NULL
Distance <- NULL
min_prob <- NULL
l <- length(Coordinates)
l2 <- length(x_given)
half_length <- l/2
s <- l2*half_length
Distance_Matrix <- matrix(c(rep(1,s)), nrow=half_length)
# Creating the distance matrix
for (k in 1:half_length){
for (i in 1:l2){
Term_1[i] <- (Coordinates[k]-x_given[i])^2
Term_2[i] <- (Coordinates[k+half_length]-y_given[i])^2
Distance[i] <- sqrt(Term_1[i]+Term_2[i])
Distance_Matrix[k,i] <- Distance[i]
}
}
d <- Distance_Matrix
# Find the minimum in each row, thats what we want to obtain ánd minimize
for (l in 1:nrow(d)){
min_prob[l] <- min(d[l,])
}
som<-sum(min_prob)
return(som)
}
# Minimise
sol<-optim(p,f)
x<-sol$par[1:3]
y<-sol$par[4:6]
plot(x_given,y_given)
points(x,y,pch=19)
The solution however is clearly not that optimal. I've tried to use the nloptr function, but I'm not sure which algorithm to use. Which algorithm can I use or can I use/program another function which solves this problem? Thanks in advance (and sorry for the detailed long question)
Look at the output of optim. It reached the iteration limit and had not yet converged.
> optim(p, f)
$`par`
[1] 2.501441 5.002441 5.003209 5.001237 1.995857 2.000265
$value
[1] 0.009927249
$counts
function gradient
501 NA
$convergence
[1] 1
$message
NULL
Although the result is not that different you will need to increase the number of iterations to get convergence. If that is still unacceptable then try different starting values.
> optim(p, f, control = list(maxit = 1000))
$`par`
[1] 2.502806 4.999866 5.000000 5.003009 1.999112 2.000000
$value
[1] 0.005012449
$counts
function gradient
755 NA
$convergence
[1] 0
$message
NULL
I wrote a simple function for maximum likelihood and would like this function to give different result based on the different values of its parameters using for loop in R. That is my function include an expression based on for loop. My function works well and the result are saved in a list. Then, Since I have two different results, I would like to apply the optim function to my function based on each part of my function. For example,
ff <- function(x,mu=c(2,0.5),sd=c(0.2,0.3)){
out <- vector("list",2)
for (i in 1:2){
out[[i]] <- -sum(log(dnorm(x,mu[[i]],sd[[i]]))) ## here I have two different part of my funcitons wrap as one using for loop.
}
return(out)
}
set.seed(123)
x <- rnorm(10,2,0.5)
x
Then the result of my function is:
> ff(x)
[[1]]
[1] 25.33975
[[2]]
[1] 101.4637
Then, since my function has two different parts wrap as one using for loop, I would like to apply the optim function to this function based on each part of it. I tried many own methods and they did not work. Here is one of my tries:
op <- vector("list",2)
for(i in 1:2){
op <- optim(c(0.5,0.5),fn=ff[[i]],i=i)
}
That is, I want the optim function to evaluate my function at the first value of my argument i=1 and then evaluate the function for the second one i=2.
So my funcitons without the wrap is as follows:
ff_1 <- function(x,mu=c(2,0.5),sd=c(0.2,0.3)){
-sum(log(dnorm(x,mu[[1]],sd[[1]])))
return(out)
}
ff_2 <- function(x,mu=c(2,0.5),sd=c(0.2,0.3)){
-sum(log(dnorm(x,mu[[2]],sd[[2]])))
return(out)
}
and I then need to use two different optim functions for each functions.
I search many website and R help sites but I couldnot find a solution to this question.
Any help please?
Try this one, it's just the way of passing the arguments to optim, I suppose
# given data
set.seed(123)
x <- rnorm(10,2,0.5)
# use vector parOpt instead of specifying two; for convience
# with optim
ff <- function(x, parOpt){
out <- -sum(log(dnorm(x, parOpt[1], parOpt[2])))
return(out)
}
# parameters in mu,sd vectors arranged in list
params <- list(set1 = c(2, 0.2), set2 = c(0.5, 0.3))
# output list
out <- list()
for(i in 1:2){
# pass params (mu and sd) to optim, function ff and the data
# note, since function ff has x argument, specify that in optim
out[[i]] <- optim(par = params[[i]], fn=ff ,x=x)
}
Should give something like this:
[[1]]
[[1]]$par
[1] 2.0372546 0.4523918
[[1]]$value
[1] 6.257931
[[1]]$counts
function gradient
55 NA
[[1]]$convergence
[1] 0
[[1]]$message
NULL
[[2]]
[[2]]$par
[1] 2.037165 0.452433
[[2]]$value
[1] 6.257932
[[2]]$counts
function gradient
73 NA
[[2]]$convergence
[1] 0
[[2]]$message
NULL
Hope this helps.
As an alternative, you can find the same solution using the command fitdist of the fitdistrplus package:
library(fitdistrplus)
set.seed(123)
x <- rnorm(10,2,0.5)
mu.start <- c(2,0.5)
sd.start <- c(0.2,0.3)
op <- vector("list",2)
for(i in 1:2){
op[[i]] <- fitdist(x,"norm", start=c(mu.start[i],sd.start[i]))
}
op
The result is:
[[1]]
Fitting of the distribution ' norm ' by maximum likelihood
Parameters:
estimate Std. Error
1 2.0372546 0.1430588
2 0.4523918 0.1011464
[[2]]
Fitting of the distribution ' norm ' by maximum likelihood
Parameters:
estimate Std. Error
1 2.037165 0.1430719
2 0.452433 0.1011694
I have a equation system and I want to solve it using numerical methods. I want to get a close solution given a starting seed. Let me explain.
I have a vector of constants ,X, of values:
X <- (c(1,-2,3,4))
and a vector W of weights:
W <- (c(0.25,0.25,0.25,0.25))
I want that the sum of the components of W will be (sum(W)=1), and the sum of the multiplication of X and W element by element will be a given number N (sum(W*X)=N).
Is there a easy way to do this in R? I have it in Excel, using Solver, but I need to automatize it.
Here is your constant and your target value:
x <- c(1, -2, 3, 4)
n <- 10
You need a function to minimize. The first line contains each of your conditions, and the second line provides a measure of how to combine the errors into a single score. You may want to change the second line. For example, you could make one error term be more heavily weighted than the other using sum(c(1, 5) * errs ^ 2).
fn <- function(w)
{
errs <- c(sum(w) - 1, sum(x * w) - n)
sum(errs ^ 2)
}
The simplest thing is to start with all the weights the same value.
init_w <- rep.int(1 / length(x), length(x))
Use optim to optimize.
optim(init_w, fn)
## $par
## [1] 0.1204827 -1.2438883 1.1023338 1.0212406
##
## $value
## [1] 7.807847e-08
##
## $counts
## function gradient
## 111 NA
##
## $convergence
## [1] 0
##
## $message
## NULL
The par element contains your weights.
There is no unique solution for this problem. If you try other initial values for w you will most likely get different results from optim.
The problem can be formulated as solving an underdetermined system of linear equations.
A <- matrix(c(rep(1,4),x), nrow=2,byrow=TRUE)
b <- matrix(c(1,n), nrow=2)
We seek a solution that satisfies A %*% w = b but which one? Minimum norm solution? Or maybe some other one? There are infinitely many solutions. Solutions can be given using the pseudo-inverse of the matrix A. Use package MASS for this.
library(MASS)
Ag <- ginv(A)
The minimum norm solution is
wmnorm <- Ag %*% b
And check with A %*% wmnorm - b and fn(wmnorm).
See the Wikipedia page System of linear equations
the section Matrix solutions.
The solutions are given by
Az <- diag(nrow=nrow(Ag)) - Ag %*% A
w <- wmnorm + Az %*% z
where z is an arbitrary vector of ncol(Az) elements.
And now generate some solutions and check
xb <- wmnorm
z <- runif(4)
wsol.2 <- xb + Az %*% z
wsol.2
A %*% wsol.2 - b
fn(wsol.2)
z <- runif(4)
wsol.3 <- xb + Az %*% z
wsol.3
A %*% wsol.2 - b
fn(wsol.3)
And you'll see that these two solutions are valid solutions when given as argument to fn. And are quite different from the solution found by optim. You could test this by choosing a different starting point init_w for example by init_w1 <- runif(4)/4.
I have defined a distance function as follow
jaccard.rules.dist <- function(x,y) ({
# implements feature distance. Feature "Airline" gets a different treatment, the rest
# are booleans coded as 1/0. Airline column distance = 0 if same airline, 1 otherwise
# the rest of the atributes' distance is cero iff both are 1, 1 otherwise
airline.column <- which(colnames(x)=="Aerolinea")
xmod <- x
ymod <-y
xmod[airline.column] <-ifelse(x[airline.column]==y[airline.column],1,0)
ymod[airline.column] <-1 # if they are the same, they are both ones, else they are different
andval <- sum(xmod&ymod)
orval <- sum(xmod|ymod)
return (1-andval/orval)
})
which modifies a little bit jaccard distance for dataframes of the form
t <- data.frame(Aerolinea=c("A","B","C","A"),atr2=c(1,1,0,0),atr3=c(0,0,0,1))
Now, I would like to perform some k-means clustering on my dataset, using the distance just defined. If I try to use the function kmeans, there is no way to specify my distance function. I tried the to use hclust, which accepts a distanca matrix, which I calculated as follows
distmat <- matrix(nrow=nrow(t),ncol=nrow(t))
for (i in 1:nrow(t))
for (j in i:nrow(t))
distmat[j,i] <- jaccard.rules.dist(t[j,],t[i,])
distmat <- as.dist(distmat)
and then invoked hclust
hclust(distmat)
Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed 65536") :
missing value where TRUE/FALSE needed
what am i doing wrong? is there another way to do clustering that just accepts an arbitrary distance function as its input?
thanks in advance.
I think distmat (from your code) has to be a distance structure (which is different from a matrix). Try this instead:
require(proxy)
d <- dist(t, jaccard.rules.dist)
clust <- hclust(d=d)
clust#centers
[,1] [,2]
[1,] 0.044128322 -0.039518142
[2,] -0.986798495 0.975132418
[3,] -0.006441892 0.001099211
[4,] 1.487829642 1.000431146
This question already has answers here:
A^k for matrix multiplication in R?
(6 answers)
Closed 9 years ago.
I'm trying to compute the -0.5 power of the following matrix:
S <- matrix(c(0.088150041, 0.001017491 , 0.001017491, 0.084634294),nrow=2)
In Matlab, the result is (S^(-0.5)):
S^(-0.5)
ans =
3.3683 -0.0200
-0.0200 3.4376
> library(expm)
> solve(sqrtm(S))
[,1] [,2]
[1,] 3.36830328 -0.02004191
[2,] -0.02004191 3.43755429
After some time, the following solution came up:
"%^%" <- function(S, power)
with(eigen(S), vectors %*% (values^power * t(vectors)))
S%^%(-0.5)
The result gives the expected answer:
[,1] [,2]
[1,] 3.36830328 -0.02004191
[2,] -0.02004191 3.43755430
The square root of a matrix is not necessarily unique (most real numbers have at least 2 square roots, so it is not just matricies). There are multiple algorithms for generating a square root of a matrix. Others have shown the approach using expm and eigenvalues, but the Cholesky decomposition is another possibility (see the chol function).
To extend this answer beyond square roots, the following function exp.mat() generalizes the "Moore–Penrose pseudoinverse" of a matrix and allows for one to calculate the exponentiation of a matrix via a Singular Value Decomposition (SVD) (even works for non square matrices, although I don't know when one would need that).
exp.mat() function:
#The exp.mat function performs can calculate the pseudoinverse of a matrix (EXP=-1)
#and other exponents of matrices, such as square roots (EXP=0.5) or square root of
#its inverse (EXP=-0.5).
#The function arguments are a matrix (MAT), an exponent (EXP), and a tolerance
#level for non-zero singular values.
exp.mat<-function(MAT, EXP, tol=NULL){
MAT <- as.matrix(MAT)
matdim <- dim(MAT)
if(is.null(tol)){
tol=min(1e-7, .Machine$double.eps*max(matdim)*max(MAT))
}
if(matdim[1]>=matdim[2]){
svd1 <- svd(MAT)
keep <- which(svd1$d > tol)
res <- t(svd1$u[,keep]%*%diag(svd1$d[keep]^EXP, nrow=length(keep))%*%t(svd1$v[,keep]))
}
if(matdim[1]<matdim[2]){
svd1 <- svd(t(MAT))
keep <- which(svd1$d > tol)
res <- svd1$u[,keep]%*%diag(svd1$d[keep]^EXP, nrow=length(keep))%*%t(svd1$v[,keep])
}
return(res)
}
Example
S <- matrix(c(0.088150041, 0.001017491 , 0.001017491, 0.084634294),nrow=2)
exp.mat(S, -0.5)
# [,1] [,2]
#[1,] 3.36830328 -0.02004191
#[2,] -0.02004191 3.43755429
Other examples can be found here.