I have been told to implement the Levenberg-Marquardt algorithm in R studio, considering lambda's initial value equals 10. The algorithm must stop when the gradient's norm is lower than the tolerance. I also need to print the values that x1, x2, λ, ∇f(x), d1 and d2 take for each iteration. Any ideas on how to do it? Many thanks in advance
This is what I have:
library(pracma)
library(matlib)
MetodeLM<-function(f,xi,t)
{
l=10
stop=FALSE
x<-xi
k=0
while (stop==FALSE){
dk<- inv(hessian(f,x)+l*diag(diag(hessian(f,x))))
x1<-x+dk
if (Norm(grad(f,x1))<t){
stop<-TRUE
}
else{
if (f(x1) < f(x)){
l<-l/10
k<-k+1
stop<-FALSE
}
else{
l<-l*10
stop<-FALSE
}
}
}
}
Correcting a few mistakes in your code, the following implementation of Levenberg Marquadt's algorithm should work (note that the update rule for the algorithm is shown in the following figure):
library(pracma)
# tolerance = t, λ = l
LM <- function(f, x0, t, l=10, r=10) {
x <- x0
k <- 0
while (TRUE) {
H <- hessian(f, x)
G <- grad(f, x)
dk <- inv(H + l * diag(nrow(H))) %*% G # dk <- solve(H + l * diag(nrow(H)), G)
x1 <- x - dk # update rule
print(k) # iteration
# print(l) # λ
print(x1) # x1, x2
print(G) # ∇f(x)
print(dk) # d1, d2
if (Norm(G) < t) break
l <- ifelse(f(x1) < f(x), l / r, l * r)
k <- k + 1
x <- x1 # update the old point
}
}
For example, with the following function, the non-linear optimization algorithm will quickly find a local minimum point (in the 10th iteration) as shown below
f <- function(x) {
return ((x[1]^2+x[2]-25)^2 + (x[1]+x[2]^2-25)^2)
}
x0 <- rep(0,2)
LM(f, x0, t=1e-3, l=400, r=2)
# [1] 0
# [,1]
# [1,] 0.165563
# [2,] 0.165563
# [1] -50 -50
# [,1]
# [1,] -0.165563
# [2,] -0.165563
# [1] 1
# [,1]
# [1,] 0.7986661
# [2,] 0.7986661
# [1] -66.04255 -66.04255
# [,1]
# [1,] -0.6331031
# [2,] -0.6331031
# ...
# [1] 10
# [,1]
# [1,] 4.524938
# [2,] 4.524938
# [1] 0.0001194898 0.0001194898
# [,1]
# [1,] 5.869924e-07
# [2,] 5.869924e-07
The following animation shows the convergence to the local minimum point for the function:
The following one is with LoG function
Related
Just some smaller changes which do not need to be considered.
This for loop may be helpful.
1. Run all of your codes
s <- 60000
t <- 20
mu <- function(x, t) {
A <- .00022
B <- 2.7*10^(-6)
c <- 1.124
mutemp <- A + B*c^(x + t)
out <- ifelse(t <= 2, 0.9^(2 - t)*mutemp, mutemp)
out}
f <- function(x) (s - x - 0.05*(0.04*x + 1810.726 - mu(40, t)*(s - x)))
2. Run the for loop below for iteration
2.1 Predefine the length of the outcome. In your case is 400 (t/0.05 = 400).
output <- vector(mode = "numeric", length = t/0.05)
2.2 Run through the for loop from 1 to 400. Save each uniroot result to step 2.1, and then reassign both s and t accordingly.
for (i in 1:400) {
output[i] <- uniroot(f, lower=0.1, upper=100000000)$root
s <- output[i]
t <- 20 - i * 0.05
}
3. Inspect the result
output
Hope this is helpful.
You could use vapply on a defined t sequence.
s <- 6e4
tseq <- seq.int(19.95, 0, -.05)
x <- vapply(tseq, \(t) {
s <<- uniroot(\(x) (s - x - 0.05*(0.04*x + 1810.726 - mu(40, t)*(s - x))), lower=0.1, upper=100000000)$root
}, numeric(1L))
Note, that <<- changes s in the global environment, and at the end gets the last value.
s
# [1] 2072.275
res <- cbind(t=tseq, x)
head(res)
# t x
# [1,] 19.95 59789.92
# [2,] 19.90 59580.25
# [3,] 19.85 59371.01
# [4,] 19.80 59162.18
# [5,] 19.75 58953.77
# [6,] 19.70 58745.77
I'm having some issues with the following problem:
I want to maximize the function U with respect to the constraint B using solnp()
I've defined the U function as:
U = function(x){
#P is a vector of parameters
#X is a vector of variables
-P[1]*(x[1]^P[2])*(x[2]^P[3])
}
And the function B as follows:
B = function(x){
#M is a vector of parameters
#X is a vector of variables
#The constant M[3] is an argument of CCM
-M[1]*x[1]-M[2]*x[2]
}
Then I've defined the optimizer function this way as a cover of solnp():
CCM = function(P,M){
solnp(c(0,0), #starting values (random - obviously need to be positive and sum to 15)
U, #function to optimise
eqfun=B, #equality function
eqB=-M[3], #the equality constraint
LB=c(0,0), #lower bound for parameters i.e. greater than zero
UB=c(10000,10000)) #upper bound for parameters (I just chose 10000 randomly)
}
I've written a optimize function to be able to choose different parameters for this problem.
The issue here is that when I run the CCM function is does not seem to be maximizing (gives me (0,0) as solution), solnp() minimizes by default, so I've changed the sign of U and B due to that.
By the way, I'm trying to program a Consumer Choice model.
Edit:
Test run with the values in the OP's comment.
P <- c(2,0.1,0.9)
M <- c(1,2,15)
CCM(P, M)
#
#solnp-->The linearized problem has no feasible
#solnp-->solution. The problem may not be feasible.
#
#Iter: 1 fn: 0 Pars: 0 0
#solnp--> Solution not reliable....Problem Inverting Hessian.
#$pars
#[1] 0 0
#
#$convergence
#[1] 2
#
#$values
#[1] 0 0
#
#$lagrange
#[1] 0
#
#$hessian
# [,1] [,2]
#[1,] 1 0
#[2,] 0 1
#
#$ineqx0
#NULL
#
#$nfuneval
#[1] 7
#
#$outer.iter
#[1] 1
#
#$elapsed
#Time difference of 0.6487024 secs
#
#$vscale
#[1] 1.0e-08 1.5e+01 1.0e+00 1.0e+00
#
CCM(P, M)$convergence
#
#solnp-->The linearized problem has no feasible
#solnp-->solution. The problem may not be feasible.
#
#Iter: 1 fn: 0 Pars: 0 0
#solnp--> Solution not reliable....Problem Inverting Hessian.
#[1] 2
You can consider the following approach :
U <- function(val_x, val_P)
{
val_P[1] * (val_x[1] ^ val_P[2]) * (val_x[2] ^ val_P[3])
}
B <- function(val_x, val_M)
{
val_M[1] * val_x[1] - val_M[2] * val_x[2]
}
fn_Opt <- function(val_x, val_P, val_M)
{
val <- (B(val_x, val_M) - M[3]) ^ 2 + U(val_x, val_P)
if(is.na(val) | is.nan(val) | is.infinite(val))
{
return(10 ^ 30)
}else
{
return(val)
}
}
library(DEoptim)
P <- c(2, 0.1, 0.9)
M <- c(1, 2, 15)
obj_DEoptim <- DEoptim(fn = fn_Opt, lower = c(0,0), upper = c(10000,10000),
control = list(itermax = 1000), val_P = P, val_M = M)
obj_DEoptim$optim$bestmem
par1 par2
1.500000e+01 5.661556e-134
B(obj_DEoptim$optim$bestmem, M)
par1
15
U(obj_DEoptim$optim$bestmem, P)
par1
3.135316e-120
I'm trying to calculate the euclidean distance between two matrices. I have already achieved that using 2 for loops but trying to vectorize the calculation to speed up. I'm using pdist as a benchmark to valid if the distance is calculated correctly.
Thanks to this post, https://medium.com/#souravdey/l2-distance-matrix-vectorization-trick-26aa3247ac6c, I tried to achieve the same thing in r with this code:
dist <- sqrt(rowSums(xtest**2)+rowSums(xtrain**2)-2*xtrain %*% t(xtest))
But the result is different from what comes out of pdist. I am not sure what's wrong with this.
Here are some codes
Create some data
xtest=matrix(cbind(c(0,0),c(1,31)),2,2,byrow=TRUE)
xtrain=matrix(cbind(c(9,2),c(4,15),c(7,8),c(-22,-2)),4,2,byrow=TRUE)
Calculate using double loops
mydist <- function(xtest,xtrain) {
euc.dist <- function(x1, x2) sqrt(sum((x1 - x2) ^ 2))
dist <- matrix(,nrow=nrow(xtrain),ncol=nrow(xtest))
for (i in 1:nrow(xtrain)){
for (j in 1:nrow(xtest)){
dist[i,j] <- euc.dist(xtrain[i,], xtest[j,])
}
}
return (dist)
}
> mydist(xtest,xtrain)
[,1] [,2]
[1,] 9.219544 30.08322
[2,] 15.524175 16.27882
[3,] 10.630146 23.76973
[4,] 22.090722 40.22437
The result is same as using pdist
> libdists <- pdist(xtrain,xtest)
> as.matrix(libdists)
[,1] [,2]
[1,] 9.219544 30.08322
[2,] 15.524175 16.27882
[3,] 10.630146 23.76973
[4,] 22.090721 40.22437
But if I use matrix multiplication method it's wrong
> mydist2 <- function(xtest,xtrain) {
+ dist <- sqrt(rowSums(xtest**2)+rowSums(xtrain**2)-2*xtrain %*% t(xtest))
+ return (dist)
+ }
> mydist2(xtest,xtrain)
[,1] [,2]
[1,] 9.219544 NaN
[2,] 34.684290 16.27882
[3,] 10.630146 NaN
[4,] 38.078866 40.22437
I have also tried to use mapply function
> mydist3 <- function(xtest,xtrain) {
+ euc.dist <- function(x1, x2) sqrt(sum((x1 - x2) ^ 2))
+ dist <- mapply(euc.dist, xtest,xtrain)
+ return (dist)
+ }
> mydist3(xtest,xtrain)
[1] 9 3 7 53 2 14 8 33
I think it goes element wise rather than takes each row as a vector to calculate the distance between two vectors.
Any suggestions will be appreciated!
Use two apply instances with the second nested in the first:
d1 <- apply(xtest, 1, function(x) apply(xtrain, 1, function(y) sqrt(crossprod(x-y))))
Check against pdist:
library(pdist)
d2 <- as.matrix(pdist(xtrain, xtest))
all.equal(d1, d2, tolerance = 1e-7)
## [1] TRUE
I am newbie to Particle Swarm Optimization. I read research paper on Clustering based on PSO and K-means but I did not found any working example of the same. Any kind of help is much appreciated. Thanks in advance!
I want to perform text document clustering using PSO and K-means in R. I have the basic idea that first PSO will give me the optimised values of the cluster centroids, then I have to use those optimised value of cluster centroids of PSO as the initial cluster centroid for k-means to get cluster of documents.
Below are the codes which describe what I have done so far!
#Import library
library(pdist)
library(hydroPSO)
#Create matrix and suppose it is our document term matrix which we get after
the cleaning of corpus
( In my actual data I have 20 documents with 951 terms i.e., dim(dtm) = 20*951 )
matri <- matrix(data = seq(1, 20, 1), nrow = 4, ncol = 7, byrow = TRUE)
matri
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1 2 3 4 5 6 7
[2,] 8 9 10 11 12 13 14
[3,] 15 16 17 18 19 20 1
[4,] 2 3 4 5 6 7 8
#Initially select first and second row as centroids
cj <- matri[1:2,]
#Calculate Euclidean Distance of each data point from centroids
vm <- as.data.frame(t(as.matrix(pdist(matri, cj))))
vm
V1 V2 V3 V4
1 0.00000 18.52026 34.81379 2.645751
2 18.52026 0.00000 21.51744 15.874508
#Create binary matrix S in which 1 means Instance Ii is allocated to the cluster Cj otherwise 0.
S <- matrix(data = NA, nrow = nrow(vm), ncol = ncol(vm))
for(i in 1:nrow(vm)){
for(j in 1:ncol(vm)){
cd <- which.min(vm[, j])
ifelse(cd==i, S[i,j] <-1, S[i,j] <-0)
}
}
S
[,1] [,2] [,3] [,4]
[1,] 1 0 0 1
[2,] 0 1 1 0
#Apply `hydroPSO()` to get optimised values of centroids.
set.seed(5486)
D <- 4 # Dimension
lower <- rep(0, D)
upper <- rep(10, D)
m_s <- matrix(data = NA, nrow = nrow(S), ncol = ncol(matri))
Fn= function(y) { #Objective Function which has to be minimised
for(j in 1:ncol(matri)){
for(i in 1:nrow(matri)){
for(k in 1:nrow(y)){
for(l in 1:ncol(y)){
m_s[k,] <- colSums(matri[y[k,]==1,])/sum(y[k,])
}
}
}
}
sm <- sum(m_s)/ nrow(S)
return(sm)
}
hh1 <- hydroPSO(S,fn=Fn, lower=lower, upper=upper,
control=list(write2disk=FALSE, npart=3))
But the above hydroPSO() function is not working. It is giving error Error in 1:nrow(y) : argument of length 0. I searched for it but didn't get any solution which works for me.
I also made some changes in my objective function and this time hydroPSO() worked but I guess not correctly. I am passing my initial centroid matrix as a parameter whose dimension is 2*7 but the function returns only 1*7 optimised values. I am not getting its reason.
set.seed(5486)
D <- 7# Dimension
lower <- rep(0, D)
upper <- rep(10, D)
Fn = function(x){
vm <- as.data.frame(t(as.matrix(pdist(matri, x))))
S <- matrix(data = NA, nrow = nrow(vm), ncol = ncol(vm))
for(i in 1:nrow(vm)){
for(j in 1:ncol(vm)){
cd <- which.min(vm[, j])
ifelse(cd==i, S[i,j] <-1, S[i,j] <-0)
}
}
m_s <- matrix(data = NA, nrow = nrow(S), ncol = ncol(matri))
for(j in 1:ncol(matri)){
for(i in 1:nrow(matri)){
for(k in 1:nrow(S)){
for(l in 1:ncol(S)){
m_s[k,] <- colSums(matri[S[k,]==1,])/sum(S[k,])
}
}
}
}
sm <- sum(m_s)/ nrow(S)
return(sm)
}
hh1 <- hydroPSO(cj,fn=Fn, lower=lower, upper=upper,
control=list(write2disk=FALSE, npart=2, K=2))
Output of the above function.
## $par
## Param1 Param2 Param3 Param4 Param5 Param6 Param7
## 8.6996174 2.1952303 5.6903588 0.4471795 3.7103161 1.6605425 8.2717574
##
## $value
## [1] 61.5
##
## $best.particle
## [1] 1
##
## $counts
## function.calls iterations regroupings
## 2000 1000 0
##
## $convergence
## [1] 3
##
## $message
## [1] "Maximum number of iterations reached"
I guess I am passing parameters to the hydroPSO() in a wrong way. Please correct me where I'm doing it wrong.
Thank you very much!
Instead of passing cj to hydroPSO() I used as.vector(t(cj)) in my second approach and it worked fine for me. I got 14 optimised values
While there is a function used for Trace Matrix as seen below:
sum(diag(matrix))
This may incorrectly give you a result if the matrix is not Square (i.e. an "n x n" size). Are there any other inbuilt functions for running "Trace" of a matrix?
Package: psych
Function: tr()
Example:
> x <- matrix(replicate(9,1), ncol = 3, nrow = 3)
> x
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 1 1
[3,] 1 1 1
> tr(x)
[1] 3
> x <- matrix(replicate(12,1), ncol = 4, nrow = 3)
> x
[,1] [,2] [,3] [,4]
[1,] 1 1 1 1
[2,] 1 1 1 1
[3,] 1 1 1 1
> tr(x)
Fehler in tr(x) : m must be a square matrix
("Fehler" means error)
Moreover
Package: matrixcalc
Function: matrix.trace
Below is a quick function to test if the object is a matrix and then test if it is also square.
tr <- function (m)
{
total_sum <- 0
if(is.matrix(m))
{
row_count <- nrow(m)
col_count <- ncol(m)
if(row_count == col_count)
{
total_sum <-sum(diag(m))
total_sum
}
else
{
message ('Matrix is not square')
}
}
else
{
message( 'Object is not a matrix')
}
}
I also found the following package for Matrix.Trace:
Matrixcalc
You can try using eigenvalues
# first find eigenvalues
e = eigen(matrix)
# Calculate the trace of the matrix, and compare with the sum of the eigenvalues.
# function to calculate the trace using sum of the diagonal
trace <- function(data)sum(diag(data))
trace(H)
# using sum of the eigenvalues
sum(e$values)
Hope it helps.