apply an optimal function to a function using a for loop in r - r

I wrote a simple function for maximum likelihood and would like this function to give different result based on the different values of its parameters using for loop in R. That is my function include an expression based on for loop. My function works well and the result are saved in a list. Then, Since I have two different results, I would like to apply the optim function to my function based on each part of my function. For example,
ff <- function(x,mu=c(2,0.5),sd=c(0.2,0.3)){
out <- vector("list",2)
for (i in 1:2){
out[[i]] <- -sum(log(dnorm(x,mu[[i]],sd[[i]]))) ## here I have two different part of my funcitons wrap as one using for loop.
}
return(out)
}
set.seed(123)
x <- rnorm(10,2,0.5)
x
Then the result of my function is:
> ff(x)
[[1]]
[1] 25.33975
[[2]]
[1] 101.4637
Then, since my function has two different parts wrap as one using for loop, I would like to apply the optim function to this function based on each part of it. I tried many own methods and they did not work. Here is one of my tries:
op <- vector("list",2)
for(i in 1:2){
op <- optim(c(0.5,0.5),fn=ff[[i]],i=i)
}
That is, I want the optim function to evaluate my function at the first value of my argument i=1 and then evaluate the function for the second one i=2.
So my funcitons without the wrap is as follows:
ff_1 <- function(x,mu=c(2,0.5),sd=c(0.2,0.3)){
-sum(log(dnorm(x,mu[[1]],sd[[1]])))
return(out)
}
ff_2 <- function(x,mu=c(2,0.5),sd=c(0.2,0.3)){
-sum(log(dnorm(x,mu[[2]],sd[[2]])))
return(out)
}
and I then need to use two different optim functions for each functions.
I search many website and R help sites but I couldnot find a solution to this question.
Any help please?

Try this one, it's just the way of passing the arguments to optim, I suppose
# given data
set.seed(123)
x <- rnorm(10,2,0.5)
# use vector parOpt instead of specifying two; for convience
# with optim
ff <- function(x, parOpt){
out <- -sum(log(dnorm(x, parOpt[1], parOpt[2])))
return(out)
}
# parameters in mu,sd vectors arranged in list
params <- list(set1 = c(2, 0.2), set2 = c(0.5, 0.3))
# output list
out <- list()
for(i in 1:2){
# pass params (mu and sd) to optim, function ff and the data
# note, since function ff has x argument, specify that in optim
out[[i]] <- optim(par = params[[i]], fn=ff ,x=x)
}
Should give something like this:
[[1]]
[[1]]$par
[1] 2.0372546 0.4523918
[[1]]$value
[1] 6.257931
[[1]]$counts
function gradient
55 NA
[[1]]$convergence
[1] 0
[[1]]$message
NULL
[[2]]
[[2]]$par
[1] 2.037165 0.452433
[[2]]$value
[1] 6.257932
[[2]]$counts
function gradient
73 NA
[[2]]$convergence
[1] 0
[[2]]$message
NULL
Hope this helps.

As an alternative, you can find the same solution using the command fitdist of the fitdistrplus package:
library(fitdistrplus)
set.seed(123)
x <- rnorm(10,2,0.5)
mu.start <- c(2,0.5)
sd.start <- c(0.2,0.3)
op <- vector("list",2)
for(i in 1:2){
op[[i]] <- fitdist(x,"norm", start=c(mu.start[i],sd.start[i]))
}
op
The result is:
[[1]]
Fitting of the distribution ' norm ' by maximum likelihood
Parameters:
estimate Std. Error
1 2.0372546 0.1430588
2 0.4523918 0.1011464
[[2]]
Fitting of the distribution ' norm ' by maximum likelihood
Parameters:
estimate Std. Error
1 2.037165 0.1430719
2 0.452433 0.1011694

Related

how to draw the log-likelihood graph

I am learning how to draw a log-likelihood graph. Please allow me briefly introduce what I want to do specifically:
Assume we have the data/vector as below:
set.seed(123)
sample <- rpois(50, 1.65)
And the log_like function is given as below:
log_like_graph <- function(lambda){
X <- as.matrix(sample) # not sure whether this is necessary for one-parameter distribution.
N <- nrow(X)
logLik <- N*log(lambda) - lambda*N*mean(X)
return(loglik)
}
log_like_graph <- Vectorize(log_like_graph)
# set range of lambda
lambda_vals <- seq(-10,10,by=1)
log_vals <- outer(lambda_vals,log_like_graph)
Based on the above lambda_vals and log_vals, I expect to produce a plot like below:
However, when I excute the last command: log_vals <- outer(lambda_vals,log_like_graph), I got the error hint
Error in as.vector(x, mode) :
cannot coerce type 'closure' to vector of type 'any'
Could you please help me solve this problem? Thank you very much!
(FYI: I mainly follow the youtube video https://www.youtube.com/watch?v=w3drLH-DFpE&ab_channel=CalebLikesR that teaches to draw the curve for a log-likelihood function, although it uses normal distribution for demonstration.)
A couple of things I see; no need to vectorise log_like_graph as you can just pass lambda values into it with sapply rather than outer, you are passing lambda_vals < 0 but the support of lambda is >= 0, and I don't think your log-likelihood function is correct (I think it should be -N * lambda - sum(lfactorial(sample)) + log(lambda) * sum(X) but it is easier/more accurate to use dpois(..., log=TRUE)).
So fixing these things
# data
set.seed(123)
samples <- rpois(50, 1.65)
# The log-likelihood becomes
log_like_graph <- function(X, lambda){
N <- NROW(X)
logLik <- -N * lambda - sum(lfactorial(X)) + log(lambda) * sum(X)
return(logLik)
}
# set lambda >= 0 and take smaller steps (0.01) for a smoother curve
lambda_vals <- seq(0,10,by=0.01)
# loop through lambda values calculating the log-likehood at each value
ll1 <- sapply(lambda_vals, function(i) log_like_graph(samples, i))
plot(lambda_vals, ll1, type="l")
This can also be done with dpois(..., log=TRUE) :
ll2 <- sapply(lambda_vals, function(i) sum(dpois(samples, lambda=i, log=TRUE)))
all.equal(ll1, ll2)
# [1] TRUE

Trying to store estimates when using "replicate" in R

I am new to RStudio and have a question I was hoping one could help me with. I am using the "replicate" function to simulate a log-GARCH model, what I get is 100 replications and what I want to do is store the estimates such that I can calculate the average. How can I do that?
Code:
library(lgarch)
replicate(n=100,{ x <- lgarchSim(500, constant=0.5) mymod <- lgarch(x) }, simplify=FALSE )
This produced the following output:
Output
Attached is an image of replication [99] and [100], what I essentially want is to store "intercept" "arch1" and "garch1" in a list.
The result is giving you a list of models. You can use lapply to extract the coefficients from each model, rbind them together into a matrix and get the colMeans to get your averages coefficients:
library(lgarch)
my_list <- replicate(n = 100, {
x <- lgarchSim(500, constant = 0.5)
mymod <- lgarch(x)
}, simplify = FALSE )
colMeans(do.call(rbind, lapply(my_list, coefficients)))
#> intercept arch1 garch1 Elnz2
#> 1.15879953 0.04965793 0.82525587 -1.26277188

GRG Nonlinear R

I want to transform my excel solver model into a model in R. I need to find 3 sets of coordinates which minimizes the distance to the 5 other given coordinates. I've made a program which calculates a distance matrix which outputs the minimal distance from each input to the given coordinates. I want to minimize this function by changing the input. Id est, I want to find the coordinates such that the sum of minimal distances are minimized. I tried several methods to do so, see the code below (Yes my distance matrix function might be somewhat cluncky, but this is because I had to reduce the input to 1 variable in order to run some algorithms such as nloprt (would get warnings otherwise). I've also seen some other questions (such as GRG Non-Linear Least Squares (Optimization)) but they did not change/improve the solution.
# First half of p describes x coordinates, second half the y coordinates # yes thats cluncky
p<-c(2,4,6,5,3,2) # initial points
x_given <- c(2,2.5,4,4,5)
y_given <- c(9,5,7,1,2)
f <- function(Coordinates){
# Predining
Term_1 <- NULL
Term_2 <- NULL
x <- NULL
Distance <- NULL
min_prob <- NULL
l <- length(Coordinates)
l2 <- length(x_given)
half_length <- l/2
s <- l2*half_length
Distance_Matrix <- matrix(c(rep(1,s)), nrow=half_length)
# Creating the distance matrix
for (k in 1:half_length){
for (i in 1:l2){
Term_1[i] <- (Coordinates[k]-x_given[i])^2
Term_2[i] <- (Coordinates[k+half_length]-y_given[i])^2
Distance[i] <- sqrt(Term_1[i]+Term_2[i])
Distance_Matrix[k,i] <- Distance[i]
}
}
d <- Distance_Matrix
# Find the minimum in each row, thats what we want to obtain ánd minimize
for (l in 1:nrow(d)){
min_prob[l] <- min(d[l,])
}
som<-sum(min_prob)
return(som)
}
# Minimise
sol<-optim(p,f)
x<-sol$par[1:3]
y<-sol$par[4:6]
plot(x_given,y_given)
points(x,y,pch=19)
The solution however is clearly not that optimal. I've tried to use the nloptr function, but I'm not sure which algorithm to use. Which algorithm can I use or can I use/program another function which solves this problem? Thanks in advance (and sorry for the detailed long question)
Look at the output of optim. It reached the iteration limit and had not yet converged.
> optim(p, f)
$`par`
[1] 2.501441 5.002441 5.003209 5.001237 1.995857 2.000265
$value
[1] 0.009927249
$counts
function gradient
501 NA
$convergence
[1] 1
$message
NULL
Although the result is not that different you will need to increase the number of iterations to get convergence. If that is still unacceptable then try different starting values.
> optim(p, f, control = list(maxit = 1000))
$`par`
[1] 2.502806 4.999866 5.000000 5.003009 1.999112 2.000000
$value
[1] 0.005012449
$counts
function gradient
755 NA
$convergence
[1] 0
$message
NULL

Adding random numbers to the end of a vector

I am trying to repeatedly compute the median of a random vector, constructing a vector of all the medians. Why do I get a NULL result from my code below, in which I repeatedly compute the median and add it to the end of the vector m?
medainfunc<-function(n,mu,sigma,iterate){
m<-c()
for (i in itreate){
x<-rnorm(n,mu,sigma)
y<-median(x)
m<-c(m,y)
}
m
}
medianfunc(10,3,15,10000)
NULL
We have multiple typos in the OP's code, i.e. iterate vs. itreate and calling medainfunc while the original function is medianfunc. In addition, we are providing only a single input value for 'iterate', so seq(iterate may be we want inside the function. Otherwise, we get a single value output.
medianfunc<-function(n,mu,sigma,iterate){
m<-c()
for (i in seq(iterate)){
x<-rnorm(n,mu,sigma)
y<-median(x)
m<-c(m,y)
}
m
}
set.seed(25)
medianfunc(10,3,15, 5)
#[1] 0.9770646 -6.4852741 4.6768291 -6.4167869 5.3176253
This could be vectorized by getting the rnorm of 'n*iterate' values. Convert this to a matrix and use colMedians from library(matrixStats).
medianfunc1 <- function(n, mu, sigma, iterate){
m1 <- matrix(rnorm(n*iterate, mu, sigma), ncol=iterate)
library(matrixStats)
colMedians(m1)
}
set.seed(25)
medianfunc1(10,3,15, 5)
#[1] 0.9770646 -6.4852741 4.6768291 -6.4167869 5.3176253
Building a vector one-by-one is both inefficient (see the second circle of the R inferno) and code-intensive. Instead, you can repeat a randomized operation a certain number of times in a single line of code with the replicate function:
medianfunc <- function(n, mu, sigma, iterate) {
replicate(iterate, median(rnorm(n, mu, sigma)))
}
set.seed(25)
medianfunc(10,3,15, 5)
# [1] 0.9770646 -6.4852741 4.6768291 -6.4167869 5.3176253
I dont know if this is just the code you wrote on the post but the function is called medainfunc and then on your code you are calling the medianfunc. That might be just one error, the other thing I noticed is that you should add return
at the end of your function like so:
medainfunc<-function(n,mu,sigma,iterate){
m<-c()
for (i in itreate){
x<-rnorm(n,mu,sigma)
y<-median(x)
m<-c(m,y)
}
return(m)
}

Uniform kernel function returning NaN

I'm writing my own uniform kernel function like so:
uniform.kernel <- function(data, predict.at, iv.name, dv.name, bandwidth){
#Load in the DV/IV and turn them into vectors
iv <- data$iv.name
dv <- data$dv.name
#Given the point we're predicting,
#what kernel weights does each observation of the iv receive?
kernelvalue <- ifelse(abs((iv - predict.at)/bandwidth)<= 1, 0.5,0)
#Given these kernel values and the dv,
#what is our estimate of the conditional expectation?
conditional.expectation <-sum(kernelvalue*dv)/sum(kernelvalue)
#Return the expectation
return(conditional.expectation)
}
And then applying it to this data:
set.seed(101)
x <- seq(from=0, to=100, by=.1)
errors <- runif(min=.5, max=5000, n=length(x))
y <- x^2 - 3*x + errors^1.1
combo.frame <- cbind.data.frame(x,y)
Only, when I apply the function to the data (like below), I get "NaN".
uniform.kernel(combo.frame, 20, "x","y", 4)
However, when I just write out the steps within my function to the data set directly (without using the function), I get the correct answer. For example, I do the following and get the correct results:
kernelvalue <- ifelse(abs((combo.frame$x - 20)/4)<= 1, 0.5,0)
conditional.expectation <- sum(kernelvalue*combo.frame$y)/sum(kernelvalue)
Why am I getting NaN when I use the function?
You can't use the $ operator with character objects like that. Use the [ operator instead. Replace the first two lines in your function like this:
iv <- data[,iv.name]
dv <- data[,dv.name]
and it works as expected.

Resources