Optimizing a function in a loop - r

I am trying to deal with a likelihood function from an AR(2) model.
I have to maximize a function with respect to two variables, alpha1 and alpha2.
Since it is about time series, I have the variable x stored in a matrix for 200 time periods.
I have 10000 simulations of this. So I have the x values in a 200x10000 matrix.
I want to directly have a matrix of 2x10000 in which the results of the optimization for every simulation are stored. I have set a for loop and I have specified the function inside of it, but it is not working, i.e. when I run it it tells me:
Error in A <- -optim(c(1.5, 0.75), log_lik) : argument not valid for operator
I attach here my code. I have created a function to contain the results before running the loop and I have called it A:
for (i in 1:R) {
for (t in 3:N) {
log_lik <- function (α) {
α1 <- α[1]
α2 <- α[2]
L = -1/2*((N-2)*log(pi*2)+(N-2)*log(1)+sum((x[t,i]-c-α1*x[t-1,i]-α2*x[t-2,i])^2))
}
A <- -optim(c(1.5, 0.75), log_lik)$par
}
return(A)
}
Thanks a lot!!

Related

Grid Search in R for Nonparametric Quantile Regression

I use a library called "quantreg" in R and try to estimate full nonparametric quantile regression on time series basis. To get statistically significant results I try lots of variables and smoothing parameter values (lambda). But it's exhausting and very time consuming. Therefore, I want to apply grid search, however it is a little bit hard for me. I want to determine best smoothing values, so I should construct a for loop. But I want that loop to try every combination. At the I want to have the lambda values of best model or models (all variables' p values<0.05 condition).
For example if I have three variables in my equation I've written something like that:
lambdas1<-rbind(1,2,3)
lambdas2<-rbind(1,2,3)
lambdas3<-rbind(1,2,3)
mylist<-list()
for (i in 1:3) {
for (j in 1:3) {
for (n in 1:3) {
f <-try(rqss(Y~qss(X1,lambda = lambdas1[i])+qss(X2,lambda = lambdas2[j])+qss(X3,lambda = lambdas3[n]), tau=0.05))
sf<-summary(f)
if( (sf[["qsstab"]]['X1','Pr(>F)']<0.05)&(sf[["qsstab"]]['X2','Pr(>F)']<0.05)&(sf[["qsstab"]]['X3','Pr(>F)']<0.05) ){
mylist[[i]]<-f$lambdas
}
}
}
}
How can I rearrange this code?
Is there any shortcut?
Any help will be appreciated.
Thank you in advance.
You can use baseR expand.grid to create a data.frame of all the possible combinations and then use apply(grid, MARGIN=2, ...) to loop through its rows, also I "optimized" the code you were looking if each p-value I changed it to use all(p.vals < .05)
lambdas <- expand.grid(1:3,1:3,1:3)
check_lambdas <- function(lambdas){
f <-try(rqss(Y~qss(X1,lambda = lambdas[1])+qss(X2,lambda = lambdas[2])+qss(X3,lambda = lambdas[3]), tau=0.05))
if( all(summary(f)$qsstab[,'Pr(>F)']<0.05) ) f$lambdas else NULL
}
apply(lambdas, 2, check_lambdas)

How to create my own binomial coefficient function in R

I have created a factorial function which is then used to create a function for the binomial coefficient. My factorial function works but the binomial function does not.
I needed to create a factorial function which was then to be used to create a binomial coefficient function using R.
I was not allowed to use the base program's functions such as factorial nor choose.
I had to use for statements, logics etc. even though it is inefficient.
I had to print the factorial of zero and ten, then the binomial coefficient with n = 5, and k = 2
fact <- function(n) {
x <- 1
if(n == 0) {
print(1)
} else {
for(i in 1:n) {
x <- x*i
}
}
print(x)
}
fact(0)
fact(10)
bc <- function(n, k) {
y <- fact(n) / fact(n - k) * fact(k)
print(y)
}
bc(5, 2)
For the factorial function I got the correct answer
But for the binomial function I was way off.
If someone can show me where I have made the mistake I would be most appreciative.
There are quite a few issues here, both relating to basic R coding and coding in general. Let's go through some of them step-by-step:
Your function fact actually does not return anything. All it does at the moment is print values to the console. If you take a look at help("print") it is stated that
‘print’ prints its argument and returns it invisibly (via ‘invisible(x)’).
So in order for fact to actually return a value we can do
fact <- function(n) {
x <- 1
if (n > 0) {
for (i in 1:n) x <- x * i
}
return(x)
}
I have tidied up your code by removing the unnecessary n == 0 check.
Note that there is still room for improvement. For example, there are better ways to calculate the factorial of a number. Secondly, your function currently does not properly deal with negative numbers. Generally, the factorial is only defined for non-negative integers. So you can either change fact to return NA for negative numbers, or -- perhaps more interesting -- generalise the factorial function to the Gamma function to allow for any real (or even complex) number. Either way, I'll leave this up to you.
Similarly your function bc also does not return anything, and instead writes the value of y to the console. Furthermore you need to be careful of brackets to make sure that the terms (n - k)! and k! are in the denominator. Both issues can be fixed by writing
bc <- function(n,k) return(fact(n)/(fact(n - k) * fact(k)))
To confirm, we calculate the coefficient for 5 choose 2:
bc(5, 2)
#10

Using a loop in an ODE to graphically compare different parameters R

I'm using the deSolve package to plot a couple differential equations (read if interested http://www.maa.org/press/periodicals/loci/joma/the-sir-model-for-spread-of-disease-the-differential-equation-model).
My eventual goal is to create an iterative function or process (for loop) to plot how changes in certain parameters (beta and gamma) will affect the solution. The preferred output would be a list that contains all each ode solution for each specified value of beta in the loop. I'm running into issues for integrating a loop into the setup that the deSolve package requires for the ode function.
In the code below, I am trying to plot how the range of values (1 to 2 by increments of 0.1) in parameter beta will affect the plot of the differential equations.
for(k in seq(1,2,by=0.1)){ #range of values for beta
init <- c(S=1-1e-6, I=1e-6, R=0) #initial conditions for odes
time <- seq(0,80,by=1) #time period
parameters <- c(beta=k, gamma=0.15) #parameters in ode
SIR <- function(time,state,parameters){ #function containing equaations
with(as.list(c(state,parameters)),{
dS <- -beta*S*I
dI <- beta*S*I-gamma*I
dR <- gamma*I
return(list(c(dS,dI,dR)))
})
}
ode(y=init,times=time,func=SIR()[beta],parms=parameters[k])}
}
The first error I'm getting states that the argument parameters in the SIR function is missing
Error in as.list(c(init, parameters)) : argument "parameters" is
missing, with no default
I don't understand why this error is being reported when I've assigned parameters in the lines previous.
You might as well define your gradient function (and the other non-changing elements) outside the loop:
SIR <- function(time,state,parameters) {
with(as.list(c(state,parameters)),{
dS <- -beta*S*I
dI <- beta*S*I-gamma*I
dR <- gamma*I
return(list(c(dS,dI,dR)))
})
}
init <- c(S=1-1e-6, I=1e-6, R=0) #initial conditions for odes
time <- seq(0,80,by=1) #time period
Now define the vector of values to try (not necessary but convenient):
betavec <- seq(1,2,by=0.1)
and define a list to hold the results:
res <- vector(length(betavec),mode="list")
library(deSolve)
for (k in seq_along(betavec)){ #range of values for beta
res[[k]] <- ode(y=init,times=time,func=SIR,
parms=c(beta=betavec[k], gamma=0.15))
}
Now you have a list, each element of which contains the results from one run. You can sapply or lapply over this list, e.g. to get a matrix of the last states from each run:
t(sapply(res,tail,1))
Or if you want the results as one long data frame ...
names(res) <- betavec ## to get beta value incorporated in results
dd <- dplyr::bind_rows(lapply(res,as.data.frame),.id="beta")
dd$beta <- as.numeric(dd$beta)
do.call(rbind,...) would work nearly as well as bind_rows(), but bind_rows's .id argument is convenient for adding the beta values to each data frame. You could also leave the results as a list and loop over them while plotting with separate lines() calls, or (e.g.) bind just the infective columns together and use matplot() to plot them all at the same time. This is just a matter of style and idiom.
library(ggplot2); theme_set(theme_bw())
library(viridis)
ggplot(dd,aes(x=time,y=I,colour=beta))+
geom_line(aes(group=beta))+
scale_color_viridis()+
scale_y_log10()

Adding random numbers to the end of a vector

I am trying to repeatedly compute the median of a random vector, constructing a vector of all the medians. Why do I get a NULL result from my code below, in which I repeatedly compute the median and add it to the end of the vector m?
medainfunc<-function(n,mu,sigma,iterate){
m<-c()
for (i in itreate){
x<-rnorm(n,mu,sigma)
y<-median(x)
m<-c(m,y)
}
m
}
medianfunc(10,3,15,10000)
NULL
We have multiple typos in the OP's code, i.e. iterate vs. itreate and calling medainfunc while the original function is medianfunc. In addition, we are providing only a single input value for 'iterate', so seq(iterate may be we want inside the function. Otherwise, we get a single value output.
medianfunc<-function(n,mu,sigma,iterate){
m<-c()
for (i in seq(iterate)){
x<-rnorm(n,mu,sigma)
y<-median(x)
m<-c(m,y)
}
m
}
set.seed(25)
medianfunc(10,3,15, 5)
#[1] 0.9770646 -6.4852741 4.6768291 -6.4167869 5.3176253
This could be vectorized by getting the rnorm of 'n*iterate' values. Convert this to a matrix and use colMedians from library(matrixStats).
medianfunc1 <- function(n, mu, sigma, iterate){
m1 <- matrix(rnorm(n*iterate, mu, sigma), ncol=iterate)
library(matrixStats)
colMedians(m1)
}
set.seed(25)
medianfunc1(10,3,15, 5)
#[1] 0.9770646 -6.4852741 4.6768291 -6.4167869 5.3176253
Building a vector one-by-one is both inefficient (see the second circle of the R inferno) and code-intensive. Instead, you can repeat a randomized operation a certain number of times in a single line of code with the replicate function:
medianfunc <- function(n, mu, sigma, iterate) {
replicate(iterate, median(rnorm(n, mu, sigma)))
}
set.seed(25)
medianfunc(10,3,15, 5)
# [1] 0.9770646 -6.4852741 4.6768291 -6.4167869 5.3176253
I dont know if this is just the code you wrote on the post but the function is called medainfunc and then on your code you are calling the medianfunc. That might be just one error, the other thing I noticed is that you should add return
at the end of your function like so:
medainfunc<-function(n,mu,sigma,iterate){
m<-c()
for (i in itreate){
x<-rnorm(n,mu,sigma)
y<-median(x)
m<-c(m,y)
}
return(m)
}

optimization using "nlminb"

im now performing Location Model using non-parametric smoothing to estimate the paramneters.....one of the smoothed paramater is the lamdha that i have to optimize...
so in that case, i decide to use "nlminb function" to achieve it.....
however, my programing give me the same "$par" value even though it was iterate 150 time and make 200 evaluation (by default)..... which is it choose "the start value as $par" (that is 0.000001 ...... i think, there must be something wrong with my written program....
my programing look like:- (note: w is the parameter that i want to optimize and LOO is
stand for leave-one-out
BEGIN
Myfunc <- function(w, n1, n2, v1, v2, g)
{ ## open loop for main function
## DATA generation
# generate data from group 1 and 2
# for each group: discretise the continuous to binary
# newdata <- combine the groups 1 and 2
## MODEL construction
countError <- 0
n <- nrow(newdata)
for (k in 1:n)
{# open loop for leave-one-out
# construct model based on n-1 object using smoothing method
# classify omitted object
countError <- countError + countE
} # close loop for LOO process
Error <- countError / n # error rate counted from LOO procedure
return(Error) # The Average ERROR Rate from LOO procedure
} # close loop for Myfunc
library(stats)
nlminb(start=0.000001, Myfunc, lower=0.000001, upper=0.999999,
control=list(eval.max=100, iter.max=100))
END
could someone help me......
your concerns and guidances is highly appreciated and really100 needed......
Hashibah,
Statistic PhD Student
In your question, provide a nlminb with a univariate starting value. If you are doing univariate optimisation, it is probably worth looking at optimize. If your function is multivariate, then you need to call nlminb slightly differently.
You need define the objective function such that you provide the parameters to optimize over as a vector which is the first argument. Other inputs to the objective function should be provided as subsequent arguments.
For example (modified from the nlminb help page):
X <- rnbinom(100, mu = 10, size = 10)
hdev <- function(par, x) {
-sum(dnbinom(x, mu = par[1], size = par[2], log = TRUE))
}
nlminb(start = c(9, 12), hdev, x = X)

Resources