Optimizing Log-Likelihood Function in R with optim - r

I have a log-likelihood function I would like to optimize and understood I could do so with optim() in R. The parameters my function requires is a vector of probabilities (of length N) as well as a symmetric matrix of size N*N (where only N-choose-2 (right now N=5) values matter, due to the symmetry).
When I try using optim() I receive the following error:
Error in optim(params, L) : (list) object cannot be coerced to type 'double'
Why do I receive this error and how can I make this work?
(If there is a better solution in Matlab or Python, references or suggestions for functions in these languages are welcome too)
Here is the code:
numerator <- function(P, Gamma, y, U, N) {
expr = 1
for (i in 1:N-1) {
for ( j in i+1:N) {
if ((y[i] == y[j]) & (y[i] == 1)) {
expr = expr*P[i]*P[j]*exp(Gamma[i,j])
}
if ((y[i] != y[j]) & (y[i] == 1)) {
expr = expr*P[i]*(1 - P[j])
}
if ((y[i] != y[j]) & (y[i] == 0)) {
expr = expr*(1 - P[i])*P[j]
}
if ((y[i] == y[j]) & (y[i] == 0)) {
expr = expr*(1 - P[i]*P[j]*exp(Gamma[i,j]) - P[i]*(1 - P[j]) - (1 - P[i])*P[j])
}
}
}
return(expr)
}
denominator <- function(params, y, U, N) {
P <- params$probs
val <- 1
for (i in 1:N-1) {
val <- val*(y[i]*P[i]^(N-3) + (1-y[i])*(1 - P[i])^(N-3))
}
val <- val * y%*%P + (1 - y)%*%(1 - P)
return(val)
}
L <- function(params, y, U, N) {
P <- params$probs
Gamma <- params[,2:(N+1)]
n <- log(numerator(P, Gamma, y, U, N))
d <- log(denominator(P, y, U, N))
l <- n-d
return(l)
}
y <- readRDS(file="purchase_records_df.rds")
N <- ncol(y)
params <- data.frame('probs'=rep(0.001, N), 'gamma'=matrix(0,nrow=N,ncol=N))
optim(params, L)
Briefly, the setting is y is a vector of purchases, but here we want to take our purchase data and find the underlying probabilities.
Thank you very much!

Related

Unexpected symbol error when trying to run a function

Im trying to run the function below but I get:
Error: unexpected symbol in:
"MMod<-function (Pmat, cycle, n, Init, halfcycle=F, measure, discount=.03)
{cumul<-0 i"
MMod<-function (Pmat, cycle, n, Init, halfcycle=F, measure, discount=.03)
{cumul<-0 i<-1 istate<-Init m<-measure*cycle
if (halfcycle) {cumul<-0.5*(Init%*%m)} while (i <= n)
{ istate<-istate%*%Pmat imeasure<-istate%*%m cumul<-cumul+imeasure
#print(paste(c(i,round(istate,2),cumul))) i<-i+1
m<-m*(1-discount*cycle)
}
if (halfcycle) {cumul<-cumul - 0.5*imeasure} return(cumul)}
then I get more erros but I believe that is the key to run it.
R (among many programming languages) is very particular about this: different expressions must be separated either by a newline or a semi-colon. Try this:
MMod <- function(Pmat, cycle, n, Init, halfcycle = FALSE, measure, discount = 0.03) {
cumul <- 0
i <- 1
istate <- Init
m <- measure * cycle
if (halfcycle) {
cumul <- 0.5 * (Init %*% m)
}
while (i <= n) {
istate <- istate %*% Pmat
imeasure <- istate %*% m
cumul <- cumul + imeasure
#print(paste(c(i,round(istate,2),cumul))) i<-i+1
m <- m * (1 - discount * cycle)
}
if (halfcycle) {
cumul <- cumul - 0.5 * imeasure
}
return(cumul)
}

How to use the output of an r function in another function?

I want to create an script that calculates probabilities for a rol game.
I´m new to programming and I´m stuck with the return values and nested functions. What I want is to use the values returned by the first function in the next one.
I have two functions dice(k, n) and fight(a, b). (for the example, the functions are partly written):
dice <- function (k, n) {
if (k > 3 && n > 2){
a <- 3
b <- 2
attack <- sample(1:6, a)
deff <- sample(1:6, b)
}
return(c(attack, deff))
}
So I want to use the vector attack, and deff in the next function:
fight <- function(a, b){
if (a == 3 && b == 2){
if(sort(attack,T)[1] > sort(deff,T)[1]){
n <- n - 1}
if (sort(attack,T)[1] <= sort(deff,T)[1]) {
k <- k - 1}
if (sort(attack,T)[2] > sort(deff,T)[2]) {
n <- n - 1}
if (sort(attack,T)[2]<= sort(deff,T)[2]){
k <- k - 1}
}
return(c(k, n)
}
But this gives me the next error:
Error in sort(attack, T) : object 'attack' not found
Any ideas? Thanks!

Nested for loops with linked indexes

I just got a simple question for you. It concerns nested for loops code in R.
I got this kind of algorithm to develop:
N = 180; k = 1 ... N; l = 1...k; alpha = 1
So I tried this nested for loop, but it was a disaster.
N <- 180
nu <- 0
for (k in 1:N) {
for (l in 1:k) {
nu<- nu + 1/(N-l+1)
}
E<- N*nu
print(E)
}
I got 180 E values, but all of them are wrong.
For example, for k = 10 I expect to get a nu = 0.057 and a related E[10] = 10.26. Whereas script returns me a E[10] = 55.94.
I really cannot figure out why.
Thank you.
You keep summing on nu, here is a fix for the problem. When calculating a new E at the first for loop just zero the nu again.
This return the expected result:
N <- 180
nu <- 0
for (k in 1:N) {
nu <- 0
for (l in 1:k) {
nu <- nu + 1/(N-l+1)
}
E<- N*nu
print(E)
}
And, below is a somewhat better solution using the sum function instead of looping.
### Functions
vk <- function(k, alpha, N) {
return(sum(1 / (N - 1:k + 1)^alpha))
}
E <- function(N, k) {
return(N * vk(k, 1, N))
}
N <- 180
k <- N
result.vec <- rep(NA, k)
for (i in 1:k) {
result.vec[i] <- E(N, i)
}

NaN values while computing the probability in binomial distribution

I would like to compute integrate the following function
riskFunction <- function(theta, n, r, s)
{
risk <- 0
for (j in 1:n)
{
risk <- risk + abs(theta - r * j - s) * dbinom(j, n, theta)
}
return(risk)
}
using the trapeizodal method on the interval [0, 1]. That's my code
trapeizodalMethod <- function(a, b, m, n, r, s)
{
intValue <- 0
h <- (b - a)/m
for (i in 0:m-1)
{
intValue <- intValue + 0.5 * (riskFunction(a + i * h, n=n, r=r, s=s) + riskFunction(a + (i + 1) * h, n=n, r=r, s=s)) * h
}
return(intValue)
}
After calling trapezoidalMethod
trapeizodalMethod(a=0, b=1, m=100, n=100, r=0.01, s=0)
more than 50 errors occurs: In dbinom(j, 100, theta) : NaN produced.
I have no idea what might have gone wrong. I would appreciate any hints or tips.
That warning arises when dbinom(x, size, prob, log = FALSE) has prob outside [0, 1]. In your case, theta = -0.01 occurs because the loop is not running as you expected.
The binary operator : has higher precedence than binary -. So for example 1:5-1 is evaluated as (1:5) - 1, not 1:(5 - 1). You want
trapeizodalMethod <- function(a, b, m, n, r, s)
{
intValue <- 0
h <- (b - a)/m
for (i in 0:(m-1)) {
# ^^^^^
intValue <- intValue + 0.5 * (riskFunction(a + i * h, n=n, r=r, s=s) + riskFunction(a + (i + 1) * h, n=n, r=r, s=s)) * h
}
return(intValue)
}

Mixture modeling - troublee with infinite values from exp() and log()

I'm writing a function for Gaussian mixture models with spherical covariance structures--ie $\Sigma_k = \sigma_k^2 I$. This particular function is similar to the mclust package with identifier VII.
http://en.wikipedia.org/wiki/Mixture_model
Anyways, the problem I'm having is running into infinite values for the weight matrix. Definition: Let W be an n x m matrix where n = 1, ..., n (number of obs) and m = 1, ..., m (number of mixtues). Each element of W (ie w_ij) can essentially be defined as a specific form of:
w_im = \frac{a / b * exp(c)}{\sum_i=1^m [a_i / b_i * exp(c_i)]}
Computing this numerically is giving me infinite values. So I'm trying to use the log-identity log(x+y) = log(x) + log(1 + y/x). But the issue is that it's not as simple as log(x+y) but rather log(\sum_i=1^m [a_i / b_i * exp(c_i)]).
Here's some code define:
n_im = a / b * exp(c) ;
d_.m = \sum_i=1^m [a_i / b_i * exp(c_i)] ; and
c_mat[i,j] as the value of the exponent for the [i,j]th term.
n_mat[, i] <- log(a[i]) - log(b[i]) - c[,i] # numerator of w_im
internal_vec1[i] <- (a[i] * b[1])/ (a[1] * b[i]) # an internal for the step below
c_mat2 <- cbind(rep(1, n), c_mat[,1] - c_mat[,-1]) # since e^a / e^b = e^(a-b)
for (i in 1:n) {
d_vec[i] <- n_mat[i,1] + log(sum(internal_vec1 * exp(c_mat2[i,)))
} ## still getting infinite values
I'm trying to define the problem as briefly as possible. the entire function is obviously much larger than this. But, since the problem I'm running into is specifically dealing with infinite (and 1/infinity) values, I'm hoping this snippet is sufficient. Anyone with a coding trick here?
Here is the solution!! (I've spent way too damn long on this)
**The first function log_plus() solves the simple problem where you want log(\sum_{i=1)^n x_i)
**The second function log_plus2() solves the more complicated problem described above where you want log(\sum_{i=1}^n [a_i / b_i * exp(c_i)])
log_plus <- function(xvec) {
m <- length(xvec)
x <- log(xvec[1])
for (j in 2:m) {
sum_j <- sum(xvec[1:j-1])
x <- x + log(1 + xvec[j]/sum_j)
}
return(x)
}
log_plus2 <- function(a, b, c) {
# assumes intended input of form sum(a/b * e^c)
if ((length(a) != length(b)) || (length(a) != length(c))) {
stop("Input equal length vectors")
}
if (!(all(c > 0) || all(c < 0))) {
stop("All values of c must be either > 0 or < 0.")
}
m <- length(a)
# initilialize log sum
x <- log(a[1]) - log(b[1]) + c[1]
# aggregate / loop log sum
for (j in 2:m) {
# build denominator
b2 <- b[1:j-1]
for (i in 1:j-1) {
d1 <- 0
c2 <- c[1:i]
if (all(c2 > 0)) {
c_min <- min(c2[1:j-1])
c2 <- c2 - c_min
} else if (all(c2 < 0)) {
c_min <- max(c2[1:j-1])
c2 <- c2 - c_min
}
d1 <- d1 + a[i] * prod(b2[-i]) * exp(c2[i])
}
den <- b[j] * (d1)
num <- a[j] * prod(b[1:j-1]) * exp(c[j] - c_min)
x <- x + log(1 + num / den)
}
return(x)
}

Resources