NaN values while computing the probability in binomial distribution - r

I would like to compute integrate the following function
riskFunction <- function(theta, n, r, s)
{
risk <- 0
for (j in 1:n)
{
risk <- risk + abs(theta - r * j - s) * dbinom(j, n, theta)
}
return(risk)
}
using the trapeizodal method on the interval [0, 1]. That's my code
trapeizodalMethod <- function(a, b, m, n, r, s)
{
intValue <- 0
h <- (b - a)/m
for (i in 0:m-1)
{
intValue <- intValue + 0.5 * (riskFunction(a + i * h, n=n, r=r, s=s) + riskFunction(a + (i + 1) * h, n=n, r=r, s=s)) * h
}
return(intValue)
}
After calling trapezoidalMethod
trapeizodalMethod(a=0, b=1, m=100, n=100, r=0.01, s=0)
more than 50 errors occurs: In dbinom(j, 100, theta) : NaN produced.
I have no idea what might have gone wrong. I would appreciate any hints or tips.

That warning arises when dbinom(x, size, prob, log = FALSE) has prob outside [0, 1]. In your case, theta = -0.01 occurs because the loop is not running as you expected.
The binary operator : has higher precedence than binary -. So for example 1:5-1 is evaluated as (1:5) - 1, not 1:(5 - 1). You want
trapeizodalMethod <- function(a, b, m, n, r, s)
{
intValue <- 0
h <- (b - a)/m
for (i in 0:(m-1)) {
# ^^^^^
intValue <- intValue + 0.5 * (riskFunction(a + i * h, n=n, r=r, s=s) + riskFunction(a + (i + 1) * h, n=n, r=r, s=s)) * h
}
return(intValue)
}

Related

Linear optimization with R

I'm trying to maximize the function by x[1], x[2]:
(a - 1) * x[1] * c + (a - 1) * x[2] * b * d
Where a, b, c, d are known positive constants and
0 < x[1] < 1, 0 < x[2] < 1,
x[1] + x[2] = 1
Using NlcOptim, I did:
solver_1 <- function(a, b, c, d){
obj = function(x){
return((a - 1) * x[1] * c + (a - 1) * x[2] * b * d )
}
con = function(x){
f = NULL
f = rbind(f, x[1] + x[2] - 1)
return(list(ceq = f, c = NULL))
}
x0 = c(1, 0)
solnl(x0, objfun = obj, confun = con)
}
solver_1(1.2, 5.2, 0.8, 0.1)
But it gives me:
Error in if (norm(H, "I") == 0) { : missing value where TRUE/FALSE needed
Does anyone know what I'm doing wrong?
Using an optimization package is massive overkill.
Either by following up on the comment of #Roland and doing some elementary calculus, or using the corner point theorem of linear programming, your objective function is optimized when x[1] = 0 and x[2] = 1 or vice versa. Thus you only have to evaluated two constant expressions in a,b,c,d and pick the larger of the two:
max(c*(a-1), b*d*(a-1))
For any a other than 1, which is larger will be determined by whether or not c > a*b

Optimizing Log-Likelihood Function in R with optim

I have a log-likelihood function I would like to optimize and understood I could do so with optim() in R. The parameters my function requires is a vector of probabilities (of length N) as well as a symmetric matrix of size N*N (where only N-choose-2 (right now N=5) values matter, due to the symmetry).
When I try using optim() I receive the following error:
Error in optim(params, L) : (list) object cannot be coerced to type 'double'
Why do I receive this error and how can I make this work?
(If there is a better solution in Matlab or Python, references or suggestions for functions in these languages are welcome too)
Here is the code:
numerator <- function(P, Gamma, y, U, N) {
expr = 1
for (i in 1:N-1) {
for ( j in i+1:N) {
if ((y[i] == y[j]) & (y[i] == 1)) {
expr = expr*P[i]*P[j]*exp(Gamma[i,j])
}
if ((y[i] != y[j]) & (y[i] == 1)) {
expr = expr*P[i]*(1 - P[j])
}
if ((y[i] != y[j]) & (y[i] == 0)) {
expr = expr*(1 - P[i])*P[j]
}
if ((y[i] == y[j]) & (y[i] == 0)) {
expr = expr*(1 - P[i]*P[j]*exp(Gamma[i,j]) - P[i]*(1 - P[j]) - (1 - P[i])*P[j])
}
}
}
return(expr)
}
denominator <- function(params, y, U, N) {
P <- params$probs
val <- 1
for (i in 1:N-1) {
val <- val*(y[i]*P[i]^(N-3) + (1-y[i])*(1 - P[i])^(N-3))
}
val <- val * y%*%P + (1 - y)%*%(1 - P)
return(val)
}
L <- function(params, y, U, N) {
P <- params$probs
Gamma <- params[,2:(N+1)]
n <- log(numerator(P, Gamma, y, U, N))
d <- log(denominator(P, y, U, N))
l <- n-d
return(l)
}
y <- readRDS(file="purchase_records_df.rds")
N <- ncol(y)
params <- data.frame('probs'=rep(0.001, N), 'gamma'=matrix(0,nrow=N,ncol=N))
optim(params, L)
Briefly, the setting is y is a vector of purchases, but here we want to take our purchase data and find the underlying probabilities.
Thank you very much!

ContrOptim Function- Error in Argument

I'm trying to replicate the Excel Solver in R- which is basically a constraint optimization problem
I'm trying to minimize the cost per action which is total spend/ total actions which equals to the below function with a few constraints.
CPA function:
(a+b+c+d)/((consta+(Baln(a)))+ (constb+(Bbln(b)))+(constc+(Bcln(c)))+(constd+(Bdln(d)))
where the unknown variables are a,b,c,d and const* stands for constant from a regressions and B* stand for coefficient from a regression (so they are values that I have).
Here is the simplified filled in function that I'm trying to minimize:
(a+b+c+d)/ (((69.31*ln(a))+(14.885*ln(b))+(21.089*ln(c))+(9.934*ln(d))-(852.93))
Constraints:
a+b+c+d>=0
a+b+c+d<=130000(total spend)
a<=119000 (maxa)
a>=272.56(mina)
b<=11000(maxb)
b>=2.04(minb)
c<=2900(maxc)
c>=408.16(minc)
d<=136800(maxd)
d>=55.02(mind)
I'm doing this using the constraints optimization function. My code is below:
g<-function(a,b,c,d) { (a+b+c+d)/((consta+(Balog(a)))+ (constb+(Bblog(b)))+ (constc+(Bclog(c)))+ (constd+(Bdlog(d)))) }
gb<-function(a) g(a[1], a[2], a[3],a[4])
A<-matrix(c(1,0,0,0,-1,0,0,0,0,1,0,0,0,-1,0,0,0,0,1,0,0,0,-1,0,0,0,0,1,0,0,0,-1,-1,-1,-1,-1,1,1,1,1),4,10)
B<- c(mina, -maxa, minb, -maxb, minc, -maxc, mind, -maxd,-totalspend, 0)
constrOptim(c(273,6,409,56),g,gb,A,B)
When I run the optimization function, it states that something is wrong with my arguments (Error in ui %*% theta : non-conformable arguments). I think it is the gradient of the function that is coded wrong but I'm not sure. Any help is appreciated.
You can consider the following approach
library(DEoptim)
fn_Opt <- function(param)
{
a <- param[1]
b <- param[2]
c <- param[3]
d <- param[4]
bool_Cond <- a + b + c + d <= 130000
if(bool_Cond == FALSE)
{
return(10 ^ 30)
}else
{
val <- (a + b + c + d) / (((69.31 * log(a)) + (14.885 * log(b)) + (21.089 * log(c)) + (9.934 * log(d)) - (852.93)))
return(val)
}
}
obj_DEoptim <- DEoptim(fn = fn_Opt, lower = c(272.56, 2.04, 408.16, 55.02),
upper = c(119000, 11000, 2900, 136800),
control = list(itermax = 10000))

How to use the HSL (Hue Saturation Lightness) cylindric color model

I would like to use the HSL (Hue Saturation Lightness) color model (cylindric). Maybe I missed it, but I was not able to find a function that returns colors according to that scheme. hcl from the colorspace packages uses chroma not saturation. I would like to use a model, where maximal luminance/lightness will always return the color white as in the image below. How can I specify colors using the HSL model in R?
TIA
Add on
What I get for different levels of luminance using hcl is this (code here).
What I want is this the HSL model.
Adapted from wikipedia & java code, which (honestly) didn't take more than 90s (I timed it):
# specify h as whole input degrees (e.g 0-360)
# s = 0.0 - 1 (0 - 100%)
# l = 0.0 - 1, (0 - 100%)
# returns output from R's rgb() functin
hsl_to_rgb <- function(h, s, l) {
h <- h / 360
r <- g <- b <- 0.0
if (s == 0) {
r <- g <- b <- l
} else {
hue_to_rgb <- function(p, q, t) {
if (t < 0) { t <- t + 1.0 }
if (t > 1) { t <- t - 1.0 }
if (t < 1/6) { return(p + (q - p) * 6.0 * t) }
if (t < 1/2) { return(q) }
if (t < 2/3) { return(p + ((q - p) * ((2/3) - t) * 6)) }
return(p)
}
q <- ifelse(l < 0.5, l * (1.0 + s), l + s - (l*s))
p <- 2.0 * l - q
r <- hue_to_rgb(p, q, h + 1/3)
g <- hue_to_rgb(p, q, h)
b <- hue_to_rgb(p, q, h - 1/3)
}
return(rgb(r,g,b))
}
# r, g, b = 0.0 - 1 (0 - 100%)
# returns h/s/l in a vector, h = 0-360 deg, s = 0.0 - 1 (0-100%), l = 0.0 - 1 (0-100%)
rgb_to_hsl <- function(r, g, b) {
val_max <- max(c(r, g, b))
val_min <- min(c(r, g, b))
h <- s <- l <- (val_max + val_min) / 2
if (val_max == val_min){
h <- s <- 0
} else {
d <- val_max - val_min
s <- ifelse(l > 0.5, d / (2 - val_max - val_min), d / (val_max + val_min))
if (val_max == r) { h <- (g - b) / d + (ifelse(g < b, 6, 0)) }
if (val_max == g) { h <- (b - r) / d/ + 2 }
if (val_max == b) { h <- (r - g) / d + 4 }
h <- (h / 6) * 360
}
return(c(h=h, s=s, l=l))
}

Mixture modeling - troublee with infinite values from exp() and log()

I'm writing a function for Gaussian mixture models with spherical covariance structures--ie $\Sigma_k = \sigma_k^2 I$. This particular function is similar to the mclust package with identifier VII.
http://en.wikipedia.org/wiki/Mixture_model
Anyways, the problem I'm having is running into infinite values for the weight matrix. Definition: Let W be an n x m matrix where n = 1, ..., n (number of obs) and m = 1, ..., m (number of mixtues). Each element of W (ie w_ij) can essentially be defined as a specific form of:
w_im = \frac{a / b * exp(c)}{\sum_i=1^m [a_i / b_i * exp(c_i)]}
Computing this numerically is giving me infinite values. So I'm trying to use the log-identity log(x+y) = log(x) + log(1 + y/x). But the issue is that it's not as simple as log(x+y) but rather log(\sum_i=1^m [a_i / b_i * exp(c_i)]).
Here's some code define:
n_im = a / b * exp(c) ;
d_.m = \sum_i=1^m [a_i / b_i * exp(c_i)] ; and
c_mat[i,j] as the value of the exponent for the [i,j]th term.
n_mat[, i] <- log(a[i]) - log(b[i]) - c[,i] # numerator of w_im
internal_vec1[i] <- (a[i] * b[1])/ (a[1] * b[i]) # an internal for the step below
c_mat2 <- cbind(rep(1, n), c_mat[,1] - c_mat[,-1]) # since e^a / e^b = e^(a-b)
for (i in 1:n) {
d_vec[i] <- n_mat[i,1] + log(sum(internal_vec1 * exp(c_mat2[i,)))
} ## still getting infinite values
I'm trying to define the problem as briefly as possible. the entire function is obviously much larger than this. But, since the problem I'm running into is specifically dealing with infinite (and 1/infinity) values, I'm hoping this snippet is sufficient. Anyone with a coding trick here?
Here is the solution!! (I've spent way too damn long on this)
**The first function log_plus() solves the simple problem where you want log(\sum_{i=1)^n x_i)
**The second function log_plus2() solves the more complicated problem described above where you want log(\sum_{i=1}^n [a_i / b_i * exp(c_i)])
log_plus <- function(xvec) {
m <- length(xvec)
x <- log(xvec[1])
for (j in 2:m) {
sum_j <- sum(xvec[1:j-1])
x <- x + log(1 + xvec[j]/sum_j)
}
return(x)
}
log_plus2 <- function(a, b, c) {
# assumes intended input of form sum(a/b * e^c)
if ((length(a) != length(b)) || (length(a) != length(c))) {
stop("Input equal length vectors")
}
if (!(all(c > 0) || all(c < 0))) {
stop("All values of c must be either > 0 or < 0.")
}
m <- length(a)
# initilialize log sum
x <- log(a[1]) - log(b[1]) + c[1]
# aggregate / loop log sum
for (j in 2:m) {
# build denominator
b2 <- b[1:j-1]
for (i in 1:j-1) {
d1 <- 0
c2 <- c[1:i]
if (all(c2 > 0)) {
c_min <- min(c2[1:j-1])
c2 <- c2 - c_min
} else if (all(c2 < 0)) {
c_min <- max(c2[1:j-1])
c2 <- c2 - c_min
}
d1 <- d1 + a[i] * prod(b2[-i]) * exp(c2[i])
}
den <- b[j] * (d1)
num <- a[j] * prod(b[1:j-1]) * exp(c[j] - c_min)
x <- x + log(1 + num / den)
}
return(x)
}

Resources