How can I optimize the expected value of a function in R? - r

I have derived a survival function for a system of components (ignore the details of how this system is setup) and I am trying to maximize its expected, or more specifically, maximizing the expected value of the function:
surv_func = function(x,mu) = {(exp(-(x/(mu))^(1/3))*((1-exp(-(4/3)*x^(3/2)))+exp(-(-(4/3)*x^(3/2)))))*exp(-(x/(3-mu))^(1/3))}
and I am supposed (since the pdf including my tasks gives a hint about it) to use the function
optimize()
and the expected value for a function can be computed with
# Computes expected value of a the function "function"
E <- integrate(function, 0, Inf)
but my function depends on x and mu. The expected value could (obviously) be computed if the integral had no mu but instead only depended on x. For those interested, the mu comes from the fact that one of the components has a Weibull-distribution with parameters (1/3,mu) and the 3-mu comes from that has a Weibull-distribution with parameters (1/3,lambda). In the task there is a constraint mu + lambda = 3, so I tought substituting the lambda-parameter in the second Weibull-distribution with lambda = 3 - mu and trying to maximize this problem would yield not only mu, but also lambda.
If I try to, just for the sake of learing about R, compute the expected value using the code below (in the console window), it just gives me the following:
> E <- integrate(surv_func,0,Inf)
Error in (function (x, mu) : argument "mu" is missing, with no default
I am new to R and seem to be a little bit "slow" at learning. How can I approach this problem?

Related

Weird R behavior with indexing function arrays

I'm having some unexpected behaviour in R with function arrays, and I've reduced the problem to a minimal working example:
theory = c(function(p) p)
i = 1
posterior = function(p) theory[[i]](p)
i = 2
posterior(0)
Which gives me an error saying the subscript i is out of bounds.
So I guess that i is somehow being used as a "free" variable in the definition of posterior so it gets updated when I redefine i. Oddly enough, this works:
theory = c(function(p) p)
i = 1
posterior = theory[[i]]
i = 2
posterior(0)
How can I avoid this? Note that not redefining i is not an option, as this stuff is going in a for loop where i is the index.
The reason that this doesn't work is that you redefine i = 2, and then you are out of bounds of theory, which contains a single element. The function is evaluated lazily, so that it only executes theory[[i]] when the function is called, at which point i equals 2.
You can read some more about lazy evaluation here.

Error when fitting a beta distribution: the function mle failed to estimate the parameters with error code 100

I'm trying to use fitdist () function from the fitdistrplus package to fit my data to different distributions. Let's say that my data looks like:
x = c (1.300000, 1.220000, 1.160000, 1.300000, 1.380000, 1.240000,
1.150000, 1.180000, 1.350000, 1.290000, 1.150000, 1.240000,
1.150000, 1.120000, 1.260000, 1.120000, 1.460000, 1.310000,
1.270000, 1.260000, 1.270000, 1.180000, 1.290000, 1.120000,
1.310000, 1.120000, 1.220000, 1.160000, 1.460000, 1.410000,
1.250000, 1.200000, 1.180000, 1.830000, 1.670000, 1.130000,
1.150000, 1.170000, 1.190000, 1.380000, 1.160000, 1.120000,
1.280000, 1.180000, 1.170000, 1.410000, 1.550000, 1.170000,
1.298701, 1.123595, 1.098901, 1.123595, 1.110000, 1.420000,
1.360000, 1.290000, 1.230000, 1.270000, 1.190000, 1.180000,
1.298701, 1.136364, 1.098901, 1.123595, 1.316900, 1.281800,
1.239400, 1.216989, 1.785077, 1.250800, 1.370000)
Next, if i run fitdist (x, "gamma") everything is fine, but if I use fitdist (x, "beta") instead I get the following error:
Error in start.arg.default(data10, distr = distname) :
values must be in [0-1] to fit a beta distribution
Ok, so I'm not native english but as far as I understand this method requires data to be in the range [0,1], so I scale it by using x_scaled = (x-min(x))/max(x). This gives me a vector with values in that range that perfectly correlates the original vector x.
Because of x_scaled is of class matrix, I convert into a numeric vector using as.numeric(). And then fit the model with fitdist(x_scale,"beta").
This time I get the following error:
Error in fitdist(x_scale, "beta") :
the function mle failed to estimate the parameters, with the error code 100
So after that I've been doing some search engine queries but I don't find anything useful. Does anybody ave an idea of whats going on wrong here? Thank you
By reading into the source code, it can be found that the default estimation method of fitdist is mle, which will call mledist from the same package, which will construct a negative log-likelihood for the distribution you have chosen and use optim or constrOptim to numerically minimize it. If there is anything wrong with the numerical optimization process, you get the error message you've got.
It seems like the error occurs because when x_scaled contains 0 or 1, there will be some problem in calculating the negative log-likelihood for beta distribution, so the numerical optimization method will simply broke. One dirty trick is to let x_scaled <- (x - min(x) + 0.001) / (max(x) - min(x) + 0.002), so there is no 0 nor 1 in x_scaled, and fitdist will work.

How to weight observations in mxnet?

I am new to neural networks and the mxnet package in R. I want to do a logistic regression on my predictors since my observations are probabilities varying between 0 and 1. I'd like to weight my observations by a vector obsWeights I have, but I'm not sure where to implement the weights. There seems to be a weight= option in mx.symbol.FullyConnected but if I try weight=obsWeights I get the following error message
Error in mx.varg.symbol.FullyConnected(list(...)) :
Cannot find argument 'weight', Possible Arguments:
----------------
num_hidden : int, required
Number of hidden nodes of the output.
no_bias : boolean, optional, default=False
Whether to disable bias parameter.
How should I proceed to weight my observations? Here is my code at the moment.
# Prepare data
train.mm = model.matrix(obs ~ . , data = train_data)
train_label = train_data$obs
# Normalize
train.mm = apply(train.mm, 2, function(x) (x-min(x))/(max(x)-min(x)))
# Create MXDataIter compatible iterator
batch_size = 128
train.iter = mx.io.arrayiter(data=t(train.mm), label=train_label,
batch.size=batch_size, shuffle=T)
# Symbolic model definition
data = mx.symbol.Variable('data')
fc1 = mx.symbol.FullyConnected(data=data, num.hidden=128, name='fc1')
act1 = mx.symbol.Activation(data=fc1, act.type='relu', name='act1')
final = mx.symbol.FullyConnected(data=act1, num.hidden=1, name='final')
logistic = mx.symbol.LogisticRegressionOutput(data=final, name='logistic')
# Run model
mxnet_train = mx.model.FeedForward.create(
symbol = logistic,
X = train.iter,
initializer = mx.init.Xavier(rnd_type = 'gaussian', factor_type = 'avg', magnitude = 2),
num.round = 25)
Assigning the fully connected weight argument is not what you want to do at any rate. That weight is a reference to parameters of the layer; i.e., what you multiply in the inputs by to get output values These are the parameter values you're trying to learn.
If you want to make some samples matter more than others, then you'll need to adjust the loss function. For example, multiply the usual loss function by your weights so that they do not contribute as much to the overall average loss.
I do not believe the standard Mxnet loss functions have a spot for assigning weights (that is LogisticRegressionOutput won't cover this). However, you can make your own cost function that does. This would involve passing your final layer through a sigmoid activation function to first generate the usual logistic regression output value. Then pass that into the loss function you define. You could do squared error, but for logistic regression you'll probably want to use the cross entropy function:
l * log(y) + (1 - l) * log(1 - y),
where l is the label and y is the predicted value.
Ideally, you'd write a symbol with an efficient definition of the gradient (Mxnet has a cross entropy function, but its for softmax input, not a binary output. You could translate your output to two outputs with softmax as an alternative, but that seems less easy to work with in this case), but the easiest path would be to let Mxnet do its autodiff on it. Then you multiply that cross entropy loss by the weights.
I haven't tested this code, but you'd ultimately have something like this (this is what you'd do in python, should be similar in R):
label = mx.sym.Variable('label')
out = mx.sym.Activation(data=final, act_type='sigmoid')
ce = label * mx.sym.log(out) + (1 - label) * mx.sym.log(1 - out)
weights = mx.sym.Variable('weights')
loss = mx.sym.MakeLoss(weigths * ce, normalization='batch')
Then you want to input your weight vector into the weights Variable along with your normal input data and labels.
As an added tip, the output of an mxnet network with a custom loss via MakeLoss outputs the loss, not the prediction. You'll probably want both in practice, in which case its useful to group the loss with a gradient-blocked version of the prediction so that you can get both. You'd do that like this:
pred_loss = mx.sym.Group([mx.sym.BlockGrad(out), loss])

Estimate parameters of Frechet distribution using mmedist or fitdist(with mme) error

I'm relatively new in R and I would appreciated if you could take a look at the following code. I'm trying to estimate the shape parameter of the Frechet distribution (or inverse weibull) using mmedist (I tried also the fitdist that calls for mmedist) but it seems that I get the following error :
Error in mmedist(data, distname, start = start, fix.arg = fix.arg, ...) :
the empirical moment function must be defined.
The code that I use is the below:
require(actuar)
library(fitdistrplus)
library(MASS)
#values
n=100
scale = 1
shape=3
# simulate a sample
data_fre = rinvweibull(n, shape, scale)
memp=minvweibull(c(1,2), shape=3, rate=1, scale=1)
# estimating the parameters
para_lm = mmedist(data_fre,"invweibull",start=c(shape=3,scale=1),order=c(1,2),memp = "memp")
Please note that I tried many times en-changing the code in order to see if my mistake was in syntax but I always get the same error.
I'm aware of the paradigm in the documentation. I've tried that as well but with no luck. Please note that in order for the method to work the order of the moment must be smaller than the shape parameter (i.e. shape).
The example is the following:
require(actuar)
#simulate a sample
x4 <- rpareto(1000, 6, 2)
#empirical raw moment
memp <- function(x, order)
ifelse(order == 1, mean(x), sum(x^order)/length(x))
#fit
mmedist(x4, "pareto", order=c(1, 2), memp="memp",
start=c(shape=10, scale=10), lower=1, upper=Inf)
Thank you in advance for any help.
You will need to make non-trivial changes to the source of mmedist -- I recommend that you copy out the code, and make your own function foo_mmedist.
The first change you need to make is on line 94 of mmedist:
if (!exists("memp", mode = "function"))
That line checks whether "memp" is a function that exists, as opposed to whether the argument that you have actually passed exists as a function.
if (!exists(as.character(expression(memp)), mode = "function"))
The second, as I have already noted, relates to the fact that the optim routine actually calls funobj which calls DIFF2, which calls (see line 112) the user-supplied memp function, minvweibull in your case with two arguments -- obs, which resolves to data and order, but since minvweibull does not take data as the first argument, this fails.
This is expected, as the help page tells you:
memp A function implementing empirical moments, raw or centered but
has to be consistent with distr argument. This function must have
two arguments : as a first one the numeric vector of the data and as a
second the order of the moment returned by the function.
How can you fix this? Pass the function moment from the moments package. Here is complete code (assuming that you have made the change above, and created a new function called foo_mmedist):
# values
n = 100
scale = 1
shape = 3
# simulate a sample
data_fre = rinvweibull(n, shape, scale)
# estimating the parameters
para_lm = foo_mmedist(data_fre, "invweibull",
start= c(shape=5,scale=2), order=c(1, 2), memp = moment)
You can check that optimization has occurred as expected:
> para_lm$estimate
shape scale
2.490816 1.004128
Note however, that this actually reduces to a crude way of doing overdetermined method of moments, and am not sure that this is theoretically appropriate.

Behavior of optim() function in R

I'm doing maximum likelihood estimation using the R optim function.
The command I used is
optim(3, func, lower=1.0001, method="L-BFGS-B")$par
The function func has infinite value if the parameter is 1.
Thus I set the lower value to be 1.0001.
But sometime an error occurs.
Error in optim(3, func, lower = 1.0001, method = "L-BFGS-B", sx = sx, :
L-BFGS-B needs finite values of 'fn'
What happened next is hard to understand.
If I run the same command again, then it gives the result 1.0001 which is lower limit.
It seems that the optim function 'learns' that 1 is not the proper answer.
How can the optim function can give the answer 1.0001 at my first run?
P.S.
I just found that this problem occurs only in stand-alone R-console. If I run the same code in R Studio, it does not occur. Very strange.
The method "L-BFGS-B" requires all computed values of the function to be finite.
It seems, for some reason, that optim is evaluating your function at the value of 1.0, giving you an inf, then throwing an error.
If you want a quick hack, try defining a new function that gives a very high value(or low if you're trying to maximize) for inputs of 1.
func2 <- function(x){
if (x == 1){
return -9999
}
else{
return func(x)
}
}
optim(3, func2, lower=1.0001, method="L-BFGS-B")$par
(Posted as answer rather than comment for now; will delete later if appropriate.)
For what it's worth, I can't get this example (with a singularity at 1) to fail, even using the default control parameters (e.g. ndeps=1e-3):
func <- function(x) 1/(x-1)*x^2
library(numDeriv)
grad(func,x=2) ## critical point at x=2
optim(par=1+1e-4,fn=func,method="L-BFGS-B",lower=1+1e-4)
Try a wide range of starting values:
svec <- 1+10^(seq(-4,2,by=0.5))
sapply(svec,optim,fn=func,method="L-BFGS-B",lower=1+1e-4)
These all work.

Resources