Calculating binomial distribution in R - r

I want to do some calculations on a binomial distribution Bin(100,0.55). In particular:
P[μₓ - σₓ ≤ X ≤ μₓ + σₓ]
I also want to calculate it using the normal approximation for the binomial:
f(x)=1/√(2πσ²) exp[-(x-μ)²/2σ²]
and attempt to do so using the Chebyshev inequality (say for k=2)
P[|X-μ| ≥ kσ] ≤ 1/k²
I would like to know if there is a relatively straightforward way to implement these calculations easily in R (that is, using in built functions rather than coding this manually).

please check the help documentation for
dbinom
pbinom
qbinom
In general, for other distributions, this pattern dDISTRIBUTION_NAME, pDISTRIBUTION_NAME, qDISTRIBUTION_NAME still holds.

Related

How to partition variance among parameters in a Monte Carlo simulation?

I have some doubts as how to better extract information from a Monte Carlo simulation, I will simplify the problem here, using some R pseudocode but the question is more general.
Let's say I have a function with three parameters, each of them with a mean and SD. For this example I will use normal distributions, but for more general cases let's assume one of them is another distribution.
f(x,y,z) = rnorm(mean_x, sd_x) * rnorm(mean_y, sd_y) * rnorm(mean_z, sd_z)
I am using a Monte Carlo simulation to quantify the uncertainty, quite straight-forward. I am interested though in understanding what % of the total uncertainty corresponds to each parameter, done computationally and not analytically.
One way I have envisioned would be:
Define total uncertainty as the 0.05-0.95 quantiles of the full MC simulation, and this would be 100% of the model uncertainty.
Do a new simulation, but now set one parameter fixed using the mean value, such as:
f(x,y,z) = mean_x * rnorm(mean_y, sd_y) * rnorm(mean_z, sd_z)
The difference between the "total uncertainty" and the 0.05-0.95 quantiles of this simulation would then be amount of uncertainty related to this specific parameter.
Repeat for the other parameters.
I know this is a simplification as it ignores interactions among parameters, but would this be correct? The problem in question is slightly more complex than this so other analytical approaches are not that feasible.

Convexity of ratio of two linear functions

I am working on optimization of an objective function which is a ratio of two linear functions given as mx + b/-mx+c. Can somebody comment about convexity of this function and/or give me some reference?
You might consider consulting Stephen Boyd's convex optimization book. Section 3.4 (example 3.32) is what you are interested in. Your example is called a linear fractional function and is indeed quasiconvex and quasiconcave if you restrict the domain of the denominator to be either greater or less than 0. Quasiconvex optimization problems can be solved using a method like bisection which involves solving a series of feasibility problems
The easiest litmus test for convexivity of a function is to take the derivative and consider the region where this derivative is zero - these are potential local minima, though they could be global minima or saddle points.
In this case, your derivative is: (d)/(dx)((m x + b)/(-m x + c)) = (m (b + c))/(c - m x)^2
There is no zero point at all that depends on x, except at infinity. There is no minimum.

Extracting Lagrange Multipliers from SVM output in R

I would like to extract the alpha lagrange multipliers from the SVM function in the e1071 R package, however I am not sure if svm$coef is producing these?
Alphas are defined as in Equation 9.23, p352, An Introduction to Statistical Learning
In the documentation for SVM, it says that
SVM$Coefs = The corresponding coefficients times the training labels
Could someone please explain it?
$coefs produces alpha_i * y_i, but as alpha_i are by definition non-negative, you can simply take absolute value of coefs and it gives you Lagrange multipliers, and extract y_i by taking a sign (as they are only +1 or -1). This is just a simplification, often used in SVM packages, as multipliers are never actually used - only their product with the label, thus they are stored as a single number, for simplicity and efficiency, and in a case of need (like this one) - you can always reconstruct them.

R function for Likelihood

I'm trying to analyze repairable systems reliability using growth models.
I have already fitted a Crow-Amsaa model but I wonder if there is any package or any code for fitting a Generalized Renewal Process (Kijima Model I) or type II
in R and find it's parameters Beta, Lambda(or alpha) and q.
(or some other model for the mean cumulative function MCF)
The equation number 15 of this article gives an expression for the
Log-likelihood
I tried to create the function like this:
likelihood.G1=function(theta,x){
# x is a vector with the failure times, theta vector of parameters
a=theta[1] #Alpha
b=theta[2] #Beta
q=theta[3] #q
logl2=log(b/a) # First part of the equation
for (i in 1:length(x)){
logl2=logl2 +(b-1)*log(x[i]/(a*(1+q)^(i-1))) -(x[i]/(a*(1+q)^(i-1)))^b
}
return(-logl2) #Negavite of the log-likelihood
}
And then use some rutine for minimize the -Log(L)
theta=c(0.5,1.2,0.8) #Start parameters (lambda,beta,q)
nlm(likelihood.G1,theta, x=Data)
Or also
optim(theta,likelihood.G1,method="BFGS",x=Data)
However it seems to be some mistake, since the parameters it returns has no sense
Any ideas of what I'm doing wrong?
Thanks
Looking at equation (16) of the paper you reference and comparing it with your code it looks like you are missing one term in the for loop. It seems that each data point contributes to three terms of the log-likelihood but in your code (inside the loop) you only have two terms (not considering the updating term)
Specifically, your code does not include the 4th term in equation (16):
and neither it does the 7th term, and so on. This is at least one error in the code. An extra consideration would be that α and β are constrained to be greater than zero. I am not sure if the solver you are using is considering this constraint.

Quadrature to approximate a transformed beta distribution in R

I am using R to run a simulation in which I use a likelihood ratio test to compare two nested item response models. One version of the LRT uses the joint likelihood function L(θ,ρ) and the other uses the marginal likelihood function L(ρ). I want to integrate L(θ,ρ) over f(θ) to obtain the marginal likelihood L(ρ). I have two conditions: in one, f(θ) is standard normal (μ=0,σ=1), and my understanding is that I can just pick a number of abscissa points, say 20 or 30, and use Gauss-Hermite quadrature to approximate this density. But in the other condition, f(θ) is a linearly transformed beta distribution (a=1.25,b=10), where the linear transformation B'=11.14*(B-0.11) is such that B' also has (approximately) μ=0,σ=1.
I am confused enough about how to implement quadrature for a beta distribution but then the linear transformation confuses me even more. My question is threefold: (1) can I use some variation of quadrature to approximate f(θ) when θ is distributed as this linearly transformed beta distribution, (2) how would I implement this in R, and (3) is this a ridiculous waste of time such that there is an obviously much faster and better method to accomplish this task? (I tried writing my own numerical approximation function but found that my implementation of it, being limited to the R language, was just too slow to suffice.)
Thanks!
First, I assume you can express your L(θ,ρ) and f(θ) in terms of actual code; otherwise you're kinda screwed. Given that assumption, you can use integrate to perform the necessary computations. Something like this should get you started; just plug in your expressions for L and f.
marglik <- function(rho) {
integrand <- function(theta, rho) L(theta, rho) * f(theta)
# set your lower/upper integration limits as appropriate
integrate(integrand, lower=-5, upper=5, rho=rho)
}
For this to work, your integrand has to be vectorized; ie, given a vector input for theta, it must return a vector of outputs. If your code doesn't fit the bill, you can use Vectorize on the integrand function before passing it to integrate:
integrand <- Vectorize(integrand, "theta")
Edit: not sure if you're also asking how to define f(θ) for the transformed beta distribution; that seems rather elementary for someone working with joint and marginal likelihoods. But if you are, then the density of B' = a*B + b, given f(B), is
f'(B') = f(B)/a = f((B' - b)/a) / a
So in your case, f(theta) is dbeta(theta/11.14 + 0.11, 1.25, 10) / 11.14

Resources