After scrambling through the internet, related questions and trying for days without any success I hope you can potentially help me out with the inclusion of optim() with or for a polr() function in R.
What I am trying to do, is to just set some constraints (lower and/or upper bounds for the coefficients). If you'd have a general example of how this would work, I'd be more than delighted.
Let's consider the following fake and senseless data:
set.seed(3)
my.df <- data.frame(id = 1:1000, y = sample(c(1,2,3), 1000, replace = TRUE),
a = rnorm(5000, 1, 0.1), b = rnorm(100, 1.9, 0.5), c = rnorm (10, 0.8, 1.2))
and a polr function like this:
model <- polr(y ~ a + b + c, method = 'logistic')
I do get a coefficient for c, which is negative (albeit insignificant), but I know it's relation to y being positive. Thus, I want to constrain it's coefficient to being positive and I think I can do this with optim().
I want to include something along the lines:
optim(model, method = "L-BFGS-B", lower = c(a = -1, b = -1, c = 0))
which doesn't work.
I think there might be some option with including the optim() function into the polr function using the (to me) arcane ...but I just cannot figure out how. Any ideas?
Much appreciated, and believe it or not, I really tried hard.
Related
I'm looking to calculate the Minimum Detectable Effect for a potential Difference-in-Differences design where treatment and control are assigned at the cluster level and my outcome at the individual level is dichotomous. To do this I'm working in R and using the clusterPower package, specifically I'm using the cpa.did.binary function. In the help file for this function it notes that d is "The expected absolute difference." I'm interpreting this as being a Minimum Detectable Effect, is that correct? If this is the MDE, is this output the expected difference in logits?
Thanks to anyone that can help. Alternatively, if you have a better package or way of calculating MDE that is also welcome.
# Input
cpa.did.binary(power = .8,
nclusters = 10,
nsubjects = 100,
p = .5,
d = NA,
ICC = .04,
rho_c = 0,
rho_s = 0)
# Output
> d
>0.2086531
I have somewhat a complex physical model with five unknown parameters to fit, but no success so far.
I used nls2 first to get some estimates for the start values, but then nls, nlxb, and nlsLM all threw the famous "singular gradient error at initial parameter estimates" error.
For the start values for nls2, I extracted them from the literature, so I think that I have good starting values at least for nls2. The parameter estimates extracted from nls2 make quite sense physically as well; however, don't resolve the issue with the singular gradient matrix error.
Since it's a physical model, every coefficient has a physical meaning, and I prefer not to fix any of them.
I should also mention that all five unknown parameters in the model equation are positive and the shape parameter m can go up to 2.
Reading through many posts and trying different solution suggestions, I have come to conclusion that I have either over-parameterization or unidentifiable parameters problem.
My question is that should I stop trying to use nls with this specific model (with this many unknown parameters) or is there any way out?
I am quite new to topic, so any help, mathematically or code-wise, is greatly appreciated.
Here is my MWE:
# Data
x <- c(0, 1000, 2000, 2500, 2750, 3000, 3250, 3500, 3750, 4000, 5000)
y <- c(1.0, 0.99, 0.98, 0.95, 0.795, 0.59, 0.35, 0.295, 0.175, 0.14, 0.095)
# Start values for nls2
bounds <- data.frame(a = c(0.8, 1.5), b = c(1e+5, 1e+7), c = c(0.4, 1.4), n = c(0.1, 2), m = c(0.1, 2))
# Model equation function
mod <- function(x, a, b, c, n, m){
t <- b*85^n*exp(-c/0.0309)
(1 - exp(-(a/(t*x))^m))
}
# # Model equation
# mod <- y ~ (1 - exp(-(a/(b*85^n*exp(-c/0.0309)*x))^m))
# Model fit with nls2
fit2 <- nls2(y ~ mod(x, a, b, c, n, m), data = data.frame(x, y), start = bounds, algorithm = "brute-force")
# Model fit with nls
fit <- nls(y ~ mod(x, a, b, c, n, m), data = data.frame(x, y), start = coef(fit2))
The more I look at this the more confused I get, but I'm going to try again.
Looking again at your expression, we have the expression inside the exponential
-(a/(b*85^n*exp(-c/0.0309)*x))^m
We can rewrite this as
-( [a/(b*85^n*exp(-c/0.0309))] * 1/x )^m
(please check my algebra!)
If this is correct, then that whole bold-faced blob doesn't affect the functional form of x — it all collapses to a single constant in the equation. (In other words, {a,b,c,n} are all jointly unidentifiable.) Lumping that stuff into a single parameter phi :
1 - exp(-(phi/x)^m)
phi is a shape parameter (has the same units as x, should be roughly the same magnitude as a typical value of x): let's try a starting value of 2500 (the mean value of x)
m is a shape parameter; we can't go too badly wrong starting from m==1
Now nls works fine without any extra help:
n1 <- nls(y~1 - exp(-(phi/x)^m), start=list(phi=2500,m=1), data=data.frame(x,y))
and gets phi=2935, m=6.49.
Plot predictions:
plot(x,y, ylim=c(0,1))
xvec <- seq(0, 5000, length=101)
lines(xvec, predict(n1, newdata=data.frame(x=xvec)))
Another way to think about what this curve is doing: we can transform the equation to -log(1-y) = phi^m*(1/x)^m: that is, -log(1-y) should follow a power-law curve with respect to 1/x.
Here's what that looks like:
plot(1/x, -log(1-y))
## curve() uses "x" as the current x-axis variable, i.e.
## read "x" as "1/x" below.
with(as.list(coef(n1)), curve(phi^m*x^m, add=TRUE))
In this format, it appears to fit the central data well but fails for large values of 1/x (the x=0 point is missing here because it goes to infinity).
I'm trying to fit a truncated normal distribution to data using fitdistrplus::fitdistr and specifying upper and lower bounds. However, when comparing the MLE-fitted parameters to those of an MLE-fit without bounds, they seem to be the same.
library(fitdistrplus)
library(MASS)
dt <- rnorm(100, 1, 0.5)
cat("truncated:", fitdistr(dt, "normal", lower = 0, upper = 1.5, method = "mle")$estimate,
"original:", fitdist(dt, "norm", method = "mle")$estimate, sep = "\n")
truncated:
1.034495
0.4112629
original:
1.034495
0.4112629
I'm not a statistics genius, but I'm pretty sure that parameters should be different because truncating the distribution, both mean and sd will change (because the distribution is rescaled). Is this right?
Thanks for your advice
Cheers,
Simon
I've been trying to use the nls function to fit experimental data to a model that I have, expressed by a function of 3 parameters, let's say a, b and c. However, I would like to keep b and c fixed, since I know their true value, and fit only the parameter a:
nls(formula=pattern~myfunction(a, b, c), start=list(a=estimate_a), control=list(maxiter=50, tol=5e-8, warnOnly=T), algorithm="port", weights=sqrt(pattern), na.action=na.exclude, lower=0, upper=1)
But apparently this does not work... How can I tell R that b and c are fixed?
To fix a parameter (1) set it before running nls and (2) do not include it in start. Here is a self contained example showing the fixing of a to 0 :
a <- 0
nls(demand ~ a + b * Time, BOD, start = list(b = 1))
A quick solution:
my_new_function <- function(a) myfunction(a, b = b_true, c = c_true)
nls(formula = pattern ~ my_new_function(a), start = list(a = estimate_a),
control = list(maxiter = 50, tol = 5e-8, warnOnly = TRUE), algorithm = "port",
weights = sqrt(pattern), na.action = na.exclude, lower = 0, upper = 1)
The issue of fixed (or MASKED) parameters has been around a long time. Ron Duggleby of U. of Queensland introduced me to the term "masked" when I was on sabbatical there in 1987, and I have had masks in my own software for nonlinear optimization and nonlinear least squares since. In particular, the CRAN package "nlsr" or the developmental "nlsr2" (https://gitlab.com/nashjc/improvenls/-/tree/master/nlsr-rox) handle fixed parameters reliably.
Another approach is to use "nls()" with the "port" algorithm and set upper and lower bounds equal for the fixed parameters. I'm not sure if this is pushing the envelope, and have only tried a couple of examples. For those examples, "minpack.lm::nlsLM()" using the same equal bounds approach seems to give incorrect results sometimes.
John Nash
I am trying to reproduce a simple example of using kernel PCA. The objective is to separate out the points from two concentric circles.
Creating the data:
circle <- data.frame(radius = rep(c(0, 1), 500) + rnorm(1000, sd = 0.05),
phi = runif(1000, 0, 2 * pi),
group = rep(c("A", "B"), 500))
#
circle <- transform(circle,
x = radius * cos(phi),
y = radius * sin(phi),
z = rnorm(length(radius))) %>% select(group, x, y, z)
TFRAC = 0.75
#
train <- sample(1:1000, TFRAC * 1000)
circle.train <- circle[train,]
circle.test <- circle[-train,]
> head(circle.train)
group x y z
491 A -0.034216 -0.0312062 0.70780
389 A 0.052616 0.0059919 1.05942
178 B -0.987276 -0.3322542 0.75297
472 B -0.808646 0.3962935 -0.17829
473 A -0.032227 0.0027470 0.66955
346 B 0.894957 0.3381633 1.29191
I have split the data up into training and testing sets because I have the intention (once I get this working!) of testing the resulting model.
In principal kernel PCA should allow me to separate out the two classes. Other discussions of this example have used the Radial Basis Function (RBF) kernel, so I adopted this too. In R kernel PCA is implemented in the kernlab package.
library(kernlab)
circle.kpca <- kpca(~ ., data = circle.train[, -1], kernel = "rbfdot", kpar = list(sigma = 10), features = 1)
I requested only the first component and specified the RBF kernel. This is the result:
There has definitely been a major transformation of the data, but the transformed data is not what I was expecting (which would be a nice, clean separation of the two classes). I have tried fiddling with the value of the parameter sigma and, although the results do vary dramatically, I still didn't get what I was expecting. I assume that sigma is related to the parameter gamma mentioned here, possibly via the relationship given here (without the negative sign?).
I'm pretty sure that I am making a naive rookie error here and I would really appreciate any pointers which would get me onto the right track.
Thanks,
Andrew.
Try sigma = 20. I think you will get the answer you are looking for. The sigma in kernlab is actually what is usually referred to as gamma for rbf kernel so they are inversely related.