I've been trying to use the nls function to fit experimental data to a model that I have, expressed by a function of 3 parameters, let's say a, b and c. However, I would like to keep b and c fixed, since I know their true value, and fit only the parameter a:
nls(formula=pattern~myfunction(a, b, c), start=list(a=estimate_a), control=list(maxiter=50, tol=5e-8, warnOnly=T), algorithm="port", weights=sqrt(pattern), na.action=na.exclude, lower=0, upper=1)
But apparently this does not work... How can I tell R that b and c are fixed?
To fix a parameter (1) set it before running nls and (2) do not include it in start. Here is a self contained example showing the fixing of a to 0 :
a <- 0
nls(demand ~ a + b * Time, BOD, start = list(b = 1))
A quick solution:
my_new_function <- function(a) myfunction(a, b = b_true, c = c_true)
nls(formula = pattern ~ my_new_function(a), start = list(a = estimate_a),
control = list(maxiter = 50, tol = 5e-8, warnOnly = TRUE), algorithm = "port",
weights = sqrt(pattern), na.action = na.exclude, lower = 0, upper = 1)
The issue of fixed (or MASKED) parameters has been around a long time. Ron Duggleby of U. of Queensland introduced me to the term "masked" when I was on sabbatical there in 1987, and I have had masks in my own software for nonlinear optimization and nonlinear least squares since. In particular, the CRAN package "nlsr" or the developmental "nlsr2" (https://gitlab.com/nashjc/improvenls/-/tree/master/nlsr-rox) handle fixed parameters reliably.
Another approach is to use "nls()" with the "port" algorithm and set upper and lower bounds equal for the fixed parameters. I'm not sure if this is pushing the envelope, and have only tried a couple of examples. For those examples, "minpack.lm::nlsLM()" using the same equal bounds approach seems to give incorrect results sometimes.
John Nash
Related
So I'm trying to maximize the likelihood function for a gamma-poisson and I've programmed it into R as the following:
lik<- function(x,t,a,b){
for(i in 1:n){
like[i] =
log(gamma(a + x[i]))-log(gamma(a))
-log(gamma(1+x[i] + x[i]*log(t[i]/b)-(a+x[i])*log(1+t[i]/b)
}
return(sum(like))
}
where x and t are the data, and I have n data rows.
I need a and b to be solved for simultaneously. Does a built in function exist in R? Or do I need to hard code an algorithm to solve the system of equations? [I'd rather not] I know optimize() solves for 1 variable and so does fminbnd(). I'm trying to copy the behavior of FindMaximum() in mathematica. In a perfect world I'd like the code to work something like this:
optimize(f=lik, a>0, b>0, x=x, t=t, maximum=TRUE, iteration=5000)
$maximum
a 150
b 6
Thanks.
optim's first argument can be a vector of parameters. So you could try something like this:
lik <- function(p=c(1,1), x, t){
# In the body of the function replace a by p[1] and b by p[2]
}
optim(c(1,1), lik, method = c("L-BFGS-B"), x=x, t=t, control=list(fnscale=-1))
So the solution that ended up working out is:
attempt2d <- optim(
par = c(sumx/sumt, 1), fn = lik, data = data11,
method = "L-BFGS-B", control = list(fnscale = -1, trace=TRUE),
lower=0.1, upper = 170
)
However my parameters run out to 170, essentially meaning that my gamma parameters are Inf. Because gamma() hits infinity relatively quickly. And in mathematica the solutions are a=169 and b=16505, and R gets nowhere near that maxing out at 170. The known solutions are beyond 170 in some cases any solution for this anomaly?
Could someone explain how the Quality column in the xgboost R package is calculated in the xgb.model.dt.tree function?
In the documentation it says that Quality "is the gain related to the split in this specific node".
When you run the following code, given in the xgboost documentation for this function, Quality for node 0 of tree 0 is 4000.53, yet I calculate the Gain as 2002.848
data(agaricus.train, package='xgboost')
train <- agarics.train
X = train$data
y = train$label
bst <- xgboost(data = train$data, label = train$label, max.depth = 2,
eta = 1, nthread = 2, nround = 2,objective = "binary:logistic")
xgb.model.dt.tree(agaricus.train$data#Dimnames[[2]], model = bst)
p = rep(0.5,nrow(X))
L = which(X[,'odor=none']==0)
R = which(X[,'odor=none']==1)
pL = p[L]
pR = p[R]
yL = y[L]
yR = y[R]
GL = sum(pL-yL)
GR = sum(pR-yR)
G = sum(p-y)
HL = sum(pL*(1-pL))
HR = sum(pR*(1-pR))
H = sum(p*(1-p))
gain = 0.5 * (GL^2/HL+GR^2/HR-G^2/H)
gain
I understand that Gain is given by the following formula:
Since we are using log loss, G is the sum of p-y and H is the sum of p(1-p) - gamma and lambda in this instance are both zero.
Can anyone identify where I am going wrong?
OK, I think I've worked it out. The value for reg_lambda is not 0 by default as given in the documentation, but is actually 1 (from param.h)
Also, it appears that the factor of a half is not applied when calculating the gain, so the Quality column is double what you would expect. Lastly, I also don't think gamma (also called min_split_loss) is applied to this calculation either (from update_hitmaker-inl.hpp)
Instead, gamma is used to determine whether to invoke pruning, but is not reflected in the gain calculation itself, as the documentation suggests.
If you apply these changes, you do indeed get 4000.53 as the Quality for node 0 of tree 0, as in the original question. I'll raise this as an issue to the xgboost guys, so the documentation can be changed accordingly.
After scrambling through the internet, related questions and trying for days without any success I hope you can potentially help me out with the inclusion of optim() with or for a polr() function in R.
What I am trying to do, is to just set some constraints (lower and/or upper bounds for the coefficients). If you'd have a general example of how this would work, I'd be more than delighted.
Let's consider the following fake and senseless data:
set.seed(3)
my.df <- data.frame(id = 1:1000, y = sample(c(1,2,3), 1000, replace = TRUE),
a = rnorm(5000, 1, 0.1), b = rnorm(100, 1.9, 0.5), c = rnorm (10, 0.8, 1.2))
and a polr function like this:
model <- polr(y ~ a + b + c, method = 'logistic')
I do get a coefficient for c, which is negative (albeit insignificant), but I know it's relation to y being positive. Thus, I want to constrain it's coefficient to being positive and I think I can do this with optim().
I want to include something along the lines:
optim(model, method = "L-BFGS-B", lower = c(a = -1, b = -1, c = 0))
which doesn't work.
I think there might be some option with including the optim() function into the polr function using the (to me) arcane ...but I just cannot figure out how. Any ideas?
Much appreciated, and believe it or not, I really tried hard.
I am trying to reproduce a simple example of using kernel PCA. The objective is to separate out the points from two concentric circles.
Creating the data:
circle <- data.frame(radius = rep(c(0, 1), 500) + rnorm(1000, sd = 0.05),
phi = runif(1000, 0, 2 * pi),
group = rep(c("A", "B"), 500))
#
circle <- transform(circle,
x = radius * cos(phi),
y = radius * sin(phi),
z = rnorm(length(radius))) %>% select(group, x, y, z)
TFRAC = 0.75
#
train <- sample(1:1000, TFRAC * 1000)
circle.train <- circle[train,]
circle.test <- circle[-train,]
> head(circle.train)
group x y z
491 A -0.034216 -0.0312062 0.70780
389 A 0.052616 0.0059919 1.05942
178 B -0.987276 -0.3322542 0.75297
472 B -0.808646 0.3962935 -0.17829
473 A -0.032227 0.0027470 0.66955
346 B 0.894957 0.3381633 1.29191
I have split the data up into training and testing sets because I have the intention (once I get this working!) of testing the resulting model.
In principal kernel PCA should allow me to separate out the two classes. Other discussions of this example have used the Radial Basis Function (RBF) kernel, so I adopted this too. In R kernel PCA is implemented in the kernlab package.
library(kernlab)
circle.kpca <- kpca(~ ., data = circle.train[, -1], kernel = "rbfdot", kpar = list(sigma = 10), features = 1)
I requested only the first component and specified the RBF kernel. This is the result:
There has definitely been a major transformation of the data, but the transformed data is not what I was expecting (which would be a nice, clean separation of the two classes). I have tried fiddling with the value of the parameter sigma and, although the results do vary dramatically, I still didn't get what I was expecting. I assume that sigma is related to the parameter gamma mentioned here, possibly via the relationship given here (without the negative sign?).
I'm pretty sure that I am making a naive rookie error here and I would really appreciate any pointers which would get me onto the right track.
Thanks,
Andrew.
Try sigma = 20. I think you will get the answer you are looking for. The sigma in kernlab is actually what is usually referred to as gamma for rbf kernel so they are inversely related.
In order to learn the support vector machine, we must determine various parameters.
For example, there are parameters such as cost and gamma.
I am trying to determine sigma and gamma parameters of SVM Using "GA" package and "kernlab" package of R.
I use accuracy as the evaluation function of the genetic algorithm.
I have created the following code, and I ran it.
library(GA)
library(kernlab)
data(spam)
index <- sample(1:dim(spam)[1])
spamtrain <- spam[index[1:floor(dim(spam)[1]/2)], ]
spamtest <- spam[index[((ceiling(dim(spam)[1]/2)) + 1):dim(spam)[1]], ]
f <- function(x)
{
x1 <- x[1]
x2 <- x[2]
filter <- ksvm(type~.,data=spamtrain,kernel="rbfdot",kpar=list(sigma=x1),C=x2,cross=3)
mailtype <- predict(filter,spamtest[,-58])
t <- table(mailtype,spamtest[,58])
return(t[1,1]+t[2,2])/(t[1,1]+t[1,2]+t[2,1]+t[2,2])
}
GA <- ga(type = "real-valued", fitness = f, min = c(-5.12, -5.12), max = c(5.12, 5.12), popSize = 50, maxiter = 2)
summary(GA)
plot(GA)
However, When I call the GA function,the following error is returned.
"No Support Vectors found. You may want to change your parameters"
I can not understand why the code is bad.
Using GA for SVM parameters is not a good idea - it should be sufficient to just do a regular grid search ( two for loops, one for C and one for gamma values).
In Rs library e1071 (which also provides SVMs) there is a methodtune.svm` which looks for best parameters using a grid search.
Example
data(iris)
obj <- tune.svm(Species~., data = iris, sampling = "fix",
gamma = 2^c(-8,-4,0,4), cost = 2^c(-8,-4,-2,0))
plot(obj, transform.x = log2, transform.y = log2)
plot(obj, type = "perspective", theta = 120, phi = 45)
Which also shows one important thing - you should look for a good C and gamma values in a geometric manner, so eg. 2^x for x in {-10,-8,-6,-6,-4,-2,0,2,4}.
GA is an algorithm for meta optimisation, where the parameters space is huge, and there is no easy correlation between parameters and the optimising function. It requires tuning of much more parameters then SVM (number of generations, size of the population, mutation probability, crossing probability, mutation operator, crossing operator ...) so it completely useless approach here.
And of course - as it was earlier stated in comments - C and Gamma have to be strictly positive.
For more details about using e1071 take a look at the CRAN document: http://cran.r-project.org/web/packages/e1071/e1071.pdf