How to pass from lm.fit to optim in R? - r

I am using a function to do a linear regression and works fine.:
here it is:
fun <- function(x1, x2, y) {
but now I have a non-linear equation so I want to use optim instead of lm.fit in the above Fun.
the optim function is here:
f <- function(p){
sum((y - (p[1]*x1+p[2]*x2+p[3])^p[4]+p[5])^2)
}
p <- optim(rep(.5, 5), f)$par
Any idea please on how to implement this function in the first one? and remove lm.fit

If you determined to use optim, you can try this approach:
make.fun <- function(x1,x2,y,n.keep=3) {
keep <- !(is.na(x1) | is.na(x2) | is.na(y))
if (sum(keep)<n.keep) return()
function(p){
sum((y - (p[1]*x1+p[2]*x2+p[3])^p[4]+p[5])^2)
}
}
f <- make.fun(x1,x2,y)
p <- if (is.null(f)) rep(NA,5) else optim(rep(.5,5),f)$par
Here, make.fun tests if you have enough complete rows and if yes, it returns a function to be minimized. That function will already have x1, x2, and y in its environment, so it will need only p as input, which is convenient for use with optim.

Related

Create function for linear regression in R for n parameter using matrix method

I am trying to create linear regression function in R for n parameter but I don't know how to proceed.I have created function for two variable.
]
new_lm <- function(y,x){
z=cbind(1,x)
k= solve(t(z)%*%z) %*% t(z) %*% y
return(k)
}
But in this case I am passing the values suppose I wanted to use it for n parameter that is same function can be used for n=1,2.. etc.
Something like this might work:
new_lm <- function(y,...){
x <- do.call(cbind, list(...))
z <- cbind(1,x)
k <- solve(t(z)%*%z) %*% t(z) %*% y
return(k)
}
By the way, computing linear regressions this way is terrible in practice (although for small problems it will work fine); you should use QR or singular value decomposition, or some other more sophisticated bit of linear algebra ...

Creating a Loss Function

I was trying to creating a loss function below.
Where tts is the total sum of squares and x is values 1-100 and t is a given y hat. W0+W1 is supposedly par(0,1) but I'm having issues with getting the function correct but I'm not sure why.
x
t
loss <- function(par){
th<-w0+w1*x
tts<-(t-th)^2
return(sum(tts))
}
```{r, error = TRUE}
results <- optim(par = c(0,1), fn = loss, method = 'BFGS')
results$par
The first argument to any function that you want to optimize with optim must be the vector of parameters that optim will search over. You named this vector par but then you didn't use par anywhere in your function. In my example below, I'm going to call the vector of parameters params so as not to mix it up with the first argument to optim and you'll see it gets used (ie, the loss function uses params[1], etc.):
# define loss function
loss <- function(params, x, y) {
yhat <- params[1] + params[2]*x
tss <- (y - yhat)^2
return(sum(tss))
}
# generate fake data
n <- 100
x <- 1:n
w0_true <- 2
w1_true <- 3
y <- w0_true + w1_true*x + rnorm(n)
# find w0_hat and w1_hat with optim
optim(par=c(0,1), fn=loss, x=x, y=y)
# check with lm
summary(lm(y ~ x))

I am beginner in R and I'm trying to solve a system of equations but when i run i get error in R [duplicate]

This question already has an answer here:
Simple for loop in R producing "replacement has length zero" in R
(1 answer)
Closed 4 years ago.
# my error : Error in F[1] <- n/(X[0]) - sum(log(1 + Y^exp(X[1] + X[2] * x))) : replacement has length zero
set.seed(16)
#Inverse Transformation on CDF
n=100
SimRRR.f <- function(100, lambda=1,tau)) {
x= rnorm(100,0,1)
tau= exp(-1-x)
u=runif(100)
y= (1/(u^(1/lambda)-1))^(1/tau)
y
}
Y<-((1/u)-1)^exp(-1-x)
# MLE for Simple Linear Regresion
# System of equations
library(rootSolve)
library(nleqslv)
model <- function(X){
F <- numeric(length(X))
F[1] <- n/(X[0])-sum(log(1+Y^exp(X[1]+X[2]*x)))
F[2] <- 2*n -(X[0]+1)*sum(exp(X[1]+X[2]*x))*Y^( exp(X[1]+X[2]*x))*log(Y)/(1+ Y^( exp(X[1]+X[2]*x)))
F[3] <- sum(x) + sum(x*log(Y))*exp(X[1]+X[2]*x) -(X[0]+1)*X[1]*sum(exp(X[1]+X[2]*x)*Y^(exp(X[1]+X[2]*x)*log(Y)))/(1+ Y^( exp(X[1]+X[2]*x)))
# Solution
F
}
startx <- c(0.5,3,1) # start the answer search here
answers<-as.data.frame(nleqslv(startx,model))
The problem is that you define x, u, tau and y inside the SimRRR function, but are trying to define Y in terms of them outside the function.
Using a function, you give it input, and you get back output. All the other variables defined in the course of the function doing its job go away at the end. As it stands, Y should be a series of NAs (unless you defined the above variables in the global environment as you were working on your function...)
Try the following functions, see if they do the job:
# I usually put all my library calls together at the beginning of the script.
library(rootSolve)
library(nleqslv)
x = rnorm(n,0,1) # see below for why this is pulled out.
SimRRR.f <- function(x, lambda=1,tau)) { # 100 can't be by itself in the function call. everything in there needs to be attached to a variable.
n <- length(x)
tau= exp(-1-x)
u=runif(n)
y= (1/(u^(1/lambda)-1))^(1/tau)
y
}
Y_sim = SimRRR.f(n = 100, lambda = 1, tau = 1) # pick the right tau, it's never defined here.
Your second function has more issues. Namely, it relies on x, which is not defined anywhere that can be found. Either you need x from the previous function, or you really meant X. I'm going to assume you do need the values of x, since X is only of length 3. This is why I pulled it out of the last function call - we need it now.
[Update]
It's also been pointed out in the comments that the indexing here is wrong. I didn't catch that previously (and the F elements are defined correctly). I think I've fixed the indexing issues too now:
model <- function(X, Y, x){ # If you use x and Y in the function, define them here.
n <- length(x)
F <- numeric(length(X))
F[1] <- n/(X[1])-sum(log(1+Y^exp(X[2]+X[3]*x)))
F[2] <- 2*n -(X[1]+1)*sum(exp(X[2]+X[3]*x))*Y^( exp(X[2]+X[3]*x))*log(Y)/(1+ Y^( exp(X[2]+X[3]*x)))
F[3] <- sum(x) + sum(x*log(Y))*exp(X[2]+X[3]*x) -(X[1]+1)*X[2]*sum(exp(X[2]+X[3]*x)*Y^(exp(X[2]+X[3]*x)*log(Y)))/(1+ Y^( exp(X[2]+X[3]*x)))
# Solution
F
}
I'm not familiar with the nleqslv package, but unless there is a method defined to convert it to a data frame, that might not go so well. I'd make sure everything else is working before the conversion.
startx <- c(0.5,3,1) # start the answer search here
answers <- nleqslv(startx,model, Y = Y_sim, x = x)
answer_df <- as.data.frame(answers)

Use variable in GLM quasi specification

I'm fitting a GLM to some data, using a quasi-likelihood approach (family=quasi(...)).
I'd like to use a variable, p in the variance specification, like so:
family = quasi(link=log, variance=mu^p)
This however doesn't work (it no longer recongises mu).
Is there any way to get R to just insert the value of p in the expression before it is evaluated, so I can use pinstead of a number?
Here's an example that doesn't work:
set.seed(1)
x <- runif(100)
y <- x^2+2*x+sin(2*pi*x) + rnorm(100)
fitModel <- function(x,y, p) {
model <- glm(y~x, family=quasi(link=log, variance=mu^p))
return(model)
}
fitModel(x,y,2)
Thanks!
The family function does fancy parsing which means the paste0 solution suggested in the comments won't work without jumping through considerable hoops. Also, the following function fails if any of the y values are <= 0, so I changed the example a little bit (if you do have negative response values you'll have to think about what you want to do about this ...)
set.seed(1)
x <- seq(2,10,length=100)
y <- x^2+2*x+sin(2*pi*x) + rnorm(100,)
What I did was to create a quasi family object, then modify its variance function on the fly.
pfamily <- quasi(link="log",variance="mu")
fitModel <- function(x,y, p) {
pfamily[["variance"]] <- function(mu) mu^p
model <- glm(y~x, family=pfamily)
model
}
fitModel(x,y,2)
fitModel(x,y,1)
For what it's worth, this variant should be able to do arbitrary values of p, so e.g. you can draw a curve over the variance power:
dfun <- function(p) {
deviance(fitModel(x,y,p))
}
pvec <- seq(0.1,3,by=0.1)
dvec <- sapply(pvec,dfun)
par(las=1,bty="l")
plot(pvec,dvec,type="b",xlab="variance power",ylab="deviance")

Function that returns a mixture of probability densities

In R, I would like to create a function which does the following:
given vectors of means, variances and weights
create a function which is a mixture of normal distributions
return this function
A mixture of normal densities is the sum
f(x) = c_1 * N(mu_1,var_1) + ... + c_n * N(mu_n,var_n).
What's an easy way of doing this in R?
edit
I figured out a very straightforward, perhaps naive way of doing this:
gmix <- function(means,vars,weights){
n <- length(means)
f<- function(t) {
res <- 0;
for(i in 1:n) {
res <- res + weights[i]*dnorm(t,mean=means[i],sd=sqrt(vars[i]))
}
return(res)
}
return(f)
}
I'm no programmer so this might be not a smart way of doing this... If anyone knows a better implementation I'd be glad to hear it.

Resources