Using lm, I would like to fit the model:
y = b0 + b1*x1 + b2*x2 + b1*b2*x1*x2
My question is:
How can I specify that the coefficient of the interaction should equal the multiplication of the coefficients the main effects?
I've seen that to set the coefficient to a specific value you can use offset() and I() but I don't know how to specify a relationship between coefficient.
Here is a simple simulated dataset:
n <- 50 # Sample size
x1 <- rnorm(n, 1:n, 0.5) # Independent variable 1
x2 <- rnorm(n, 1:n, 0.5) # Independent variable 2
b0 <- 1
b1 <- 0.5
b2 <- 0.2
y <- b0 + b1*x1 + b2*x2 + b1*b2*x1*x2 + rnorm(n,0,0.1)
To fit Model 1: y = b0 + b1*x1 + b2*x2 + b3*x1*x2, I would use:
summary(lm(y~ x1 + x2 + x1:x2))
But how do I fit Model 2: y = b0 + b1*x1 + b2*x2 + b1*b2*x1*x2?
One of the main differences between the two models is the number of parameters to estimate. In Model 1, we estimate 4 parameters: b0 (intercept), b1 (slope of var. 1), b2 (slope of var. 2), and b3 (slope for the interaction between vars. 1 & 2). In Model 2, we estimate 3 parameters: b0 (intercept), b1 (slope of var. 1 & part of slope of the interaction between vars. 1 & 2), and b2 (slope of var. 2 & part of slope of the interaction between vars. 1 & 2)
The reason why I want to do this is that when investigating whether there is a significant interaction between x1 & x2, model 2, y = b0 + b1*x1 + b2*x2 + b1*b2*x1*x2, can be a better null model than y = b0 + b1*x1 + b2*x2.
Many thanks!
Marie
Because of the constraint that you impose on the coefficients, the model you specify is not a linear model and so lm can not be used to fit it. You would need to use a non-linear regression, such as nls.
> summary(nls(y ~ b0 + b1*x1 + b2*x2 + b1*b2*x1*x2, start=list(b0=0, b1=1, b2=1)))
Formula: y ~ b0 + b1 * x1 + b2 * x2 + b1 * b2 * x1 * x2
Parameters:
Estimate Std. Error t value Pr(>|t|)
b0 0.987203 0.049713 19.86 <2e-16 ***
b1 0.494438 0.007803 63.37 <2e-16 ***
b2 0.202396 0.003359 60.25 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1121 on 47 degrees of freedom
Number of iterations to convergence: 5
Achieved convergence tolerance: 2.545e-06
You can really see that the model is non-linear when you re-write it as
> summary(nls(y ~ b0+(1+b1*x1)*(1+b2*x2)-1, start=list(b0=0, b1=1, b2=1)))
Formula: y ~ b0 + (1 + b1 * x1) * (1 + b2 * x2) - 1
Parameters:
Estimate Std. Error t value Pr(>|t|)
b0 0.987203 0.049713 19.86 <2e-16 ***
b1 0.494438 0.007803 63.37 <2e-16 ***
b2 0.202396 0.003359 60.25 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1121 on 47 degrees of freedom
Number of iterations to convergence: 5
Achieved convergence tolerance: 2.25e-06
Brian provides a way to fit the constrained model you specify but if you're interested in if the unconstrained model fits better than your constrained model you use the delta method to test that hypothesis.
# Let's make some fake data where the constrained model is true
n <- 100
b0 <- 2
b1 <- .2
b2 <- -1.3
b3 <- b1 * b2
sigma <- 1
x1 <- rnorm(n)
# make x1 and x2 correlated for giggles
x2 <- x1 + rnorm(n)
# Generate data according to the model
y <- b0 + b1*x1 + b2*x2 + b3*x1*x2 + rnorm(n, 0, sigma)
# Fit full model y = b0 + b1*x1 + b2*x3 + b3*x1*x2 + error
o <- lm(y ~ x1 + x2 + x1:x2)
# If we want to do a hypothesis test of Ho: b3 = b1*b2
# this is the same as Ho: b3 - b1*b2 = 0
library(msm)
# Get estimate of the difference specified in the null
est <- unname(coef(o)["x1:x2"] - coef(o)["x1"] * coef(o)["x2"])
# Use the delta method to get a standard error for
# this difference
standerr <- deltamethod(~ x4 - x3*x2, coef(o), vcov(o))
# Calculate a test statistic. We're relying on asymptotic
# arguments here so hopefully we have a decent sample size
z <- est/standerr
# Calculate p-value
pval <- 2 * pnorm(-abs(z))
pval
I explain what the delta method is used for and more on how to use it in R in this blog post.
Expanding on Brian's answer you could alternatively do this by comparing the full model to the constrained model - however you have to use nls to fit the full model to be able to easily compare the models.
o2 <- nls(y ~ b0 + b1*x1 + b2*x2 + b1*b2*x1*x2, start=list(b0=0, b1=1, b2=1))
o3 <- nls(y ~ b0 + b1*x1 + b2*x2 + b3*x1*x2, start = list(b0 = 0, b1 = 1, b2 = 1, b3 = 1))
anova(o2, o3)
There's no way to do what you're asking for in lm and there's no reason for it to be able to do it. You run lm to get estimates of of your coefficients. If you don't want to estimate the coefficient then don't include the predictor in the model. You can use coef to extract the coefficients you want and multiply them out afterwards.
Note that leaving the interaction out is a different model and will produce a different b1 and b2. You could alternatively leave I(x1 * x2) in and not use the coefficient.
As for why you want to do this, there's not good a priori justification that your constrained model actually fits better than the simple additive model. Having more free parameters necessarily means a model fits better but you haven't added that, you've added a constraint that, in the real world, could make it fit worse. In that case would you consider it a better "baseline" for comparison to the model including the interaction?
Related
I'd like to estimate the effect of a treatment on two separate groups, so something of the form
Equation 1
T being the treatment and M the dummy separating the two groups.
The problem is that the treatment is correlated to other variables that affect Y. Luckily, there exists a variable Z that serves as an instrument for T. What I've been able to implement in R was to "manually" run 2SLS, following the stages
Equation 2
and
Equation 3
To provide a reproducible example, first a simulation
n <- 100
set.seed(271)
Z <- runif(n)
e <- rnorm(n, sd = 0.5)
M <- as.integer(runif(n)) # dummy
u <- rnorm(n)
# Treat = 1 + 2*Z + e
alpha_0 <- 1
alpha_1 <- 2
Treat <- alpha_0 + alpha_1*Z + e
# Y = 3 + M + 2*Treat + 3*Treat * M + e + u (ommited vars that determine Treat affect Y)
beta_0 <- 3
beta_1 <- 1
beta_2 <- 2
beta_3 <- 3
Y <- beta_0 + beta_1*M + beta_2*Treat + beta_3 * M*Treat + e + u
The first stage regression
fs <- lm(Treat ~ Z)
stargazer::stargazer(fs, type = "text")
===============================================
Dependent variable:
---------------------------
Treat
-----------------------------------------------
Z 2.383***
(0.168)
Constant 0.835***
(0.096)
-----------------------------------------------
Observations 100
R2 0.671
Adjusted R2 0.668
Residual Std. Error 0.445 (df = 98)
F Statistic 200.053*** (df = 1; 98)
===============================================
And second stage
Treat_hat <- fitted(fs)
ss <- lm(Y ~ M + Treat_hat + M:Treat_hat)
stargazer::stargazer(ss, type = "text")
===============================================
Dependent variable:
---------------------------
Y
-----------------------------------------------
M 1.230
(1.717)
Treat_hat 2.243***
(0.570)
M:Treat_hat 2.636***
(0.808)
Constant 2.711**
(1.213)
-----------------------------------------------
Observations 100
R2 0.727
Adjusted R2 0.718
Residual Std. Error 2.539 (df = 96)
F Statistic 85.112*** (df = 3; 96)
===============================================
The problem now is that those Standard Errors aren't adjusted for the first stage, which looks like quite some work to do manually. As I'd do for any other IV regression, I'd prefer to just use AER::ivreg.
But I can't seem to get the same regression going there. Here are many possible iterations, that never quite do the same thing
AER::ivreg(Y ~ M + Treat + M:Treat | Z)
AER::ivreg(Y ~ M + Treat + M:Treat | M + Z)
Warning message:
In ivreg.fit(X, Y, Z, weights, offset, ...) :
more regressors than instruments
These make sense, I guess
AER::ivreg(Y ~ M + Treat + M:Treat | M + Z + M:Z)
Call:
AER::ivreg(formula = Y ~ M + Treat + M:Treat | M + Z + M:Z)
Coefficients:
(Intercept) M Treat M:Treat
2.641 1.450 2.229 2.687
Surprisingly close, but not quite.
I couldn't find a way to tell ivreg that Treat and M:Treat aren't really two separate endogenous variables, but really just the same endogenous variable moved around and interacted with an exogenous one.
In conclusion,
i) Is there some way to mess with ivreg and make this work?
ii) Is there some other function for 2SLS that can just manually accept 1st and 2nd stage formulas without this sort of restriction, and that adjusts standard errors?
iii) What's the simplest way to get the correct SEs if there are no other alternatives? I didn't come across any direct R code, just a bunch of matrix multiplication formulas (although I didn't dig too deep for this one).
Thank you
Essentially, if Z is a valid a valid instrument for Treat, M:Z should be a valid instrument for M:Treat, so, to me this makes sense:
AER::ivreg(Y ~ M + Treat + M:Treat | M + Z + M:Z)
I actually managed to back out the correct param values for a modified simulation:
n <- 100
set.seed(271)
Z <- runif(n)
e <- rnorm(n, sd = 0.5)
M <- round(runif(n)) # note: I changed from as.integer() to round() in order to get some 1's in the regression
u <- rnorm(n)
# Treat = 1 + 2*Z + e
alpha_0 <- 1
alpha_1 <- 2
Treat <- alpha_0 + alpha_1*Z + e
beta_0 <- 3
beta_1 <- 1
beta_2 <- 2
beta_3 <- 3
Y <- beta_0 + beta_1*M + beta_2*Treat + beta_3 * M*Treat
Now:
my_ivreg <- AER::ivreg(Y ~ M + Treat + M:Treat | M + Z + M:Z)
>summary(my_ivreg)
Call:
AER::ivreg(formula = Y ~ M + Treat + M:Treat | M + Z + M:Z)
Residuals:
Min 1Q Median 3Q Max
-1.332e-14 -7.105e-15 -3.553e-15 -8.882e-16 3.553e-15
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.000e+00 2.728e-15 1.100e+15 <2e-16 ***
M 1.000e+00 3.810e-15 2.625e+14 <2e-16 ***
Treat 2.000e+00 1.255e-15 1.593e+15 <2e-16 ***
M:Treat 3.000e+00 1.792e-15 1.674e+15 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5.633e-15 on 96 degrees of freedom
Multiple R-Squared: 1, Adjusted R-squared: 1
Wald test: 1.794e+31 on 3 and 96 DF, p-value: < 2.2e-16
Which is what we were looking for...
Consider the following dataset
Quantity <- c(25,39,45,57,70,85,89,100,110,124,137,150,177)
Sales <- c(1000,1250,2600,3000,3500,4500,5000,4700,4405,4000,3730,3400,3300)
df <- data.frame(Quantity,Sales)
df
Plotting the data, the distribution of observations is clearly non-linear, but presents a likely breaking-point around Quantity = 89 (I skip the plot here). Therefore, I built a joint piecewise linear model as follows
df$Xbar <- ifelse(df$Quantity>89,1,0)
df$diff <- df$Quantity - 89
reg <- lm(Sales ~ Quantity + I(Xbar * (Quantity - 89)), data = df)
summary(reg)
or simply
df$X <- df$diff*df$Xbar
reg <- lm(Sales ~ Quantity + X, data = df)
summary(reg)
However, according to this parametrization, the coefficient of X represents the change in the slope from the preceding interval.
How can I parametrize the relevant coefficient to rather represent the slope for the second interval?
I did some research but I was unable to find the desired specification, apart from some automatization in stata (see the voice 'marginal' here https://www.stata.com/manuals13/rmkspline.pdf).
Any help is much appreciated. Thank you!
Acknowledgement:
the workable example is retrieved from
https://towardsdatascience.com/unraveling-spline-regression-in-r-937626bc3d96
The key here is to use a logical variable is.right which is TRUE for the points to the right of 89 and FALSE otherwise.
From the the output shown 60.88 is the slope to the left of 89 and -19.97 is the slope to the right. The lines intersect at Quantity = 89, Sales = 4817.30.
is.right <- df$Quantity > 89
fm <- lm(Sales ~ diff : is.right, df)
fm
## Call:
## lm(formula = Sales ~ diff:is.right, data = df)
##
## Coefficients:
## (Intercept) diff:is.rightFALSE diff:is.rightTRUE
## 4817.30 60.88 -19.97
Alternatives
Alternately if you want to use Xbar from the question do it this way. It gives the same coefficients as fm.
fm2 <- lm(Sales ~ diff : factor(Xbar), df)
or
fm3 <- lm(Sales ~ I(Xbar * diff) + I((1 - Xbar) * diff), df)
Double check with nls
We can double check these using nls with the following formulation which makes use of the fact that if we extend both lines the one to use at any Quantity is the lower of the two.
st <- list(a = 0, b1 = 1, b2 = -1)
fm4 <- nls(Sales ~ a + pmin(b1 * (Quantity - 89), b2 * (Quantity - 89)), start = st)
fm4
## Nonlinear regression model
## model: Sales ~ a + pmin(b1 * (Quantity - 89), b2 * (Quantity - 89))
## data: parent.frame()
## a b1 b2
## 4817.30 60.88 -19.97
## residual sum-of-squares: 713120
##
## Number of iterations to convergence: 1
## Achieved convergence tolerance: 2.285e-09
This would also work:
fm5 <- nls(Sales ~ a + ifelse(Quantity > 89, b2, b1) * diff, df, start = st)
Plot
Here is a plot:
plot(Sales ~ Quantity, df)
lines(fitted(fm) ~ Quantity, df)
Model matrix
And here is the model matrix for the linear regression:
> model.matrix(fm)
(Intercept) diff:is.rightFALSE diff:is.rightTRUE
1 1 -64 0
2 1 -50 0
3 1 -44 0
4 1 -32 0
5 1 -19 0
6 1 -4 0
7 1 0 0
8 1 0 11
9 1 0 21
10 1 0 35
11 1 0 48
12 1 0 61
13 1 0 88
If you know the breakpoints, then you almost have the model, it should be:
fit=lm(Sales ~ Quantity + Xbar + Quantity:Xbar,data=df)
Because if you don't introduce a new intercept (Xbar), it will start from the intercept already in the model, which will not work. We can plot it:
plot(df$Quantity,df$Sales)
newdata = data.frame(Quantity=seq(40,200,by=5))
newdata$Xbar= ifelse(newdata$Quantity>89,1,0)
lines(newdata$Quantity,predict(fit,newdata))
The coefficients are:
summary(fit)
Call:
lm(formula = Sales ~ Quantity * Xbar, data = df)
Residuals:
Min 1Q Median 3Q Max
-527.9 -132.2 -15.1 148.1 464.7
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -545.435 327.977 -1.663 0.131
Quantity 59.572 5.746 10.367 2.65e-06 ***
Xbar 7227.288 585.933 12.335 6.09e-07 ***
Quantity:Xbar -80.133 6.856 -11.688 9.64e-07 ***
And the coefficient of the 2nd slope is 59.572+(-80.133) = -20.561
I am using the 'bife' package to run the fixed effect logit model in R. However, I cannot compute any goodness-of-fit to measure the model's overall fit given the result I have below. I would appreciate if I can know how to measure the goodness-of-fit given this limited information. I prefer chi-square test but still cannot find a way to implement this either.
---------------------------------------------------------------
Fixed effects logit model
with analytical bias-correction
Estimated model:
Y ~ X1 +X2 + X3 + X4 + X5 | Z
Log-Likelihood= -9153.165
n= 20383, number of events= 5104
Demeaning converged after 6 iteration(s)
Offset converged after 3 iteration(s)
Corrected structural parameter(s):
Estimate Std. error t-value Pr(> t)
X1 -8.67E-02 2.80E-03 -31.001 < 2e-16 ***
X2 1.79E+00 8.49E-02 21.084 < 2e-16 ***
X3 -1.14E-01 1.91E-02 -5.982 2.24E-09 ***
X4 -2.41E-04 2.37E-05 -10.171 < 2e-16 ***
X5 1.24E-01 3.33E-03 37.37 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
AIC= 18730.33 , BIC= 20409.89
Average individual fixed effects= 1.6716
---------------------------------------------------------------
Let the DGP be
n <- 1000
x <- rnorm(n)
id <- rep(1:2, each = n / 2)
y <- 1 * (rnorm(n) > 0)
so that we will be under the null hypothesis. As it says in ?bife, when there is no bias-correction, everything is the same as with glm, except for the speed. So let's start with glm.
modGLM <- glm(y ~ 1 + x + factor(id), family = binomial())
modGLM0 <- glm(y ~ 1, family = binomial())
One way to perform the LR test is with
library(lmtest)
lrtest(modGLM0, modGLM)
# Likelihood ratio test
#
# Model 1: y ~ 1
# Model 2: y ~ 1 + x + factor(id)
# #Df LogLik Df Chisq Pr(>Chisq)
# 1 1 -692.70
# 2 3 -692.29 2 0.8063 0.6682
But we may also do it manually,
1 - pchisq(c((-2 * logLik(modGLM0)) - (-2 * logLik(modGLM))),
modGLM0$df.residual - modGLM$df.residual)
# [1] 0.6682207
Now let's proceed with bife.
library(bife)
modBife <- bife(y ~ x | id)
modBife0 <- bife(y ~ 1 | id)
Here modBife is the full specification and modBife0 is only with fixed effects. For convenience, let
logLik.bife <- function(object, ...) object$logl_info$loglik
for loglikelihood extraction. Then we may compare modBife0 with modBife as in
1 - pchisq((-2 * logLik(modBife0)) - (-2 * logLik(modBife)), length(modBife$par$beta))
# [1] 1
while modGLM0 and modBife can be compared by running
1 - pchisq(c((-2 * logLik(modGLM0)) - (-2 * logLik(modBife))),
length(modBife$par$beta) + length(unique(id)) - 1)
# [1] 0.6682207
which gives the same result as before, even though with bife we, by default, have bias correction.
Lastly, as a bonus, we may simulate data and see it the test works as it's supposed to. 1000 iterations below show that both test (since two tests are the same) indeed reject as often as they are supposed to under the null.
I am trying to fit a Marketing Mix Model and run into the following problem:
Warning message:
In KFS(model) :
Possible error in diffuse filtering: Negative variances in Pinf, check the >model or try changing the tolerance parameter tol or P1/P1inf of the model.
Below is a reproducible sample code with more detail.
The goal is to develop a tool to optimize marketing mix
Below x1(t), x2(t), x3(t) are investments into 3 marketing channelsat time t
y(t) is the sales at time t
We want to use Kalman Filter approach:
y(t) = alpha + lambda * y(t-1) + beta1 * x1(t) + beta2 * x2(t) + beta3 * x3(t) +
beta12 * x1(t) * x2(t) + beta13 * x1(t) * x3(t) + beta23 * x2(t) * x3(t) + N(0, sigma)
In order to:
i. deduce sales attributed to each channel x1(t), x2(t), x3(t)
ii. their synergies x1 * x2, x1 * x3, x2 * x3
iii. carry over sales yCO(t) = y(t-1)
We use KFAS package
https://cran.r-project.org/web/packages/KFAS/index.html
Below is a simple reproducible example where we:
1. Simulate x1(t), x2(t), x3(t)
2. Set input parameter values used for sales simulation
b1, b2, b12, ..., b23, lambdaà, and sigma
3. Use simulated y(t), x1(t), x2(t), x3(t) to fit the model
4. Compare estimated coefficient with input values b1, b2,...
library(KFAS)
library(dplyr)
sigma<-50
set.seed(1)
x1<-1000 + rnorm(n = 100,mean = 0,sd = 100) + rnorm(100, 0, sigma)
x2<-rep(0, 100)
x2[sort(which(1:100%%6==0))]<-500
x3<-300+100*sin(1:100%%12/12*pi) + rnorm(100, 0, sigma)
#Operationalize with SQRT
x1<-sqrt(x1)
x2<-sqrt(x2)
x3<-sqrt(x3)
#Set input parameters fro simulation
lambda0<-0.5
b1 <- 3
b2 <- 4
b3 <- 5
b12 <- 0.3
b13 <- 0.2
b23 <- 0.1
y_s <-
b1 * x1 +
b2 * x2 +
b3 * x3 +
b12 * x1 * x2 +
b13 * x1 * x3 +
b23 * x2 * x3 +
rnorm(100, sd = sigma)
# function to account for carry over term
# y(t) = lambda * y(t-1) + y_s(t) , where
# y_s(t) = b1 * x1 + b2 * x2 + b3 * x3 + Synergy terms
getCarryOver<-function(t,
y_s,
lambda)
{
if (t==1) return(y_s[1])
else lambda*getCarryOver(t-1,y_s,lambda) + y_s[t]
}
# Add Carry Over term
y<-vector('numeric',100)
for (i in 1:100){
y[i]<-getCarryOver(i,y_s,lambda0)
}
yCO=dplyr::lag(y)
yCO[1]=0
if (!identical(y[-1],y_s[-1]+lambda0*yCO[-1]))
stop('identical(y,y_s+lambda0*yCO)')
model <- SSModel(y ~ SSMregression(~ x1 +
x2 +
x3 +
x1*x2 +
x1*x3 +
x2*x3 +
yCO
, Q = diag(NA,1)), H = NA)
fit <- fitSSM(model, inits = c(0,0,0,0,0,0,0,0), method = "BFGS")
model <- fit$model
model$Q
model$H
out <- KFS(model)
print(out)
This doesn't fully answer your question but this is too long for a comment...
You are not creating the same model you have formulated in before the codes. By defining Q=NA you are actually stating that the first coefficient x1 should be time varying with unknown variance. And then in the fitSSM call you are giving too many initial values so you don't notice the error (only Q and H need numerical estimation by fitSSM, the coeffients for x1 etc are directly estimated by Kalman filter). I admit there is probably few checks missing here which would warn user accordingly. SSMregression function is only needed if you have time-varying regression coefficients or complex multivariate models, here you can just write SSModel(y~ x1*x2 + x1*x3 + x2*x3 + yCO, H=NA) (the main effects are automatically included as in lm).
I would also check the carryover term calculations, just to be sure that you actually generate your data correctly.
If you still get errors, it could be that you have really high multicollinearity that you need to modify the prior for first time step, ie set model$P1inf[] <- 0 (removes the diffuse initialization) and set diag(model$P1) to something moderate like 100, (prior variance of the coefficients).
Actually if you are using the the formulation you suggest (all x's and yC0 as simple explanatory variable) then you should get identical results with lm. And running your code with lm I get same apparently wrong results:
> model <- SSModel(y~ x1*x2 + x1*x3 + x2*x3 + yCO, H=NA)
>
> fit <- fitSSM(model, inits = 0, method = "BFGS")
> out <- KFS(fit$model)
Warning message:
In KFS(fit$model) :
Possible error in diffuse filtering: Negative variances in Pinf, check the model or try changing the tolerance parameter tol or P1/P1inf of the model.
> out
Smoothed values of states and standard errors at time n = 100:
Estimate Std. Error
(Intercept) -1.171e+03 1.300e+03
x1 3.782e+01 4.102e+01
x2 -4.395e+00 1.235e+01
x3 7.287e+01 6.844e+01
yCO 5.244e-01 3.396e-02
x1:x2 5.215e-01 3.979e-01
x1:x3 -1.853e+00 2.167e+00
x2:x3 1.671e-01 3.471e-01
> summary(lm(y~ x1*x2 + x1*x3 + x2*x3 + yCO))
Call:
lm(formula = y ~ x1 * x2 + x1 * x3 + x2 * x3 + yCO)
Residuals:
Min 1Q Median 3Q Max
-137.297 -29.870 -2.214 35.178 87.578
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.171e+03 1.300e+03 -0.901 0.370
x1 3.782e+01 4.102e+01 0.922 0.359
x2 -4.395e+00 1.235e+01 -0.356 0.723
x3 7.287e+01 6.844e+01 1.065 0.290
yCO 5.244e-01 3.396e-02 15.445 <2e-16 ***
x1:x2 5.215e-01 3.979e-01 1.311 0.193
x1:x3 -1.853e+00 2.167e+00 -0.855 0.395
x2:x3 1.671e-01 3.471e-01 0.481 0.631
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 50.25 on 92 degrees of freedom
Multiple R-squared: 0.88, Adjusted R-squared: 0.8709
F-statistic: 96.43 on 7 and 92 DF, p-value: < 2.2e-16
So I think you have some issues with generating your example data, or in the model formulation.
I would like to know how to constrain certain parameters in lm() to have positive coefficients. There are a few packages or functions (e.g. display) that can make all coefficients, and the intercept, positive.
For instance, in this example, I would like to force only x1 and x2 to have positive coefficients.
x1=c(NA,rnorm(99)*10)
x2=c(NA,NA,rnorm(98)*10)
x3=rnorm(100)*10
y=sin(x1)+cos(x2)-x3+rnorm(100)
lm(y~x1+x2+x3)
Call:
lm(formula = y ~ x1 + x2 + x3)
Coefficients:
(Intercept) x1 x2 x3
-0.06278 0.02261 -0.02233 -0.99626
I have tried function nnnpls() in package nnls, it can control the coefficient sign easily. Unfortunately I can't use it due to issues with NAs in the data as this function doesn't allow NA.
I saw function glmc() can be used to apply constraints but I couldn't get it working.
Could someone let me know what should I do?
You could use the package colf for this. It currently offers two least squares non linear optimizers, namely nls or nlxb:
library(colf)
colf_nlxb(y ~ x1 + x2 + x3, data = DF, lower = c(-Inf, 0, 0, -Inf))
#nlmrt class object: x
#residual sumsquares = 169.53 on 98 observations
# after 3 Jacobian and 3 function evaluations
# name coeff SEs tstat pval gradient JSingval
#1 param_X.Intercept. -0.0066952 NA NA NA 3.8118 103.3941
#2 param_x1 0.0000000 NA NA NA 103.7644 88.7017
#3 param_x2 0.0000000 NA NA NA 0.0000 9.8032
#4 param_x3 -0.9487088 NA NA NA 330.7776 0.0000
colf_nls(y ~ x1 + x2 + x3, data = DF, lower = c(-Inf, 0, 0, -Inf))
#Nonlinear regression model
# model: y ~ param_X.Intercept. * X.Intercept. + param_x1 * x1 + param_x2 *
# x2 + param_x3 * x3
# data: model_ingredients$model_data
#param_X.Intercept. param_x1 param_x2 param_x3
# -0.0392 0.0000 0.0000 -0.9801
# residual sum-of-squares: 159
#
#Algorithm "port", convergence message: both X-convergence and relative convergence (5)
You can set the lower and/or upper bounds to specify the limits as you like for each one of the coefficients.
You can use package penalized:
set.seed(1)
x1=c(NA,rnorm(99)*10)
x2=c(NA,NA,rnorm(98)*10)
x3=rnorm(100)*10
y=sin(x1)+cos(x2)-x3+rnorm(100)
DF <- data.frame(x1,x2,x3,y)
lm(y~x1+x2+x3, data=DF)
#Call:
#lm(formula = y ~ x1 + x2 + x3, data = DF)
#
#Coefficients:
#(Intercept) x1 x2 x3
# -0.02438 -0.01735 -0.02030 -0.98203
This gives the same:
library(penalized)
mod1 <- penalized(y, ~ x1 + x2 + x3, ~1,
lambda1=0, lambda2=0, positive = FALSE, data=na.omit(DF))
coef(mod1)
#(Intercept) x1 x2 x3
#-0.02438357 -0.01734856 -0.02030120 -0.98202831
If you constraint the coefficients of x1 and x2 to be positive, they become zero (as expected):
mod2 <- penalized(y, ~ x1 + x2 + x3, ~1,
lambda1=0, lambda2=0, positive = c(T, T, F), data=na.omit(DF))
coef(mod2)
#(Intercept) x3
#-0.03922266 -0.98011223
With ConsReg https://cran.r-project.org/web/packages/ConsReg/index.html package you can deal with this kind of problems
You can set bound limits (lower and upper) and also restrictions within coefficients, like beta1 > beta2 which in some cases can be very useful.