Declare the slope coefficient in the Linear Regression model as a variable - r

Creating linear regressions in R are great because they are simple. However, I have found a lot of difficulty in referring back to the slope of the newly created trend line.
I have the following:
#Reproducible data
v1<-c(1:20)
v2<-c(1:20)
v2<-v2^2
df1<-as.data.frame(cbind(v1,v2))
v3<-c(1:20)
v4<-c(1:20)
v4<-v4^3
df2<-as.data.frame(cbind(v3,v4))
#Model
lm1<-lm(v2~v1,df1)
lm2<-lm(v4~v3,df2)
However how do I declare the slope coefficients of lm1 and lm2 as variables for later use? I am unable to find anything about declaring it, but quite a lot of the interpretation, which I understand already what the slope of it is.
A step further: What if I create a linear model with more than 1 explanatory variables. How would I get the slope coefficients and declare them as a variable?
#Reproducible data
v1<-c(1:20)
v2<-c(1:20)
v2<-v2^2
v5<-c(0:.01,20)
df1<-as.data.frame(cbind(v1,v2,v5))
v3<-c(1:20)
v4<-c(1:20)
v4<-v4^3
v6<-c(0:.01,20)
df2<-as.data.frame(cbind(v3,v4,v6))
#Model
lm1<-lm(v2~v1+v5,df1)
lm2<-lm(v4~v3+v6,df2)

You can find the coefficients from your regression using:
lm1$coefficients
lm2$coefficients

Related

Generalized Linear Model (GLM) in R

I have a response variable (A) which I transformed (logA) and predictor (B) from data (X) which are both continuous. How do I check the linearity between the two variables using Generalized Additive Model (GAM) in R. I use the following code
model <- gamlss(logA ~ pb(B) , data = X, trace = F)
but I am not sure about it, can I add "family=Poisson" in the code when logA is continuous in GLM? Any thoughts on this?
Thanks in advance
If your dependent variable is a count variable, you can use family=PO() without log transformation. With family=PO() a log link is already applied to transform the variable. See help page for gamlss family and also vignette on count regression section 2.1.
So it will go like:
library(gamlss)
fit = gamlss(gear ~ pb(mpg),data=mtcars,family=PO())
You can see that the predictions are log transformed and you need to take the exponential:
with(mtcars,plot(mpg,gear))
points(mtcars$mpg,exp(predict(fit,what="mu")),col="blue",pch=20)

Linear regression with errors on x and y

I have two variables, x and y, each of which has an error in x and y associated with each point. I'm trying to fit a linear regression model in R which takes account of the error in both variables. I see that you can use weights in lm() to weight the regression based on errors but as far as I can see this can only incorporate errors on one variable. Is there any way to fit a linear model which takes into account errors on both of the variables?
Thanks to #Stéphane Laurent for the answer.
The package "deming" contains a function to do exactly this.

How to rectify heteroscedasticity for multiple linear regression model

I'm fitting a multiple linear regression model with 6 predictiors (3 continuous and 3 categorical). The residuals vs. fitted plot show that there is heteroscedasticity, also it's confirmed by bptest().
summary of sales_lm
rediduals vs. fitted plot
Also I calculated the sqrt for my train data and test data, as showed below:
sqrt(mean(sales_train_lm_pred-sales_train$SALES)^2)
2 3533.665
sqrt(mean(sales_test_lm_pred-sales_test$SALES)^2)
2 3556.036
I tried to fit glm() model, but still didn't rectify heteroscedasticity.
glm.test3<-glm(SALES~.,weights=1/sales_fitted$.resid^2,family=gaussian(link="identity"), data=sales_train)
resid vs. fitted plot for glm.test3
it looks weird.
glm.test3 plot
Could you please help me what should I do next?
Thanks in advance!
That you observe heteroscedasticity for your data means that the variance is not stationary. You can try the following:
1) Apply the one-parameter Box-Cox transformation (of the which the log transform is a special case) with a suitable lambda to one or more variables in the data set. The optimal lambda can be determined by looking at its log-likelihood function. Take a look at MASS::boxcox.
2) Play with your feature set (decrease, increase, add new variables).
2) Use the weighted linear regression method.

Obtaining predicted (i.e. expected) values from the orm function (Ordinal Regression Model) from rms package in R

I've run a simple model using orm (i.e. reg <- orm(formula = y ~ x)) and I'm having trouble understanding how to get predicted values for Y. I've never worked with models that use multiple intercepts. I want to know for each and every value of Y in my dataset what the predicted value from the model would be. I tried predict(reg, type="mean") and this produced values that are close to the predicted values from an OLS regression, but I'm not sure if this is what I want. I really just want something analogous to OLS where you can obtain the E(Y) given a set of predictors. If possible, please provide code I can run to do this with a very short explanation.

How do I plot predictions from new data fit with gee, lme, glmer, and gamm4 in R?

I have fit my discrete count data using a variety of functions for comparison. I fit a GEE model using geepack, a linear mixed effect model on the log(count) using lme (nlme), a GLMM using glmer (lme4), and a GAMM using gamm4 (gamm4) in R.
I am interested in comparing these models and would like to plot the expected (predicted) values for a new set of data (predictor variables). My goal is to compare the predicted effects for each model under particular conditions (x variables). Of particular interest is the comparison between marginal (GEE) and conditional estimates.
I think my main problem might be getting the new data in the correct form with the correct labels and attributes and such. I am still very much an R novice and struggle with this stuff (no course on this at my university unfortunately).
I currently have fitted models
gee1 lme1 lmer1 gamm1
and can extract their fixed effect coefficients and standard errors without a problem. I also don't have a problem converting them from the log scale or estimating confidence intervals accounting for the random effects.
I also have my new dataframe newdat which has 365 observations of 23 variables (average environmental data for each day of the year).
I am stuck on how to predict new count estimates from this. I played around with the model.matrix function but couldn't get it to work. For example, I tried:
mm = model.matrix(terms(glmm1), newdat) # Error in model.frame.default(object,
# data, xlev = xlev) : object is not a matrix
newdat$pcount = mm %*% fixef(glmm1)
Any suggestions or good references would be greatly appreciated. Can anyone help with the error above?
Getting predictions for lme() and lmer() is documented on http://glmm.wikidot.com/faq

Resources