Subset of predictors using coefplot() - r

I'd like to do a plot of coefficients using coefplot() that only takes into account a subset of the predictors that I'm using. For example, if you have the code
y1 <- rnorm(1000,50,23)
x1 <- rnorm(1000,50,2)
x2 <- rbinom(1000,1,prob=0.63)
x3 <- rpois(1000, 2)
fit1 <- lm(y1 ~ x1 + x2 + x3)
and then ran
coefplot(fit1)
it would give you a plot displaying the coefficients of the intercept, x1, x2 and x3. How can I modify this so I only get the coefficients for say, x1 and x2?

You can use the argument predictors and it will only plot the coefficients you need:
library(coefplot)
coefplot(fit1, predictors=c('x1','x2'))
Output:

Related

Regression Output from probitmfx

I am trying to produce a nice regression table for marginal effects & p-values from the probitmfx function, where p-values are reported under the marginal effect per covariate. An picture example of what I'd like it to look like is here Similar Output from Stata.
I tried the stargazer function, as suggested here but this does not seem to work if I don't have an OLS / probit.
data_T1 <- read_dta("xxx")
#specification (1)
T1_1 <- probitmfx(y ~ x1 + x2 + x3, data=data_T1)
#specification (1)
T1_2 <- probitmfx(y ~ x1 + x2 + x3 + x4 + x5, data=data_T1)
#this is what I tried but does not work
table1 <- stargazer(coef=list(T1_1$mfxest[,1], T1_2$mfxest[,1]),
p=list(T1_2$mfxest[,4],T1_2$mfxest[,4]), type="text")
Any suggestions how I can design such a table in R?
You can probably use parameters package to produce a beautiful table:
Code:
library(mfx)
library(parameters)
# simulate some data
set.seed(12345)
n <- 1000
x <- rnorm(n)
# binary outcome
y <- ifelse(pnorm(1 + 0.5 * x + rnorm(n)) > 0.5, 1, 0)
data <- data.frame(y, x)
mod <- probitmfx(formula = y ~ x, data = data)
print_html(model_parameters(mod))
HTML table to be used in Rmarkdown:

Constrained weighted linear regression in R

I am trying to set up a contrained weighted linear regression. That is to say, that I have a dataset of i observations and three different x values. Each observations has a weight. I want to perform a weighted multiple linear regression using the restrictions that the weighted mean of each x value has to be zero and the weighted standard deviation should be one.
Since I am new and have no reputation yet, I can‘t post images with latex formulas. So I have to write them down this way.
First restriction $\sum_{i} w_{i} X_{i,k} = 0$ for k = 1,2,3.
Second one: $\sum_{i} w_{i} X_{i,k}^2 = 1$ for k = 1,2,3.
This is an example dataset:
y <- rnorm(10)
w <- rep(0.1, 10)
x1 <- rnorm(10)
x2 <- rnorm(10)
x3 <- rnorm(10)
data <- cbind(y, x1, x2, x3, w)
lm(y ~ x1 + x2 + x3, data = data, weigths = data$w)
The weights do not have to be equal for each observation but have to add up to one.
I would like to include these restrictions into the regression. Is there a way to do that?
You could perhaps use the Generalised Linear Model:
glm(y ~ x1 + x2 + x3, weights = w, data=data)
Data needs to be a data.frame(...).

R : Plotting Prediction Results for a multiple regression

I want to observe the effect of a treatment variable on my outcome Y. I did a multiple regression: fit <- lm (Y ~ x1 + x2 + x3). x1 is the treatment variable and x2, x3 are the control variables. I used the predict function holding x2 and x3 to their means. I plotted this predict function.
Now I would like to add a line to my plot similar to a simple regression abline but I do not know how to do this.
I think I have to use line(x,y) where y = predict and x is a sequence of values for my variable x1. But R tells me the lengths of y and x differ.
I think you are looking for termplot:
## simulate some data
set.seed(0)
x1 <- runif(100)
x2 <- runif(100)
x3 <- runif(100)
y <- cbind(1,x1,x2,x3) %*% runif(4) + rnorm(100, sd = 0.1)
## fit a model
fit <- lm(y ~ x1 + x2 + x3)
termplot(fit, se = TRUE, terms = "x1")
termplot uses predict.lm(, type = "terms") for term-wise prediction. If a model has intercept (like above), predict.lm will centre each term (What does predict.glm(, type=“terms”) actually do?). In this way, each terms is predicted to be 0 at the mean of the covariate, and the standard error at the mean is 0 (hence the confidence interval intersects the line at the mean).

R: How to run regression (glm function) with some coefficients specified?

Suppose I have one dependent variable, and 4 independent variables. I suspect only 3 of the independent variables are significant, so I use the glm(y~ x1 + x2 + x3...) function. Then I get some coefficients for these variables. Now I want to run glm(y ~ x1 + x2 + x3 + x4), but I want to specify that the x1, x2, x3 coefficients remain the same. How could I accomplish this?
Thanks!
I don't think you can fit a model where some of the independent variables have fixed parameters. What you can do is create a new variable y2 that equals the predicted value of your first model with x1+x2+x3. Then, you can fit a second model y~y2+x4 to include it as an independent variable along with x4.
So basically, something like this:
m1 <- glm(y~x1+x2+x3...)
data$y2 <- predict(glm, newdata=data)
m2 <- glm(y~y2+x4...)

matrix of correlations

I'm new to using R and I am trying to create a matrix of correlations. I have three independent variables (x1,x2,x3) and one dependent varaible (y).
I've been trying to use cor to make a matrix of the correlations, but so far I have bene unable to find a formula for doing this.
x1=rnorm(20)
x2=rnorm(20)
x3=rnorm(20)
y=rnorm(20)
data=cbind(y,x1,x2,x3)
cor(data)
If I have correctly understood, you have a matrix of 3 columns (say x1 to x3) and many rows (as y values). You may act as follows:
foo = matrix(runif(30), ncol=3) # creating a matrix of 3 columns
cor(foo)
If you have already your values in 3 vectors x1 to x3, you can make foo like this: foo=data.frame(x1,x2,x3)
Correct me if I'm wrong, but assuming this is related to a regression problem, this might be what you're looking for:
#Set the number of data points and build 3 independent variables
set.seed(0)
numdatpoi <- 7
x1 <- runif(numdatpoi)
x2 <- runif(numdatpoi)
x3 <- runif(numdatpoi)
#Build the dependent variable with some added noise
noisig <- 10
yact <- 2 + (3 * x1) + (5 * x2) + (10 * x3)
y <- yact + rnorm(n=numdatpoi, mean=0, sd=noisig)
#Fit a linear model
rmod <- lm(y ~ x1 + x2 + x3)
#Build the variance-covariance matrix. This matrix is typically what is wanted.
(vcv <- vcov(rmod))
#If needed, convert the variance-covariance matrix to a correlation matrix
(cm <- cov2cor(vcv))
From the above, here's the variance-covariance matrix:
(Intercept) x1 x2 x3
(Intercept) 466.5773 14.3368 -251.1715 -506.1587
x1 14.3368 452.9569 -170.5603 -307.7007
x2 -251.1715 -170.5603 387.2546 255.9756
x3 -506.1587 -307.7007 255.9756 873.6784
And, here's the associated correlation matrix:
(Intercept) x1 x2 x3
(Intercept) 1.00000000 0.03118617 -0.5908950 -0.7927735
x1 0.03118617 1.00000000 -0.4072406 -0.4891299
x2 -0.59089496 -0.40724064 1.0000000 0.4400728
x3 -0.79277352 -0.48912986 0.4400728 1.0000000

Resources