Problems with Fixed effects panel data - r

I am trying to run a regression with a panel data from the Michigan Consumers Survey. It is the first time I am using panel data on R so I am not very aware of the package "plm" that is needed. I am setting my panel data for fixed effects on individuals (CASEID) and time (YYYY):
Michigan_panel <- pdata.frame(Michigan_survey, index = c("CASEID", "YYYY"))
Then I am using the following regression:
mod_1 <- plm(data = Michigan_panel, ICS ~ ICE + PX1Q2 + RATEX + ZLB + INCOME + AGE + EDUC + MARRY + SEX + AGE_sq, model = "within")
However R is showing me the following error:
> mod_1 <- plm(data = Michigan_panel, ICS ~ ICE + PX1Q2 + RATEX + ZLB + INCOME + AGE + EDUC + MARRY + SEX + AGE_sq, model = "within")
Error in plm.fit(data, model, effect, random.method, random.models, random.dfcor, :
empty model
Does anyone know what I am doing wrong?

Could you give the link where is this specific survey? I found various dataset with this data name.
I suspect (only suspect), you data isn't panel data, please check the CASEID variable.
Changing the order between formula and data in plm won't be solve your problem.
.

I think the error come when you write the model. Your solution is this:
mod_1 <- plm(data = Michigan_panel, ICS ~ ICE + PX1Q2 + RATEX + ZLB + INCOME + AGE + EDUC + MARRY + SEX + AGE_sq, model = "within")
In my view, you have to specify indexes in the formula, and follow the order of the plm package. I would like to write your formula as follows:
mod_1 <- plm(ICS ~ ICE + PX1Q2 + RATEX + ZLB + INCOME + AGE + EDUC + MARRY + SEX + AGE_sq,
data = Michigan_panel,
index= c("CASEID", "YYYY"),
model = "within")
1. Different Approach
From my knowledge we can also code this formula in a more elegant format.
library(plm)
Michigan_panel <- pdata.frame(Michigan_survey, index = c("CASEID", "YYYY"))
attach(Michigan_panel)
y <- cbind(ICS)
X <- cbind(ICE,PX1Q2,RATEX,ZLB,INCOME,AGE,EDUC,MARRY,SEX,AGE_sq)
model1 <- plm(y~X+factor(CASEID)+factor(YEAR), data=Michigan_panel, model="within")
summary(model1)
detach()
Adding factor(CASEID) and factor(YEAR) will add dummy variables in your model.

Related

Create Well-Being Latent Variable in Lavaan from Depression/Anxiety Questionnaire

I'm building a structural equation model that incorporates 4 latent variables: physical lifestyle, social lifestyle, trauma score, and the DV (well-being).
We have a 7 question survey of just well-being, but I think it would be more sound (less measurement error) to cull three surveys of well-being, depression, and anxiety to make them into a latent dependent variable. I received the warning that the covariance matrix was not positive definite when just using the scaled scores from the surveys, so I decided to actually incorporate the questions from the surveys themselves. However, when I do this and then look at modification indices I receive an output that suggests that the residuals are not currently correlated, when I thought that that was the default for any latent variable, which is why I am wondering whether I am specifying the well-being latent variable correctly (whether it's just a matter of adding in all questions that will ultimately comprise this latent variable).
Below is the entire model. The latent variable "well-being" currently only has questions from the phq 9, Depression Survey; and the General Anxiety Survey (but will also be adding in the well-being survey). I've added the output for the modification indices below that.
I've included some data here: https://drive.google.com/file/d/1AX50DFNik30Qsyiyp6XnPMETNfVXK83r/view?usp=sharing
Thanks much!
fit.latent_wb <- '
#factor loadings; measurement model portion
pl =~ exercisescore + mindfulnessscore + promistscore
sl =~ family_support + friendshipcount + friendshipnet +
sense_of_community + sesscore + ethnicity
trauma =~ neglectscore + abusescore + exposure + family_support + age
wb =~ phq9_1 + phq9_2 + phq9_3 + phq9_4 + phq9_5 + phq9_6 +
phq9_7 + phq9_8 + phq9_9 + gad7_1 + gad7_2 + gad7_3 + gad7_4 +
gad7_5+ gad7_6+ gad7_7
#regressions: structural model
wb ~ age + gender + ethnicity + sesscore + resiliencescore +
pl + emotionalsupportscore + trauma
resiliencescore ~ age + sesscore + emotionalsupportscore + sl
emotionalsupportscore ~ sl + gender
friendshipnet~~age
exercisescore~~sense_of_community
'
fit.latent_wb <- sem(fit.latent_wb, data = total, meanstructure = TRUE, std.lv = TRUE)
summary(fit.latent_wb, fit.measures = TRUE,standardized = TRUE, rsquare = TRUE, estimates = FALSE)
Output for Mod Indices:

Issue with setting wd

Hey everyone having some trouble executing this code. Here is what my boss told me to do, having such a remedial issue. PLease advice.
require(data.table)
require(MASS)
dat = fread("~/OneDrive - SUNY Upstate Medical University/bin/projects/rdoc/brett_project_igt/WGCNA_moduleEigengenes_SamplePhenotype.txt")
#working on setting the right directory
#C:\Users\brett\OneDrive\Desktop\Lab\brett_project_igt
dat = fread("~/Users/brett/OneDrive/Desktop/Lab/WGCNA_moduleEigengenes_SamplePhenotype.txt")
hist(dat$IGT_total_net)
hist(dat$IGT_total_lat)
# linear model
fit = lm(IGT_total_net ~ Age + Gender + SV1 + race + RIN + ME36, data = dat)
summary(fit)
# negative binomial model
fit = glm.nb(IGT_total_lat ~ Age + Gender + SV1 + race + RIN + ME36, data = dat)
summary(fit)
Here is my most current error message, some of the previous issues have been not having the right wd, then also having issues with fread function. Thanks in advance.
Error in fread("C:/Users/brett/AppData/Local/Packages/Microsoft.MicrosoftEdge_8wekyb3d8bbwe/TempState/Downloads") :
File 'C:/Users/brett/AppData/Local/Packages/Microsoft.MicrosoftEdge_8wekyb3d8bbwe/TempState/Downloads' is a directory. Not yet implemented.

How to get p-values for random effects in glmer

I want to analyze when the claims of a protest are directed at the state, based on action and country level characteristics, using glmer. So, I would like to obtain p-values of both the fixed and random effects. My model looks like this:
targets <- glmer(state ~ ENV + HLH + HRI + LAB + SMO + Capital +
(1 + rile + parties + rep + rep2 + gdppc + election| Country),
data = df, family = binomial)
The output only gives me the Variance & Std.Dev. of the random effects, as well as the correlations among them, which makes sense for most multilevel analyses but not for my purposes. Is there any way I can get something like the estimates and the p-values for the random effects?
If this cannot be done with R, is there any other statistical software that would give such an output?
UPDATE: Following the suggestions here, I have moved this question to Cross Validated: https://stats.stackexchange.com/questions/381208/r-how-to-get-estimates-and-p-values-for-random-effects-in-glmer
library(lme4)
library(lattice)
xyplot(incidence/size ~ period|herd, cbpp, type=c('g','p','l'),
layout=c(3,5), index.cond = function(x,y)max(y))
gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial)
summary(gm1)

How do you exclude interaction term in r lm?

I'm working with a model from the Prestige dataset in the car package in R.
library(car)
library(carData)
data = na.omit(Prestige)
prestige = data$prestige
income = data$income
education = data$education
type = data$type
I'm trying to fit the model lm(prestige ~ income + education + type + income:type + education:type). For class I'm starting with the full model and working down to a smaller model, just backward selection. One of the least useful covariates according to p-value is the education:typeprof. How do I just delete that covariate from the model without taking out all the education:type interactions? In general how do you exclude interactions with factors? I saw an answer with the update function specifying which interaction to exclude but it didn't work in my case. Maybe I implemented it incorrectly.
fit4 = lm(prestige ~ income + education + type + income:type + education:type)
newfit = update(fit4, . ~ . - education:typeprof)
Unfortunately this didn't work for me.
So there is a way to drop a single interaction term. Suppose you have the linear model
fullmodel = lm(y_sim ~ income + education + type + income:type + education:type - 1)
You can call model.matrix on fullmodel which will give you the X matrix for your linear model. From there you can specify which column you'd like to drop and refit your model.
X = model.matrix(fullmodel)
drop = which(colnames(X) == 'education:typeprof')
X1 = X[,-1]
newfit = lm(presitge ~ X1 - 1)

R: Logit Regression with Instrument Variable and Interaction Term

I have a severe problem with R. I did not figure out how to run a logit regression with an instrument variable.
The tricky thing is that I have 2 independent variables that work as an interaction term, but the instrument only works on one of the two independent variables. Further, I have a couple of Controls.
I tried a couple of things with the AER ivreg package, but I could not figure out what I have to type in the regression command.
I would be so grateful if somebody could help me.
I think this post is what you need:
http://www.r-bloggers.com/a-simple-instrumental-variables-problem/
The code in the post
library(AER)
library(lmtest)
data("CollegeDistance")
cd.d<-CollegeDistance
simple.ed.1s<- lm(education ~ distance,data=cd.d)
cd.d$ed.pred<- predict(simple.ed.1s)
simple.ed.2s<- lm(wage ~ urban + gender + ethnicity + unemp + ed.pred , data=cd.d)
simple.comp<- encomptest(wage ~ urban + gender + ethnicity + unemp + ed.pred , wage ~ urban + gender + ethnicity + unemp + education , data=cd.d)
1s.ftest<- encomptest(education ~ tuition + gender + ethnicity + urban , education ~ distance , data=cd.d)
library(arm)
coefplot(lm(wage ~ urban + gender + ethnicity + unemp + education,data=cd.d),vertical=FALSE,var.las=1,varnames=c("Education","Unemp","Hispanic","Af-am","Female","Urban","Education"))
coefplot(simple.ed.2s , vertical=FALSE,var.las=1,varnames=c("Education","Unemp","Hispanic","Af-am","Female","Urban","Education"))

Resources