Fixed effect model - using plm and lm, different R2 values - r

I tried to do fixed effect model using plm and lm + factor(countries).
And I got the same values for estimators, but I got different values for R2, residual standard error and F-statistic. Why?
And the second question - can I get the coefficients for countries using the plm? (similar to lm + factor(countries))
Here is my data set: https://www.dropbox.com/s/a8r0vl85rb1ak6e/panel_data.csv?dl=0
There are some financial measures and GDP growth for some countries and several years. There are some NAN's (panel in unbalanced)
proba<-read_excel("my_data.xlsx")
pdata<-pdata.frame(proba,index=c("id","year"))
fixed <-plm(GDP_growth~gfdddi01 + gfdddi02 + gfdddi04 + gfdddi05,data=pdata,model="within")
fixed.dum <-lm(GDP_growth~gfdddi01 + gfdddi02 + gfdddi04 + gfdddi05 + factor(country) - 1, data=pdata_srednie_5_letnie)
Thank you!

Related

Compare beta coefficients of the same regression

is there a way to compare (standardized) beta coefficients of one sample and regression without generating two models and conducting an anova? Is there a simpler method with e.g. one function?
For example, if I have this model and would want to compare beta coefficients of SE_gesamt and CE_gesamt (only two variables):
library(lm.beta)
fit1 <- lm(Umint_gesamt ~ Alter + Geschlecht_Dummy + SE_gesamt + CE_gesamt + EmoP_gesamt + Emp_gesamt + IN_gesamt + DN_gesamt + SozID_gesamt, data=dataset)
summary(fit1)
lm.beta(fit1)
All the best,
Karen

Regression Kriging of binomial data in geoRglm R package

I am using binom.krige() function of the R package geoRglm for determining the spatial predictions of a binary (0, 1) response variable with several continuous as well as discrete covariates.
Using glm() with binomial logit link function I found that the response variable is showing significant dependency on several covariates.
I included the trend into binom.krige() using krige.glm.control() where I specified the two trend models as
> trend.d=trend.spatial(~ rivers + roads + annual_pre + annual_tem + elevation_ + host_densi + lulc + moist_dq + moist_in + moist_wq, data_points)
> trend.l=trend.spatial(~ rivers1 + roads + annual_pre + annual_tem + elevation_ + host_densi + lulc + moist_dq + moist_in + moist_wq, pred_grid)
The question, which is confusing me, is when trend.d and trend.l go into krige.glm.control() and eventually into binom.krige(), does it actually fit a glm with binomial logit or just linear model (because the above equations seem to be a linear model)?

Exposure variable in logistic regression

I have a data frame which contains some characteristics from clients and contracts and 0s and 1s showing whether a fall happened the period between 2008 and 2017. I'm using a binomial model to regress probability of fall on the characteristics. I have 38000 differents contracts.
So I'm using an binomial model like this (R-code):
formule <- y ~ Niveau_gar_incapacite + Niv_indem_mens + Regrpt_franchise + Niveau_prime + Situation_familiale + Classe_age_chute + Grde_Region + Regrpt_strate + Taille_courtier + Commission + Retention + Anciennete + Regrpt_CSP + Regrpt_sinistres + Couplage
logit <- glm(Chute_commerciale~1, data=train, family=binomial(link="logit"))
selection_asc_AIC <- step(logit, direction="forward", trace=TRUE, k=2, scope=list(upper=formule))
After some tests to find multi-collinearity, I did eliminations of variables or groupings of terms.
I have this result :
results from GLM
results from GLM 2
This results are not correct with null deviance and residual deviance.
I supposed my variable exposure that is the problem.
In fact, I have contracts beginning and finishing at differents years.
So my exposure can be 5.32 or 1.36 and I have truncation and censorship.
How can I treat this variable exposure in regression logistic binomial ?
If I duplicate my row by the number of year of exposure, there is a problem of independance of observations.

Longitudinal Multi-group latent growth curve model with time-variant and time-invariant predictors (lavaan)

First of all, I am relatively new in using R and haven't used lavaan (or growth models) before so please excuse my ignorance.
I am doing my thesis and analyzing the U.S. financial industry during the financial crisis of 2007. I therefore have individual banks and several variables for each bank across time (from 2007-2013), some are time-variant (such as ROA or capital adequacy) and some are time-invariant (such as size or age). Some variables are also time-variant but not multi-level since they apply to all firms (such as the average ROA of the U.S. financial industry).
Fist of all, can I use lavaan's growth curve model ("growth") in this instance? The example given on the tutorial is for either time-varying variables (c) that influence the outcome (DV) or time-invariant variables (x1 & x2) which influence the slope (s) and intercept (i). What about time varying variables that influence the slope and intercept? I couldn't find an example for this syntax.
Also, how do I specify the "groups" (i.e. different banks) in my analysis? It is actually possible to do a multi-level growth curve model in lavaan (or R for that matter)?
Last but not least, I could find how to import a multilevel dataset in R. My dataset is basically a 3-dimensional matrix (different variables for different firms across time) so how do I input that via SPSS (or notepad?)?
Any help is much appreciated, I am basically lost on how to implement this model and sincerely need some assistance...
Thank you all in advance for your time!
Harry
edit: Here is the sytanx that I have come with so far. DO you think it makes sense?
ETHthesismodel <- '
# intercept and slope with fixed coefficients
i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4
s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4
#regressions (independent variables that influence the slope & intercept)
i ~ high_constr_2007 + high_constr_2008 + ... + low_constr_2007 + low_constr_2008 + ... + ... diff_2013
s ~ high_constr_2007 + high_constr_2008 + ... + low_constr_2007 + low_constr_2008 + ... + ... diff_2013
# time-varying covariates (control variables)
t1 ~ size_2007 + cap_adeq_2007 + brand_2007 +... + acquisitions_2007
t2 ~ size_2008 + cap_adeq_2008 + brand_2008 + ... + acquisitions_2008
...
t7 ~ size_2013 + cap_adeq_2013 + brand_2013 + ... + acquisitions_2013
'
fit <- growth(ETHthesismodel, data = inputdata,
group = "bank")
summary(fit)

specifying multiple random effects in R lmer (translating from HLM model)

I'm attempting to "translate" a model run in HLM7 software to R lmer syntax.
This is from the now-ubiquitous "Math achievement" dataset. The outcome is math achievement score, and in the dataset there are various student-level predictors (such as minority status, SES, and whether or not the student is female) and various school level predictors (such as Catholic vs. Public).
The only predictors in the model I want to fit are student-level predictors, which have all been group-mean centered to deal with dummy variables (aside: contrast codes are better). The students are nested in schools, so we should (I think) have random effects specified for all of the components of the model.
Here is the HLM model:
Level-1 Model
(note: all predictors at level one are group mean centered)
MATHACHij = β0j + β1j*(MINORITYij) + β2j*(FEMALEij) + β3j*(SESij) + rij
Level-2 Models
β0j = γ00 + u0j
β1j = γ10 + u1j
β2j = γ20 + u2j
β3j = γ30 + u3j
Mixed Model
MATHACHij = γ00 + γ10*MINORITYij + γ20*FEMALEij + γ30*SESij + u0j + u1j*MINORITYij + u2j*FEMALEij + u3j*SESij + rij
Translating it to lmer syntax, I try:
(note: _gmc means the variable has been group mean centered, the grouping factor is "school_id")
model1<-lmer(mathach~minority_gmc+female_gmc+ses_gmc+(minority_gmc|school_id)+(female_gmc|school_id)+(ses_gmc|school_id), data=data, REML=F)
When I run this model I get results that don't mesh with the HLM results. Am I specifying the random effects incorrectly?
Thanks!
When you specify your random effect structure, you can include each random effect in one parentheses. While this may not solve your result dependencies, I believe the appropriate random effects code syntax for your model is this:
lmer(mathach~minority_gmc + female_gmc + ses_gmc + (1 + minority_gmc + female_gmc + ses_gmc |school_id), data=data, REML=F)

Resources