In R, the plm function documentation of package plm, we can read about possibility of choose one of three effects individual, time, twoways. Why such exists if I can just pick model type which already specifies which effect to use ? E.g. 'within' model will only use individual and random will always pick twoways. To say more - for example pooling model by definition takes no effect (no time and no individual) so choosing effect in this case is meaningless. What's the purpose of this additional input?
How do you come to this conclusion? The within model can be used with "individual", "time" or "twoways". You should see different results for your model coefficients, when choosing a different effect. Also, for example, when you use "time" or "twoways", you should be able to get the specific time effects via
summary(fixef(yourmodel,type = "level", effect="time")).
(My plm package version is 2.2-4.)
Related
I'm running a two way fixed effects regression in R with "firmID" and "Year" as indexes in the model specification. I have provincial data and would like to be able to cluster the standard errors of the data at the provincial level. Is there a way to do this? It seems that it is hard to cluster standard errors for any variable out of the index. Using province as one of the indexes is not an option. So far the most promising way would be using varvocCR in the clubSandwich package (refer here) however, this will not work for a two way fixed effect as specified in the documentation. Something similar to this approach would be great.
x <- plm(y ~ x, data=data, model='within', index=c("firmid", "year"), effect="twoway")
then followed by
clubSandwich::vcovCR(x, type = 'CR2', cluster = df$Prov))
to get clustered standard errors at the provincial level. But as I mentioned earlier, the documentation in the package specifies that this will work with individual fixed effects. An approach like this that works with two way fixed effects model would be amazing. Any insight is appreciated in any form.
I am interested in using the new bvar package in R to predict a set of endogenous time series. However, because of the COVID pandemic, my time series have been through a structural break. What is the best way to account for this in the model? Some hypotheses:
Add exogenous dummy variable (it seems the package doesn't have this feature)
Add endogenous dummy variable with strong priors that zero the coefficients of impact from other variables over it (i.e. an "artificial" exogenous variable)
Create two separate models (before vs after structural break)
I have tried a mix of 2+3. I tested a (i) model with only recent data (after structural break) and no dummies vs (ii) another with the full history with an additional endogenous (dummy) variable, but without the strong dummy prior (I couldn't understand how to configure it properly). The model (ii) has performed way better in the test set.
I wrote an e-mail to the owner of the package, Nikolas Kuschnig (couldn't find his user in SO), to which he replied:
Structural breaks are always a pain to model. In general it's probably
preferable to estimate two separate models, but given the short timespan and you
getting usable results your idea with adding a dummy variable should also work.
You can adjust priors from other variables by manually setting psi in
bv_mn() (see the docs and the vignette for an explanation).
Depending on the variables you might also be fine not doing any of that, since
COVID could just be seen as another shock (which is almost always quite the
stretch, given the extent of it).
Note that if there is an actual structural break, the dummies won't suffice,
since the coefficients would change (hence my preference for your option 3). To
an extent you could model this with a Markov-switching VAR, but unfortunately I
don't know of an accessible implementation for R.
Thank you, Nikolas
How does one get multiway clustered standard errors in R for plm objects, where the clustering is not at the level of the panel's time/group IDs?
The package plm provides support to calculate cluster-robust standard errors using the function plm::vcovHC. Unfortunately, this function only supports clustering at the group or time IDs of the panel. In certain cases one would want to cluster standard errors at a different level than the panel's unit of observation. An example is a regression with individual fixed effects where variables at a higher level of aggregation are used as independent variables.
A similar question was asked in 2014 and a bootstrap function was recommended. This question differs because (1) I would like to cluster on two variables, not just one and (2) I would prefer not to use the bootstrap.
The package multiwaycov provides something very close to what I want, but unfortunately does not support plm objects.
I have a dataframe ('math') like this (there are three different methods, although only one is shown) -
dataframe
I am trying to create a multi-level growth model for MathScore, where VerbalScore is an independent, time invariant, random effect.
I believe the R code should be similar to this -
random <- plm(MathScore ~ VerbalScore + Method, data=math, index=c("id","Semester"),
model="random")
However, running this code results in the following error:
Error in plm.fit(object, data, model = "within", effect = effect) :
empty model
I believe it's an issue with the index, as the code will run if I use:
random <- plm(MathScore ~ VerbalScore + Method + Semester, data=math, index="id",
model="random")
I would be grateful for any advice on how to create a multi-level, random effect model as described.
This is likely a problem with your data:
As it seems, the variables VerbalScore and Method do not vary per individual. Thus, for the Swamy-Arora RE model (default) the within variance necessary cannot be computed. Affected variables drop out of the model which are here all RHS variables and you get the (not very specific) error message empty model.
You can check variation per individual with the command pvar().
If my assumption is true and still you want to estimate a random effects model, you will have to use a different random effect estimator which does not rely on the within variance, e.g. try the Wallace-Hussain estimator (random.method="walhus").
I want to do a structural equation model for my analysis.
All three concepts are formative, which means these measurement variables construct the latent structure. Not reflective.
And also I want to control for demographic variables.
The analysis will use R and I learnt Lavaan in statistics class, just curious whether anyone can give more information on what I want to do.
Thanks ahead.
in order to used formative constructs you need to use partial least square SEM (PLS-SEM) and not covariance based SEM (Cov-SEM).
the underlying assumption of the two technique differs, as PLS assumes that because the constructs are formative there is no error in measurement while CoV-SEM, has the defaults assumption of reflexive measurement and thus takes into account the error terms while analyzing the model
Lavaan package is a COV-SEM one, you can try plspm for using formative items
here there is a guide on how to use the plspm pakage:
http://gastonsanchez.com/PLS_Path_Modeling_with_R.pdf