PLM package in R - r

This is a trivial question; but, I'm new to R; and, none of the tutorials I've seen address it. When using the PLM package in R for my panel data, do I include the cross sectional units, the individuals' variable, in my regression formula? While they don't speak to it directly, the tutorials that I've seen seem to leave that variable out. However, in practice, the results are far more realistic when left in.

The package assumes that the individual and time indexes are in the first two columns. If they are not, use the index argument.
Reference: plm paper (section 4.1: Data structure)

Related

Specification of a mixed model using glmmLasso package

I have a dataset containing repeated measures and quite a lot of variables per observation. Therefore, I need to find a way to select explanatory variables in a smart way. Regularized Regression methods sound good to me to address this problem.
Upon looking for a solution, I found out about the glmmLasso package quite recently. However, I have difficulties defining a model. I found a demo file online, but since I'm a beginner with R, I had a hard time understanding it.
(demo: https://rdrr.io/cran/glmmLasso/src/demo/glmmLasso-soccer.r)
Since I cannot share the original data, I would suggest you use the soccer dataset (the same dataset used in glmmLasso demo file). The variable team is repeated in observations and should be taken as a random effect.
# sample data
library(glmmLasso)
data("soccer")
I would appreciate if you can explain the parameters lambda and family, and how to tune them.

Alternative example for capscale function in vegan package

I have been learning multivariate analyses in PRIMER, yet now want to convert to R using the vegan package. I wish to use the capscale() function in vegan, yet am not sure how my data should be formatted beforehand.
In the example in the vignette http://cc.oulu.fi/~jarioksa/softhelp/vegan/html/capscale.html, both dataframes (varespec and varechem) have numeric values only, yet I have one dataframe of dependent (numeric) values, and another of independent (factor) values. So what I am asking for is an alternative worked example that I might be able to emulate. I can't find anything online.
The iris data set should provide sufficient toy data. Thank you
The vignette source you use is badly outdated: I got to remove it from the Web. The help page for ?capscale should contain more up-to-date documentation in your current vegan installation. For the independent data with factors, you should be able to use the model of any other constrained ordination help (?rda) which tell you that with formula interface you can have factors in the independent data -- and the formula interface is the only allowed interface in capscale.
You should switch from capscale to dbrda in vegan: capscale may be deprecated in the future.

Determine number of factors in EFA (R) using Comparison Data

I am looking for ways to determine number of optimal factors in R factanal function. The most used method (conduct a pca and use scree plot to determine the number of factors) is already known to me. I have found a method described here to be easier for non technical folks like me. Unfortunately the R script is no longer accessible in which the method was implemented. I was wondering if there is a package available in R that does the same?
The method was originally proposed in this study: Determining the number of factors to retain in an exploratory factor analysis using comparison data of known factorial structure.
The R code is now moved here as per the author.
EFA.dimensions ist also a nice and easy to use package for that

How to use the `vcconv` command in lme4 for serial correlation?

I'm working with a large longitudinal dataset of firm-year observations. For some time now I have been using lme4to implement crossed (non-nested) effects for year and firm-ID groups.
My goal is now to correct for the serial correlation in the firm-group dimension. Based on chl's and fabians' answers to this question (as well as Ben Bolker's comment on the latter), I've assumed this is impossible with lmer(), but is feasible with nlme::lme().
I have been able to implement crossed effects in nlme based on the discussion in Pinheiro & Bates (2000, sec. 4.2.2, pp. 163-6). In principle then, I believe I can use the correlation = AR1() speficiation in lme() to control for autocorrelation.
My strong preference, however, would be to implement such a correlation specification in lmer() because:
lme4 is much, much (much) faster
nlme requires crossed effects to be nested in some higher group -- without such a higher level grouping I'm forced to create an arbitrary dummy for groupedData to which all observations belong (e.g., here). This creates issues interpreting the relative levels of variation between the two crossed groups and the residual variance because some of the variation appears to be captured by the higher-level dummy group.
I got excited when I found the feature request #224 on GitHub, but alas it doesn't seem like there's much movement on the flexLambda front (please let me know if I'm wrong!).
lme4 v1.1-10
I've just noticed that the latest (Oct. 2015) version of lme4 contains a vcconv command that can
Convert between representation of (co-)variance structures (EXPERIMENTAL.)
Based on the source code, it seems that maybe the sdcor2cov option could allow one to specify a correlation structure such as AR(1).
So my questions are:
Is this interpretation of the vcconv function correct?
If so, does the user supply the correlation (e.g., AR(1)) parameters or are they determined internally in lmer()?
How does one implement this function properly?

Panel data with binary dependent variable in R

Is it possible to do regressions in R using a panel data set with a binary dependent variable? I am familiar with using glm for logit and probit and plm for panel data, but am not sure how to combine the two. Are there any existing code examples?
EDIT
It would also be helpful if I could figure out how to extract the matrix that plm() is using when it does a regression. For instance, you could use plm to do fixed effects, or you could create a matrix with the appropriate dummy variables and then run that through glm(). In a case like this, however, it is annoying to generate the dummies yourself and it would be easier to have plm do it for you.
The package "pglm" might be what you need.
http://cran.r-project.org/web/packages/pglm/pglm.pdf
This package offers some functions of glm-like models for panel data.
Maybe the package lme4 is what you are looking for.
It seems to be possible to run generalized regressions with fixed effects using the comand glme.
But you should be aware that panel data with binary dependent variable is different than the usual linear models.
This site may be helpful.
Best regards,
Manoel
model.frame(plmmodel)
will give you the data frame that is actually used by plm for fitting the model (i.e. after list-wise deletion if you have NAs, etc.)
I don't think that plm has implemented functions to estimate models with binary outcomes, but I may be wrong. Check out the reference manual at: http://cran.r-project.org/web/packages/plm/index.html
If I'm right, this would suggest that you can't "combine the two" without considerable work in extending the functions provided by plm.

Resources