fixed effect, instrumental variable regression like xtivreg in stata (FE IV regression) - r

Does anyone know about a R package that supports fixed effect, instrumental variable regression like xtivreg in stata (FE IV regression). Yes, I can just include dummy variables but that just gets impossible when the number of groups increases.
Thanks!

I can just include dummy variables but that just gets impossible when the number of groups increases
By "impossible," do you mean "computationally impossible"? If so, check out the plm package, which was designed to handle cases that would otherwise be computationally infeasible, and which permits fixed-effects IV.
Start with the plm vignette. It will quickly make clear whether plm is what you're looking for.
Update 2018 December 03: the estimatr package will also do what you want. It's faster and easier to use than the plm package.

As you may know, for many fixed effects and random effects models {I should mention FE and RE from econometrics and education standpoint since the definitions in statistics are different}, you can create an equivalent SEM (Structural Equation Modeling) model. There are two packages in R that can be used for that purpose: 1)SEM 2) LAVAAN
Another solution is to use SAS. In SAS, you can use Proc GLM which enables you to use "absorb" statement which automatically takes care of the dummies as well as finding (x - xbar) per each observation.
Hope it helps.

Try the ivreg command from the AER package.

Related

How to conduct unconditional quantile regression in R?

The only package I know that does unconditional quantile regression in R is uqr. Unfortunately, it's been removed from CRAN. Even though I can still use it, its functionality is limited (e.g., does not conduct significance tests or allow to compare effects across quantiles). I'm wondering if anyone knows how to conduct UQR in R, with either functions they wrote or some other means.
there are many limitations in terms of test and asymptotic theory regarding unconditional quantile regressions, especially if you are thinking on the version proposed in Firpo, Fortin, and Limeaux (2009) "Unconditional quantile regressions".
The application, however, is straightforward. you need only 2 elements:
the unconditional quantile (estimated with any of your favorite packages).
the density of the outcome at the quantile you got in (1)
After that, you apply the RIF function:
$$RIF(q) = q(t)+\frac{t-1(y<=q(t)}{f(q(t))}$$
Once you have this, you just use that instead of your dep variable, when you write your "lm()" function. And that is it.
HTH

R alternatives to JAGS/BUGS

Is there an R-Package I could use for Bayesian parameter estimation as an alternative to JAGS? I found an old question regarding JAGS/BUGS alternatives in R, however, the last post is already 9 years old. So maybe there are new and flexible gibbs sampling packages available in R? I want to use it to get parameter estimates for novel hierarchical hidden markov models with random effects and covariates etc. I highly value the flexibility of JAGS and think that JAGS is simply great, however, I want to write R functions that facilitate model specification and am looking for a package that I can use for parameter estimation.
There are some alternatives:
stan, with rstan R package. Stan looks well optimized but cannot do certain type of models (like binomial/poisson mixture model), since he cannot sample a discrete variable (or something like that...).
nimble
if you want highly optimized sampling based on C++, you may want to check Rcpp based solutions from Dirk Eddelbuettel

Random Effects with count Models

I'm trying to do a hurdle model with random effects in either r or stata. I've looked at the glmmADMB package, but am running into problems getting it download in R and I can't find any documentation on the package in Cran. Is this package still available? Has anyone used it successfully to estimate a hurdle model with random effects?
Alternatively, is there a way to estimate this in stata? Is there a way to estimate random effects with any type of count data in stata?
Any advice would be greatly appreciated.
Jennifer
In Stata, xtnbreg and xtpoisson have the random effects estimator as the default option. You can always estimate the two parts separately by hand. See the count-data chapter of Cameron and Trivedi's Stata book for cross-sectional examples.
You also have the user-written hplogit and hnlogit for hurdle count models. These use a logit/probit for the first-stage and a zero-truncated poisson/negative binomial for the second stage. Also, a finite mixture model might be a nice approach (see user-written fmm). There's also ztpnm. All these are cross-sectional models.

Incorporating observation weights in the randomForest package

How can I use the R randomForest package with observation weights? I know that there is no such option in this package. I have 2 questions:
Are there any solutions to this problem using randomForest package? At this moment I'm drawing samples from data with weights as the probability so I can at least simulate it:
m = dim(data)[1]
sample(data, m, replace=TRUE, prob=weights)
It works are there other (better) solutions?
Are there any alternatives to the randomForest package. I found the party package (cforest) but it's terrible in terms of memory management (or I cannot use it the way I use randomForest package). I have around 200k observations and 30-40 variables.
EDIT:
Sorry for not clarifying details. I'm using the randomForest package for regression problem (not classification). It is a time series and every observation has its weight. Later on this weight is used to determine the model performance across test observations. The y variable is continuous.
I was looking for the same option as you Pawel in the Random Forest. And I figured out the package "ranger" in R incorporates it in the function "ranger" (through the parameter "case.weights").
The package released in june 2016 so it is very young.
Best,
randomForest does have a "classwt" parameter that should allow you to account for differential sampling probabilities or even for differential costs. Admittedly it is ignored with regression Perhaps you should explain why you need to use weighting and what sort of y variable you are using.

Panel data with binary dependent variable in R

Is it possible to do regressions in R using a panel data set with a binary dependent variable? I am familiar with using glm for logit and probit and plm for panel data, but am not sure how to combine the two. Are there any existing code examples?
EDIT
It would also be helpful if I could figure out how to extract the matrix that plm() is using when it does a regression. For instance, you could use plm to do fixed effects, or you could create a matrix with the appropriate dummy variables and then run that through glm(). In a case like this, however, it is annoying to generate the dummies yourself and it would be easier to have plm do it for you.
The package "pglm" might be what you need.
http://cran.r-project.org/web/packages/pglm/pglm.pdf
This package offers some functions of glm-like models for panel data.
Maybe the package lme4 is what you are looking for.
It seems to be possible to run generalized regressions with fixed effects using the comand glme.
But you should be aware that panel data with binary dependent variable is different than the usual linear models.
This site may be helpful.
Best regards,
Manoel
model.frame(plmmodel)
will give you the data frame that is actually used by plm for fitting the model (i.e. after list-wise deletion if you have NAs, etc.)
I don't think that plm has implemented functions to estimate models with binary outcomes, but I may be wrong. Check out the reference manual at: http://cran.r-project.org/web/packages/plm/index.html
If I'm right, this would suggest that you can't "combine the two" without considerable work in extending the functions provided by plm.

Resources