I have a very basic question regarding these two functions in R.
When I try to do panel data analysis using generalised moment method, I realised that both gmm and pgmm are functions for this method. What is the difference between them? Should I use pgmm one instead of gmm for panel data(I would like to do difference GMM estimation)?
Thank you in advance!
If you are doing panel analysis, you should use pgmm. It is part of the plm package for R, which is a comprehensive package for panel data econometrics.
I assume the gmm function you refer to is from the package of the same name. It is developed for cross-section GMM estimation, not panel data.
Related
I am newbie in R and I need to know how to plot a tree selected from a random forest training model created using the train () function in caret package.
First and foremost, I used a training dataset to create a fitting model of a random forest using the train() function. The created random forest contains about 500 trees. Is there any methodology to create a plot of a selected tree?
Thank you.
CRAN package party offers a method called prettyTree.
Look here
As far as I know, the randomForest package does not have any built-in functionality to plot individual trees. You can extract trees using the getTree() function, but nothing is provided to plot / visualize it. This question may be a duplicate as a quick search yielded approaches other people have used to extract trees from a random forest are found
here and here and here
I want to build a Bagged Logistic Regression Model in R. My dataset is really biased and has 0.007% of positive occurrences.
My thoughts to solve this was to use Bagged Logistic Regression. I came across the hybridEnsemble package in R. Does anyone have an example of how this package can be used? I searched online, but unfortunately did not find any examples.
Any help will be appreciated.
The way that I would try to solve this is use the h2o.stackedEnsemble() function in the h2o R package. You can automatically create more balanced classifiers by using the balance_classes = TRUE option in all of the base learners. More information about how to use this function to create ensembles is located in the Stacked Ensemble H2O docs.
Also, using H2O will be way faster than anything that's written in native R.
I am trying to fit a transfer function model using R in order to apply the fitted model to a validation set of data, because SPSS doesn't allow me to (or I don't know how to) compute point forecasts just like the function Arima() from forecast package does. It does let me apply the model, but it does not use the dependet variable's lagged values, that's why I am trying R.
Anyone know how I could get those type of "updated" or validation forecasts using the arimax() function? I am not looking for the following type of predictions:
predict(vixari011, n.ahead=12)
But rather these:
Arima(test$VIX, model = vixari)
From what I have been reading there is no prediction function for the arimax() function, any ideas about how I could forecast to evaluate point-by-point performance? I can just think of computing manually using a spreadsheet...
I had the same problem. I know this post is old but this can help someone.
I used this it worked just fine
forecast(fitted(arimax_ts_model), h=11)
This is a follow-up to a previous question I asked a while back that was recently answered.
I have built several gbm models with dismo::gbm.step, which relies on the gbm fitting functions found in R package gbm, as well as cross validation tools from R package splines.
As part of my analysis, I would like to use some of the graphical tools available in R (e. g. perspective plots) to visualize pairwise interactions in the data. Both the gbm and the dismo packages have functions for detecting and modelling interactions in the data.
The implementation in dismo is explained in Elith et. al (2008) and returns a statistic which indicates departures of the model predictions from a linear combination of the predictors, while holding all other predictors at their means.
The implementation in gbm uses Friedman`s H statistic (Friedman & Popescue, 2005), and returns a different metric, and also does NOT set the other variables at their means.
The interactions modelled and plotted with dismo::gbm.interactions are great and have been very informative. However, I would also like to use gbm::interact.gbm, partly for publication strength and also to compare the results from the two methods.
If I try to run gbm::interact.gbm in a gbm.object created with dismo, an error is returned…
"Error in is.factor(data[, x$var.names[j]]) :
argument "data" is missing, with no default"
I understand dismo::gmb.step adds extra data the authors thought would be useful to the gbm model.
I also understand that the answer to my question lies somewherein the source code.
My questions is...
Is it possible to modify a gbm object created in dismo to be used in gbm::gbm.interact? If so, would this be accomplished by...
a. Modifying the gbm object created in dismo::gbm.step?
b. Modifying the source code for gbm::interact.gbm?
c. Doing something else?
I will be going through the source code trying to solve this myself, if I come up with a solution before anyone answers I will answer my own question.
The gbm::interact.gbm function requires data as an argument interact.gbm <- function(x, data, i.var = 1, n.trees = x$n.trees).
The dismo gbm.object is essentially the same as the gbm gbm.object, but with extra information attached so I don't imagine changing the gbm.object would help.
Is it possible to do regressions in R using a panel data set with a binary dependent variable? I am familiar with using glm for logit and probit and plm for panel data, but am not sure how to combine the two. Are there any existing code examples?
EDIT
It would also be helpful if I could figure out how to extract the matrix that plm() is using when it does a regression. For instance, you could use plm to do fixed effects, or you could create a matrix with the appropriate dummy variables and then run that through glm(). In a case like this, however, it is annoying to generate the dummies yourself and it would be easier to have plm do it for you.
The package "pglm" might be what you need.
http://cran.r-project.org/web/packages/pglm/pglm.pdf
This package offers some functions of glm-like models for panel data.
Maybe the package lme4 is what you are looking for.
It seems to be possible to run generalized regressions with fixed effects using the comand glme.
But you should be aware that panel data with binary dependent variable is different than the usual linear models.
This site may be helpful.
Best regards,
Manoel
model.frame(plmmodel)
will give you the data frame that is actually used by plm for fitting the model (i.e. after list-wise deletion if you have NAs, etc.)
I don't think that plm has implemented functions to estimate models with binary outcomes, but I may be wrong. Check out the reference manual at: http://cran.r-project.org/web/packages/plm/index.html
If I'm right, this would suggest that you can't "combine the two" without considerable work in extending the functions provided by plm.