Including an offset variable [or constraining a coefficient to 1] using mlogit R - r

I´ve been estimating some MNL models in R using mlogit. The package works very well but it seems that it does not allow to include offset variables. I read the package documentation in order to see whether it allowed to constrain a coefficient when it estimated the model. It seems that there is one alternative using mlogit.optim. Unfortunately, it does not specify how it must be used.
So, my point is: does any of you know how to either (a) including an offset variable or (b) how to constrain a coefficient (coeff A = 1) using mlogit?
Thanks in advance!
Best,
Dr. Wall

Related

How should you use scaled weights with the svydesign() function in the survey package in R?

I am using the survey package in R to analyse the "Understanding Society" social survey. The main user guide for the survey specifies (on page 45) that the weights have been scaled to have a mean of 1. When using the svydesign() function, I am passing the weight variable to the weight argument.
In the survey package documentation, under the surveysummary() function, it states:
Note that the design effect will be incorrect if the weights have been rescaled so that they are not reciprocals of sampling probabilities.
Will I therefore get incorrect estimates and/or standard errors when using functions such as svyglm() etc?
This came to my attention because, when using the psrsq() function to get the Pseudo R-Squared of a model, I received the following warning:
Weights appear to be scaled: rsquared may be wrong
Any help would be greatly appreciated! Thanks!
No, you don't need to worry
The warning is only about design effect estimation (which most people don't want to do), and only about without-replacement design effects (DEFF rather than DEFT). Most people don't need to do design-effect estimation, they just need estimates and standard errors. These are fine; there is no problem.
If you want to estimate the design effects, R needs to estimate the standard errors (which is fine) and also estimate what the standard errors would be under simple random sampling without replacement, with the same sample size. That second part is the problem: working out the variance under SRSWoR requires knowing the population size. If you have scaled the weights, R can no longer work out the population size.
If you do need design effects (eg, to do a power calculation for another survey), you can still get the DEFT design effects that compare to simple random sampling with replacement. It's only if you want design effects compared to simple random sampling without replacement that you need to worry about the scaling of weights. Very few people are in that situation.
As a final note surveysummary isn't a function, it's a help page.

binomial()$linkinv(fixef()) and binomial_pred_ci() functions: what exactly does these function are for when applied to mixed generalized analysis?

I was workinf on a dataset, trying re-perfomrming an already run statysical analysis and I met the following function:
binomial()$linkinv(fixef(m))
after running the following model
summary((m = glmer(T1.ACC ~ COND + (COND | ID), d9only, family = binomial)))
My first question is what exactly does this functions is made for? Beacuse throgh other command lines the reciprocal code as well as a slightly modified code based always on it are also reported:
1) 1- binomial()$linkinv(fixef())
2) d9only$fit = binomial()$linkinv(model.matrix(m) %*% fixef(m)) #also the sense of the operator %*% is quite misterious too.
Moreover, another function present is the following one:
binomial_pred_ci()
To be honest, I've to search through the overall script and no customized function there was or the package where that has been called from either? Anyone knows where does it may come from? Maybe the package 'runjags'? Just in case, any on how to download it?
Thanks for your answers
I agree with most of #Oliver's answer. I will add a few comments (since I had an answer partly composed already).
I would be very wary of the script you are following: some parts look wrong (I could obviously be mistaken since these bits are taken completely out of context ...)
binomial()$linkinv refers to the inverse link function for the model used. By default (which applies in this case since no optional link= argument has been specified), this is the inverse-logit or logistic function A nearly equivalent function is available via plogis(), but using $linkinv could be better in some cases since it would generalize to binomial analyses done with other link functions [e.g. probit or cloglog].
as #Oliver mentions, applying the inverse link function to the coefficients is at least weird, I would even say wrong. Researchers often exponentiate coefficients estimated on the logit/log-odds scale to obtain odds ratios, but applying the inverse link (usually logistic function) is rarely correct.
binomial()$linkinv(model.matrix(m) %*% fixef(m)) is indeed computing the predicted estimates on the link scale and converting them back to the data (= probability) scale. You can get the same results more reliably (handling missing values, etc.) by using predict(m, type = "response", re.form = ~0) (this extends #Oliver's answer to a case that also applies the inverse-link function for you).
I don't know what binomial_pred_ci is either, but I would suggest you look at predictInterval() from the merTools package ...
PS these answers all have not much to do with runjags, which uses an entirely different model structure. Presumably glmer models are being fitted for comparison ...
help(binomial) describes the link function and inverse link function and their uses. binomial()$linkinv is the binomial inverse-link function (sigmoid function) prob(y|eta) = 1 / (1 + exp(-eta)) where eta is the linear predictor. Using this with the coefficients (or fixed effects) is a bit odd, but is not unusual to get an idea of how large the effect of each coefficient is. I would not encourage it however.
%*% is the matrix multiplier, while model.matrix(m) (for lme4) extracts the fixed effect model matrix. So model.matrix(m) %*% fixef(m) is the linear predictor using only fixed effects. It would be the same as predict(m, re.form = ~ 0). This is often used in case you want to use the fixed effect model either because you want to correct for between-group-variation or because you are predicting new data.
binomial_pred_ci no idea. Guessing it's a function for predicting confidence levels.

Adjusting priors in the package BayesFactor in R

I have some pilot data that I should be able to exploit to adjust the prior in a Bayes t-test on a newer dataset.
I've been performing Bayes t-tests using the default settings via the package BayesFactor in R. Can anyone shed some light on how exactly I can go about adjusting the prior for such a test?
Additionally, what do I need from the pilot data to make this happen? I suspect an effect size?
Here's an example of how to employ the Bayes t-test using the default settings:
ttestBF(x = df1$Value, df2$Value, paired = TRUE)
Thanks for your time.
For reference see Rouder et al. 2009. The BayesFactor package in R uses a JZS prior. See the explanation in the documentation of the ttestBF function:
A noninformative Jeffreys prior is placed on the variance of the normal population, while a Cauchy prior is placed on the standardized effect size. The rscale argument controls the scale of the prior distribution, with rscale=1 yielding a standard Cauchy prior. See the references below for more details.
For the rscale argument, several named values are recognized: "medium", "wide", and "ultrawide". These correspond to r scale values of sqrt(2)/2, 1, and sqrt(2) respectively.
Then in the paper it is said:
For both JZS and scaled-information priors, as r
is increased, the Bayes factor provides increased support for the null.
Which basically means, that if you expect really small effect sizes, you should lower the r parameter.
About your second question:
You should be able to use your piloting data as an estimate for the expected effect size and adjust the priors accordingly.
Be aware that you should not adjust your prior with regard to the observed data (i.e. the new data).
Furthermore, with regards to the BayesFactor package I would assume that the default priors should work pretty well with most data (at least if it's from psychology). See the other references provided in the help function.
I hope this helps a little :) , unfortunately, I cannot tell you whether or how to calculate the best scale for your effect size, as there is also a trade off between effect size and BF for really large sample sizes.

fixed effect, instrumental variable regression like xtivreg in stata (FE IV regression)

Does anyone know about a R package that supports fixed effect, instrumental variable regression like xtivreg in stata (FE IV regression). Yes, I can just include dummy variables but that just gets impossible when the number of groups increases.
Thanks!
I can just include dummy variables but that just gets impossible when the number of groups increases
By "impossible," do you mean "computationally impossible"? If so, check out the plm package, which was designed to handle cases that would otherwise be computationally infeasible, and which permits fixed-effects IV.
Start with the plm vignette. It will quickly make clear whether plm is what you're looking for.
Update 2018 December 03: the estimatr package will also do what you want. It's faster and easier to use than the plm package.
As you may know, for many fixed effects and random effects models {I should mention FE and RE from econometrics and education standpoint since the definitions in statistics are different}, you can create an equivalent SEM (Structural Equation Modeling) model. There are two packages in R that can be used for that purpose: 1)SEM 2) LAVAAN
Another solution is to use SAS. In SAS, you can use Proc GLM which enables you to use "absorb" statement which automatically takes care of the dummies as well as finding (x - xbar) per each observation.
Hope it helps.
Try the ivreg command from the AER package.

Standard error of the ARIMA constant

I am trying to manually calculate the standard error of the constant in an ARIMA model, if it is included. I have referred to Box and Jenkins (1994) text, specially Section 7.2, but my understanding is that the methods mentioned here calculates the variance-covariance matrix for the ARIMA parameters only, not the constant. Tried searching on the Internet, but couldn't find any theory. Software like Minitab, R etc. calculate this, so I was wondering what is the way? Can someone provide any pointer(s) on this topic?
Thanks.
arima() will fit a regression model with ARMA errors. The constant is treated as the coefficient of a regression variable consisting only of 1s. So you need the covariance matrix of the regression coefficients which is usually calculated separately from the covariance matrix of the ARMA coefficients. Look at Section 8.3 of Hamilton's "Time series analysis"
One of the nicest things about R is that you can access a lot of the source code to R itself from within the environment. If you simply type arima at the command prompt, you get the high-level source code for the arima() function. I got several pages of code here, when I tried it.
You do miss out on anything implemented internally within the R executable in native code, but often the high-level code tells you everything you want to know.
Perhaps a shift of perspective can solve this problem.
Rather than seeing the constant as something special, just consider the problem without constant and with a variable that is a vector of ones.

Resources