The documentation for the multinom() function from the nnet package in R says that it "[f]its multinomial log-linear models via neural networks" and that "[t]he response should be a factor or a matrix with K columns, which will be interpreted as counts for each of K classes." Even when I go to add a tag for nnet on this question, the description says that it is software for fitting "multinomial log-linear models."
Granting that statistics has wildly inconsistent jargon that is rarely operationally defined by whoever is using it, the documentation for the function even mentions having a count response and so seems to indicate that this function is designed to model count data. Yet virtually every resource I've seen treats it exclusively as if it were fitting a multinomial logistic regression. In short, everyone interprets the results in terms of logged odds relative to the reference (as in logistic regression), not in terms of logged expected count (as in what is typically referred to as a log-linear model).
Can someone clarify what this function is actually doing and what the fitted coefficients actually mean?
nnet::multinom is fitting a multinomial logistic regression as I understand...
If you check the source code of the package, https://github.com/cran/nnet/blob/master/R/multinom.R and https://github.com/cran/nnet/blob/master/R/nnet.R, you will see that the multinom function is indeed using counts (which is a common thing to use as input for a multinomial regression model, see also the MGLM or mclogit package e.g.), and that it is fitting the multinomial regression model using a softmax transform to go from predictions on the additive log-ratio scale to predicted probabilities. The softmax transform is indeed the inverse link scale of a multinomial regression model. The way the multinom model predictions are obtained, cf.predictions from nnet::multinom, is also exactly as you would expect for a multinomial regression model (using an additive log-ratio scale parameterization, i.e. using one outcome category as a baseline).
That is, the coefficients predict the logged odds relative to the reference baseline category (i.e. it is doing a logistic regression), not the logged expected counts (like a log-linear model).
This is shown by the fact that model predictions are calculated as
fit <- nnet::multinom(...)
X <- model.matrix(fit) # covariate matrix / design matrix
betahat <- t(rbind(0, coef(fit))) # model coefficients, with expicit zero row added for reference category & transposed
preds <- mclustAddons::softmax(X %*% betahat)
Furthermore, I verified that the vcov matrix returned by nnet::multinom matches that when I use the formula for the vcov matrix of a multinomial regression model, Faster way to calculate the Hessian / Fisher Information Matrix of a nnet::multinom multinomial regression in R using Rcpp & Kronecker products.
Is it not the case that a multinomial regression model can always be reformulated as a Poisson loglinear model (i.e. as a Poisson GLM) using the Poisson trick (glmnet e.g. uses the Poisson trick to fit multinomial regression models as a Poisson GLM)?
Is there a simple way to decide for a category based on the predicted probabilities of an ordinal logistic regression?
In the binary case, I have so far set the criterion based on the distribution of the base data set. But this is not possible in the ordinal case.
I am trying to perform logistic regression on data that contains a binary outcome. However, I do not have access to the outcome data.
I've calculated probabilities of a "1" outcome for each subject by assigning "risk points" to certain values of each variable and adding them up for each subject, so that the probability of a "1" is (sum of subject's risk points) / (total number of possible risk points). I then took the log of the odds ratio to calculate the logit, so I have a list of logit values between -3 and 2 for each subject.
However, I would like to use logistic regression to evaluate which variables have the greatest effect on the outcome probabilities. Is there a way in R to perform logistic regression using only the predictive variables and logit, without the binary outcome data? I have tried using glm and it does not work, because in order to do logisitic regression you need binary outcome data.
Thank you!
I would like to perform a model-based clustering using a mixture of ordinal logistic regressions (for outcome, not as concomitant model)
Does some one know if it implemented in R? For example, can I manage to use ordinal regression instead of multinomial in flexmix package?
Thanks a lot!
One can perform regular ordinary generalized linear model in R using glm function that has it's own method for summary function and one can summary of the model in which output there are p-values for each variable. Depending on those p-values one can say which variables are statistically significance or not under specific confidence level.
My question is. Is there is the same functionality for cv.glmnet function from glmnet package? I know that after computation I can receive a table with coefficients coef(model, s="lambda.min") where some of them are not zero. So I assume (maybe wrongly) that those non-zero are statistically significance. Am I right? Is there any method that provides p-values or confidence intervals for those coefficients?