Multinomial logit in R: mlogit versus nnet - r

I want to run a multinomial logit in R and have used two libraries, nnet and mlogit, which produce different results and report different types of statistics. My questions are:
What is the source of discrepency between the coefficients and standard errors reported by nnet and those reported by mlogit?
I would like to report my results to a Latex file using stargazer. When doing so, there is a problematic tradeoff:
If I use the results from mlogit then I get the statistics I wish, such as psuedo R squared, however, the output is in long format (see example below).
If I use the results from nnet then the format is as expected, but it reports statistics that I am not interested in such as AIC, but does not include, for example, psuedo R squared.
I would like to have the statistics reported by mlogit in the formatting of nnet when I use stargazer.
Here is a reproducible example, with three choice alternatives:
library(mlogit)
df = data.frame(c(0,1,1,2,0,1,0), c(1,6,7,4,2,2,1), c(683,276,756,487,776,100,982))
colnames(df) <- c('y', 'col1', 'col2')
mydata = df
mldata <- mlogit.data(mydata, choice="y", shape="wide")
mlogit.model1 <- mlogit(y ~ 1| col1+col2, data=mldata)
The tex output when compiled is of what I refer to as "long format" which I deem undesired:
Now, using nnet:
library(nnet)
mlogit.model2 = multinom(y ~ 1 + col1+col2, data=mydata)
stargazer(mlogit.model2)
Gives the tex output:
which is of the "wide" format which I desire. Note the different coefficient and standard errors.

To my knowledge, there are three R packages that allow the estimation of the multinomial logistic regression model: mlogit, nnet and globaltest (from Bioconductor). I do not consider here the mnlogit package, a faster and more efficient implementation of mlogit.
All the above packages use different algorithms that, for small samples, give different results. These differencies vanishes for moderate sample sizes (try with n <- 100).
Consider the following data generating process taken from the James Keirstead's blog:
n <- 40
set.seed(4321)
df1 <- data.frame(x1=runif(n,0,100), x2=runif(n,0,100))
df1 <- transform(df1, y=1+ifelse(100 - x1 - x2 + rnorm(n,sd=10) < 0, 0,
ifelse(100 - 2*x2 + rnorm(n,sd=10) < 0, 1, 2)))
str(df1)
'data.frame': 40 obs. of 3 variables:
$ x1: num 33.48 90.91 41.15 4.38 76.35 ...
$ x2: num 68.6 42.6 49.9 36.1 49.6 ...
$ y : num 1 1 3 3 1 1 1 1 3 3 ...
table(df1$y)
1 2 3
19 8 13
The model parameters estimated by the three packages are respectively:
library(mlogit)
df2 <- mlogit.data(df1, choice="y", shape="wide")
mlogit.mod <- mlogit(y ~ 1 | x1+x2, data=df2)
(mlogit.cf <- coef(mlogit.mod))
2:(intercept) 3:(intercept) 2:x1 3:x1 2:x2 3:x2
42.7874653 80.9453734 -0.5158189 -0.6412020 -0.3972774 -1.0666809
#######
library(nnet)
nnet.mod <- multinom(y ~ x1 + x2, df1)
(nnet.cf <- coef(nnet.mod))
(Intercept) x1 x2
2 41.51697 -0.5005992 -0.3854199
3 77.57715 -0.6144179 -1.0213375
#######
library(globaltest)
glbtest.mod <- globaltest::mlogit(y ~ x1+x2, data=df1)
(cf <- glbtest.mod#coefficients)
1 2 3
(Intercept) -41.2442934 1.5431814 39.7011119
x1 0.3856738 -0.1301452 -0.2555285
x2 0.4879862 0.0907088 -0.5786950
The mlogit command of globaltest fits the model without using a reference outcome category, hence the usual parameters can be calculated as follows:
(glbtest.cf <- rbind(cf[,2]-cf[,1],cf[,3]-cf[,1]))
(Intercept) x1 x2
[1,] 42.78747 -0.5158190 -0.3972774
[2,] 80.94541 -0.6412023 -1.0666813
Concerning the estimation of the parameters in the three packages, the method used in mlogit::mlogit is explained in detail here.
In nnet::multinom the model is a neural network with no hidden layers, no bias nodes and a softmax output layer; in our case there are 3 input units and 3 output units:
nnet:::summary.nnet(nnet.mod)
a 3-0-3 network with 12 weights
options were - skip-layer connections softmax modelling
b->o1 i1->o1 i2->o1 i3->o1
0.00 0.00 0.00 0.00
b->o2 i1->o2 i2->o2 i3->o2
0.00 41.52 -0.50 -0.39
b->o3 i1->o3 i2->o3 i3->o3
0.00 77.58 -0.61 -1.02
Maximum conditional likelihood is the method used in multinom for model fitting.
The parameters of multinomial logit models are estimated in globaltest::mlogit using maximum likelihood and working with an equivalent log-linear model and the Poisson likelihood. The method is described here.
For models estimated by multinom the McFadden's pseudo R-squared can be easily calculated as follows:
nnet.mod.loglik <- nnet:::logLik.multinom(nnet.mod)
nnet.mod0 <- multinom(y ~ 1, df1)
nnet.mod0.loglik <- nnet:::logLik.multinom(nnet.mod0)
(nnet.mod.mfr2 <- as.numeric(1 - nnet.mod.loglik/nnet.mod0.loglik))
[1] 0.8483931
At this point, using stargazer, I generate a report for the model estimated by mlogit::mlogit which is as similar as possible to the report of multinom. The basic idea is to substitute the estimated coefficients and probabilities in the object created by multinom with the corresponding estimates of mlogit.
# Substitution of coefficients
nnet.mod2 <- nnet.mod
cf <- matrix(nnet.mod2$wts, nrow=4)
cf[2:nrow(cf), 2:ncol(cf)] <- t(matrix(mlogit.cf,nrow=2))
# Substitution of probabilities
nnet.mod2$wts <- c(cf)
nnet.mod2$fitted.values <- mlogit.mod$probabilities
Here is the result:
library(stargazer)
stargazer(nnet.mod2, type="text")
==============================================
Dependent variable:
----------------------------
2 3
(1) (2)
----------------------------------------------
x1 -0.516** -0.641**
(0.212) (0.305)
x2 -0.397** -1.067**
(0.176) (0.519)
Constant 42.787** 80.945**
(18.282) (38.161)
----------------------------------------------
Akaike Inf. Crit. 24.623 24.623
==============================================
Note: *p<0.1; **p<0.05; ***p<0.01
Now I am working on the last issue: how to visualize loglik, pseudo R2 and other information in the above stargazer output.

If you are using stargazer you can use omit to remove unwanted rows or references. Here is a quick example, hopefully, it will point you int he right direction.
nb. My assumption is you are using Rstudio and rmarkdown with knitr.
```{r, echo=FALSE}
library(mlogit)
df = data.frame(c(0,1,1,2,0,1,0), c(1,6,7,4,2,2,1), c(683,276,756,487,776,100,982))
colnames(df) <- c('y', 'col1', 'col2')
mydata = df
mldata <- mlogit.data(mydata, choice = "y", shape="wide")
mlogit.model1 <- mlogit(y ~ 1| col1+col2, data=mldata)
mlogit.col1 <- mlogit(y ~ 1 | col1, data = mldata)
mlogit.col2 <- mlogit(y ~ 1 | col2, data = mldata)
```
# MLOGIT
```{r echo = FALSE, message = TRUE, error = TRUE, warning = FALSE, results = 'asis'}
library(stargazer)
stargazer(mlogit.model1, type = "html")
stargazer(mlogit.col1,
mlogit.col2,
type = "html",
omit=c("1:col1","2:col1","1:col2","2:col2"))
```
Result:
Note that the second image omits 1:col1, 2:col2, 1:col2 and 2:col2

Related

R modelsummary output: robust standard errors from coxph model

Im running a series of coxph models in R and compiling the output into latex tables using the modelsummary package and command. The coxph provides SE and robust se as outputs and the p-value is based on the robust se. Here is a quick example to illustrate the output:
test_data <- list(time=c(4,3,1,1,2,2,3)
, status=c(1,1,1,0,1,1,0)
, x=c(0,2,1,1,1,0,0)
, sex=c(0,0,0,0,1,1,1))
model <- coxph(Surv(time, status) ~ x + cluster(sex), test_data)
Call:
coxph(formula = Surv(time, status) ~ x, data = test1, cluster = sex)
coef exp(coef) se(coef) robust se z p
x 0.460778 1.585306 0.562800 0.001052 437.9 <2e-16
Likelihood ratio test=0.66 on 1 df, p=0.4176
n= 7, number of events= 5
Next, Im trying to create latex tables from these models, displaying the robust se.
modelsummary(model, output = "markdown", fmt = 3, estimate = "{estimate}{stars}", statistic = "std.error")
| | Model 1 |
|:---------------------|:----------:|
|x | 0.461*** |
| | (0.563) |
|Num.Obs. | 7 |
|R2 | 0.090 |
|AIC | 13.4 |
As we can see, only the non-adjusted se is displayed. I could not find any alternative for this statistic = "std.error" parameter that fits and also something like vcov="robust" does not work.
How can I display any kind of robust standard errors using modelsummary for coxph models?
Thanks for reading and any help is appreciated.
The next version of modelsummary (now available on Github) will produce a more informative error message in those cases:
library(survival)
library(modelsummary)
test_data <- list(time=c(4,3,1,1,2,2,3)
, status=c(1,1,1,0,1,1,0)
, x=c(0,2,1,1,1,0,0)
, sex=c(0,0,0,0,1,1,1))
model <- coxph(Surv(time, status) ~ x + cluster(sex), test_data)
modelsummary(model, vcov = "robust")
# Error: Unable to extract a variance-covariance matrix for model object of class
# `coxph`. Different values of the `vcov` argument trigger calls to the `sandwich`
# or `clubSandwich` packages in order to extract the matrix (see
# `?insight::get_varcov`). Your model or the requested estimation type may not be
# supported by one or both of those packages, or you were missing one or more
# required arguments in `vcov_args` (like `cluster`).
As you can see, the problem is that modelsummary does not compute robust standard errors itself. Instead, it delegates this task to the sandwich or clubSandwich packages. Unfortunately, this coxph model does not appear appear to be supported by those packages:
sandwich::vcovHC(model)
#> Error in apply(abs(ef) < .Machine$double.eps, 1L, all): dim(X) must have a positive length
sandwich is the main package in the R ecosystem to compute robust standard errors. AFAICT, all the other table-making packages available (e.g., stargazer, texreg) also use sandwich, so you are unlikely to have success by looking at those. If you find another package which can compute robust standard errors for Cox models, please file a report on the modelsummary Github repository. I will investigate to see if it’s possible to add support then.
If the info you want is available in the summary object, you can add this information by following the instructions here:
https://vincentarelbundock.github.io/modelsummary/articles/modelsummary.html#adding-new-information-to-existing-models
tidy_custom.coxph <- function(x, ...) {
s <- summary(x)$coefficients
data.frame(
term = row.names(s),
robust.se = s[, "robust se", drop = FALSE])
}
modelsummary(model, statistic = "robust.se")
Model 1
x
0.461
(0.001)
Num.Obs.
7
AIC
13.4
BIC
13.4
RMSE
0.61

multinomial logistic regression in R: multinom in nnet package result different from mlogit in mlogit package?

Both R functions, multinom (package nnet) and mlogit (package mlogit) can be used for multinomial logistic regression. But why this example returns different result of p values of coefficients?
#prepare data
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
mydata$rank <- factor(mydata$rank)
mydata$gre[1:10] = rnorm(10,mean=80000)
#multinom:
test = multinom(admit ~ gre + gpa + rank, data = mydata)
z <- summary(test)$coefficients/summary(test)$standard.errors
# For simplicity, use z-test to approximate t test.
pv <- (1 - pnorm(abs(z)))*2
pv
# (Intercept) gre gpa rank2 rank3 rank4
# 0.00000000 0.04640089 0.00000000 0.00000000 0.00000000 0.00000000
#mlogit:
mldata = mlogit.data(mydata,choice = 'admit', shape = "wide")
mlogit.model1 <- mlogit(admit ~ 1 | gre + gpa + rank, data = mldata)
summary(mlogit.model1)
# Coefficients :
# Estimate Std. Error t-value Pr(>|t|)
# 1:(intercept) -3.5826e+00 1.1135e+00 -3.2175 0.0012930 **
# 1:gre 1.7353e-05 8.7528e-06 1.9825 0.0474225 *
# 1:gpa 1.0727e+00 3.1371e-01 3.4195 0.0006274 ***
# 1:rank2 -6.7122e-01 3.1574e-01 -2.1258 0.0335180 *
# 1:rank3 -1.4014e+00 3.4435e-01 -4.0697 4.707e-05 ***
# 1:rank4 -1.6066e+00 4.1749e-01 -3.8482 0.0001190 ***
Why the p values from multinorm and mlogit are so different? I guess it is because of the outliers I added using mydata$gre[1:10] = rnorm(10,mean=80000). If outlier is an inevitable issue (for example in genomics, metabolomics, etc.), which R function should I use?
As alternative, you can use broom, which outputs tidy format for multinom class models.
library(broom)
tidy(test)
It'll return a data.frame with z-statistics and p-values.
Take a look at tidy documentation for further information.
P.S.: as I can't get the data from the link you posted, I can't replicate the results
The difference here is the difference between the Wald $z$ test (what you calculated in pv) and the Likelihood Ratio test (what is returned by summary(mlogit.model). The Wald test is computationally simpler, but in general has less desirable properties (e.g., its CIs are not scale-invariant). You can read more about the two procedures here.
To perform LR tests on your nnet model coefficents, you can load the car and lmtest packages and call Anova(test) (though you'll have to do a little more work for the single df tests).

Prediction of 'mlm' linear model object from `lm()`

I have three datasets:
response - matrix of 5(samples) x 10(dependent variables)
predictors - matrix of 5(samples) x 2(independent variables)
test_set - matrix of 10(samples) x 10(dependent variables defined in response)
response <- matrix(sample.int(15, size = 5*10, replace = TRUE), nrow = 5, ncol = 10)
colnames(response) <- c("1_DV","2_DV","3_DV","4_DV","5_DV","6_DV","7_DV","8_DV","9_DV","10_DV")
predictors <- matrix(sample.int(15, size = 7*2, replace = TRUE), nrow = 5, ncol = 2)
colnames(predictors) <- c("1_IV","2_IV")
test_set <- matrix(sample.int(15, size = 10*2, replace = TRUE), nrow = 10, ncol = 2)
colnames(test_set) <- c("1_IV","2_IV")
I'm doing a multivariate linear model using a training set defined as the combination of response and predictor sets, and I would like to use this model to make predictions for the test set:
training_dataframe <- data.frame(predictors, response)
fit <- lm(response ~ predictors, data = training_dataframe)
predictions <- predict(fit, data.frame(test_set))
However, the results for predictions are really odd:
predictions
First off the matrix dimensions are 5 x 10, which is the number of samples in the response variable by the number of DVs.
I'm not very skilled with this type of analysis in R, but shouldn't I be getting a 10 x 10 matrix, so that I have predictions for each row in my test_set?
Any help with this issue would be greatly appreciated,
Martin
You are stepping into a poorly supported part in R. The model class you have is "mlm", i.e., "multiple linear models", which is not the standard "lm" class. You get it when you have several (independent) response variables for a common set of covariates / predictors. Although lm() function can fit such model, predict method is poor for "mlm" class. If you look at methods(predict), you would see a predict.mlm*. Normally for a linear model with "lm" class, predict.lm is called when you call predict; but for a "mlm" class the predict.mlm* is called.
predict.mlm* is too primitive. It does not allow se.fit, i.e., it can not produce prediction errors, confidence / prediction intervals, etc, although this is possible in theory. It can only compute prediction mean. If so, why do we want to use predict.mlm* at all?! The prediction mean can be obtained by a trivial matrix-matrix multiplication (in standard "lm" class this is a matrix-vector multiplication), so we can do it on our own.
Consider this small, reproduce example.
set.seed(0)
## 2 response of 10 observations each
response <- matrix(rnorm(20), 10, 2)
## 3 covariates with 10 observations each
predictors <- matrix(rnorm(30), 10, 3)
fit <- lm(response ~ predictors)
class(fit)
# [1] "mlm" "lm"
beta <- coef(fit)
# [,1] [,2]
#(Intercept) 0.5773235 -0.4752326
#predictors1 -0.9942677 0.6759778
#predictors2 -1.3306272 0.8322564
#predictors3 -0.5533336 0.6218942
When you have a prediction data set:
# 2 new observations for 3 covariats
test_set <- matrix(rnorm(6), 2, 3)
we first need to pad an intercept column
Xp <- cbind(1, test_set)
Then do this matrix multiplication
pred <- Xp %*% beta
# [,1] [,2]
#[1,] -2.905469 1.702384
#[2,] 1.871755 -1.236240
Perhaps you have noticed that I did not even use a data frame here. Yes it is unnecessary as you have everything in matrix form. For those R wizards, maybe using lm.fit or even qr.solve is more straightforward.
But as a complete answer, it is a must to demonstrate how to use predict.mlm to get our desired result.
## still using previous matrices
training_dataframe <- data.frame(response = I(response), predictors = I(predictors))
fit <- lm(response ~ predictors, data = training_dataframe)
newdat <- data.frame(predictors = I(test_set))
pred <- predict(fit, newdat)
# [,1] [,2]
#[1,] -2.905469 1.702384
#[2,] 1.871755 -1.236240
Note the I() when I use data.frame(). This is a must when we want to obtain a data frame of matrices. You can compare the difference between:
str(data.frame(response = I(response), predictors = I(predictors)))
#'data.frame': 10 obs. of 2 variables:
# $ response : AsIs [1:10, 1:2] 1.262954.... -0.32623.... 1.329799.... 1.272429.... 0.414641.... ...
# $ predictors: AsIs [1:10, 1:3] -0.22426.... 0.377395.... 0.133336.... 0.804189.... -0.05710.... ...
str(data.frame(response = response, predictors = predictors))
#'data.frame': 10 obs. of 5 variables:
# $ response.1 : num 1.263 -0.326 1.33 1.272 0.415 ...
# $ response.2 : num 0.764 -0.799 -1.148 -0.289 -0.299 ...
# $ predictors.1: num -0.2243 0.3774 0.1333 0.8042 -0.0571 ...
# $ predictors.2: num -0.236 -0.543 -0.433 -0.649 0.727 ...
# $ predictors.3: num 1.758 0.561 -0.453 -0.832 -1.167 ...
Without I() to protect the matrix input, data are messy. It is amazing that this will not cause problem to lm, but predict.mlm will have a hard time obtaining the correct matrix for prediction, if you don't use I().
Well, I would recommend using a "list" instead of a "data frame" in this case. data argument in lm as well newdata argument in predict allows list input. A "list" is a more general structure than a data frame, which can hold any data structure without difficulty. We can do:
## still using previous matrices
training_list <- list(response = response, predictors = predictors)
fit <- lm(response ~ predictors, data = training_list)
newdat <- list(predictors = test_set)
pred <- predict(fit, newdat)
# [,1] [,2]
#[1,] -2.905469 1.702384
#[2,] 1.871755 -1.236240
Perhaps in the very end, I should stress that it is always safe to use formula interface, rather than matrix interface. I will use R built-in dataset trees as a reproducible example.
fit <- lm(cbind(Girth, Height) ~ Volume, data = trees)
## use the first two rows as prediction dataset
predict(fit, newdata = trees[1:2, ])
# Girth Height
#1 9.579568 71.39192
#2 9.579568 71.39192
Perhaps you still remember my saying that predict.mlm* is too primitive to support se.fit. This is the chance to test it.
predict(fit, newdata = trees[1:2, ], se.fit = TRUE)
#Error in predict.mlm(fit, newdata = trees[1:2, ], se.fit = TRUE) :
# the 'se.fit' argument is not yet implemented for "mlm" objects
Oops... How about confidence / prediction intervals (actually without the ability to compute standard error it is impossible to produce those intervals)? Well, predict.mlm* will just ignore it.
predict(fit, newdata = trees[1:2, ], interval = "confidence")
# Girth Height
#1 9.579568 71.39192
#2 9.579568 71.39192
So this is so different compared with predict.lm.

Create a binary outcome with random forest

I have a dataset that looks like this:
TEAM1 TEAM2 EXPG1 EXPG2 Gewonnen
ADO Den Haag Groningen 1.5950 1.2672 1
I now try to predict the column Gewonnen based on EXPG1 and EXPG2. Therefore I created a training and test set and am creating the following model (all using rcaret):
modFit <- train(Gewonnen~ EXPG1 + EXPG2, data=training, method="rf", prox=TRUE)
I can't make a confusion matrix now because my data has more references. That's true because when I do:
pred <- predict(modFit, testing)
head(print)
It says: 0.5324000 0.7237333 0.2811333 0.8231000 0.8299333 0.9792000
Because I want to make a confusion matrix I can't turn them into on 0/1 but I have the feeling that there should be an option to do this in the model as well.
Any thoughts on what I should change in this model to create 0/1 values. I couldn't find it in the documentation:
modFit <- train(Gewonnen~ EXPG1 + EXPG2, data=training, method="rf", prox=TRUE)
First of all, as Tim Biegeleisen says, you should convert your Gewonnen variable to a factor (in both training & test sets), if it is not already:
training$Gewonnen <- as.factor(training$Gewonnen)
testing$Gewonnen <- as.factor(testing$Gewonnen)
After that, the type option in the caret function predict determines what type of response you get for a binary classification problem, i.e. class labels or probabilities. Here is a reproducible example from the caret documentation using the Sonar dataset from the package mlbench:
library(caret)
library(mlbench)
data(Sonar)
str(Sonar$Class)
# Factor w/ 2 levels "M","R": 2 2 2 2 2 2 2 2 2 2 ...
set.seed(998)
inTraining <- createDataPartition(Sonar$Class, p = .75, list = FALSE)
training <- Sonar[ inTraining,]
testing <- Sonar[-inTraining,]
modFit <- train(Class ~ ., data=training, method="rf", prox=TRUE)
pred <- predict(modFit, testing, type="prob") # for class probabilities
head(pred)
# M R
# 5 0.442 0.558
# 10 0.276 0.724
# 11 0.096 0.904
# 12 0.360 0.640
# 20 0.654 0.346
# 21 0.522 0.478
pred2 <- predict(modFit, testing, type="raw") # for class labels
head(pred2)
# [1] R R R R M M
# Levels: M R
For the confusion matrix, you will need class labels (i.e. pred2 above):
confusionMatrix(pred2, testing$Class)
# Confusion Matrix and Statistics
# Reference
# Prediction M R
# M 25 6
# R 2 18
This answer is a bit speculative as you omitted some critical details about your data set and I have not worked extensively with the caret package. That being said, it appears that you are running random forests in regression mode, which means that you will end up with a continuous function. This means that predictions can have a response value of 0, 1, or anything in between 0 and 1. If your Gewonnen column only has values of 0 or 1, and you want predicted values to also behave this way, then you can try turning Gewonnen into a categorical variable. As this article discusses, this might tell random forests to run in classification mode instead of regression mode.
Gewonnen <- as.factor(Gewonnen)
This builds the random forest as you did before, and you should have the responses you want.

Hausman type test in R

I have been using "plm" package of R to do the analysis of panel data. One of the important test in this package for choosing between "fixed effect" or "random effect" model is called Hausman type. A similar test is also available for the Stata. The point here is that Stata requires fixed effect to be estimated first followed by random effect. However, I didn't see any such restriction in the "plm" package. So, I was wondering whether "plm" package has the default "fixed effect" first and then "random effect" second. For your reference, I mention below the steps in Stata and R that I followed for the analysis.
*
Stata Steps: (data=mydata, y=dependent variable,X1:X4: explanatory variables)
*step 1 : Estimate the FE model
xtreg y X1 X2 X3 X4 ,fe
*step 2: store the estimator
est store fixed
*step 3 : Estimate the RE model
xtreg y X1 X2 X3 X4,re
* step 4: store the estimator
est store random
*step 5: run Hausman test
hausman fixed random
#R steps (data=mydata, y=dependent variable,X1:X4: explanatory variables)
#step 1 : Estimate the FE model
fe <- plm(y~X1+X2+X3+X4,data=mydata,model="within")
summary(model.fe)
#step 2 : Estimate the RE model
re <- pggls(y~X1+X2+X3+X4,data=mydata,model="random")
summary(model.re)
#step 3 : Run Hausman test
phtest(fe, re)
Update: Be sure to read the comments. Original answer below.
Trial-and-error way of finding this out:
> library(plm)
> data("Gasoline", package = "plm")
> form <- lgaspcar ~ lincomep + lrpmg + lcarpcap
> wi <- plm(form, data = Gasoline, model = "within")
> re <- plm(form, data = Gasoline, model = "random")
> phtest(wi, re)
Hausman Test
data: form
chisq = 302.8037, df = 3, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent
> phtest(re, wi)
Hausman Test
data: form
chisq = 302.8037, df = 3, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent
As you can see, the test yields the same result no matter which of the models you feed it as the first and which as the second argument.

Resources