Related
I want to get the average marginal effects (AME) of a multinomial logit model with standard errors. For this I've tried different methods, but they haven't led to the goal so far.
Best attempt
My best attempt was to get the AMEs by hand using mlogit which I show below.
library(mlogit)
ml.d <- mlogit.data(df1, choice="Y", shape="wide") # shape data for `mlogit()`
ml.fit <- mlogit(Y ~ 1 | D + x1 + x2, reflevel="1", data=ml.d) # fit the model
# coefficient names
c.names <- all.vars(ml.fit$call)[2:4]
# get marginal effects
ME.mnl <- sapply(c.names, function(x)
stats::effects(ml.fit, covariate=x, data=ml.d),
simplify=FALSE)
# get AMEs
(AME.mnl <- t(sapply(ME.mnl, colMeans)))
# 1 2 3 4 5
# D -0.03027080 -0.008806072 0.0015410569 0.017186531 0.02034928
# x1 -0.02913234 -0.015749598 0.0130577842 0.013240212 0.01858394
# x2 -0.02724650 -0.005482753 0.0008575982 0.005331181 0.02654047
I know these values are the correct ones. However, I could not get the correct standard errors by simply doing the columns' standard deviations:
# standard errors - WRONG!
(AME.mnl.se <- t(sapply(E.mnl, colSdColMeans)))
(Note: colSdColMeans() for columns' SD is provided here.)
Accordingly this also led me to the wrong t-values:
# t values - WRONG!
AME.mnl / AME.mnl.se
# 1 2 3 4 5
# D -0.7110537 -0.1615635 0.04013228 0.4190057 0.8951484
# x1 -0.7170813 -0.2765212 0.33325968 0.3656893 0.8907836
# x2 -0.7084573 -0.1155825 0.02600653 0.1281190 0.8559794
Whereas I know the correct t-values for this case are these:
# D -9.26 -1.84 0.31 4.29 8.05
# x1 -6.66 -2.48 1.60 1.50 3.22
# x2 -2.95 -0.39 0.06 0.42 3.21
I learned that there should be a "delta method", but I only found some code for a very special case with interactions at Cross Validated.
Failed attempts
1.) Package margins doesn't seem to be able to handle "mlogit"
objects:
library(margins)
summary(margins(ml.fit))
2.) There's another package for mlogits, nnet,
library(nnet)
ml.fit2 <- multinom(Y ~ D + x1 + x2, data=df1)
summary(ml.fit2)
but margins can't handle this correctly either:
> summary(margins(ml.fit2))
factor AME SE z p lower upper
D -0.0303 NA NA NA NA NA
x1 -0.0291 NA NA NA NA NA
x2 -0.0272 NA NA NA NA NA
3.) There's also a package around that claims to calculate "Average Effects for Multinomial Logistic Regression Models",
library(DAMisc)
mnlChange2(ml.fit2, varnames="D", data=df1)
but I couldn't get a drop of milk out of it, since the function yields just nothing (even not with the function's example).
How now can we get AMEs with standard errors / t-statistics of a multinomial logit model with R?
Data
df1 <- structure(list(Y = c(3, 4, 1, 2, 3, 4, 1, 5, 2, 3, 4, 2, 1, 4,
1, 5, 3, 3, 3, 5, 5, 4, 3, 5, 4, 2, 5, 4, 3, 2, 5, 3, 2, 5, 5,
4, 5, 1, 2, 4, 3, 1, 2, 3, 1, 1, 3, 2, 4, 2, 2, 4, 1, 5, 3, 1,
5, 2, 3, 4, 2, 4, 5, 2, 4, 1, 4, 2, 1, 5, 3, 2, 1, 4, 4, 1, 5,
1, 1, 1, 4, 5, 5, 3, 2, 3, 3, 2, 4, 4, 5, 3, 5, 1, 2, 5, 5, 1,
2, 3), D = c(12, 8, 6, 11, 5, 14, 0, 22, 15, 13, 18, 3, 5, 9,
10, 28, 9, 16, 17, 14, 26, 18, 18, 23, 23, 12, 28, 14, 10, 15,
26, 9, 2, 30, 18, 24, 27, 7, 6, 25, 13, 8, 4, 16, 1, 4, 5, 18,
21, 1, 2, 19, 4, 2, 16, 17, 23, 15, 13, 21, 24, 14, 27, 6, 20,
6, 19, 8, 7, 23, 11, 11, 1, 22, 21, 4, 27, 6, 2, 9, 18, 30, 26,
22, 10, 1, 4, 7, 26, 15, 26, 18, 30, 1, 11, 29, 25, 3, 19, 15
), x1 = c(13, 12, 4, 3, 16, 16, 15, 13, 1, 15, 10, 16, 1, 17,
7, 13, 12, 6, 8, 16, 16, 11, 7, 16, 5, 13, 12, 16, 17, 6, 16,
9, 14, 16, 15, 5, 7, 2, 8, 2, 9, 9, 15, 13, 9, 4, 16, 2, 11,
13, 11, 6, 4, 3, 7, 4, 12, 2, 16, 14, 3, 13, 10, 11, 10, 4, 11,
16, 8, 12, 14, 9, 4, 16, 16, 12, 9, 10, 6, 1, 3, 8, 7, 7, 5,
16, 17, 10, 4, 15, 10, 8, 3, 13, 9, 16, 12, 7, 4, 11), x2 = c(12,
19, 18, 19, 15, 12, 15, 16, 15, 11, 12, 16, 17, 14, 12, 17, 17,
16, 12, 20, 11, 11, 15, 14, 18, 10, 14, 13, 10, 14, 18, 18, 18,
17, 18, 14, 16, 19, 18, 16, 18, 14, 17, 10, 16, 12, 16, 15, 11,
18, 19, 15, 19, 11, 16, 10, 20, 14, 10, 12, 10, 15, 13, 15, 11,
20, 11, 12, 16, 16, 11, 15, 11, 11, 10, 10, 16, 11, 20, 17, 20,
17, 16, 11, 18, 19, 18, 14, 17, 11, 16, 11, 18, 14, 15, 16, 11,
14, 11, 13)), class = "data.frame", row.names = c(NA, -100L))
We can do something very similar to what is done in your linked answer. In particular, first we want a function that would compute AMEs at a given vector of coefficients. For that we can define
AME.fun <- function(betas) {
tmp <- ml.fit
tmp$coefficients <- betas
ME.mnl <- sapply(c.names, function(x)
effects(tmp, covariate = x, data = ml.d), simplify = FALSE)
c(sapply(ME.mnl, colMeans))
}
where the second half is yours, while in the first one I use a trick to take the same ml.fit object and to change its coefficients. Next we find the jacobian with
require(numDeriv)
grad <- jacobian(AME.fun, ml.fit$coef)
and apply the delta method. Square roots of the diagonal of grad %*% vcov(ml.fit) %*% t(grad) is what we want. Hence,
(AME.mnl.se <- matrix(sqrt(diag(grad %*% vcov(ml.fit) %*% t(grad))), nrow = 3, byrow = TRUE))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0.003269320 0.004788536 0.004995723 0.004009762 0.002527462
# [2,] 0.004375795 0.006348496 0.008168883 0.008844684 0.005763966
# [3,] 0.009233616 0.014048212 0.014713090 0.012702188 0.008261734
AME.mnl / AME.mnl.se
# 1 2 3 4 5
# D -9.259050 -1.8389907 0.30847523 4.2861720 8.051269
# x1 -6.657611 -2.4808393 1.59847852 1.4969683 3.224159
# x2 -2.950794 -0.3902812 0.05828811 0.4197057 3.212458
which coincides with Stata's results.
If you use vce="bootstraps" within margin function then it provides SE with Confidence interval as well
summary(margins(ml.fit2,vce="bootstraps"))
The terminology for “marginal effects” is very inconsistent across
disciplines. Since you refer to the margins package, I assume that you
use the expression “Average Marginal Effects” in the same that that the
margins developers used it, which is the result of this procedure:
Compute the slope of the outcome with respect to D for every row
in the original dataset (unit-level marginal effects).
Take the average of the unit-level slopes (average marginal effect)
In models like nnet::multinom, the slopes will be different for every
level of the outcome variable. There will thus be one average marginal
effect per level, per regressor.
Using the marginaleffects package and the data you supplied, we get:
library(nnet)
library(marginaleffects)
mod <- nnet::multinom(Y ~ D + x1*x2, data=df1, trace = FALSE)
marginaleffects(mod) |> summary()
Group Term Effect Std. Error z value Pr(>|z|) 2.5 % 97.5 %
1 1 D -0.027558 0.004183 -6.5878 4.4625e-11 -3.576e-02 -0.019359
2 1 x1 -0.026789 0.003916 -6.8411 7.8596e-12 -3.446e-02 -0.019114
3 1 x2 -0.026542 0.009812 -2.7051 0.00682871 -4.577e-02 -0.007311
4 2 D -0.012115 0.004702 -2.5766 0.00997729 -2.133e-02 -0.002899
5 2 x1 -0.018223 0.006017 -3.0287 0.00245619 -3.002e-02 -0.006430
6 2 x2 -0.007045 0.013101 -0.5377 0.59078427 -3.272e-02 0.018633
7 3 D 0.001536 0.005877 0.2614 0.79380433 -9.982e-03 0.013054
8 3 x1 0.012451 0.008775 1.4189 0.15592516 -4.748e-03 0.029650
9 3 x2 0.002193 0.015573 0.1408 0.88801728 -2.833e-02 0.032715
10 4 D 0.016300 0.004325 3.7689 0.00016399 7.823e-03 0.024776
11 4 x1 0.018111 0.008789 2.0606 0.03934167 8.845e-04 0.035338
12 4 x2 0.013543 0.013266 1.0208 0.30733424 -1.246e-02 0.039544
13 5 D 0.021837 0.003387 6.4479 1.1343e-10 1.520e-02 0.028475
14 5 x1 0.014449 0.005402 2.6749 0.00747469 3.862e-03 0.025037
15 5 x2 0.017851 0.009072 1.9677 0.04909878 7.048e-05 0.035631
Model type: multinom
Prediction type: probs
Suppose I want to make a plot with the following data:
pairs <- c(1, 2, 2, 3, 2, 4, 2, 5, 2, 6, 2, 7, 2, 8, 2, 9, 2, 10, 2, 11, 4,
14, 4, 15, 6, 13, 6, 19, 6, 28, 6, 36, 7, 16, 7, 23, 7, 26, 7, 33,
7, 39, 7, 43, 8, 35, 8, 40, 9, 21, 9, 22, 9, 25, 9, 27, 9, 33, 9,
38, 10, 12, 10, 18, 10, 20, 10, 32, 10, 34, 10, 37, 10, 44, 10, 45,
10, 46, 11, 17, 11, 24, 11, 29, 11, 30, 11, 31, 11, 33, 11, 41, 11,
42, 11, 47, 14, 50, 14, 52, 14, 54, 14, 55, 14, 56, 14, 57, 14, 58,
14, 59, 14, 60, 14, 61, 15, 48, 15, 49, 15, 51, 15, 53, 15, 62, 15,
63)
g <- graph( pairs )
plot( g,layout = layout.reingold.tilford )
I get a plot like the one below:
As you can see the spaces between some of the vertices are so small that these vertices overlap.
1. I wonder if there is a way to change the spacing between vertices.
2. In addition, is the spacing between vertices arbitrary? For example, Vertices 3, 4, and 5 are very close to each other, but 5 and 6 are far apart.
EDIT:
For my 2nd question, I guess the spacing is dependent on the number of nodes below. E.g., 10 and 11 are farther from each other than 8 and 9 are because there are more children below 10 and 11 than there are below 8 and 9.
I bet there is a better solution but I cannot find it. Here my approach. Since seems that a general parameter for width is missing you have to adjust manually parameters in order to obtain the desired output.
My approach is primarily to resize some elements of the plot in order to make them of the right size, adjust margins in order to optimize the space as much as possible. The most important parameter here is the asp parameter that controls the aspect ratio of the plot (since in this case the plot I guess is better long than tall an aspect ratio of even less than 0.5 is right). Other tricks are to diminish the size of vertex and fonts. Here is the code:
plot( g, layout = layout.reingold.tilford,
edge.width = 1,
edge.arrow.width = 0.3,
vertex.size = 5,
edge.arrow.size = 0.5,
vertex.size2 = 3,
vertex.label.cex = 1,
asp = 0.35,
margin = -0.1)
That produces this plot:
another approach would be to set graphical devices to PDF (or JPEG etc.) and then set the rescale to FALSE. With Rstudio viewer this cut off a huge piece of the data but with other graphic devices it might (not guarantee) work well.
Anyway for every doubt about how to use these parameters (that are very tricky sometimes) type help(igraph.plotting)
For the second part of the question I am not sure but looking inside the function I cannot figure out a precise answer but I guess that the space between elements on the same level is calculated on the child elements they have, say 3,4,5 have to be closer because they have child and sub-child and then they require more space.
I want to calculate a mean. Here is the code with sample data:
# sample data
Nr <- c(1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
dph <- c(3.125000, 6.694737, 4.310680, 11.693735, 103.882353, 11.000000, 7.333333, 20.352941, 5.230769, NA, 4.615385, 47.555556, 2.941176, 18.956522, 44.320000, 28.500000, NA, 10.470588, 19.000000, 25.818182, 43.216783, 51.555556, 8.375000, 6.917647, 9.375000, 5.647059, 4.533333, 27.428571, 14.428571, NA, 1.600000, 5.764706, 4.705882, 55.272727, 2.117647, 30.888889, 41.222222, 23.444444, 2.428571, 6.200000, 17.076923, 21.280000, 40.829268, 14.500000, 6.250000, NA, 15.040000, 5.687204, 2.400000, NA, 26.375000, 18.064516, 4.000000, 6.139535, 8.470588, 128.666667, 2.235294, 34.181818, 116.000000, 6.000000, 5.777778, 10.666667, 15.428571, 54.823529, 81.315789, 42.333333)
dat <- data.frame(cbind(Nr = Nr, dph = dph))
# calculate mean directly
mean(dat$dph, na.rm = TRUE)
[1] 23.02403
# aggregate first, then calculate mean
mean(aggregate(dph ~ Nr, dat, mean, na.rm = T)$dph)
[1] 22.11743
# 23.02403 != 22.11743
Why do I get two different results?
Explanation for question:
I need to perform a Wilcoxon test, comparing a pre baseline with a post baseline. Pre is 3 measurements, post is 16. Because a Wilcoxon test needs two vectors of equal length, I calculate means for pre and post for each patient with aggregate, creating two vectors of equal length. Above data is pre.
Edit:
Patient no. 4 was removed from the data. But using Nr <- rep(1:22, 3) returns the same results.
I think this is because in the mean(dat$x, na.rm=T) version, each NA that is removed, reduces the number of observations by 1, whereas if you aggregate first, in your example you have an NA in row 10 (ID 11) which is removed but since the other rows with ID 11 do not contain NAs (or at least one of them doesn't), the number of observations (unique IDs) you use to calculate the mean after aggregation for each ID, is not reduced by 1 for each NA. So the difference IMO comes from dividing the sum of dph, which should be the same in both calculations, by different numbers of observations.
You can verify this by changing NA entries to 0 and the calculating the mean again with both versions, they'll return the same.
But generally you should note that it only works here because you have the same number of observations for each ID (3 in this case). If they were different, you would again get different results.
How can I calculate the mean of the top 4 observations in my column?
c(12, 13, 15, 1, 5, 9, 34, 50, 60, 50, 60, 4, 6, 8, 12)
For instance, in the above I would have (50+60+50+60)/4 = 55. I only know how to use the quantile, but it does not work for this.
Any ideas?
Since you're interested in only the top 4 items, you can use partial sort instead of full sort. If your vector is huge, you might save quite some time:
x <- c(12, 13, 15, 1, 5, 9, 34, 50, 60, 50, 60, 4, 6, 8, 12)
idx <- seq(length(x)-3, length(x))
mean(sort(x, partial=idx)[idx])
# [1] 55
Try this:
vec <- c(12, 13, 15, 1, 5, 9, 34, 50, 60, 50, 60, 4, 6, 8, 12)
mean(sort(vec, decreasing=TRUE)[1:4])
gives
[1] 55
Maybe something like this:
v <- c(12, 13, 15, 1, 5, 9, 34, 50, 60, 50, 60, 4, 6, 8, 12)
mean(head(sort(v,decreasing=T),4))
First, you sort your vector so that the largest values are in the beginning. Then with head you take the 4 first values in that vector, subsequently taking the mean value of that.
To be different! Also, please try to do some research on your own before posting.
x <- c(12, 13, 15, 1, 5, 9, 34, 50, 60, 50, 60, 4, 6, 8, 12)
mean(tail(sort(x), 4))
Just to show that you can use quantile in this exercise:
mean(quantile(x,1-(0:3)/length(x),type=1))
#[1] 55
However, the other answers are clearly more efficient.
You could use the order function. Order by -x to give the values in descending order, and just average the first 4:
x <- c(12, 13, 15, 1, 5, 9, 34, 50, 60, 50, 60, 4, 6, 8, 12)
mean(x[order(-x)][1:4])
[1] 55
I'm a student working on an epidemiology model in R, using maximum likelihood methods. I created my negative log likelihood function. It's sort of gross looking, but here it is:
NLLdiff = function(v1, CV1, v2, CV2, st1 = (czI01 - czV01), st2 = (czI02 - czV02), st01 = czI01, st02 = czI02, tt1 = czT01, tt2 = czT02) {
prob1 = (1 + v1 * CV1 * tt1)^(-1/CV1)
prob2 = ( 1 + v2 * CV2 * tt2)^(-1/CV2)
-(sum(dbinom(st1, st01, prob1, log = T)) + sum(dbinom(st2, st02, prob2, log = T)))
}
The reason the first line looks so awful is because most of the data it takes is input there. czI01, for example, is already declared. I did this simply so that my later calls to the function don't all have to have awful vectors in them.
I then optimized for CV1, CV2, v1 and v2 using mle2 (library bbmle). That's also a bit gross looking, and looks like:
ml.cz.diff = mle2 (NLLdiff, start=list(v1 = vguess, CV1 = cguess, v2 = vguess, CV2 = cguess), method="L-BFGS-B", lower = 0.0001)
Now, everything works fine up until here. ml.cz.diff gives me values that I can turn into a plot that reasonably fits my data. I also have several different models, and can get AICc values to compare them. However, when I try to get confidence intervals around v1, CV1, v2 and CV2 I have problems. Basically, I get a negative bound on CV1, which is impossible as it actually represents a square number in the biological model as well as some warnings.
Is there a better way to get confidence intervals? Or, really, a way to get confidence intervals that make sense here?
What I see happening is that, by coincidence, my hessian matrix is singular for some values in the optimization space. But, since I'm optimizing over 4 variables and don't have overly extensive programming knowledge, I can't come up with a good method of optimization that doesn't rely on the hessian. I have googled the problem - it suggested that my model's bad, but I'm reconstructing some work done before which suggests that my model's really not awful (the plots I make using the ml.cz.diff look like the plots of the original work). I have also read the relevant parts of the manual as well as Bolker's book Ecological Models in R. I have also tried different optimization methods, which resulted in a longer run time but the same errors. The "SANN" method didn't finish running within an hour, so I didn't wait around to see the result.
In a nutshell: my confidence intervals are bad. Is there a relatively straightforward way to fix them in R?
My vectors are:
czT01 = c(5, 5, 5, 5, 5, 5, 5, 25, 25, 25, 25, 25, 25, 25, 50, 50, 50, 50, 50, 50, 50)
czT02 = c(5, 5, 5, 5, 5, 10, 10, 10, 10, 10, 25, 25, 25, 25, 25, 50, 50, 50, 50, 50, 75, 75, 75, 75, 75)
czI01 = c(25, 24, 22, 22, 26, 23, 25, 25, 25, 23, 25, 18, 21, 24, 22, 23, 25, 23, 25, 25, 25)
czI02 = c(13, 16, 5, 18, 16, 13, 17, 22, 13, 15, 15, 22, 12, 12, 13, 13, 11, 19, 21, 13, 21, 18, 16, 15, 11)
czV01 = c(1, 4, 5, 5, 2, 3, 4, 11, 8, 1, 11, 12, 10, 16, 5, 15, 18, 12, 23, 13, 22)
czV02 = c(0, 3, 1, 5, 1, 6, 3, 4, 7, 12, 2, 8, 8, 5, 3, 6, 4, 6, 11, 5, 11, 1, 13, 9, 7)
and I get my guesses by:
v = -log((c(czI01, czI02) - c(czV01, czV02))/c(czI01, czI02))/c(czT01, czT02)
vguess = mean(v)
cguess = var(v)/vguess^2
It's also possible that I'm doing something else completely wrong, but my results seem reasonable so I haven't caught it.
You could change the parameterization so that the constraints are always satisfied. Rewrite the likelihood as a a function of ln(CV1) and ln(CV2), that way you can be sure that CV1 and CV2 remain strictly positive.
NLLdiff_2 = function(v1, lnCV1, v2, lnCV2, st1 = (czI01 - czV01), st2 = (czI02 - czV02), st01 = czI01, st02 = czI02, tt1 = czT01, tt2 = czT02) {
prob1 = (1 + v1 * exp(lnCV1) * tt1)^(-1/exp(lnCV1))
prob2 = ( 1 + v2 * exp(lnCV2) * tt2)^(-1/exp(lnCV2))
-(sum(dbinom(st1, st01, prob1, log = T)) + sum(dbinom(st2, st02, prob2, log = T)))
}