How can I display all my model predicted values using whisker plots? - r

I'm working with a linear mixed model with sex and diel (day/night) as my predictors and depth displacement as my response in R. Here is the model:
displacement_lmm_hour <- lmer(Displacement~sex*Light + (1|Hour), data = avg_depth_df_hour)
I want to create a whisker plot displaying each predicted value for each of my predictors from the model. So, I tried using dwplot from the dotwhisker() library in R.
dwplot(displacement_lmm_hour, effects = "fixed")
This is what it came out with:
As you can see, it is only showing the first 'sets' (if you will) of predicted values. Ie. there's no males, or day time values shown. I realize this comes from the model itself and the summary() table of the model only shows those as well. But, how can I show these values for the 'hidden' predicted values that also come from the model?
I also tried using '''plot_model''', which allowed me to separate my predicted values, but I don't think the error bars are correct (why I tried the whisker plots instead)
plot_model(displacement_lmm_hour, type = "pred", terms = c("sex","Light"), axis.title = c("Sex", "Displacement")
Do you have an idea how to accomplish this using the dwplot function? Or another way to accomplish this in general?
Thanks!

Related

How do you fix incorrect predicted values ("pred") being used in the plot_model function in R?

I am trying to fix a plot that I generated that has the incorrect predicted values along the y-axis. The predicted values should be "Score," my outcome variable. For some reason, the "id" variable is along the y-axis instead. Everything else is plotted correctly. I checked my model, and I can't see where the issue is coming from. I will post my regression model syntax and plot syntax below. The model is a multivariate regression with two outcomes. "scale" labels each of those two outcomes which are indicated by the Score variable. Both predictors are two-level categorical variables, and there is also an interaction between them.
If anyone has any ideas, I would greatly appreciate it!
multireg4 <- gls(Score ~ 0 + scale + scale:DrinkStat + scale:ACEHx + scale:DrinkStat:ACEHx,
data = R34_Long,
na.action = na.omit,
weights = varIdent(form = ~ 1 | scale),
correlation = corSymm(form = ~ as.numeric(scale) | id))
plot_model(multireg4, type = "pred", terms = c("DrinkStat", "ACEHx", "scale")) + theme_sjplot2()
here is an image of the plot
I've tried adding limits to the scale variable in the plot_model code, but the issue is that the wrong predicted values are being pulled. This seems like an automatic function, so I am not exactly sure where to edit the syntax (in the model regression or the plot) in order to get R to use the correct predicted values.

GAM residuals missing in plot

I am applying a GAM model to my data: cell abundance over time.
The model works just fine (although I am aware of a pattern in my resiudals, but this is a different issue not relevant here).
It just fails to display the partial residuals in the final plot, although i set residuals = TRUE. Here is my output:
https://i.stack.imgur.com/C1MlY.png
also I used mgcv package.
Previously this code worked as I wanted, but on different data. Any ideas on why it is not working are welcome!
GAM_EA <- mgcv::gam(EUB_FISH ~ s(Day, by = Heatwave), data = HnH, method = "REML")
gam.check(GAM_EA) #Checking the model
mgcv::anova.gam(GAM_EA) #Retrieving the statistical results. See ?anova.gam
summary.gam(GAM_EA)
plot(GAM_EA, shift = coef(GAM_EA)[1], residuals = TRUE)
See argument by.resid in ?plot.gam. They way these are used in plot.gam would been meaningless for factor by terms unless you were to subset the partial residuals and plot only the residuals for observations in the specific level of the by factor.

Coefficient Plot in r for mixed model

I have fitted a three level model looking at political trust using multiple waves of survey data. Individuals nested in country-waves nested in countries. Now that I have my results, I want to present them in a coefficient plot. I have fitted a coefficient plot using the sjPlot function below. I am only interested in presenting the higher level variables (growth,inflation,unemployment,corruption) as the individual level variables are controls, but the plot puts all the predictors in. I also want to edit the names of certain variables so its clearer. How can I do this? I don't mind suggestions using ggplot or the base r coefplot function.
plot_model(fullmodel, transform = NULL, show.values = TRUE)

What are the differences between directly plotting the fit function and plotting the predicted values(they have same shape but different ranges)?

I am trying to learn gam() in R for a logistic regression using spline on a predictor. The two methods of plotting in my code gives the same shape but different ranges of response in the logit scale, seems like an intercept is missing in one. Both are supposed to be correct but, why the differences in range?
library(ISLR)
attach(Wage)
library(gam)
gam.lr = gam(I(wage >250) ~ s(age), family = binomial(link = "logit"), data = Wage)
agelims = range(age)
age.grid = seq(from = agelims[1], to = agelims[2])
pred=predict(gam.lr, newdata = list(age = age.grid), type = "link")
par(mfrow = c(2,1))
plot(gam.lr)
plot(age.grid, pred)
I expected that both of the methods would give the exact same plot. plot(gam.lr) plots the additive effects of each component and since here there's only one so it is supposed to give the predicted logit function. The predict method is also giving me estimates in the link scale. But the actual outputs are on different ranges. The minimum value of the first method is -4 while that of the second is less than -7.
The first plot is of the estimated smooth function s(age) only. Smooths are subject to identifiability constraints as in the basis expansion used to parametrise the smooth, there is a function or combination of functions that are entirely confounded with the intercept. As such, you can't fit the smooth and an intercept in the same model as you could subtract some value from the intercept and add it back to the smooth and you have the same fit but different coefficients. As you can add and subtract an infinity of values you have an infinite supply of models, which isn't helpful.
Hence identifiability constraints are applied to the basis expansions, and the one that is most useful is to ensure that the smooth sums to zero over the range of the covariate. This involves centering the smooth at 0, with the intercept then representing the overall mean of the response.
So, the first plot is of the smooth, subject to this sum to zero constraint, so it straddles 0. The intercept in this model is:
> coef(gam.lr)[1]
(Intercept)
-4.7175
If you add this to values in this plot, you get the values in the second plot, which is the application of the full model to the data you supplied, intercept + f(age).
This is all also happening on the link scale, the log odds scale, hence all the negative values.

visreg package for R: conditional plots

The visreg package in R can produce plots for various regression models. When creating conditional plots for each predictor, the other predictors are — by default — held at their median values, although this value can be changed by the user. In the documentation, an example is given (Fig. 5) that shows the effect of choosing values other than the median. The model's predictions change depending on the chosen value, as do the data that are plotted. My question is this: how are the data transformed between these plots? Are they simply adjusted according to the model?
The conditional plot shows partial residuals and not the original data as I thought. Consequently, the points that are plotted are determined according to the equation given in that URL.

Resources