As dependent variables, I have a data frame with 0s and 1s (using certain product or not). As independent variables, I have a set of data frames with categorical variables (living in brick house, etc.). I plot logistic regression using ggplot:
g <- ggplot(decision, aes(x=decision_point, y=use)) + geom_point(alpha=.1, size=2, col="red") +
geom_smooth(method = "glm",
method.args = list(family = "binomial"),
aes(x = as.numeric(decision_point)),
se = F)
What happens is that it plots a straight line. It seems the categorical variables are turned into numeric (as I wrote) and it just goes through it.
But if I don't use as.numeric, no line shows at all.
What can I do? The line should be a curve. So if independent variables were incremental numeric values, like 0-100, then plotting a curve would be easy. But they are categorical variables, like "Brick House" "Hut", "Others". Hence the problem. Thank you in advance.
Related
I am trying to fix a plot that I generated that has the incorrect predicted values along the y-axis. The predicted values should be "Score," my outcome variable. For some reason, the "id" variable is along the y-axis instead. Everything else is plotted correctly. I checked my model, and I can't see where the issue is coming from. I will post my regression model syntax and plot syntax below. The model is a multivariate regression with two outcomes. "scale" labels each of those two outcomes which are indicated by the Score variable. Both predictors are two-level categorical variables, and there is also an interaction between them.
If anyone has any ideas, I would greatly appreciate it!
multireg4 <- gls(Score ~ 0 + scale + scale:DrinkStat + scale:ACEHx + scale:DrinkStat:ACEHx,
data = R34_Long,
na.action = na.omit,
weights = varIdent(form = ~ 1 | scale),
correlation = corSymm(form = ~ as.numeric(scale) | id))
plot_model(multireg4, type = "pred", terms = c("DrinkStat", "ACEHx", "scale")) + theme_sjplot2()
here is an image of the plot
I've tried adding limits to the scale variable in the plot_model code, but the issue is that the wrong predicted values are being pulled. This seems like an automatic function, so I am not exactly sure where to edit the syntax (in the model regression or the plot) in order to get R to use the correct predicted values.
I have create a multivariate linear regression model in R called modelP. I want to plot this in ggplot, but can't work this out. The model is:
modelP <- lm(House_Price ~ FactorA + FactorB + FactorC, data=df)
I have plotted individual linear regression lines for each factor, but I want to create a graph of the model combining all factors. I would like to do something like this:
ggplot(df)+
geom_smooth(aes(y = House_Price, x = FactorA + FactorB + FactorC))
Or
ggplot(modelP$model)+
geom_smooth(method="lm")
But neither approach seems like it will work. I would really appreciate any help.
I've plotted the response curves for each of my predictors against all of predicted values to determine how each predictor influences my counts. However, I also want to plot the binary part of my zero-inflated model to see how the predictors in the binary part of the zero-inflated model help explain the probability of false zeroes. I am trying to get a plot similar to the one at the bottom of the page of the link below however they don't provide reproducible code in that example.
https://fukamilab.github.io/BIO202/04-C-zero-data.html#sketch_fitted_and_predicted_values
I've included some code below where I have my zero-inflated model and the predictors used. I then use the predict function to predict the estimates for a much larger raster grid (new.data) and I want to see the response between those predicted values and the predictors I use across the entire raster grid.
mod1 = zeroinfl(Response~x1+x2|x1,link ="logit",data=data,
dist="negbin")
modpred=predict(mod1, new.data, se.fit=T, type = "response")
response1 <- ggplot(data, aes(x = x1, y = modpred)) + geom_point()+
+geom_smooth(data = data, aes(x = x1, y = modpred))
I'm working with a data set that involves a response variable (mass), a covariate (length), and a categorical independent variable (location) of a certain bird species. This data set looks like this:
In this case there are three locations (Orlando, Tampa, and another Floridian City) with ten observations each). I decided to run an ANCOVA using the lm function.
m1<- lm(mass.d ~ length.d + location)
I was able to plot this model with ggplot:
predm1<- predict(m1)
ggplot(YOUR_DATA, aes(length.d, mass.d, color = pop)) + geom_point() +
geom_line(aes(y= predm1))
Now I would like to add my 95 percent confidence intervals.
I have the confidence intervals but I have very little idea on how to input them into my ggplot graph because I essentially have three different slopes for 3 different locations. I've been playing around with geom_ribbon but have not been able to find an answer.
I'm trying to use R to do some modelling, I've started to use BodyWeight library, since I've seen some examples online. Just to understand and get used to the commands.
I've come to my final model, with estimates and I was wondering how to plot these estimates, but I haven't seen anything online..
Is there a way to plot the values of the estimates with a line, and dots for the values of each observation?
Where can I find information about how to do this, do I have to extract the values myself or it is possible to say plot the estimates of these model?
I'm only starting with R. Any help is welcome.
Thank you
There is no function that just plots the output of a model, since there are usually many different possible ways of plotting the output.
Take a look at the predict function for whatever model type you are using (for example, linear regressions using lm have a predict.lm function).
Then choose a plotting system (you will likely want different panels for different levels of diet, so use either ggplot2 or lattice). Then see if you can describe more clearly in words how you want the plot to look. Then update your question if you get stuck.
Now we've identified which dataset you are using, here's a possible plot:
#Run your model
model <- lme(weight ~ Time + Diet, BodyWeight, ~ 1 | Rat)
summary(model)
#Predict the values
#predict.lme is a pain because you have to specify which rat
#you are interested in, but we don't want that
#manually predicting things instead
times <- seq.int(0, 65, 0.1)
mcf <- model$coefficients$fixed
predicted <-
mcf["(Intercept)"] +
rep.int(mcf["Time"] * times, nlevels(BodyWeight$Diet)) +
rep(c(0, mcf["Diet2"], mcf["Diet3"]), each = length(times))
prediction_data <- data.frame(
weight = predicted,
Time = rep.int(times, nlevels(BodyWeight$Diet)),
Diet = rep(levels(BodyWeight$Diet), each = length(times))
)
#Draw the plot (using ggplot2)
(p <- ggplot(BodyWeight, aes(Time, weight, colour = Diet)) +
geom_point() +
geom_line(data = prediction_data)
)