Sorting the x axes in R - r

I built a logistic regression model (called 'mylogit') using the glm function in R as follows:
mylogit <- glm(answer ~ as.factor(gender) + age, data = mydata, family = "binomial")
where age is numeric and gender is categorical (male and female).
I then proceeded to make predictions with the model built.
pred <- predict(mylogit, type = "response")
I can easily make a time series plot of the predictions by doing:
plot.ts(ts(pred))
to give a plot that looks like this:
Plot of Time against Predictions
which gives a plot of the predictions.
My question is this:
Is it possible to put the x axis in segments according to gender (male or female) which was specified in the glm? In other words, can I have predictions on the y axis and have gender (divided into male and female) on the x axis?
A sample of the data I want to plot from is similar to this:
I did:
bind = cbind(mydata, pred)
'bind' looks like this:
pred age gender
0.9461198 32 male
0.9463577 45 female
0.9461198 45 female
0.9461198 37 female
0.9477645 40 male
0.8304513 32 female

Check out #4 on this blog post, "4. How To Create Two Different X- or Y-axes".
My suggestion to you is that you look at some of the dedicated R plotting tools, like ggplot2.

I don't think you need to use ts and plot.ts because the data you have is not a time series, right? Just sort pred before plotting.
# Get data
str <- "pred,age,gender
0.9461198,32,male
0.9463577,45,female
0.9461198,45,female
0.9461198,37,female
0.9477645,40,male
0.8304513,32,female"
bind <- read.csv(textConnection(str))
# Plot
bind <- bind[order(bind$gender),]
plot(bind$pred, col = bind$gender)
library(ggplot2)
ggplot(bind, aes(x = gender, y = pred)) +
geom_point(position = position_jitter(width = .3))
Or without creating bind you could do plot(pred[order(mydata$gender)]).

Related

How to plot only certain values of continuous variables using `sjPlot::plot_model()`

I'm using the plot_model() function in R to visualize an interaction for an OLS regression.
The regression is:
model <- lm(dv ~ condition + control1 + control2 + variable1 + condition*variable1, data=data)
condition is a 4-level factor variable.
variable1 is a continuous variable.
control1 is a continuous variable.
control2 is a factor variable with 2-levels.
I'm trying to visualize the interaction of this regression, with variable1 on the x-axis, and condition as indicators.
However, I'm hoping to visualize it as if variable 1 was a factor variable, i.e., only showing particular points (2, 4, and 6) on the plot, with confidence interval whiskers, instead of a straight line across all continuous values and confindence interval bands around the line. So something that looks like this:
Rather than this:
I've tried multiple things but am having trouble looking for the right solution. Any help will be appreciated! The code I have now that creates the line & band plot is:
plot_model(model, type="pred",terms=c("variable1","condition"), ci.lvl=0.95)
Specify the points your want in brackets in the terms argument:
library(sjPlot)
model <- lm(hwy ~ year + drv * displ, data = ggplot2::mpg)
plot_model(model, type = "pred", terms = c("drv", "displ [2, 3, 5]"))

Fitted vs observed plot with line from glmer model

I have a data frame that is made up individuals in different treatment groups, with a 1 for if they survived and a 0 for if they are dead, and a 3rd column indicating which dish. I ran a glmer model using lme4 package with Dish_ID as my random variable. I have a piece of code from base plot which plots the mortality rate against treatment group using the line from my glmer model. How can I write the same observed vs fitted plots in ggplot. I have tried looking online but cant seem to find an answer that explains the process.
I want to get the line from my binomial model (manually not using geom_smooth) and then plot my observed points in red in ggplot2. Thanks for the help.
library(tidyverse)
library(ggplot2)
library(lme4)
mortality_data$Dish_ID <- as.factor(mortality_data$Dish_ID)
mortality_model <- glmer(Survived ~ Treatment + (1|Dish_ID), data = mortality_data, family = "binomial")
summary(mortality_model)
plot(mortality_data$Treatment, 1 - fitted(mortality_model), ylim = c(0,1))
plot(mortality_data$Treatment, 1 - fitted(mortality_model), ylim = c(0,1), type = "l", xlab = "Concentration of Cu2SO4", ylab = "Mortality rate")
tv <- unique(mortality_data$Treatment)
#observed in red
for (i in tv) {
points(i, y = 1 - mean(mortality_data$Survived[mortality_data$Treatment == i]), col = "red")
}
my data frame looks something like this if it is of any use. There are 540 individuals, 90 for each treatment group
Treatment
Survived
Dish_ID
0.05
1
Dish_1
0.04
0
Dish_3

Graphing model results of longitudinal data in R

I am looking to create a graph of longitudinal data by age and sex, similar to the graph in this image , from this paper https://www.thelancet.com/journals/lanpub/article/PIIS2468-2667(20)30258-9/fulltext.
To graph model results in the past, I have used both ggplot2 and ggpredict. I prefer ggpredict because it graphs the results accounting for covariates, but I am OK with graphing in ggplot2 if it can't be done in ggpredict.
I am providing a minimal reproducible example below, with id, wave (2 waves, separated by 6 years), age, sex, tst (total sleep time), and bmi for a covariate.
id<-rep(1:50, 2)
wave<-c(rep(1, 50),rep(2, 50))
tst<-c(sample(7:9,50, replace = T),sample(4:7,50, replace = T))
mydf<-data.frame(id,wave,tst)
mydf$age[mydf$wave==1]<-sample(40:90,50, replace = T)
mydf$age[mydf$wave==2]<-mydf$age[mydf$wave==1]+6
mydf$bmi<-sample(20:30,50, replace = T)
mydf$sex<-sample(1:2,50, replace = T)
mydf$age.cat<-cut(mydf$age[mydf$wave==1], breaks = 3,labels = c(1,2,3))
##Overall model##
(model <- lmer( tst ~ wave + age + sex + bmi +(1|id), data = mydf))
I tried to graph it with ggplot2 using the following syntax, however I'm not sure that the graph is exactly what I'm looking for. I would like to graph change in tst between waves 1 and 2, by age group and sex. TST would be on the y axis, age would be on the x axis, with separate lines for age group and sex, with standard errors. The lines will correspond to within-person change in TST between waves 1 and 2.
I think that the graph right now is showing the between subjects effects of age on tst, and not taking into account the fact that the data is nested within-person. Any help would be greatly appreciated.
ggplot(mydf,aes(x=age, y=tst, color=as.factor(sex), group=as.factor(age.cat), linetype=as.factor(age.cat)))+
geom_smooth(data=mydf[mydf$sex==1,], method = lm, formula = y~x)+
geom_smooth(data=mydf[mydf$sex==2,], method = lm, formula = y~x)+
geom_point() +
theme_bw()

How to plot survival relative to general population with age on the X-axis (left-truncated data)?

I am trying to compare the survival in my study cohort with the survival in the Dutch general population (matched for age and sex). I created a rate table of the Dutch population.
library(relsurv)
setwd("")
nldpop <- transrate.hmd("mltper_1x1.txt","fltper_1x1.txt")
Then, I wanted to create a plot of the survival of my cohort (observed) and the survival of the population (expected) with age on the X-axis. However, the 'survexp' function does not seem to support a (start,stop,event)-format. Only with the normal (futime, event)-format it works, see below, but then I have follow-up time on the X-axis. Does anyone know how to get the age on the X-axis instead of follow-up time?
# Observed and expected survival with time on X-axis
fit <- survfit(Surv(futime, event)~1)
efit <- survexp(futime ~ 1, rmap = list(year=(date_entry), age=(age_entry), sex=(sex)),
ratetable=nldpop)
plot(fit)
lines(efit)
You didn't provide your example data, so i used survival::mgus data for this. Your problem may be due to incorrectly specifying variable names in the rmap option. See plot here
library(relsurv)
nldpop <- transrate.hmd("mltper_1x1.txt", "fltper_1x1.txt")
mgus2 <- mgus %>% mutate(date_year = dxyr + 1900)
fit <- survfit(Surv(futime, death) ~ 1, data = mgus2)
efit <- survexp(Surv(futime, death) ~ 1, data = mgus2,
ratetable = nldpop, rmap = list(age = age*365.25, year = date_year, sex = sex))
plot(fit)
lines(efit)

R: Formula with multiple Conditions and Categorized Surface Plot

I want to make 3D plots for linear Regression Models in R: I wish to display surface of the regression plane of a linear model.
I have 2 continuous variables (say AGE, HEIGHT) and 2 factors (SEX, ALLERGIC). I want to display the predicted values of the LM w.r.t. the 2 continuous variables conditioned on the specified levels of each factor, e.g.
ILLNESS = AGE|{SEX==MALE + ALLERGIC==YES} + HEIGHT|{SEX==MALE + ALLERGIC==YES} +
AGE|{SEX==MALE + ALLERGIC==YES}*HEIGHT|{SEX==MALE + ALLERGIC==YES}
This is the outcome I have in mind:
First Question: Are there any cool function, where you can do this very easy?
Second Question: If not, how can I write formulas, where I can condition on >1 factor level?
First, let's make some sample input data to have something to test with.
set.seed(15)
dd <- data.frame(
sex = sample(c("M","F"), 200, replace=T),
allergic = sample(c("YES","NO"), 200, replace=T),
age = runif(200, 18,65),
height = rnorm(200, 6, 2)
)
expit <- function(x) exp(x)/(exp(x)+1)
dd <- transform(dd,
illness=expit(-1+(sex=="M")*.8-0.025*age*ifelse(sex=="M",-1,1)+.16*height*ifelse(allergic=="YES",-1,1)+rnorm(200))>.5
)
Now we define the set of values we want to predict over
gg<-expand.grid(sex=c("M","F"), allergic=c("YES","NO"))
vv<-expand.grid(age=18:65, height=3:9)
and then we fit a model, and use the predict function to calculate the response for each point on the surface we wish to plot.
mm <- glm(illness~sex+allergic+age+height, dd, family=binomial)
pd<-do.call(rbind, Map(function(sex, allergic) {
nd <- cbind(vv, sex=sex, allergic=allergic)
cbind(nd, pred=predict(mm, nd, type="response"))
}, sex=gg$sex, allergic=gg$allergic))
Finally, we can use lattice to plot the data
library(lattice)
wireframe(pred~age+height|sex+allergic, pd, drape=TRUE)
which give us

Resources