I calculated a linear-mixed model using the nlme package. I was evaluating a psychological treatment and used treatment condition and measurement point as predictors. I did post-hoc comparisons using the emmans package. So far so good, everything worked out well and I am looking forward to finish my thesis. There is only one problem left. I am really really bad in plotting. I want to plot the emmeans for the four measurement points for each group. The emmip function in emmeans does this, but I am not that happy with the result. I used the following code to generate the result:
emmip(HLM_IPANAT_pos, Gruppe~TP, CIs=TRUE) + theme_bw() + labs(x = "Zeit", y = "IPANAT-PA")
I don't like the way the confidence intervals are presented. I would prefer a line bar with "normal" confidence bars, like the one below, which is taken from Ireland et al. (2017). I tried to do it in excel, but did not find out how to integrate seperate confidence intervals for each line. So I was wondering if there was the possibility to do it using ggplot2. However, I do not know how to integrate the values I obtained using emmeans in ggplot. As I said, I really have no idea about plotting. Does someone know how to do it?
I think it is possible. Rather than using emmip to create the plot, you could use emmeans to get the values for ggplot2. With ggplot2 and the data, you might be able to better control the format of the plot. Since I do not have your data, I can only suggest a few steps.
First, after fitting the model HLM_IPANAT_pos, get values using emmeans. Second, broom::tidy this object. Third, ggplot the above broom::tidy object.
Using mtcars data as an example:
library(emmeans)
# mtcars data
mtcars$cyl = as.factor(mtcars$cyl)
# Model
mymodel <- lm(mpg ~ cyl * am, data = mtcars)
# using ggplot2
library(tidyverse)
broom::tidy(emmeans(mymodel, ~ am | cyl)) %>%
mutate(cyl_x = as.numeric(as.character(cyl)) + 0.1*am) %>%
ggplot(aes(x = cyl_x, y = estimate, color = as.factor(am))) +
geom_point() +
geom_line() +
geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0.1)
Created on 2019-12-29 by the reprex package (v0.3.0)
Related
This question already has an answer here:
Overlay geom_line within a categorical x axis for each group - ggplot2
(1 answer)
Closed 1 year ago.
i'm a begginer at R and I can't figure out how to add regression lines to my boxplot. My code (with data) is:
dat_full<-data.frame(Fuerza = c("19.6N","19.6N","58.8N","58.8N","98,0N","98,0N", "274.4N","274.4N"),
Músculo = c("Bíceps","Tríceps","Bíceps","Tríceps","Bíceps","Tríceps","Bíceps","Tríceps"),
mV.s = c(3.5227565, -0.0897375, 7.2907255, 1.8571375, 16.327445, 8.042295, 31.15557, 12.69073),
standdev = c(0.111590642, 0.187825239, 0.886093185, 0.16351915, 3.876932131, 2.637289091, 3.713413688, 1.262850285))
dat_full<- dat_full %>%
mutate(Fuerza = factor(Fuerza, levels=c("19.6N","58.8N","98,0N","274.4N")))
dat_full
ggplot(dat_full, aes(x = as.factor(Fuerza),y=mV.s)) +
geom_boxplot(aes(lower = mV.s - standdev, upper = mV.s + standdev, middle = mV.s,
ymin = mV.s - 3*standdev, ymax = mV.s + 3*standdev), stat = "identity")+
facet_wrap(~Músculo)+
xlab("Fuerza (N)")+
theme_grey(base_size = 22)
which shows this plot
What i need to do is to add a regression line for the means (mV.s) of every condition (Fuerza) for the two groups. It it's possible, I also want to visualize R2 and the regression equation on the graph.
Thanks in advance.
You can add add a line to a ggplot using the geom_smooth() or lm() functions. Given the line you need to create, it may be easier to just make the line using lm().
lm() takes the parameters data and the two (or more) values you want to use in the regression. Here what you'd want to do is {name_of_regression} <- lm(data = dat_full, {dependent_var}~{independent_var}). I'm not sure what you want those variables to be, as Fuerza is currently populated with string values.
Also, it's been a little while since I've looked at R, so this is a somewhat verbose solution, but you can filter triceps and biceps into two datasets using the tidyverse package and then name make your regressions from each dataset.
library(tidyverse)
biceps <- filter(dat_full, dat_full$Musculo=="Biceps")
biceps_reg <- lm(data = biceps, {biceps_dep}~{bicdeps_indep})
And repeat for triceps.
Then, make the ggplot you want, and using geom_smooth() insert your lm using:
ggplot({some_code}) +
{...} +
geom_smooth(method="lm", se = FALSE)
I know that doesn't really solve your problem of wanting to put the charts together, but you can save each ggplot for biceps and triceps and then put it together using plot() once you're done.
Also, here's an R tip: in RStudio you can check any function by using something like:
?plot
?lm
Apologies for the verbosity -- I wanted to provide a quick fix here, but others may have better advice. Additionally, please let me know what your independent and dependent variables would be for the regression (they have to both be numeric here, so Fuerza won't work).
I am having difficulty plotting a log(10) formula on to existing data points. I derived a logarithmic function based on a list of data where "Tout_F_6am" is my independent variable and "clo" is my dependent variable.
When I go to plot it, I am getting the error that lengths x and y are different. Can someone please help me figure out whats going wrong?
logKT=lm(log10(clo)~ Tout_F_6am,data=passive)
summary(logKT) #r2=0.12
coef(logKT)
plot(passive$Tout_F_6am,passive$clo) #plot data points
x=seq(53,84, length=6381)#match length of x variable
y=logKT
lines(x,y,type="l",lwd=2,col="red")
length(passive$Tout_F_6am) #6381
length(passive$clo) #6381
Additionally, can the formula curve(-0.0219-0.005*log10(x),add=TRUE,col=2)be written as eq=(10^-0.022)*(10^-0.005*x)? thanks!
The problem is that you are trying to plot the model object, not the predictions from the model. Try something like this:
Define the explanatory values you want to plot, in a data frame (or tibble). It doesn't have to be as many as there are data points.
library(dplyr)
explanatory_data <- tibble(
Tout_F_6am = seq(53, 84, 0.1)
)
Add a column of predicted values using predict(). This takes a model and your explanatory data. predict() will return the transformed values, so you have to backtransform them.
prediction_data <- explanatory_data %>%
mutate(
log10_clo = predict(logKT, explanatory_data),
clo = 10 ^ log10_clo
)
Finally, draw your plot.
plot(clo ~ Tout_F_6am, data = prediction_data, log="y", type = "l")
The plotting is actually easier using ggplot2. This should give you more or less what you want.
library(ggplot2)
ggplot(passive, aes(Tout_F_6am, clo)) +
geom_point() +
geom_smooth(method = "lm") +
scale_y_log10()
I want to plot a regression model with confidence interval in ggplot. In my model I want to use robust standard errors clustered on a variable. However, I can't find where I can locate the variable for clustering errors.
I have already tried geom_smooth function with lm_robust method but can't find where to add cluster error variable.
p1 +geom_smooth(data=data,aes(y=y,x=x),method='lm_robust',se = T)
I need the equilavent for the following line in ggplot for the plot:
lm_robust(y~ x, data = data, clusters = z)
It seems that the non-standard evaluation gives problems, but you can instead just pass the whole data.frame column like so:
library(ggplot2)
library(estimatr)
ggplot(mtcars, aes(hp, qsec)) +
geom_smooth(method = 'lm_robust', method.args = list(cluster = mtcars$cyl))
Note that this will give not work when drawing multiple lines (e.g. using color) or with facets.
Consider the following data frame example
library('ggplot2')
library('sm')
original<-c(1:100,1)
a<-sample(original,100)
b<-rep(1:4,25)
lala<-data.frame(a,b)
My aim is to produce density plots for values in lala$a, according to each group (1,2,3,4) defined in lala$b.
For doing so in ggplot2 I could do the following
plotDensityggplot<-ggplot()+
geom_density(data = lala, aes(a, colour=factor(b)))+
theme_classic()
print(plotDensityggplot)
producing this:
However, when I plot the same data using the 'sm' package to make a formal comparison of the densities using the following code:
sm.density.compare(lala$a,as.numeric(lala$b),model = "equal")
The density curves extend beyond zero in the X-axis, despite there is no value below zero in lala$a
What's going on? - note that this affect the densities reported in the y-axis.
Is the p-value from the permutation test of equality obtained from sm.density.compare a reliable estimate? - thank you!
For what it's worth, you can (more or less) reproduce the sm output in ggplot by pre-computing densities with base R's density (I'm not familiar with sm but I imagine that sm.density calls base R's density at some point as well).
library(tidyverse)
lala %>%
group_by(b) %>%
summarise(tmp = list(map_dfc(c("x", "y"), ~density(a)[.x]))) %>%
unnest() %>%
ggplot(aes(x, y, colour = as.factor(b))) +
geom_line()
I'm not sure how geom_density (or stat_density) tune kernel density estimation parameters, but you seem to have less control over them than in base R's density.
I am an R beginner (first semester - we us this programme for univariate statistics) and currently struggling with plotting the outcome of my glm(). I read quite a few threads and help files on the internet, but I have 2 problems: 1) I don't understand the advice because it is too advanced or 2) I understand the advice but when I replicate the code, it doesn't work.
I think I am close to the solution, but my curve doesn't work how it is supposed to. Can anyone tell me what I am doing wrong?
new.data<-data.frame(x=rnorm(50,0,1), y=c("yes", "no"))
mock_model<-glm(y~x, data=new.data, family=binomial)
x1<-seq(min(new.data$x), max(new.data$x), 0.01)
y1<-predict(mock_model, list(x=x1), type="response")
plot(new.data$x, new.data$y, xlab="numeric var", ylab="binary var")
points(x1, y1)
I am new to coding and this platform, so apologies in advance if the information I have provided is not sufficient.
Any advice would be greatly appreciated.
Here's an example using mtcars and the ggplot2 package. The syntax of ggplot2 works roughly like this: You begin a plot with the ggplot() command, within which you can (but don't have to) define aesthetics (the aes() option), which include selection of axis variables, but can also contain options to change the visuals, like colors, linewidths etc. If you define the axis variables within ggplot(), don't forget to put the data assignment (see example below) outside of aes().
Afterwards, you add layers of geoms to plot specific things, like data points with geom_point(), lines with geom_line() or a lot of other fun things. When you want to use the variables and data assigned in the ggplot() command, just leave the geom empty (apart from any visual aes() options you want to use for that specific geom). However, you can define new data and variables for a geom, for example to use different data sources in the same plot.
data(mtcars)
model_shift <- glm(am ~ mpg, data = mtcars, family = 'binomial')
x <- seq(min(mtcars$mpg), max(mtcars$mpg), .1)
y <- predict(model_shift, list(mpg = x), type = 'response')
plot_data <- data.frame(mpg = x, am = y)
library(ggplot2)
ggplot(aes(x = mpg, y = am), data = plot_data) +
geom_point()
Or with a line instead of points:
ggplot(aes(x = mpg, y = am), data = plot_data) +
geom_line()
To get a glimpse of the seemingly endless possibilities of ggplot2, have a look at these 'Top 50' ggplot2 visualizations. To learn the package-specific language, see this tutorial or check your university's library for Hadley Wickham's book ggplot2: elegant graphics for data analysis.