how to combine 2 DVs in one graph using ggplot - r

I am trying to combine 2 dependent variables (or 2 graphs) in one graph using ggplot function. All the suggestions I could find online were not really helpful in my case.
Graph1 <- ggplot(mydata, aes(age, conf))
Graph1 + stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", aes(group = 1)) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2) +
labs(x = "Age Group", y = "Accuracy (%)") + ylim(0, 1)
Graph2 <- ggplot(mydata, aes(age, acc))
Graph2 + stat_summary(fun.y = percent(1), geom = "point") +
stat_summary(fun.y = mean, geom = "line", aes(group = 1), linetype = "dashed") +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2) +
labs(x = "Age Group", y = "Accuracy (%)") + ylim(0, 1)
In addition to this, I will need to have the means and error bars not overlapping. Any advice would be greatly appreciated.

After further investigation I have found the following suggestion which seems to be a great solution. However, I cannot install tidyr as it incompatible with the current R version. I have tried different options to download the package, without success.
library(tidyr)
home.land.byyear <- gather(housing.byyear, value = "value", key = "type",
Home.Value, Land.Value)
ggplot(home.land.byyear, aes(x=Date, y=value, color=type)) + geom_line()
see http://tutorials.iq.harvard.edu/R/Rgraphics/Rgraphics.html

Related

How to calculate standard error instead of standard deviation in ggplot

I need some help to figure out to estimate the standard error using the following R script:
library(ggplot2)
library(ggpubr)
library(Hmisc)
data("ToothGrowth")
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
head(ToothGrowth, 4)
theme_set(
theme_classic() +
theme(legend.position = "top")
)
# Initiate a ggplot
e <- ggplot(ToothGrowth, aes(x = dose, y = len))
# Add mean points +/- SD
# Use geom = "pointrange" or geom = "crossbar"
e + geom_violin(trim = FALSE) +
stat_summary(
fun.data = "mean_sdl", fun.args = list(mult = 1),
geom = "pointrange", color = "black"
)
# Combine with box plot to add median and quartiles
# Change fill color by groups, remove legend
e + geom_violin(aes(fill = dose), trim = FALSE) +
geom_boxplot(width = 0.2)+
scale_fill_manual(values = c("#00AFBB", "#E7B800", "#FC4E07"))+
theme(legend.position = "none")
Many thanks for the help
Kind regards
A couple of things. First, you need to reassign e when you add geom_violin and stat_summary. Otherwise, it isn't carrying those changes forward when you add the boxplot in the next step. Second, when you add the boxplot last, it is mapping over the points and error bars from stat_summary so it looks like they're disappearing. If you add the boxplot first and then stat_summary the points and error bars will be placed on top of the boxplot. Here is an example:
library(ggplot2)
library(ggpubr)
library(Hmisc)
data("ToothGrowth")
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
theme_set(
theme_classic() +
theme(legend.position = "top")
)
# Initiate a ggplot
e <- ggplot(ToothGrowth, aes(x = dose, y = len))
# Add violin plot
e <- e + geom_violin(trim = FALSE)
# Combine with box plot to add median and quartiles
# Change fill color by groups, remove legend
e <- e + geom_violin(aes(fill = dose), trim = FALSE) +
geom_boxplot(width = 0.2)+
scale_fill_manual(values = c("#00AFBB", "#E7B800", "#FC4E07"))+
theme(legend.position = "none")
# Add mean points +/- SE
# Use geom = "pointrange" or geom = "crossbar"
e +
stat_summary(
fun.data = "mean_se", fun.args = list(mult = 1),
geom = "pointrange", color = "black"
)
You said in a comment that you couldn't see any changes when you tried mean_se and mean_cl_normal. Perhaps the above solution will have solved the problem, but you should see a difference. Here is an example just comparing mean_se and mean_sdl. You should notice the error bars are smaller with mean_se.
ggplot(ToothGrowth, aes(x = dose, y = len)) +
stat_summary(
fun.data = "mean_sdl", fun.args = list(mult = 1),
geom = "pointrange", color = "black"
)
ggplot(ToothGrowth, aes(x = dose, y = len)) +
stat_summary(
fun.data = "mean_se", fun.args = list(mult = 1),
geom = "pointrange", color = "black"
)
Here is a simplified solution if you don't want to reassign at each step:
ggplot(ToothGrowth, aes(x = dose, y = len)) +
geom_violin(aes(fill = dose), trim = FALSE) +
geom_boxplot(width = 0.2) +
stat_summary(fun.data = "mean_se", fun.args = list(mult = 1),
geom = "pointrange", color = "black") +
scale_fill_manual(values = c("#00AFBB", "#E7B800", "#FC4E07")) +
theme(legend.position = "none")

Error_bar and exercise

Sorry my bad english, for not pasting the code, and for asking questions because I am not very familiar with R. I am a beginner. There's my notice and the graph I must find:
I read the R documentation to solve this problem but I was unable to figure out the solution.
Actually I found this with this script I used. But i've got not clue for adding errorbar I tried geom_errorbar(aes(ymin = mean-se, ymax = mean+se)) but surely I've mistaken myself
`rm(list=ls() )
library(dplyr)
library(ggplot2)
library(ggpubr)
Sparrows <- read.delim("C:/Users/detar/Downloads/Sparrows.txt")
View(Sparrows)
str(Sparrows)
jitter<-filter(Sparrows,day == 4)
x<-ggplot(jitter,
aes (x = rank_name,
y = logit.motility,)) + geom_point(colour = "cyan") +
xlab("Social Rank") +
ylab("Logit(Proportion of motile sperm") +
labs(title =("Ejaculate quality covaries with social rank
of male House Sparrows")) +
theme(plot.title = element_text(hjust = 0.5)) +
scale_x_discrete(breaks=c("D","S1","S2","S3"), labels=c("Dominant", "Subordinate 1", "Subordinate 2", "Subordinate 3"))
x2<-x + theme_classic() + theme(plot.title = element_text(hjust = 0.5, size = 14))
Thanks for your help
Benjamin
So I added
table <- jitter %>%
group_by(rank_name) %>%
summarize(Mean = mean(logit.motility, na.rm=TRUE),
SEM = sd( logit.motility, na.rm=TRUE) / sqrt(15)
) %>% as.data.frame()
x2<-x + theme_classic() + theme(plot.title = element_text(hjust = 0.5, size = 14)) + geom_errorbar(data = summary_table,
aes(x =rank_name,
y =logit.motility,
ymin =Mean - SEM ,
ymax =Mean + SEM ,
colour = "black",
width = 1 ))
But an error occured
Warning: Ignoring unknown aesthetics: y So am i mistaken in those arguments
And a new time, i thank you
The solution to this question is identical to the solution here but all possible duplicates I have found are about bar plots, so here is one answer to a question about scatter plots.
First, the data, since the question has none.
df1 <- iris[4:5]
Now the graph. Any of geom_errorbar or stat:summary with geom = "errorbar" could be used.
library(ggplot2)
ggplot(df1, aes(x = Species, y = Petal.Width)) +
geom_point(aes(colour = "lightblue")) +
scale_color_manual(values = "lightblue") +
stat_summary(geom = "point", fun.y = mean) +
stat_summary(geom = "errorbar", fun.data = mean_se,
position = "dodge", width = 0.2)

Looping ggplot categorical variables

I am a noob, so hope this makes sense...
Question/problem statement
I need to create a number of plots, where the only difference in each of the plots is the group used - each group contains categorical variables. I have got this to work by manually typing out all of the code.
Instead of manually writing each of the groups into R, I want to develop a loop to automate this plotting process.
Current manual method
This works, but is tedious and I want to automate through a loop - just an example with 2 of my 9 groups.
The only thing that changes in each is the factor and titles
# GOR
ggplot(aes(y = dailyCV, x = factor(GOR)), data = mergedbed) +
geom_jitter(alpha=1/2, color="tomato", position=position_jitter(width=.2), size=1/10) +
stat_summary(fun.data = min.mean.sd.max, geom = "boxplot", alpha = 0.5) +
stat_summary(fun.y=mean, colour="black", geom="text",
vjust=0.5, hjust=1.5, size=3, aes( label=round(..y.., digits=1))) +
stat_summary(fun.data = give.n, geom = "text", vjust=1, hjust=-2, size=3) +
coord_flip() +
stat_summary(fun.y = mean, geom="point",colour="darkred", size=2) +
xlab("GOR")+
ylab("Co-efficient of variation (%)")+
ggtitle("GOR vs dailyCV")
# ACCOM_EHCS
ggplot(aes(y = dailyCV, x = factor(ACCOM_EHCS)), data = mergedbed) +
geom_jitter(alpha=1/2, color="tomato", position=position_jitter(width=.2), size=1/10) +
stat_summary(fun.data = min.mean.sd.max, geom = "boxplot", alpha = 0.5) +
stat_summary(fun.y=mean, colour="black", geom="text",
vjust=0.5, hjust=1.5, size=3, aes( label=round(..y.., digits=1))) +
stat_summary(fun.data = give.n, geom = "text", vjust=1, hjust=-2, size=3) +
coord_flip() +
stat_summary(fun.y = mean, geom="point",colour="darkred", size=2) +
xlab("ACCOM_EHCS")+
ylab("Co-efficient of variation (%)")+
ggtitle("ACCOM_EHCS vs dailyCV")
My attempt
My attempt here was to create a vector with each of the groups and then try to loop this, but it doesnt work and Im sure its very wrong. My first time at attempting to create a loop.
myvariables <- c("GOR","ACCOM_EHCS","DBL_GLAZ", "BUILDING_AGE", "HhdSize", "Inc_Group_7s", "Person_Under_5", "Person_Over_64", "thermal")
lapply(myvariables, function(cc){
p <- ggplot(aes(y = dailyCV, x = factor(aes_string(y = cc))), data = mergedbed) +
geom_jitter(alpha=1/2, color="tomato", position=position_jitter(width=.2), size=1/10) +
stat_summary(fun.data = min.mean.sd.max, geom = "boxplot", alpha = 0.5) +
stat_summary(fun.y=mean, colour="black", geom="text",
vjust=0.5, hjust=1.5, size=3, aes( label=round(..y.., digits=1))) +
stat_summary(fun.data = give.n, geom = "text", vjust=1, hjust=-2, size=3) +
coord_flip() +
stat_summary(fun.y = mean, geom="point",colour="darkred", size=2) +
xlab("???")+
ylab("Co-efficient of variation (%)")+
ggtitle("??? vs dailyCV")
p
})
Thank you in advance
Here is an example using the iris dataset and purrr:
library(tidyverse)
data(iris)
## create a grid with variable combinations
variables <- iris %>%
select(everything(), -Species) %>%
names() %>%
expand.grid(x = ., y =., stringsAsFactors = F)
##create plotting function
plot_data <- function(data, x, y){
ggplot(data, aes_string(x, y)) +
geom_point() +
ggtitle(paste(x, "vs", y))
}
map2(.x = variables$x,
.y = variables$y,
.f = ~ plot_data(iris, .x, .y))
This creates all variable combinations of plots and changes the title.

Extend an annotation line across multiple facets of ggplot

When I facet a plot I often want to point out interesting comparisons between groups. For instance, in the plot produced by this code I'd like to point out that the second and third columns are nearly identical.
library(tidyverse)
ggplot(mtcars, aes(x = as.factor(am), y = mpg)) +
stat_summary(fun.y = "mean", geom = "col") +
stat_summary(fun.data = mean_se, geom = "errorbar", width = .1) +
facet_grid(~ vs)
Currently I can only make this annotation by exporting my plot to another app like Preview or Powerpoint and manually adding the lines and text across facets.
My efforts to add an annotation across facets results in annotations that do not leave their own facet. See below.
ggplot(mtcars, aes(x = as.factor(am), y = mpg)) +
stat_summary(fun.y = "mean", geom = "col") +
stat_summary(fun.data = mean_se, geom = "errorbar", width = .1) +
facet_grid(~ vs) +
annotate("errorbarh", xmin = 2, xmax = 3, y = 25, height = .5,
color = "red") +
annotate("text", x = 2.5, y = 27, label = "NS", color = "red")
Any advice about how to extend lines and annotations across facets would be greatly appreciated.

to show mean value in ggplot box plot

I need to be able to show the mean value in ggplot box plot. Below works for a point but I need the white dashed lines? Any body help?
x
Team Value
A 10
B 5
C 29
D 35
ggplot(aes(x = Team , y = Value), data = x)
+ geom_boxplot (aes(fill=Team), alpha=.25, width=0.5, position = position_dodge(width = .9))
+ stat_summary(fun.y=mean, colour="red", geom="point")
Here's my way of adding mean to boxplots:
ggplot(RQA, aes(x = Type, y = engagementPercent)) +
geom_boxplot(aes(fill = Type),alpha = .6,size = 1) +
scale_fill_brewer(palette = "Set2") +
stat_summary(fun.y = "mean", geom = "text", label="----", size= 10, color= "white") +
ggtitle("Participation distribution by type") +
theme(axis.title.y=element_blank()) + theme(axis.title.x=element_blank())
ggplot(df, aes(x = Type, y = scorepercent)) +
geom_boxplot(aes(fill = Type),alpha = .6,size = 1) +
scale_fill_brewer(palette = "Set2") +
stat_summary(fun.y = "mean", geom = "point", shape= 23, size= 3, fill= "white") +
ggtitle("score distribution by type") +
theme(axis.title.y=element_blank()) + theme(axis.title.x=element_blank())
I would caution against using text to this and do geom_line instead as text is offset slightly and gives the wrong portrayal of the mean.
Hey user1471980, I think people are more inclined to help if you have a unique user name but then again you have a lot of points :)
this is a hack but does this help:
Value<-c(1,2,3,4,5,6)
Team<-c("a","a","a","b","b","b")
x<-data.frame(Team,Value) #note means for a=2, mean for b=5
ggplot(aes(x = Team , y = Value), data = x) + geom_boxplot (aes(fill=Team), alpha=.25, width=0.5, position = position_dodge(width = .9)) +
annotate(geom="text", x=1, y=2, label="----", colour="white", size=7, fontface="bold", angle=0) +
annotate(geom="text", x=2, y=5, label="----", colour="white", size=7, fontface="bold", angle=0)

Resources