legend <- c("score" = "black", "answer" = "red")
plot <- df_l %>% ggplot(aes(date, score, color = "score")) + geom_line() +
geom_vline(aes(xintercept = getDate(df_all %>% filter(name == List[5])), color = "answer"), linetype = "dashed", size = 1,) +
scale_color_manual(name = "Legend", values = legend) +
scale_x_date(labels = date_format("%m/%y"), breaks = date_breaks("months")) +
theme(axis.text.x = element_text(angle=45)) +
labs(title = "", x = "", y = "", colors = "Legend")
I get the result above and could not figure out how to resolve the problem that in the legend always both lines are mixed up. One legend should of course show the slim black line only and the other the dashed black line. Thanks in advance!
The issue you have is that geom_vline results in a legend item that is a vertical line and geom_line gives you a horizontal line item. One solution is to create the legend kind of manually by specifying the color= aesthetic in geom_line... but not in geom_vline. You can then create a kind of "dummy" geom with geom_blank that serves as a holding object for the aesthetics of color=. You can then specify the colors for both of those items via scale_color_manual. Here's an example:
set.seed(12345)
df <- data.frame(x=1:100,y=rnorm(100))
ggplot(df, aes(x,y)) + theme_bw() +
geom_line(aes(color='score')) +
geom_vline(aes(xintercept=4), linetype=2, color='red', show.legend = FALSE) +
geom_blank(aes(color='my line')) +
scale_color_manual(name='Legend', values=c('my line'='red','score'='black'))
That creates the one legend for color... but unfortunately "my line" is solid red, when it should be dashed. To fix that, you just apply the linetype= aesthetic in the same way.
ggplot(df, aes(x,y)) + theme_bw() +
geom_line(aes(color='score', linetype='score')) +
geom_vline(aes(xintercept=4), linetype=2, color='red', show.legend = FALSE) +
geom_blank(aes(color='my line', linetype='my line')) +
scale_linetype_manual(name='Legend', values=c('my line'=2,'score'=1)) +
scale_color_manual(name='Legend', values=c('my line'='red','score'='black'))
I have a ggplot bar and don't know how to change the scale of the x axis. At the moment it looks like on the image below. However I'd like to reorder the scale of the x axis so that 21% bar is higher than the 7% bar. How could I get the % to the axis? Thanks in advance!
df= data.frame("number" = c(7,21), "name" = c("x","y"))
df
ggplot(df, aes(x=name, y=number)) +
geom_bar(stat="identity", fill = "blue") + xlab("Title") + ylab("Title") +
ggtitle("Title")
Use the prop.table function to in y variable in the geom plot.
ggplot(df, aes(x=name, y=100*prop.table(number))) +
geom_bar(stat="identity", fill = "blue") +
xlab("Stichprobe") + ylab("Paketmenge absolut") +
ggtitle("Menge total")
If you want to have the character, % in the y axis, you can add scale_y_continuous to the plot as below:
library(scales)
ggplot(df, aes(x=name, y=prop.table(number))) +
geom_bar(stat="identity", fill = "blue") +
xlab("Stichprobe") + ylab("Paketmenge absolut") +
ggtitle("Menge total") +
scale_y_continuous(labels=percent)
The only way I am able to duplicate the original plot is, as #sconfluentus noted, for the 7% and 21% to be character strings. As an aside the data frame column names need not be quoted.
df= data.frame(number = c('7%','21%'), name = c("x","y"))
df
ggplot(df, aes(x=name, y=number)) +
geom_bar(stat="identity", fill = "blue") + xlab("Title") + ylab("Title") +
ggtitle("Title")
Changing the numbers to c(0.07, 0.21) and adding, as #Mohanasundaram noted, scale_y_continuous(labels = scales::percent) corrects the situation:
To be pedantic using breaks = c(0.07, 0.21) creates nearly an exact duplicate. See also here.3
Hope this is helpful.
library(ggplot2)
library(scales)
df= data.frame(number = c(0.07,0.21), name = c("KG","MS"))
df
ggplot(df, aes(x=name, y=number)) +
geom_bar(stat="identity", fill = "blue") + xlab("Title") + ylab("Title") +
ggtitle("Title") + scale_y_continuous(labels = scales::percent, breaks = c(.07, .21)))
I am simply trying to show the breaks on the x axis of a plot (5,4,3,2,1,.5) but it will not show the .5. When I tried to code below, it resulted in not showing any x marks whatsoever and I don't know why.
labels <- c(5,4,3,2,1,.5)
ggplot(kobe_vs_kawhi, aes(desc(Time_Left), FG_Percentage, color = Player)) +
geom_point() +
geom_smooth() +
scale_x_continuous(breaks = labels) +
scale_color_manual(values = c("red4", "gold2"))
Difficult to answer without data but if you want all of those x-axis values displayed and in reverse try:
ggplot(kobe_vs_kawhi, aes(Time_Left, FG_Percentage, color = Player)) +
geom_point() +
geom_smooth() +
scale_x_reverse(breaks = labels) +
expand_limits(x = c(0, 5) +
scale_color_manual(values = c("red4", "gold2"))
Note that there is no need to sort Time_Left or for labels to be in reverse order using this approach.
I am creating a grouped boxplot with a scatterplot overlay using ggplot2. I would like to group each scatterplot datapoint with the grouped boxplot that it corresponds to.
However, I'd also like the scatterplot points to be different symbols. I seem to be able to get my scatterplot points to group with my grouped boxplots OR get my scatterplot points to be different symbols... but not both simultaneously. Below is some example code to illustrate what's happening:
library(scales)
library(ggplot2)
# Generates Data frame to plot
Gene <- c(rep("GeneA",24),rep("GeneB",24),rep("GeneC",24),rep("GeneD",24),rep("GeneE",24))
Clone <- c(rep(c("D1","D2","D3","D4","D5","D6"),20))
variable <- c(rep(c(rep("Day10",6),rep("Day20",6),rep("Day30",6),rep("Day40",6)),5))
value <- c(rnorm(24, mean = 0.5, sd = 0.5),rnorm(24, mean = 10, sd = 8),rnorm(24, mean = 1000, sd = 900),
rnorm(24, mean = 25000, sd = 9000), rnorm(24, mean = 8000, sd = 3000))
value <- sqrt(value*value)
Tdata <- cbind(Gene, Clone, variable)
Tdata <- data.frame(Tdata)
Tdata <- cbind(Tdata,value)
# Creates the Plot of All Data
# The below code groups the data exactly how I'd like but the scatter plot points are all the same shape
# and I'd like them to each have different shapes.
ln_clr <- "black"
bk_clr <- "white"
point_shapes <- c(0,15,1,16,2,17)
blue_cols <- c("#EFF2FB","#81BEF7","#0174DF","#0000FF","#0404B4")
lp1 <- ggplot(Tdata, aes(x=variable, y=value, fill=Gene)) +
stat_boxplot(geom ='errorbar', position = position_dodge(width = .83), width = 0.25,
size = 0.7, coef = 4) +
geom_boxplot( coef=1, outlier.shape = NA, position = position_dodge(width = .83), lwd = 0.3,
alpha = 1, colour = ln_clr) +
geom_point(position = position_jitterdodge(dodge.width = 0.83), size = 1.8, alpha = 0.7,
pch=15)
lp1 + scale_fill_manual(values = blue_cols) + labs(y = "Fold Change") +
expand_limits(y=c(0.01,10^5)) +
scale_y_log10(expand = c(0, 0), breaks = c(0.01,1,100,10000,100000),
labels = trans_format("log10", math_format(10^.x)))
ggsave("Scatter Grouped-Wrong Symbols.png")
#*************************************************************************************************************************************
# The below code doesn't group the scatterplot data how I'd like but the points each have different shapes
lp2 <- ggplot(Tdata, aes(x=variable, y=value, fill=Gene)) +
stat_boxplot(geom ='errorbar', position = position_dodge(width = .83), width = 0.25,
size = 0.7, coef = 4) +
geom_boxplot( coef=1, outlier.shape = NA, position = position_dodge(width = .83), lwd = 0.3,
alpha = 1, colour = ln_clr) +
geom_point(position = position_jitterdodge(dodge.width = 0.83), size = 1.8, alpha = 0.7,
aes(shape=Clone))
lp2 + scale_fill_manual(values = blue_cols) + labs(y = "Fold Change") +
expand_limits(y=c(0.01,10^5)) +
scale_y_log10(expand = c(0, 0), breaks = c(0.01,1,100,10000,100000),
labels = trans_format("log10", math_format(10^.x)))
ggsave("Scatter Ungrouped-Right Symbols.png")
If anyone has any suggestions I'd really appreciate it.
Thank you
Nathan
To get the boxplots to appear, the shape aesthetic needs to be inside geom_point, rather than in the main call to ggplot. The reason for this is that when the shape aesthetic is in the main ggplot call, it applies to all the geoms, including geom_boxplot. However, applying a shape=Clone aesthetic causes geom_boxplot to create a separate boxplot for each level of Clone. Since there's only one row of data for each combination of variable and Clone, no boxplot is produced.
That the shape aesthetic affects geom_boxplot seems counterintuitive to me, but maybe there's a reason for it that I'm not aware of. In any case, moving the shape aesthetic into geom_point solves the problem by applying the shape aesthetic only to geom_point.
Then, to get the points to appear with the correct boxplot, we need to group by Gene. I also added theme_classic to make it easier to see the plot (although it's still very busy):
ggplot(Tdata, aes(x=variable, y=value, fill=Gene)) +
stat_boxplot(geom ='errorbar', width=0.25, size=0.7, coef=4, position=position_dodge(0.85)) +
geom_boxplot(coef=1, outlier.shape=NA, lwd=0.3, alpha=1, colour=ln_clr, position=position_dodge(0.85)) +
geom_point(position=position_jitterdodge(dodge.width=0.85), size=1.8, alpha=0.7,
aes(shape=Clone, group=Gene)) +
scale_fill_manual(values=blue_cols) + labs(y="Fold Change") +
expand_limits(y=c(0.01,10^5)) +
scale_y_log10(expand=c(0, 0), breaks=10^(-2:5),
labels=trans_format("log10", math_format(10^.x))) +
theme_classic()
I think the plot would be easier to understand if you use faceting for Gene and the x-axis for variable. Putting time on the x-axis seems more intuitive, while using facetting frees up the color aesthetic for the points. With six different clones, it's still difficult (for me at least) to differentiate the point markers, but this looks cleaner to me than the previous version.
library(dplyr)
ggplot(Tdata %>% mutate(Gene=gsub("Gene","Gene ", Gene)),
aes(x=gsub("Day","",variable), y=value)) +
stat_boxplot(geom='errorbar', width=0.25, size=0.7, coef=4) +
geom_boxplot(coef=1, outlier.shape=NA, lwd=0.3, alpha=1, colour=ln_clr, width=0.5) +
geom_point(aes(fill=Clone), position=position_jitter(0.2), size=1.5, alpha=0.7, shape=21) +
theme_classic() +
facet_grid(. ~ Gene) +
labs(y = "Fold Change", x="Day") +
expand_limits(y=c(0.01,10^5)) +
scale_y_log10(expand=c(0, 0), breaks=10^(-2:5),
labels=trans_format("log10", math_format(10^.x)))
If you really need to keep the points, maybe it would be better to separate the boxplots and points with some manual dodging:
set.seed(10)
ggplot(Tdata %>% mutate(Day=as.numeric(substr(variable,4,5)),
Gene = gsub("Gene","Gene ", Gene)),
aes(x=Day - 2, y=value, group=Day)) +
stat_boxplot(geom ='errorbar', width=0.5, size=0.5, coef=4) +
geom_boxplot(coef=1, outlier.shape=NA, lwd=0.3, alpha=1, width=4) +
geom_point(aes(x=Day + 2, fill=Clone), size=1.5, alpha=0.7, shape=21,
position=position_jitter(width=1, height=0)) +
theme_classic() +
facet_grid(. ~ Gene) +
labs(y="Fold Change", x="Day") +
expand_limits(y=c(0.01,10^5)) +
scale_y_log10(expand=c(0, 0), breaks=10^(-2:5),
labels=trans_format("log10", math_format(10^.x)))
One more thing: For future reference, you can simplify your data creation code:
Gene = rep(paste0("Gene",LETTERS[1:5]), each=24)
Clone = rep(paste0("D",1:6), 20)
variable = rep(rep(paste0("Day", seq(10,40,10)), each=6), 5)
value = rnorm(24*5, mean=rep(c(0.5,10,1000,25000,8000), each=24),
sd=rep(c(0.5,8,900,9000,3000), each=24))
Tdata = data.frame(Gene, Clone, variable, value)