ggplot2: Adding another legend to a plot (two times) - r

I have the following data set that is used for plotting a bubble plot frequencies.
Freq are frequencies at time 1
Freq1 are frequencies at time 2
id names variable value Freq Freq.1
1 1 item1 1 13 11
2 2 item2 1 9 96
3 3 item1 2 10 28
4 4 item2 2 15 8
5 5 item1 3 9 80
6 6 item2 3 9 10
7 7 item1 4 11 89
8 8 item2 4 14 8
9 9 item1 5 3 97
10 10 item2 5 25 82
I am using the following code for plotting, and I do like the plot. However I am having some troubles with the legend that I explain below:
theme_nogrid <- function (base_size = 12, base_family = "") {
theme_bw(base_size = base_size, base_family = base_family) %+replace%
theme(panel.grid = element_blank())
}
plot1<- ggplot(Data, aes(x = variable, y = value, size = Freq, color=Freq.1))+
geom_point( aes(size = Freq, stat = "identity", position = "identity"),
shape = 19, color="black", alpha=0.5) +
geom_point( aes(size = Freq.1, stat = "identity", position = "identity"),
shape = 19, color="red", alpha=0.5) +
scale_size_continuous(name= "Frequencies ", range = c(2,30))+
theme_nogrid()
1- I would like to have two legends: one for color, the other one for size, but i can't get the right arguments to do it (I have consult guide and theme documentation and i can't solve my problem with my own ideas)
2- After having the two legends, I would like to increase the size of the legend shape in order to look bigger (not the text, not the background, just the shape (without actually changing the plot)).
Here and example from what I would have and what i would like (that's an example from my real data). As you can see is almost impossible to distinguish the color in the first image.
Sorry if it's a newbie question, but i can't really get an example of that.
Thanks,
Angulo

Try something like this
library(ggplot2)
library(tidyr)
d <- gather(Data, type, freq, Freq, Freq.1)
ggplot(d, aes(x = variable, y = value))+
geom_point(aes(size = freq, colour = type), shape = 19, alpha = 0.5) +
scale_size_continuous(name = "Frequencies ", range = c(2, 30)) +
scale_colour_manual(values = c("red", "blue")) +
theme_nogrid() +
guides(colour = guide_legend(override.aes = list(size = 10)))
The last line will make the circles in the "colour" legend larger.

Related

How can I use the ggplot function to visualise grouped data?

I have a data set which has the time taken for individuals to read a sentence (response_time) under the experimental factors of the condition of the sentence (normal or visually degraded) and the number of cups of coffee (caffeine) that an individual has drunk. I want to visualise the data using ggplot, but with the data grouped according to the condition of the sentence and the coffee drunk - e.g. the response times recorded for individuals reading a normal sentence and having drunk one cup of coffee.
This is what I have tried so far, but the graph comes up as one big blob (not separated by group) and has over 15 warnings!!
participant condition response_time caffeine
<dbl> <fct> <dbl> <fct>
1 1 Normal 984 1
2 2 Normal 1005 1
3 3 Normal 979 3
4 4 Normal 1040 2
5 5 Normal 1008 2
6 6 Normal 979 3
>
tidied_data_2 %>%
ggplot(aes(x = condition:caffeine, y = response_time, colour = condition:caffeine)) +
geom_violin() +
geom_jitter(width = .1, alpha = .25) +
guides(colour = FALSE) +
stat_summary(fun.data = "mean_cl_boot", colour = "black") +
theme_minimal() +
theme(text = element_text(size = 13)) +
labs(x = "Condition X Caffeine", y = "Response Time (ms)")
Any suggestions on how to better code what I want would be great.
As a wiki answer because too long for a comment.
Not sure what you are intending with condition:caffeine - I've never seen that syntax in ggplot. Try aes(x = as.character(caffeine), y = ..., color = as.character(caffeine)) instead (or, because it is a factor in your case anyways, you can just use aes(x = caffeine, y = ..., color = caffeine)
If your idea is to separate by condition, you could just use aes(x = caffeine, y = ..., color = condition), as they are going to be separated by x anyways.
of another note - why not actually plotting a scatter plot? Like making this a proper two-dimensional graph. suggestion below.
library(ggplot2)
library(dplyr)
tidied_data_2 <- read.table(text = "participant condition response_time caffeine
1 1 Normal 984 1
2 2 Normal 1005 1
3 3 Normal 979 3
4 4 Normal 1040 2
5 5 Normal 1008 2
6 6 Normal 979 3", head = TRUE)
tidied_data_2 %>%
ggplot(aes(x = as.character(caffeine), y = response_time, colour = as.character(caffeine))) +
## geom_violin does not make sense with so few observations
# geom_violin() +
## I've removed alpha so you can see the dots better
geom_jitter(width = .1) +
guides(colour = FALSE) +
stat_summary(fun.data = "mean_cl_boot", colour = "black") +
theme_minimal() +
theme(text = element_text(size = 13)) +
labs(x = "Condition X Caffeine", y = "Response Time (ms)")
what I would rather do
tidied_data_2 %>%
## in this example as.integer(as.character(x)) is unnecessary, but it is necessary for your data sample
ggplot(aes(x = as.integer(as.character(caffeine)), y = response_time)) +
geom_jitter(width = .1) +
theme_minimal()

Secondary axis in R not registering

ggplot(df) +
geom_bar(aes(x=Date, y=DCMTotalCV, fill=CampaignName), stat='identity', position='stack') +
geom_line(aes(x=Date, y=DCMCPA, color=CampaignName, group=as.factor(CampaignName)), na.rm = FALSE,show.legend=NA)+
scale_y_continuous(sec.axis = sec_axis(~./1000, name = "DCMTotalCV"))+
theme_bw()+
labs(
x= "Date",
y= "CPA",
title = "Daily Performance"
)
Hey everyone - so I have 2 y-axes i want to plot. geom_line is registering fine on the main y-axis but geom_bar is not registering properly on the right. I tried scaling but it's still not registering or plotting on that second axis. It looks like it's still appearing on the main y-axis so I'm wondering how to tell the plot to plot it on the second one? Sorry i'm kind of a newbie. Thanks!
data <- data.frame(
day = as.Date("2020-01-01"),
conversions = seq(1,6)^2,
cpa = 100000 / seq(1,6)^2
)
head(data)
str(data)
#plot
ggplot(data, aes(x=day)) +
geom_bar( aes(y=conversions), stat='identity') +
geom_line( aes(y=cpa)) +
scale_y_continuous(sec.axis = sec_axis(~./1000))
ggplot2::sec_axis is intended only to put up the scale itself; it does nothing to try to scale the values (that you are pairing with that axis). Why? Primarily because it knows nothing about which y variable you are intending to pair with which y-axis. (Is there anywhere in sec_axis to tell it that it should be looking at a particular variable? Nope.)
As a demonstration, let's start with some random data and plot the line.
set.seed(42)
dat <- data.frame(x = rep(1:10), y1 = sample(10), y2 = sample(100, size = 10))
dat
# x y1 y2
# 1 1 1 47
# 2 2 5 24
# 3 3 10 71
# 4 4 8 89
# 5 5 2 37
# 6 6 4 20
# 7 7 6 26
# 8 8 9 3
# 9 9 7 41
# 10 10 3 97
ggplot(dat, aes(x, y1)) +
geom_line() +
scale_y_continuous(name = "Oops!")
Now you determine that you want to add the y2 variable in there, but because its values are on a completely different scale, you think to just add them (I'll use geom_text here) and then set a second axis.
ggplot(dat, aes(x, y1)) +
geom_line() +
geom_text(aes(y = y2, label = y2)) +
scale_y_continuous(name = "Oops!", sec.axis = sec_axis(~ . * 10, name = "Quux!"))
Two things wrong with this:
The primary (left) y-axis now scales from 0 to 100, scrunching the primary y values to the bottom of the plot; and
Related, the secondary (right) y-axis scales from 0 to 1000?!? This is because the only thing that the secondary axis "knows" is the values that go into the primary axis ... and the primary axis is scaling to fit all of the y* variables it is told to plot.
That last point is important: this is giving y values that scale from 0 to 100, so the axis will reflect that. You can do lims(y=c(0,10)), but realize you'll be truncating y2 values ... that's not the right approach.
Instead, you need to scale the second values to be within the same range of values as the primary axis variable y1. Though not required, I'll use scale::rescale for this.
dat$y2scaled <- scales::rescale(dat$y2, range(dat$y1))
dat
# x y1 y2 y2scaled
# 1 1 1 47 5.212766
# 2 2 5 24 3.010638
# 3 3 10 71 7.510638
# 4 4 8 89 9.234043
# 5 5 2 37 4.255319
# 6 6 4 20 2.627660
# 7 7 6 26 3.202128
# 8 8 9 3 1.000000
# 9 9 7 41 4.638298
# 10 10 3 97 10.000000
Notice how y2scaled is now proportionately within y1's range?
We'll use that to position each of the text objects (though we'll still show the y2 as the label here).
ggplot(dat, aes(x, y1)) +
geom_line() +
geom_text(aes(y = y2scaled, label = y2)) +
scale_y_continuous(name = "Oops!", sec.axis = sec_axis(~ . * 10, name = "Quux!"))
Are we strictly required to make sure that the points pairing with the secondary axis perfectly fill the range of values of the primary axis? No. We could easily have thought to keep the text labels only on the bottom half of the plot, so we'd have to scale appropriately.
dat$y2scaled2 <- scales::rescale(dat$y2, range(dat$y1) / c(1, 2))
dat
# x y1 y2 y2scaled y2scaled2
# 1 1 1 47 5.212766 2.872340
# 2 2 5 24 3.010638 1.893617
# 3 3 10 71 7.510638 3.893617
# 4 4 8 89 9.234043 4.659574
# 5 5 2 37 4.255319 2.446809
# 6 6 4 20 2.627660 1.723404
# 7 7 6 26 3.202128 1.978723
# 8 8 9 3 1.000000 1.000000
# 9 9 7 41 4.638298 2.617021
# 10 10 3 97 10.000000 5.000000
ggplot(dat, aes(x, y1)) +
geom_line() +
geom_text(aes(y = y2scaled2, label = y2)) +
scale_y_continuous(name = "Oops!", sec.axis = sec_axis(~ . * 20, name = "Quux!"))
Notice that not only did I change how the y-axis values were scaled (now ranging from 0 to 5 in y2scaled2), but I also had to change the transformation within sec_axis to be *20 instead of *10.
Sometimes getting these transformations correct can be confusing, and it is easy to mess them up. However ... realize that it took many years to even get this functionality into ggplot2, mostly due to the lead developer(s) belief that even when plotted well, they can be confusing to the viewer, and potentially provide misleading takeaways. I find that they can be useful sometimes, and there are techniques one can use to encourage correct interpretation, but ... it's hard to get because it's easy to get wrong.
As an example of one technique that helps distinguish which axis goes with which data, see this:
ggplot(dat, aes(x, y1)) +
geom_line(color = "blue") +
geom_text(aes(y = y2scaled2, label = y2), color = "red") +
scale_y_continuous(name = "Oops!", sec.axis = sec_axis(~ . * 20, name = "Quux!")) +
theme(
axis.ticks.y.left = element_line(color = "blue"),
axis.text.y.left = element_text(color = "blue"),
axis.title.y.left = element_text(color = "blue"),
axis.ticks.y.right = element_line(color = "red"),
axis.text.y.right = element_text(color = "red"),
axis.title.y.right = element_text(color = "red")
)
(One might consider colors from viridis for a more color-blind palette.)

How to customise the colors in stacked bar charts

Maybe a question someone already asked.
I have a data frame (dat) that looks like this:
Sample perc cl
a 30 0
b 22 0
s 2 0
z 19 0
a 12 1
b 45 1
s 70 1
z 1 1
a 60 2
b 67 2
s 50 2
z 18 2
I would like to generate a stacked barplot. To do this I used the following:
g = ggplot(dat, aes(x = cl, y = Perc,fill = Sample)
g + geom_bar(stat="identity", position = "fill", show.legend = FALSE) +
scale_fill_manual(name = "Samples", values=c("a"="blue","b" = "blue","s" = "gray","z" = "red"))`
Fortunately the colors are assigned correctly. My point is that the order of samples in the bar is from a to z from the top to the bottom of the bar but I would like a situation in which the gray is on the top without loss of continuity in the bar from the blue to the red. Maybe there's another way to color the bars and set the desired order.
The groups are plotted in the bars in the order of the factor levels. You can change the plotting order by changing the order of the factor levels in your call to aes with factor(var, levels(var[order])) like this:
library(ggplot2)
ggplot(dat, aes(x = cl, y = perc,
fill = factor(Sample, levels(Sample)[c(3,1,2,4)]))) +
geom_bar(stat="identity", position = "fill", show.legend = FALSE) +
scale_fill_manual(name = "Samples",
values=c("a"="blue","b" = "blue","s" = "gray","z" = "red"))

Left-align titles and subtitles with y-axis labels in ggplot2 graph in Shiny application [duplicate]

I would like to left align the title in a plot like this
ggplot(data = economics, aes(x = date, y = unemploy)) +
geom_line() +
ggtitle("Unemployment in USA between 1967 and 2007") +
xlab("") +
ylab("Unemployed [thousands]")
First attempt
ggplot(data = economics, aes(x = date, y = unemploy)) + geom_line() +
ggtitle("Unemployment in USA for some years") +
xlab("") +
ylab("Unemployed [thousands]") +
theme(plot.title = element_text(hjust = -0.45, vjust=2.12)))
Yay success! But wait... there's more... now I want to change the title to something else.
ggplot(data = economics, aes(x = date, y = unemploy)) +
geom_line() +
ggtitle("Unemployment in USA between 1967 and 2007") +
xlab("") +
ylab("Unemployed [thousands]") +
theme(plot.title = element_text(hjust = -0.45, vjust=2.12))
So now I need to adjust hjust... :(
The question
How can I make the title left justified (a couple of pixels left of the y axis label or so) over and over again without messing with the hjust value? Or what is the relationship between hjust and the length of the string?
I have tried to annotate manually according to this question, but then I got only the title, and nothing else for some reason - and an error.
Thank you!
This question refers to this github tidyverse/ggplot2 solved issue: https://github.com/tidyverse/ggplot2/issues/3252
And it is implemented in ggplot2 (development version): https://github.com/tidyverse/ggplot2/blob/15263f7580d6b5100989f7c1da5d2f5255e480f9/NEWS.md
Themes have gained two new parameters, plot.title.position and plot.caption.position, that can be used to customize how plot title/subtitle and plot caption are positioned relative to the overall plot (#clauswilke, #3252).
To follow your example as a reprex:
# First install the development version from GitHub:
#install.packages("devtools") #If required
#devtools::install_github("tidyverse/ggplot2")
library(ggplot2)
packageVersion("ggplot2")
#> [1] '3.2.1.9000'
ggplot(data = economics, aes(x = date, y = unemploy)) +
geom_line() +
labs(x=NULL,
y="Unemployed [thousands]",
title = "Unemployment in USA for some years",
subtitle = "A subtitle possibly",
caption = "NOTE: Maybe a caption too in italics.") +
theme(plot.caption = element_text(hjust = 0, face= "italic"), #Default is hjust=1
plot.title.position = "plot", #NEW parameter. Apply for subtitle too.
plot.caption.position = "plot") #NEW parameter
Created on 2019-09-04 by the reprex package (v0.3.0)
Until someone comes up with a better solution, one way would be something like
library(ggplot2)
library(grid)
library(gridExtra)
p <- ggplot(data = economics, aes(x = date, y = unemploy)) +
geom_line() +
labs(x = NULL, y = "Unemployed [thousands]", title = NULL)
title.grob <- textGrob(
label = "Unemployment in USA for some years",
x = unit(0, "lines"),
y = unit(0, "lines"),
hjust = 0, vjust = 0,
gp = gpar(fontsize = 16))
p1 <- arrangeGrob(p, top = title.grob)
grid.draw(p1)
You can manually adjust the layout of the ggplot output. First, we set up the basic plot:
library(ggplot2)
p <- ggplot(data = economics, aes(x = date, y = unemploy)) +
geom_line() +
labs(title = "Unemployment in USA between 1967 and 2007",
x = NULL, y = "Unemployed [thousands]")
We can now convert the ggplot object into a gtable object, and inspect the layout of the elements in the plot. Notice that the title is in the fourth column of the grid, the same column as the main panel.
g <- ggplotGrob(p)
g$layout
# t l b r z clip name
# 17 1 1 10 7 0 on background
# 1 5 3 5 3 5 off spacer
# 2 6 3 6 3 7 off axis-l
# 3 7 3 7 3 3 off spacer
# 4 5 4 5 4 6 off axis-t
# 5 6 4 6 4 1 on panel
# 6 7 4 7 4 9 off axis-b
# 7 5 5 5 5 4 off spacer
# 8 6 5 6 5 8 off axis-r
# 9 7 5 7 5 2 off spacer
# 10 4 4 4 4 10 off xlab-t
# 11 8 4 8 4 11 off xlab-b
# 12 6 2 6 2 12 off ylab-l
# 13 6 6 6 6 13 off ylab-r
# 14 3 4 3 4 14 off subtitle
# 15 2 4 2 4 15 off title
# 16 9 4 9 4 16 off caption
To align the title with the left edge of the plot, we can change the l value to 1.
g$layout$l[g$layout$name == "title"] <- 1
Draw the modified grid:
grid::grid.draw(g)
Result:
Since the release of ggplot 3.3.0 you can also use plot.title.position = "plot" to position the title and plot.caption.position = "plot subtitle at the left side of the full plot.
ggplot(data = economics, aes(x = date, y = unemploy)) +
geom_line() +
ggtitle("Unemployment in USA between 1967 and 2007") +
xlab("") +
ylab("Unemployed [thousands]") +
theme(plot.title.position = "plot")
I wrote the ggdraw() layer in cowplot specifically so that I could make annotations easily anywhere on a plot. It sets up a coordinate system that covers the entire plot area, not just the plot panel, and runs from 0 to 1 in both the x and y direction. Using this approach, it is easy to place your title wherever you want.
library(cowplot)
theme_set(theme_gray()) # revert to ggplot2 default theme
p <- ggplot(data = economics, aes(x = date, y = unemploy)) +
geom_line() +
ggtitle("") + # make space for title on the plot
xlab("") +
ylab("Unemployed [thousands]")
ggdraw(p) + draw_text("Unemployment in USA between 1967 and 2007",
x = 0.01, y = 0.98, hjust = 0, vjust = 1,
size = 12) # default font size is 14,
# which is too big for theme_gray()
Another way to do this is to utilize theme(). Use the labs function to label all your titles x = for x axis, y = for y axis, title = for plot title, fill = or colour = if you have a legend you want to put a title. Inside theme() is hjust = 0, this will left justify the plot title to the left. You can delete hjust = 0 and the plot title will be centered align.
labs(x = 'Sex', y = 'Age Mean', title = 'Suicide 2003-2013 Age Mean by Sex') +
theme(plot.title = element_text(family = 'Helvetica',
color = '#666666',
face = 'bold',
size = 18,
hjust = 0))

connect points in ggplot2 with different line type

"a" is a data frame.
set.seed(2)
a<-data.frame(group= rep(c("A","B","C"),each=4),factor=rep(c(1,1,2,2),3),
model=rep(c("old","new"),6),mean=runif(12),sd=runif(12)/10)
>a
group factor model mean sd
1 A 1 old 0.1848823 0.076051331
2 A 1 new 0.7023740 0.018082010
3 A 2 old 0.5733263 0.040528218
4 A 2 new 0.1680519 0.085354845
5 B 1 old 0.9438393 0.097639849
6 B 1 new 0.9434750 0.022582546
7 B 2 old 0.1291590 0.044480923
8 B 2 new 0.8334488 0.007497942
9 C 1 old 0.4680185 0.066189876
10 C 1 new 0.5499837 0.038754954
11 C 2 old 0.5526741 0.083688918
12 C 2 new 0.2388948 0.015050144
I want to draw a "mean±sd" line graph with ggplot2.
My aim is to draw a picture (x-axis is "group", y-axis is "mean±sd", different "factor" should have different color, different "model" should have different connecting line type (old model is dashed line, new model is solid line))
I use following code:
library("ggplot2")
pd <- position_dodge(0.1) #The errorbars overlapped, so use position_dodge to move
#them horizontally
plot<-ggplot(a, aes(x=group, y=mean, colour=as.factor(factor), linetype=model)) +
geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width=.1, position=pd) +
geom_point(position=pd, size=3, shape=21, fill="white") + # 21 is filled circle
xlab("Groups") +
ylab("Power") +
geom_line(position=pd) +
scale_linetype_manual(values = c(new = "solid", old = "dashed"))
but "mean±sd" should all be solid lines, and connection between points should be added. Actually I want it to be like this:
Can you give me some advice, thank you!
I'd suggest adding a new grouping variable:
a$group2 <- paste(a$factor, a$model, sep="_")
Then remove linetype from ggplot() and modify geom_line():
ggplot(a, aes(x = group, y = mean, colour = as.factor(factor))) +
geom_errorbar(aes(ymin = mean-sd, ymax = mean+sd),
width = .1, position = pd) +
geom_point(position = pd, size = 3, shape = 21, fill = "white") +
xlab("Groups") +
ylab("Power") +
geom_line(aes(x = group, y = mean, colour = as.factor(factor),
group = group2, linetype = model)) +
scale_linetype_manual(values = c(new = "solid", old = "dashed"))

Resources