This question already has answers here:
facet_wrap add geom_hline
(2 answers)
Closed 5 months ago.
So I have a faceted graph, and I want to be able to add lines to it that change by each facet.
Here's the code:
p <- ggplot(mtcars, aes(x=wt))+
geom_histogram(bins = 20,aes(fill = factor(cyl)))+
facet_grid(.~cyl)+
scale_color_manual(values = c('red','green','blue'))+
geom_vline(xintercept = mean(mtcars$wt))
p
So my question is, how would I get it so that the graph is showing the mean of each faceted sub-graph.
I hope that makes sense and appreciate your time regardless of your answering capability.
You can do this within the ggplot call by using stat_summaryh from the ggstance package. In the code below, I've also changed scale_colour_manual to scale_fill_manual on the assumption that you were trying to set the fill colors of the histogram bars:
library(tidyverse)
library(ggstance)
ggplot(mtcars, aes(x=wt))+
geom_histogram(bins = 20,aes(fill = factor(cyl)))+
stat_summaryh(fun.x=mean, geom="vline", aes(xintercept=..x.., y=0),
colour="grey40") +
facet_grid(.~cyl)+
scale_fill_manual(values = c('red','green','blue')) +
theme_bw()
Another option is to calculate the desired means within geom_vline (this is an implementation of the summary approach that #Ben suggested). In the code below, the . is a "pronoun" that refers to the data frame (mtcars in this case) that was fed into ggplot:
ggplot(mtcars, aes(x=wt))+
geom_histogram(bins = 20,aes(fill = factor(cyl)))+
geom_vline(data = . %>% group_by(cyl) %>% summarise(wt=mean(wt)),
aes(xintercept=wt), colour="grey40") +
facet_grid(.~cyl)+
scale_fill_manual(values = c('red','green','blue')) +
theme_bw()
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
This post was edited and submitted for review 1 year ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
I'm trying to add a second legend for 6 vertical lines I've added onto a ggplot2 line plot representing 1 of 2 events instead of labelling the lines using the annotate function. I've tried this code but I'm not having much look:
p<- data %>%
ggplot(aes(variable, value)) +
geom_line(aes(color = size, group = size)) +
geom_vline(xintercept = c(1,9,11), linetype="dashed") +
geom_vline(xintercept = c(5,14,17), linetype="dotted") +
scale_linetype_manual(name = 'Events',
values = c('Closures' = 3,
'Opening' = 3))
A good way to do this could be to use mapping= to create the second legend in the same plot for you. The key is to use a different aesthetic for the vertical lines vs. your other lines and points.
First, the example data and plot:
library(ggplot2)
library(dplyr)
library(tidyr)
set.seed(8675309)
df <- data.frame(x=rep(1:25, 4), y=rnorm(100, 10, 2), lineColor=rep(paste0("col",1:4), 25))
base_plot <-
df %>%
ggplot(aes(x=x, y=y)) +
geom_line(aes(color=lineColor)) +
scale_color_brewer(palette="Greens")
base_plot
Now to add the vertical lines as OP has done in their question:
base_plot +
geom_vline(xintercept = c(1,9,11), linetype="dotted",
color = "red", size=1) +
geom_vline(xintercept = c(5,14,17), linetype="dotted",
color = "blue", size=1)
To add the legend, we will need to add another aesthetic in mapping=. When you use aes(), ggplot2 expects that mapping to contain the same number of observations as the dataset specified in data=. In this case, df has 100 observations, but we only need 6 lines. The simplest way around this would be to create a separate small dataset that's used for the vertical lines. This dataframe only needs to contain two columns: one for xintercept and the other that can be mapped to the linetype:
verticals <- data.frame(
intercepts=c(1,9,11,5,14,17),
Events=rep(c("Closure", "Opening"), each=3)
)
You can then use that in code with one geom_vline() call added to our base plot:
second_plot <- base_plot +
geom_vline(
data=verticals,
mapping=aes(xintercept=intercepts, linetype=Events),
size=0.8, color='blue',
key_glyph="path" # this makes the legend key horizontal lines, not vertical
)
second_plot
While this answer does not use color to differentiate the vertical lines, it's the most in line with the Grammar of Graphics principles which ggplot2 is built upon. Color is already used to indicate the differences in your data otherwise, so you would want to separate out the vertical lines using a different aesthetic. I used some of these GG principles putting together this answer - sorry, the color palette is crap though lol. In this case, I used a sequential color scale for the lines of data, and a different color to separate out the vertical lines. I'm showing the vertical line size differently than the lines from df to differentiate more and use the linetype as the discriminatory aesthetic among the vertical lines.
You can use this general idea and apply the aesthetics as you see best for your own data.
What about geom_label?
library(tidyverse)
data <- tribble(
~x, ~label,
1, "first",
1.5, "second",
5, "third"
)
data %>%
ggplot(aes(x = x, y = 1)) +
geom_vline(aes(xintercept = x)) +
geom_label(aes(label = label))
Created on 2021-12-13 by the reprex package (v2.0.1)
This question already has an answer here:
regrading adding a legend using ggplot2 for different lines
(1 answer)
Closed 2 years ago.
I have this code:
testPlot= ggplot(residFrame) +
geom_point(aes(x=STATEFP, y=total_diff, colour='total'), colour='red', shape=1) +
geom_point(aes(x=STATEFP, y=desalination_diff, colour='desalination'), colour='blue', shape=1) +
geom_point(aes(x=STATEFP, y=surfacewater_diff), colour='green', shape=1) +
geom_point(aes(x=STATEFP, y=groundwater_diff), colour='yellow', shape=1) +
xlab('STATEFP') + ylab('Difference') + ggtitle('Difference for all states', subtitle='For each source')
testPlot
And now I want to add a legend to testPlot that describes what the colours in the plot represent. I have searched the web, but cannot find the answer to this particular problem, can someone help me out here?
Thanks!
You should get the data in long format and then plot instead of calling geom_point multiple times. You have not provided an example of your data but you can try.
library(ggplot2)
residFrame %>%
tidyr::pivot_longer(cols = ends_with('diff')) %>%
ggplot() + aes(STATEFP, value, color = name) +
geom_point(shape = 1) +
xlab('STATEFP') + ylab('Difference') +
ggtitle('Difference for all states', subtitle='For each source')
This question already has answers here:
How to add legend to plot with data from multiple data frames
(2 answers)
Closed 2 years ago.
I am using ggplot to create two overlapping density from two different data frames. I need to create a legend for each of the densities.
I have been trying to follow these two posts, but still cannot get it to work:
How to add legend to plot with data from multiple data frames
ggplot legends when plot is built from two data frames
Sample code of what I am trying to do:
df1 = data.frame(x=rnorm(1000,0))
df2 = data.frame(y=rnorm(2500,0.5))
ggplot() +
geom_density(data=df1, aes(x=x), color='darkblue', fill='lightblue', alpha=0.5) +
geom_density(data=df2, aes(x=y), color='darkred', fill='indianred1', alpha=0.5) +
scale_color_manual('Legend Title', limits=c('x', 'y'), values = c('darkblue','darkred')) +
guides(colour = guide_legend(override.aes = list(pch = c(21, 21), fill = c('darkblue','darkred')))) +
theme(legend.position = 'bottom')
Is it possible to manually create a legend?
Or do I need to restructure the data as per this post?
Adding legend to ggplot made from multiple data frames with controlled colors
I'm newish to R so hoping to avoid stacking the data into a single dataframe if I can avoid it as they are weighted densities so have to multiply by different weights as well.
Unlike x, y, label etc., when using the density geom, the color aesthetic can be used within aes(). In order to accomplish what you are looking for, the color aesthetic needs to be moved into aes() enabling you to utilize scale_color_manual. Within that, you can change the values= to whatever you like.
library(tidyverse)
ggplot() +
geom_density(data=df1, aes(x=x, color='darkblue'), fill='lightblue', alpha=0.5) +
geom_density(data=df2, aes(x=y, color='darkred'), fill='indianred1', alpha=0.5) +
scale_color_manual('Legend Title', limits=c('x', 'y'), values = c('darkblue','darkred')) +
guides(colour = guide_legend(override.aes = list(pch = c(21, 21), fill = c('darkblue','darkred')))) +
theme(legend.position = 'bottom')+
scale_color_manual("Legend title", values = c("blue", "red"))
Created on 2020-08-09 by the reprex package (v0.3.0)
This question already has answers here:
How to display only integer values on an axis using ggplot2
(13 answers)
Closed 8 years ago.
I am trying to modify the axes in ggplot2 so that it is one decimal point and has a label for every integer. However, I want to do it without an upper limit so that it will automatically adjust to data of different counts.
The difference between my question and the question posed here (that I was flagged as being a duplicate of) is that I need to make this work automatically for many different data sets, not just for a single one. It must automatically choose the upper limit instead of creating a fixed y-axis with breaks=(0,2,4,...). This question has been answered extremely well by #DidzisElferts below.
Here is my work:
library(data.table)
library(scales)
library(ggplot2)
mtcars <- data.table(mtcars)
mtcars$Cylinders <- as.factor(mtcars$cyl)
mtcars$Gears <- as.factor(mtcars$gear)
setkey(mtcars, Cylinders, Gears)
mtcars <- mtcars[CJ(unique(Cylinders), unique(Gears)), .N, allow.cartesian = TRUE]
ggplot(mtcars, aes(x=Cylinders, y = N, fill = Gears)) +
geom_bar(position="dodge", stat="identity") +
ylab("Count") + theme(legend.position="top") +
scale_x_discrete(drop = FALSE)
As you can see, ggplot2 is plotting the axes with a decimal point and doing it every 2.5 automatically. I'd like to change that. Any way to do so?
integer_breaks <- function(x)
seq(floor(min(x)), ceiling(max(x)))
ggplot(mtcars, aes(x=Cylinders, y = N, fill = Gears)) +
geom_bar(position="dodge", stat="identity") +
ylab("Count") + theme(legend.position="top") +
scale_y_continuous(breaks=integer_breaks) +
scale_x_discrete(drop = FALSE)
Use scale_y_continuous(breaks=c(0,2,4,6,8,10)). So your plotting code will look like:
ggplot(mtcars, aes(x=Cylinders, y = N, fill = Gears)) +
geom_bar(position="dodge", stat="identity") +
ylab("Count") + theme(legend.position="top") +
scale_y_continuous(breaks=c(0,2,4,6,8,10)) +
scale_x_discrete(drop = FALSE)
EDIT: Alternatively you can use scale_y_continuous(breaks=seq(round(max(mtcars$N),0))) in order to automatically adjust the scale to the maximum value of the y-variable. When you want the breaks more then 1 from each other, you can use for example seq(from=0,to=round(max(mtcars$N),0),by=2)
Does anyone know how to create a scatterplot in R to create plots like these in PRISM's graphpad:
I tried using boxplots but they don't display the data the way I want it. These column scatterplots that graphpad can generate show the data better for me.
Any suggestions would be appreciated.
As #smillig mentioned, you can achieve this using ggplot2. The code below reproduces the plot that you are after pretty well - warning it is quite tricky. First load the ggplot2 package and generate some data:
library(ggplot2)
dd = data.frame(values=runif(21), type = c("Control", "Treated", "Treated + A"))
Next change the default theme:
theme_set(theme_bw())
Now we build the plot.
Construct a base object - nothing is plotted:
g = ggplot(dd, aes(type, values))
Add on the points: adjust the default jitter and change glyph according to type:
g = g + geom_jitter(aes(pch=type), position=position_jitter(width=0.1))
Add on the "box": calculate where the box ends. In this case, I've chosen the average value. If you don't want the box, just omit this step.
g = g + stat_summary(fun.y = function(i) mean(i),
geom="bar", fill="white", colour="black")
Add on some error bars: calculate the upper/lower bounds and adjust the bar width:
g = g + stat_summary(
fun.ymax=function(i) mean(i) + qt(0.975, length(i))*sd(i)/length(i),
fun.ymin=function(i) mean(i) - qt(0.975, length(i)) *sd(i)/length(i),
geom="errorbar", width=0.2)
Display the plot
g
In my R code above I used stat_summary to calculate the values needed on the fly. You could also create separate data frames and use geom_errorbar and geom_bar.
To use base R, have a look at my answer to this question.
If you don't mind using the ggplot2 package, there's an easy way to make similar graphics with geom_boxplot and geom_jitter. Using the mtcars example data:
library(ggplot2)
p <- ggplot(mtcars, aes(factor(cyl), mpg))
p + geom_boxplot() + geom_jitter() + theme_bw()
which produces the following graphic:
The documentation can be seen here: http://had.co.nz/ggplot2/geom_boxplot.html
I recently faced the same problem and found my own solution, using ggplot2.
As an example, I created a subset of the chickwts dataset.
library(ggplot2)
library(dplyr)
data(chickwts)
Dataset <- chickwts %>%
filter(feed == "sunflower" | feed == "soybean")
Since in geom_dotplot() is not possible to change the dots to symbols, I used the geom_jitter() as follow:
Dataset %>%
ggplot(aes(feed, weight, fill = feed)) +
geom_jitter(aes(shape = feed, col = feed), size = 2.5, width = 0.1)+
stat_summary(fun = mean, geom = "crossbar", width = 0.7,
col = c("#9E0142","#3288BD")) +
scale_fill_manual(values = c("#9E0142","#3288BD")) +
scale_colour_manual(values = c("#9E0142","#3288BD")) +
theme_bw()
This is the final plot:
For more details, you can have a look at this post:
http://withheadintheclouds1.blogspot.com/2021/04/building-dot-plot-in-r-similar-to-those.html?m=1