create a manual legend for density plots in R (ggplot2) - r

I want to add a legend to my graph. All solutions I found online use scale_color_manual - but it's not working for me. Where is the legend?
Here is my code:
library(ggplot2)
ggplot() +
geom_density(aes(x = rnorm(100)), color = 'red') +
geom_density(aes(x = rnorm(100)), color = 'blue') +
xlab("Age") + ylab("Density") + ggtitle('Age Densities')
theme(legend.position = 'right') +
scale_color_manual(labels = c('first', 'second'), values = c('red', 'blue'))

If for some reason you absolutely need the two geoms to take on different data sources, move the color = XXX portion inside aes() for each, then define the colors manually using a named vector:
ggplot() +
geom_density(aes(x = rnorm(100), color = 'first')) +
geom_density(aes(x = rnorm(100), color = 'second')) +
xlab("Age") + ylab("Density") + ggtitle('Age Densities') +
theme(legend.position = 'right') +
scale_color_manual(values = c('first' = 'red', 'second' = 'blue'))

Your data are not formatted correctly and you are basically creating two separate plots on a common "canvas", please see the code below (creation of the df is the crucial part):
library(ggplot2)
df <- data.frame(
x = c(rnorm(100), runif(100)),
col = c(rep('blue', 100), rep('red', 100))
)
ggplot(df) +
geom_density(aes(x = x, color = col)) +
xlab("Age") + ylab("Density") + ggtitle('Age Densities') +
theme(legend.position = 'right') +
scale_color_manual(labels = c('first', 'second'), values = c('red', 'blue'))

Related

adding a label in geom_line in R

I have two very similar plots, which have two y-axis - a bar plot and a line plot:
code:
sec_plot <- ggplot(data, aes_string (x = year, group = 1)) +
geom_col(aes_string(y = frequency), fill = "orange", alpha = 0.5) +
geom_line(aes(y = severity))
However, there are no labels. I want to get a label for the barplot as well as a label for the line plot, something like:
How can I add the labels to the plot, if there is only pone single group? is there a way to specify this manually? Until know I have only found option where the labels can be added by specifying them in the aes
EXTENSION (added a posterior):
getSecPlot <- function(data, xvar, yvar, yvarsec, groupvar){
if ("agegroup" %in% xvar) xvar <- get("agegroup")
# data <- data[, startYear:= as.numeric(startYear)]
data <- data[!claims == 0][, ':=' (scaled = get(yvarsec) * max(get(yvar))/max(get(yvarsec)),
param = max(get(yvar))/max(get(yvarsec)))]
param <- data[1, param] # important, otherwise not found in ggplot
sec_plot <- ggplot(data, aes_string (x = xvar, group = groupvar)) +
geom_col(aes_string(y = yvar, fill = groupvar, alpha = 0.5), position = "dodge") +
geom_line(aes(y = scaled, color = gender)) +
scale_y_continuous(sec.axis = sec_axis(~./(param), name = paste0("average ", yvarsec),labels = function(x) format(x, big.mark = " ", scientific = FALSE))) +
labs(y = paste0("total ", yvar)) +
scale_alpha(guide = 'none') +
theme_pubclean() +
theme(legend.title=element_blank(), legend.background = element_rect(fill = "white"))
}
plot.ExposureYearly <- getSecPlot(freqSevDataAge, xvar = "agegroup", yvar = "exposure", yvarsec = "frequency", groupvar = "gender")
plot.ExposureYearly
How can the same be done on a plot where both the line plot as well as the bar plot are separated by gender?
Here is a possible solution. The method I used was to move the color and fill inside the aes and then use scale_*_identity to create and format the legends.
Also, I needed to add a scaling factor for severity axis since ggplot does not handle the secondary axis well.
data<-data.frame(year= 2000:2005, frequency=3:8, severity=as.integer(runif(6, 4000, 8000)))
library(ggplot2)
library(scales)
sec_plot <- ggplot(data, aes(x = year)) +
geom_col(aes(y = frequency, fill = "orange"), alpha = 0.6) +
geom_line(aes(y = severity/1000, color = "black")) +
scale_fill_identity(guide = "legend", label="Claim frequency (Number of paid claims per 100 Insured exposure)", name=NULL) +
scale_color_identity(guide = "legend", label="Claim Severity (Average insurance payment per claim)", name=NULL) +
theme(legend.position = "bottom") +
scale_y_continuous(sec.axis =sec_axis( ~ . *1, labels = label_dollar(scale=1000), name="Severity") ) + #formats the 2nd axis
guides(fill = guide_legend(order = 1), color = guide_legend(order = 2)) #control which scale plots first
sec_plot

Why do two legends appear when manually editing in ggplot2?

I want to plot two lines, one solid and another one dotted, both with different colors. I'm having trouble dealing with the legends for this plot. Take this example:
library(ggplot2)
library(reshape2)
df = data.frame(time = 0:127,
mean_clustered = rnorm(128),
mean_true = rnorm(128)
)
test_data_long <- melt(df, id="time") # convert to long format
p = ggplot(data=test_data_long,
aes(x=time, y=value, colour=variable)) +
geom_line(aes(linetype=variable)) +
labs(title = "", x = "Muestras", y = "Amplitud", color = "Spike promedio\n") +
scale_color_manual(labels = c("Hallado", "Real"), values = c("blue", "red")) +
xlim(0, 127)
print(p)
Two legends appear, and on top of it, none of them is correct (the one with the right colors has wrong line styles, and the one with the right line styles has all other things wrong).
Why is this happening and how can I get the right legend to appear?
You need to ensure all the aesthetic mappings match between the different aesthetics you're using:
library(ggplot2)
library(reshape2)
data.frame(
time = 0:127,
mean_clustered = rnorm(128),
mean_true = rnorm(128)
) -> xdf
test_data_long <- melt(xdf, id = "time")
ggplot(
data = test_data_long,
aes(x = time, y = value, colour = variable)
) +
geom_line(aes(linetype = variable)) +
scale_color_manual(
name = "Spike promedio\n", labels = c("Hallado", "Real"), values = c("blue", "red")
) +
scale_linetype(
name = "Spike promedio\n", labels = c("Hallado", "Real")
) +
labs(
x = "Muestras", y = "Amplitud", title = ""
) +
xlim(0, 127)
Might I suggest also using theme parameters to adjust the legend title:
ggplot(data = test_data_long, aes(x = time, y = value, colour = variable)) +
geom_line(aes(linetype = variable)) +
scale_x_continuous(name = "Muestras", limits = c(0, 127)) +
scale_y_continuous(name = "Amplitud") +
scale_color_manual(name = "Spike promedio", labels = c("Hallado", "Real"), values = c("blue", "red")) +
scale_linetype(name = "Spike promedio", labels = c("Hallado", "Real")) +
labs(title = "") +
theme(legend.title = element_text(margin = margin(b=15)))

Aesthetics must be either length 1 or the same as the data (1): x, y, label

I'm working on some data on party polarization (something like this) and used geom_dumbbell from ggalt and ggplot2. I keep getting the same aes error and other solutions in the forum did not address this as effectively. This is my sample data.
df <- data_frame(policy=c("Not enough restrictions on gun ownership", "Climate change is an immediate threat", "Abortion should be illegal"),
Democrats=c(0.54, 0.82, 0.30),
Republicans=c(0.23, 0.38, 0.40),
diff=sprintf("+%d", as.integer((Democrats-Republicans)*100)))
I wanted to keep order of the plot, so converted policy to factor and wanted % to be shown only on the first line.
df <- arrange(df, desc(diff))
df$policy <- factor(df$policy, levels=rev(df$policy))
percent_first <- function(x) {
x <- sprintf("%d%%", round(x*100))
x[2:length(x)] <- sub("%$", "", x[2:length(x)])
x
}
Then I used ggplot that rendered something close to what I wanted.
gg2 <- ggplot()
gg2 <- gg + geom_segment(data = df, aes(y=country, yend=country, x=0, xend=1), color = "#b2b2b2", size = 0.15)
# making the dumbbell
gg2 <- gg + geom_dumbbell(data=df, aes(y=country, x=Democrats, xend=Republicans),
size=1.5, color = "#B2B2B2", point.size.l=3, point.size.r=3,
point.color.l = "#9FB059", point.color.r = "#EDAE52")
I then wanted the dumbbell to read Democrat and Republican on top to label the two points (like this). This is where I get the error.
gg2 <- gg + geom_text(data=filter(df, country=="Government will not control gun violence"),
aes(x=Democrats, y=country, label="Democrats"),
color="#9fb059", size=3, vjust=-2, fontface="bold", family="Calibri")
gg2 <- gg + geom_text(data=filter(df, country=="Government will not control gun violence"),
aes(x=Republicans, y=country, label="Republicans"),
color="#edae52", size=3, vjust=-2, fontface="bold", family="Calibri")
Any thoughts on what I might be doing wrong?
I think it would be easier to build your own "dumbbells" with geom_segment() and geom_point(). Working with your df and changing the variable refences "country" to "policy":
library(tidyverse)
# gather data into long form to make ggplot happy
df2 <- gather(df,"party", "value", Democrats:Republicans)
ggplot(data = df2, aes(y = policy, x = value, color = party)) +
# our dumbell
geom_path(aes(group = policy), color = "#b2b2b2", size = 2) +
geom_point(size = 7, show.legend = FALSE) +
# the text labels
geom_text(aes(label = party), vjust = -1.5) + # use vjust to shift text up to no overlap
scale_color_manual(values = c("Democrats" = "blue", "Republicans" = "red")) + # named vector to map colors to values in df2
scale_x_continuous(limits = c(0,1), labels = scales::percent) # use library(scales) nice math instead of pasting
Produces this plot:
Which has some overlapping labels. I think you could avoid that if you use just the first letter of party like this:
ggplot(data = df2, aes(y = policy, x = value, color = party)) +
geom_path(aes(group = policy), color = "#b2b2b2", size = 2) +
geom_point(size = 7, show.legend = FALSE) +
geom_text(aes(label = gsub("^(\\D).*", "\\1", party)), vjust = -1.5) + # just the first letter instead
scale_color_manual(values = c("Democrats" = "blue", "Republicans" = "red"),
guide = "none") +
scale_x_continuous(limits = c(0,1), labels = scales::percent)
Only label the top issue with names:
ggplot(data = df2, aes(y = policy, x = value, color = party)) +
geom_path(aes(group = policy), color = "#b2b2b2", size = 2) +
geom_point(size = 7, show.legend = FALSE) +
geom_text(data = filter(df2, policy == "Not enough restrictions on gun ownership"),
aes(label = party), vjust = -1.5) +
scale_color_manual(values = c("Democrats" = "blue", "Republicans" = "red")) +
scale_x_continuous(limits = c(0,1), labels = scales::percent)

Add item to legend by theme options

I have a data frame d like this:
d <- data.frame("name" = c("pippo","pluto","paperino"),
"id" = c(1,2,3),"count" = c(10,20,30),
"pvalue"=c(0.01,0.02,0.05),
geneRatio=c(0.5,0.8,0.2),
type=c("KEGG","Reactome","Reactome"))
and I plot a dotplot using the library ggplot:
ggplot(data = d,aes(geneRatio,name,size=count,colour = pvalue)) +
geom_point()+
ggtitle("Significantly Pathways") +
xlab("Gene Ratio") +
ylab("Pathways")+
theme(axis.text.y = element_text(color=d$type))
This is the plot at the moment
I would like to add to legend the information of "type" contained in dataframe d.
I would like to have a new item in the legend with color red = Reactome and color black= KEGG
Not saying that this is a good idea, but you can add a nonsensical geom to force the adding of a guide:
d <- data.frame("name" = c("pippo","pluto","paperino"),
"id" = c(1,2,3),
"count" = c(10,20,30),
"value"=c(0.01,0.02,0.05),
geneRatio=c(0.5,0.8,0.2),
type=c("KEGG","Reactome","Reactome")
)
library(ggplot2)
ggplot(data = d, aes(geneRatio,name,colour = pvalue)) +
geom_point(aes(size=count))+
geom_polygon(aes(geneRatio,name,fill = type)) +
ggtitle("Significantly Pathways") +
xlab("Gene Ratio") +
ylab("Pathways") +
scale_fill_manual(values = c('Reactome'='red', 'KEGG'='black')) +
theme(axis.text.y = element_text(color=d$type))
geom_polygon may not work with your actual data, and you may not find a suitable 'nonsensical' geom. I agree with #zx8754, a facet would be clearer:
ggplot(data = d, aes(geneRatio,name,colour = pvalue)) +
geom_point(aes(size=count)) +
ggtitle("Significantly Pathways") +
xlab("Gene Ratio") +
ylab("Pathways") +
facet_grid(type ~ ., scales = 'free_y', switch = 'y')
You could accomplish this using annotate, but it is a bit manual.
ggplot(data = d, aes(geneRatio, name, size = count, colour = pvalue)) +
geom_point() +
ggtitle("Significantly Pathways") +
xlab("Gene Ratio") +
ylab("Pathways")+
theme(axis.text.y = element_text(color=d$type)) +
annotate("text", x = 0.25, y = 3.5, label = "Reactome", color = "red") +
annotate("text", x = 0.25, y = 3.4, label = "KEGG", color = "black")

Only display specific labels of a ggplot legend

I have some data points, and I want to single out some of the points in my visualization. I would do it like this:
df = data.frame(
x = 1:4, y = 1:4,
special = c('normal', 'normal', 'normal', 'special')
)
ggplot(df) +
geom_point(aes(x, y, color = special)) +
scale_color_manual(values = c('red', 'black')) +
labs(color = "") +
theme_bw()
My issue here is that the black points are very self explanatory and don't need a label. I want just the red "special" label to appear. Is there a way I can hide the "normal" label?
If you are open to having any color other than red:
ggplot(df) +
geom_point(aes(x, y, color = special)) + scale_size(guide = "none") +
scale_color_discrete(breaks="special") + labs(color = "") +
theme_bw()
EDIT :
cols <- c("normal" = "black","special" = "red")
gg <- ggplot(df) + geom_point(aes(x, y, color = special)) + labs(color = "") + theme_bw()
gg + scale_colour_manual(values = cols, limits = "special")

Resources