How to delete group of dot points in geom_point ggplot - r

I have a problem with plot. I want to show only dot points in group A, not in each name. Here is an example:
name <- c("a","b","c","d")
df <- data.frame(id = rep(1:5,3),
value = c(seq(50,58,2),seq(60,68,2),seq(70,78,2)),
name = c(rep("A",5),rep("B",5),rep("C",5)),
type = rep(c("a","b","c","d","r"),3))
df$name <- factor(df$name, levels = c("C","B","A"),ordered = TRUE)
ggplot(df, aes(id, value, fill = name,color = type))+
geom_area( position = 'identity', linetype = 1, size = 1 ,colour="black") +
geom_point(size = 8)+
guides(fill = guide_legend(override.aes = list(colour = NULL, shape = NA)))

If I am reading the question correctly, it seems that you want dots for the blue area only. In that case, you could subset the data and use it for geom_point.
ggplot(df, aes(id, value, fill = name,color = type))+
geom_area( position = 'identity', linetype = 1, size = 1 ,colour="black") +
geom_point(data = subset(df, name == "A"), size = 8) +
guides(fill = guide_legend(override.aes = list(colour = NULL, shape = NA)))

Related

Add a legend to geom_point overlaid on geom_boxplot

So I create a boxplot of data and then add a set point over that data. I want my legend to capture what the data type of the geom_points represents. Thanks!
ggplot(data = NULL) +
geom_boxplot(data = discuss_impact_by_county,
aes(x=reorder(State,discuss, FUN = median),y=discuss),
outlier.shape = NA) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
labs(x = "States") +
geom_point(data = by_state,
aes(x = State, y = discuss_happen_difference),
col = "red",
size = 3,
show.legend = TRUE)
If you want a legend you have to map on aesthetics. In your case map something on the color aes, i.e. move col="red" into aes() and use scale_color_manual to set the value and the legend label to be assgined to the color label "red".
As you have only one "category" of points you can simply do scale_color_manual(values = "red", label = "We are red points") to set the color and label. In case that your have multiple points with different colors it's best to make use of a named vector to assign the colors and legend labels to the right "color label"s, i.e use scale_color_manual(values = c(red = "red"), label = c(red = "We are red points")).
Using some random example data try this:
library(ggplot2)
library(dplyr)
set.seed(42)
discuss_impact_by_county <- data.frame(
State = sample(LETTERS[1:4], 100, replace = TRUE),
discuss = runif(100, 1, 5)
)
by_state <- discuss_impact_by_county %>%
group_by(State) %>%
summarise(discuss_happen_difference = mean(discuss))
#> `summarise()` ungrouping output (override with `.groups` argument)
ggplot(data = NULL) +
geom_boxplot(data = discuss_impact_by_county,
aes(x=reorder(State,discuss, FUN = median),y=discuss),
outlier.shape = NA) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
labs(x = "States") +
geom_point(data = by_state,
aes(x = State, y = discuss_happen_difference, col = "red_points"),
size = 3,
show.legend = TRUE) +
scale_color_manual(values = "red", label = "We are red points")

How to apply position_dodge to geom_point and geom_text in the same plot?

I woul like to be able to make the geom_text inside the geom_point to follow the re-positioning when applying position_dodge. That is, I would like to go from the code below:
Q <- as_tibble(data.frame(series = rep(c("diax","diay"),3),
value = c(3.25,3.30,3.31,3.36,3.38,3.42),
year = c(2018,2018,2019,2019,2020,2020))) %>%
select(year, series, value)
ggplot(data = Q, mapping = aes(x = year, y = value, color = series, label = sprintf("%.2f",value))) +
geom_point(size = 13) +
geom_text(vjust = 0.4,color = "white", size = 4, fontface = "bold", show.legend = FALSE)
which produces the following chart:
to the following change:
ggplot(data = Q, mapping = aes(x = year, y = value, color = series, label = sprintf("%.2f",value))) +
geom_point(size = 13, position = position_dodge(width = 1)) +
geom_text(position = position_dodge(width = 1), vjust = 0.4,
color = "white", size = 4, fontface = "bold",
show.legend = FALSE)
which produces the following chart:
The curious thing about this is the fact that excatly the same change works just fine if I change from geom_point to geom_bar:
ggplot(Q, aes(year, value, fill = factor(series), label = sprintf("%.2f",value))) +
geom_bar(stat = "identity", position = position_dodge(width = 1)) +
geom_text(color = "black", size = 4,fontface= "bold",
position = position_dodge(width = 1), vjust = 0.4, show.legend = FALSE)
This happens because the the dodging is based on the group aesthetic, automatically set in this case to series because of the mapping to color. The issue is that the text layer has it's own color ("white") and so the grouping is dropped. Manually set the grouping, and all is good:
ggplot(Q, aes(x = year, y = value, color = series, label = sprintf("%.2f",value), group = series)) +
geom_point(size = 13, position = position_dodge(width = 1)) +
geom_text(position = position_dodge(width = 1), vjust = 0.4, color = "white", size = 4,
fontface = "bold", show.legend = FALSE)
One patch work would be the following. Since you cannot add labels on top of the data point using geom_text() right away, you may want to go round a bit. I first created a temporary graphic with geom_point(). Then, I accessed to the data frame which is used for drawing the graphic. You can find the values of x and y axis. Using them, I created a new data frame called temp which include the axis information and the label information. Once I had this data frame, I could draw the expected outcome using temp. Make sure that you use inherit.aes = FALSE in geom_text() since you are using another data frame.
library(dplyr)
library(ggplot2)
g <- ggplot(data = Q, aes(x = year, y = value, color = series)) +
geom_point(size = 13, position = position_dodge(width = 1))
temp <- as.data.frame(ggplot_build(g)$data) %>%
select(x, y) %>%
arrange(x) %>%
mutate(label = sprintf("%.2f",Q$value))
ggplot(data = Q, aes(x = year, y = value, color = series)) +
geom_point(size = 13, position = position_dodge(width = 1)) +
geom_text(data = temp, aes(x = x, y = y, label = label),
color = "white", inherit.aes = FALSE)

R ggplot: Apply label only to last N data points in plot

I have created a line chart (plot) in R with labels on each data point. Due to the large number of data points, the plot becomes very fully with labels. I would like to apply the labels only for the last N (say 4) data points. I have tried subset and tail in the geom_label_repel function but was not able to figure them our or got an error message. My data set consist of 99 values, spread over 3 groups (KPI).
I have the following code in R:
library(ggplot)
library(ggrepel)
data.trend <- read.csv(file=....)
plot.line <- ggplot(data=data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
geom_line(aes(group = KPI), size = 1) +
geom_point(size = 2.5) +
# Labels defined here
geom_label_repel(
aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value)),
box.padding = unit(0.35, "lines"),
point.padding = unit(0.4, "lines"),
segment.color = 'grey50',
show.legend = FALSE
)
);
I all fairness, I am quite new to R. Maybe I miss something basic.
Thanks in advance.
The simplest approach is to set the data = parameter in geom_label_repel to only include the points you want labeled.
Here's a reproducible example:
set.seed(1235)
data.trend <- data.frame(Version = rnorm(25), Value = rnorm(25),
group = sample(1:2,25,T),
KPI = sample(1:2,25,T))
ggplot(data=data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
geom_line(aes(group = KPI), size = 1) +
geom_point(size = 2.5) +
geom_label_repel(aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value)),
data = tail(data.trend, 4),
box.padding = unit(0.35, "lines"),
point.padding = unit(0.4, "lines"),
segment.color = 'grey50',
show.legend = FALSE)
Unfortunately, this messes slightly with the repel algorithm, making the label placement suboptimal with respect to the other points which are not labelled (you can see in the above figure that some points get covered by labels).
So, a better approach is to use color and fill to simply make the unwanted labels invisible (by setting both color and fill to NA for labels you want to hide):
ggplot(data=data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
geom_line(aes(group = KPI), size = 1) +
geom_point(size = 2.5) +
geom_label_repel(aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value)),
box.padding = unit(0.35, "lines"),
point.padding = unit(0.4, "lines"),
show.legend = FALSE,
color = c(rep(NA,21), rep('grey50',4)),
fill = c(rep(NA,21), rep('lightblue',4)))
If you want to show just the last label, using group_by and filter may work:
data = data.trend %>% group_by(KPI) %>% filter(Version == max(Version))
Full example:
suppressPackageStartupMessages(library(dplyr))
library(ggplot2)
library(ggrepel)
set.seed(1235)
data.trend <- data.frame(Version = rnorm(25), Value = rnorm(25),
group = sample(1:2,25,T),
KPI = sample(1:2,25,T))
ggplot(data = data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
geom_line(aes(group = KPI), size = 1) +
geom_point(size = 2.5) +
# Labels defined here
geom_label_repel(
data = data.trend %>% group_by(KPI) %>% filter(Version == max(Version)),
aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value)),
color = "black",
fill = "white")
Or if you want to show 4 random labels per KPI, data.trend %>% group_by(KPI) %>% sample_n(4):
suppressPackageStartupMessages(library(dplyr))
library(ggplot2)
library(ggrepel)
set.seed(1235)
data.trend <- data.frame(Version = rnorm(25), Value = rnorm(25),
group = sample(1:2,25,T),
KPI = as.factor(sample(1:2,25,T)))
ggplot(data = data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
geom_line(aes(group = KPI), size = 1) +
geom_point(size = 2.5) +
# Labels defined here
geom_label_repel(
data = data.trend %>% group_by(KPI) %>% sample_n(4),
aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value), fill = KPI),
color = "black", show.legend = FALSE
)
#> Warning: Duplicated aesthetics after name standardisation: fill
Created on 2021-08-27 by the reprex package (v2.0.1)

Change the shape of legend key for geom_bar in ggplot2

I'm trying to change the shape of the legend key from a geom_bar graph. I've looked at multiple answers online but found they didn't work in this case. Let me explain the problem:
df1 = data.frame(person = c("person1", "person2", "person3"),
variable = "variable1",
value = c(0.5, 0.3, 0.2))
df2 = data.frame(person = c("person1", "person2", "person3"),
variable = "variable2",
value = c(-0.3, -0.1, -0.4))
I'm trying to make a stacked barplot where one side is negative. Using ggplot2 I get:
library(ggplot2)
ggplot() + geom_bar(data = df1, aes(x = person, y = value, fill = variable), stat = "identity") +
geom_bar(data = df2, aes(x = person, y = value, fill = variable), stat = "identity") +
scale_fill_manual(values = c("steelblue", "tomato"), breaks = c("variable1","variable2"),
labels = c("Variable 1", "Variable 2"))
It then looks like this:
Now on the right the legend shows squares by default. Is there a way to change this into a circle for instance?
Online I've found the way this usually works is by using
guides(fill = guide_legend(override.aes = list(shape = 1)))
Or similar variations. However this doesn't seem to work. If anybody can help that would be great, I've been stuck for quite a while now.
You could add a layer of geom_point with no data (just to create a legend) and hide the unwanted rectangular legend from the bars using show.legend = FALSE:
df3 = data.frame(person = as.numeric(c(NA, NA)),
variable = c("variable1", "variable2"),
value = as.numeric(c(NA, NA)))
ggplot() +
geom_bar(data = df1, aes(x = person, y = value, fill = variable), stat = "identity", show.legend = FALSE) +
geom_bar(data = df2, aes(x = person, y = value, fill = variable), stat = "identity", show.legend = FALSE) +
geom_point(data = df3, aes(x = person, y = value, color = variable), size=8) +
scale_fill_manual(values = c("steelblue", "tomato"), breaks = c("variable1","variable2")) +
scale_color_manual(values = c("steelblue", "tomato")) +
theme(legend.key = element_blank())

Multiple line and bar chart in ggplot with geom_text and colours

I want to combine a line and a bar chart with three different variables using ggplot. The bar chart should have the values on top and should be filled with the same colours as the lines in the chart above. Here is my code
# First some data generation
df.x = data.frame(date = Sys.Date()-20:1,
value = c(1:20),
variable = "line",
object = "x")
df.y = data.frame(date = Sys.Date()-20:1,
value = c(1:20)*2,
variable = "line",
object = "y")
df.z = data.frame(date = Sys.Date()-20:1,
value = c(1:20)*3,
variable = "line",
object = "z")
df.y.bar = data.frame(date = Sys.Date()-10,
value = 30,
variable = "bar",
object = "y")
df.x.bar = data.frame(date = Sys.Date()-10,
value = 40,
variable = "bar",
object = "x")
df.z.bar = data.frame(date = Sys.Date()-10,
value = 50,
variable = "bar",
object = "z")
df = rbind(df.x, df.y,df.z, df.x.bar, df.y.bar,df.z.bar)
my.cols = c("blue", "green", "yellow")
# Pass everything to ggplot
ggplot(df, aes_string(x = "date", y = "value", fill="variable")) +
facet_grid(variable~., scales="free_y") +
geom_line(data = subset(df, variable == "line"), aes(colour = factor(object)), size = 1, show_guide = FALSE, stat="identity") +
geom_bar(data = subset(df, variable == "bar"), aes(colour = factor(object)), show_guide = TRUE, stat="identity", position = "dodge") +
geom_text(data = subset(df, variable == "bar"), aes(y=value+0.7 * sign(value), ymax=value, label=round(value, 2)), position = position_dodge(width = 0.9), size=3) +
scale_colour_manual(values = my.cols)
The resulting plot is not exactly what I wanted. I used position_dodge(width = 0.9) but the text over the bars does not move. Irretatingly this does not happen when I only chose two variable (e.g. x and y) to be included in the data frame. The desired result should look like this:
Thanks a lot for your help!
You can fill the bars by choosing fill = factor(object) and adding scale_fill_manual(values = my.cols).
In order to have only one legend I think removing fill="variable" from aes_string did the trick in combination with scale_fill_manual.
And for dodging the text you need to add a group argument and position = position_dodge(width=1)
ggplot(df, aes_string(x = "date", y = "value")) +
facet_grid(variable~., scales="free_y") +
geom_line(data = subset(df, variable == "line"), aes(colour = factor(object)), size = 1, show_guide = FALSE, stat="identity") +
geom_bar(data = subset(df, variable == "bar"), aes(fill = factor(object)), show_guide = TRUE, stat="identity", position = "dodge") +
geom_text(data =subset(df, variable == "bar"), aes(x=date, y=(value+0.7) * sign(value), ymax=value, label=round(value, 2), group=object), position = position_dodge(width=1), size=3) +
scale_colour_manual(values = my.cols) +
scale_fill_manual(values = my.cols)

Resources