Labeling 2 variables in scatter plot - r

library(ggplot2)
library(dplyr)
#install.packages("ggrepel")
library(ggrepel)
#Code
mpg %>%
mutate(Color=ifelse(class == '2seater','2seater','Other')) %>%
ggplot(aes(displ, hwy, colour = Color)) +
geom_point() +
geom_text_repel(aes(label = ifelse(Color == '2seater', '2seater', "")),
force_pull = 0, show.legend = FALSE) +
theme(legend.position = "none")
in the above code if I want to add another label for 'compact' how would I change the code please, so i would like 2 labels- one for compact and 2seater

You can do with case_when where you have used if else. [Although you could use an ifelse inside another ifelse. But I find case_when cleaner.]
But have a look. Aren't there too many labels? Why not just leave it color coded?
library(ggplot2)
library(dplyr)
library(ggrepel)
#Code
ggplot2::mpg %>%
mutate(Color = case_when(
class == "2seater" ~ "2seater",
class == "compact" ~ "compact",
TRUE ~ "other"
)) %>%
ggplot(aes(displ, hwy, colour = Color)) +
geom_point() +
geom_text_repel(aes(
label =
case_when(
class == "2seater" ~ "2seater",
class == "compact" ~ "compact"
)
),
force_pull = 0, show.legend = FALSE
) +
scale_color_manual(values = c("red", "blue", "gray")) +
theme(legend.position = "none")

As an alternative to the case_when() method proposed by MarBlo, you can also subset the data while specifying the layer. Upside is that is shorter to write, downside is that the labels don't repel the unlabelled points. As mentioned by MarBlo, ggrepel protests that there might be too many labels.
library(ggplot2)
library(dplyr)
library(ggrepel)
mpg %>%
mutate(Color=ifelse(class == '2seater','2seater','Other')) %>%
ggplot(aes(displ, hwy, colour = Color)) +
geom_point() +
geom_text_repel(aes(label = class),
data = ~ subset(., class %in% c("2seater", "compact")),
force_pull = 0, show.legend = FALSE) +
theme(legend.position = "none")
#> Warning: ggrepel: 31 unlabeled data points (too many overlaps). Consider
#> increasing max.overlaps
Created on 2021-01-10 by the reprex package (v0.3.0)

Related

How to have sum of values sum to 1 in geom_freqpoly()?

As an example, we can use geom_freqpoly() to examine how hp varies by cyl in the mtcars data.
library(tidyverse)
mtcars %>%
mutate(cyl = as.factor(cyl)) %>%
ggplot() +
aes(x=hp, color=cyl) +
geom_freqpoly(mapping = aes(y = after_stat(ncount)), bins=5)
Using after_stat(ncount), I can make each line be normalized between 0 and 1. However, is there a way to have it so that the sum of all the lines at any point is equal to 1? i.e., at any value of hp, the red, green, and blue lines add to one -- representing the estimated proportion of each cyl type at that value of hp.
This can be achieved with position = "fill", though it looks confusing with lines and is better represented as a filled geom using the same statistical transformation as geom_freqpoly
library(tidyverse)
mtcars %>%
mutate(cyl = as.factor(cyl)) %>%
ggplot() +
aes(x = hp, fill =c yl) +
stat_bin(bins = 5, position = "fill", geom = "area")
Compare this to the same result using an unfilled geom_freqpoly
mtcars %>%
mutate(cyl = as.factor(cyl)) %>%
ggplot() +
aes(x = hp, color = cyl) +
geom_freqpoly(position = "fill", bins = 5)
I think this is harder to follow.
Another alternative to geom_freqpoly would be geom_density, which permits more visually appealing representations of similar information:
mtcars %>%
mutate(cyl = as.factor(cyl)) %>%
ggplot() +
aes(x = hp, fill = cyl) +
geom_density(position = "fill", alpha = 0.5, color = "white", lwd = 2) +
coord_cartesian(xlim = c(50, 200)) +
scale_fill_brewer(palette = "Set2") +
theme_minimal(base_size = 20) +
labs(y = "Relative density")
Created on 2022-09-05 with reprex v2.0.2

Independent y-axis per nested facet in ggplot

Is it possible to have a second "free" y-axis specifically for the nester "Short leaves"? I do not want to have an independent y-axis for all 3, just 2 for the respective nesters. How can I do that?
library(tidyverse)
library(ggh4x)
df <- as_tibble(iris) %>%
select(3, 5) %>%
mutate(Nester = if_else(Species == "setosa", "Short Leaves", "Long Leaves"),
Nester = factor(Nester))
df %>%
pivot_longer(!c(Species, Nester), names_to = "Measure", values_to = "Value") %>%
ggplot(aes(Measure, Value)) +
geom_boxplot() +
facet_nested(~ Nester + Species)
EDIT:
So far, I only found those two options that do free y-axis for all 3:
facet_nested(~ Nester + Species, scales = "free_y", independent = "y")
and
facet_nested_wrap(~ Nester + Species, scales = "free_y", nrow = 1)
which do not give the desired result.
There isn't really a good amount of control at the facet levels about which panel uses which y-scale. However, facetted_pos_scales() gives you exactly that control.
Let's suppose we have this plot from your example.
library(tidyverse)
library(ggh4x)
df <- as_tibble(iris) %>%
select(3, 5) %>%
mutate(Nester = if_else(Species == "setosa", "Short Leaves", "Long Leaves"),
Nester = factor(Nester))
p <- df %>%
pivot_longer(!c(Species, Nester), names_to = "Measure", values_to = "Value") %>%
ggplot(aes(Measure, Value)) +
geom_boxplot() +
facet_nested(~ Nester + Species, scales = "free_y", independent = "y")
We can 'fix' a scale by giving it a constant limit, which we'd have to pre-calculate. You can then set that scale to some panels.
ylim <- range(df$Petal.Length[df$Nester == "Long Leaves"])
p + facetted_pos_scales(
y = list(Nester == "Long Leaves" ~ scale_y_continuous(limits = ylim))
)
If you also wish to omit the axis in between the panels with fixed scales, you'd need to set the scales separately for each panel. In the middle panel, you'd have to set guide = "none" to hide the axis.
p + facetted_pos_scales(
y = list(
Species == "versicolor" ~ scale_y_continuous(limits = ylim),
Species == "virginica" ~ scale_y_continuous(limits = ylim, guide = "none")
)
)
Created on 2022-08-19 by the reprex package (v2.0.0)

Labeling in ggplot on top of point

library(dplyr)
#Code
mpg %>%
mutate(Color=ifelse(class=='2seater','2seater','Other')) %>%
ggplot(aes(displ, hwy, colour = Color)) +
geom_point()
In the above code if i would like to separately label 2 wheeler on top of the blue dots for 2 wheeler instead of a separate column for legends, what would be the modification for my code pls?
Not sure if I understand you correctly, but does this answer your question?
library(ggplot2)
library(dplyr)
#install.packages("ggrepel")
library(ggrepel)
#Code
mpg %>%
mutate(Color=ifelse(class == '2seater','2seater','Other')) %>%
ggplot(aes(displ, hwy, colour = Color)) +
geom_point() +
geom_text_repel(aes(label = ifelse(Color == '2seater', '2seater', "")),
ylim = 35, force_pull = 0, show.legend = FALSE)
Or perhaps this?
library(ggplot2)
library(dplyr)
#install.packages("ggrepel")
library(ggrepel)
#Code
mpg %>%
mutate(Color=ifelse(class == '2seater','2seater','Other')) %>%
ggplot(aes(displ, hwy, colour = Color)) +
geom_point() +
geom_text_repel(aes(label = ifelse(Color == '2seater', '2seater', "")),
force_pull = 0, show.legend = FALSE) +
theme(legend.position = "none")
Or some combination of the two?

GGplot Color Outline

data(mtcars)
library(ggplot2)
ggplot(mtcars, aes(x = reorder(row.names(mtcars), mpg), y = mpg, fill = factor(cyl))) +
geom_bar(stat = "identity")
This will ggplot the bars with solid fills but what if I wish to use the same fill colors as outlines for some measures but solid fills for others. For example if 'am' equals to 1 it is solid fill but if 'am' equals to 0 than it is just an outline fill like this sample:
One option to remove the fill based on a logical condition is to change those values to NA.
library(tidyverse)
d <- head(mtcars) %>%
rownames_to_column() %>%
# make a new variable for fill
# note: don't use ifelse on a factor!
mutate(cyl_fill = ifelse(am == 0, NA, cyl),
# now make them factors
# (you can do this inside ggplot, but that is messy)
cyl = factor(cyl),
cyl_fill = factor(cyl_fill, levels = levels(cyl)))
# plot
p <- ggplot(d) +
aes(x = rowname,
y = mpg,
color = cyl,
fill = cyl_fill
) +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 90))
# change the fill color of NA values
p + scale_fill_discrete(drop=FALSE, na.value="white")
If you want NA fill values to be empty and omitted from the legend:
# omit the fill color of NA values
# note: drop=FALSE is still needed to keep the fill and (outline) color values the same
p + scale_fill_discrete(drop=FALSE, na.translate = F)
You can change the color of the outline in the same way (e.g. cyl_color = ifelse(am != 0, NA, Cyl)), but if you want to specify a color like white or black, it will (should) appear in the legend. You can try to hack your way around these wise defaults by plotting non-aesthetic layers behind your main layers, but it usually gets ugly:
head(mtcars) %>%
rownames_to_column() %>%
mutate(cyl_fill = ifelse(am == 0, NA, cyl),
cyl_color = ifelse(am != 0, NA, cyl),
cyl = factor(cyl),
cyl_fill = factor(cyl_fill, levels = levels(cyl)),
cyl_color = factor(cyl_color, levels = levels(cyl))) %>%
ggplot() +
aes(x = rowname,
y = mpg,
color = cyl_color,
fill = cyl_fill
) +
geom_bar(stat = "identity", color = "black") + # NON-AES LAYER FIRST
geom_bar(stat = "identity") + # Covers up the black except where omitted
theme(axis.text.x = element_text(angle = 90))+
scale_fill_discrete(drop=FALSE, na.translate = F) +
scale_color_discrete(drop=FALSE, na.translate = F)
You could assign the desired colors to each level of the fill and color variables. For example:
library(tidyverse)
mtcars %>%
rownames_to_column() %>%
arrange(mpg) %>%
mutate(rowname=factor(rowname, levels=rowname)) %>%
ggplot(aes(x = rowname, y = mpg, fill = factor(am), colour=factor(cyl))) +
geom_col(size=1) +
scale_fill_manual(values=c("0"="white", "1"="red")) +
scale_color_manual(values=c("4"="blue", "6"="orange", "8"="white")) +
theme_classic() +
theme(axis.text.x=element_text(angle=-90, vjust=0.5, hjust=0))
May be, we can do
library(dplyr)
library(ggplot2)
mtcars %>%
mutate(new = case_when(am == 1 ~ factor(cyl)),
new1 = case_when(am !=1 ~ factor(cyl))) %>%
ggplot(aes(x = reorder(row.names(mtcars), mpg), y = mpg,
fill = new, color = new1)) +
geom_bar(stat = 'identity') +
scale_fill_discrete(na.value= NA) + # similar to Devin Judge-Lord post
theme_classic() +
theme(axis.text.x=element_text(angle=-90, vjust=0.5, hjust=0))

Move labels from geom_label_repel into ggplot margin

In the plot below I'd like to move the label "V-Engine" into the plot margin. Adjusting the nudge_x argument is moving the "S-Engine" label but not the "V-Engine" label.
library(ggplot2)
library(ggrepel)
library(dplyr)
ds <-
mtcars %>%
mutate(vs = factor(vs, labels = c("V-Engine", "S-Engine"))) %>%
# Create labels for the rightmost data points
group_by(vs) %>%
mutate(
label =
case_when(
wt == max(wt) ~ as.character(vs),
TRUE ~ NA_character_
)
) %>%
ungroup()
ds %>%
ggplot(aes(x = wt, y = mpg, color = vs)) +
geom_smooth(se=FALSE) +
geom_label_repel(aes(label = label), nudge_x = 1, na.rm = TRUE) +
guides(color = FALSE) +
theme_minimal() +
theme(plot.margin = unit(c(1,3,1,1), "cm"))
You can set xlim() inside geom_label_repel
library(dplyr)
library(ggplot2)
library(ggrepel)
ds %>%
ggplot(aes(x = wt, y = mpg, color = vs)) +
geom_smooth(se=FALSE) +
geom_label_repel(aes(label = label),
nudge_x = 1,
# direction = 'x',
xlim = c(0, 6.5),
na.rm = TRUE) +
guides(color = FALSE) +
theme_minimal() +
theme(plot.margin = unit(c(1,3,1,1), "cm")) +
coord_cartesian(clip = 'off')
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Created on 2018-11-16 by the reprex package (v0.2.1.9000)

Resources