Ok so I have a data set with 2 variables X and Y, and an ID variable. I've created a full plot using this code:
ggplot(data = X_Y) +
geom_point(mapping = aes(x = X, y = Y))+
geom_text_repel(mapping = aes(x = X, y = Y, label = ID))+
xlim(0,100)+
ylim(0,100)
This produces a plot like this:
I now wish to create a number of separate plots only showing a single data point at a time with their label.
Now I can use just geom_label without repel and nudge the label to get this:
While this plot is ok, I was wondering if there was any way to keep the lines connecting labels to points like how ggrepel does...
EDIT
From the first suggestion, when I try use repel with only one case selected I get the following plot:
ggplot(data = X_Y) +
geom_point(aes(x = X[4], y = Y[4]))+
geom_label_repel(aes(x = X[4], y = Y[4]),
label = "You are here",
min.segment.length = unit(0, 'lines'),
nudge_y = 6)+
labs(x = "X",y = "Y",title = "mytitle")+
scale_x_continuous(limits = c(0, 100)) +
scale_y_continuous(limits = c(0, 100))
RESOLVED
Figured it out! I need to specify my data in ggplot() to only be the X and Y variables and limit to the row of interest.
Like this:
ggplot(data = X_Y[4,c(3,4)) +
geom_point(aes(x = X, y = Y))+
geom_label_repel(aes(x = X, y = Y),
label = "You are here",
min.segment.length = unit(0, 'lines'),
nudge_y = 6)+
labs(x = "X",y = "Y",title = "mytitle")+
scale_x_continuous(limits = c(0, 100)) +
scale_y_continuous(limits = c(0, 100))
You can of course still use geom_label_repel, even with a single point. To be sure a segment is drawn adjust the min.segment.length arg. This arg sets the minimum distance from the point to the label to draw a segment, setting it to unit(0, 'lines') ensures every segment is drawn:
library(ggplot2)
library(ggrepel)
ggplot(data.frame(x = 2, y = 3)) +
geom_point(aes(x, y)) +
geom_label_repel(aes(x, y),
label = 'You are here',
min.segment.length = unit(0, 'lines'),
nudge_y = .2) +
scale_x_continuous(limits = c(0, 3)) +
scale_y_continuous(limits = c(0, 4))
Related
I have an NMDS ordination that I've plotted using ggplot2. I've added environmental vectors on top (from the envfit() function in vegan) using geom_segment() and added corresponding labels to the same coordinates as the segments using geom_text() (code below):
ggplot() +
geom_point(data = nmds.sites.plot, aes(x = NMDS1, y = NMDS2, col = greening), size = 2) +
labs(title = "Study Area",
col = "Sites") +
geom_polygon(data = hull.data, aes(x = NMDS1, y = NMDS2, fill = grp, group = grp), alpha = 0.2) +
scale_fill_discrete(name = "Ellipses",
labels = c("High", "Moderate", "Control")) +
xlim(c(-1, 1)) +
guides(shape = guide_legend(order = 1),
colour = guide_legend(order = 2)) +
geom_segment(data = env.arrows,
aes(x = 0, xend = NMDS1, y = 0, yend = NMDS2),
arrow = arrow(length = unit(0.25, "cm")),
colour = "black", inherit.aes = FALSE) +
geom_text(data = env.arrows, aes(x = NMDS1, y = NMDS2, label = rownames(env.arrows))) +
coord_fixed() +
theme_bw() +
theme(text = element_text(size = 14))
However, since the labels are justified to centre, part of the label sometimes overlaps with the end of the arrow. I want to have the text START at the end of the arrow. In some other cases, if the arrow is pointing up, it pushes into the middle of the text. Essentially, I want to be able to see both the arrow head AND the text.
I have tried using geom_text_repel() from the ggrepel package but the placement seems random (and will also repel from other points or text in the plot (or just not do anything at all).
[EDIT]
Below are the coordinates of the NMDS vectors (this is the env.arrows object from the example code above):
NMDS1 NMDS2
Variable1 -0.46609087 0.27567532
Variable2 -0.21524887 -0.10128795
Variable3 0.59093184 0.03423775
Variable4 -0.00136418 0.46550043
Variable5 -0.30900813 -0.19659929
Variable6 0.53510347 -0.36387227
Variable7 0.66376246 -0.05220685
In the code below, we create a radial shift function to move the labels away from the arrows. The shift includes a constant amount plus an additional shift that varies with the absolute value of the cosine of the label's angle to the x-axis. This is because labels with theta near 0 or 180 degrees have a larger length of overlap with the arrows, and therefore need to be moved farther, than labels with theta near 90 or 270 degrees.
You may need to tweak the code a bit to get the labels exactly where you want them. Also, you'll likely need to add an additional adjustment if the variable names can have different widths.
One additional note: I've turned the variable names into a data column. You should do this with your data as well and then map that data column to the label argument of aes. Using rownames(env.arrows) for the labels reaches outside the ggplot function environment to the external data frame env.arrows and breaks the mapping to the data frame you've provided in the data argument to geom_text (although it likely won't cause a problem in this particular case).
library(tidyverse)
library(patchwork)
# data
env.arrows = read.table(text=" var NMDS1 NMDS2
Variable1 -0.46609087 0.27567532
Variable2 -0.21524887 -0.10128795
Variable3 0.59093184 0.03423775
Variable4 -0.00136418 0.46550043
Variable5 -0.30900813 -0.19659929
Variable6 0.53510347 -0.36387227
Variable7 0.66376246 -0.05220685", header=TRUE)
# Radial shift function
rshift = function(r, theta, a=0.03, b=0.07) {
r + a + b*abs(cos(theta))
}
# Calculate shift
env.arrows = env.arrows %>%
mutate(r = sqrt(NMDS1^2 + NMDS2^2),
theta = atan2(NMDS2,NMDS1),
rnew = rshift(r, theta),
xnew = rnew*cos(theta),
ynew = rnew*sin(theta))
p = ggplot() +
geom_segment(data = env.arrows,
aes(x = 0, xend = NMDS1, y = 0, yend = NMDS2),
arrow = arrow(length = unit(0.25, "cm")),
colour = "black", inherit.aes = FALSE) +
geom_text(data = env.arrows, aes(x = NMDS1, y = NMDS2, label = var)) +
coord_fixed() +
theme_bw() +
theme(text = element_text(size = 14))
pnew = ggplot() +
geom_segment(data = env.arrows,
aes(x = 0, xend = NMDS1, y = 0, yend = NMDS2),
arrow = arrow(length = unit(0.2, "cm")),
colour = "grey60", inherit.aes = FALSE) +
geom_text(data = env.arrows, aes(x = xnew, y = ynew, label = var), size=3.5) +
coord_fixed() +
theme_bw() +
theme(text = element_text(size = 14)) +
scale_x_continuous(expand=expansion(c(0.12,0.12))) +
scale_y_continuous(expand=expansion(c(0.07,0.07)))
p / pnew
I'm looking for a way to move every second x-axis tick downwards and have the tick line go down with it.
I can change the general margin and tick length for all ticks with:
#MWE
library(ggplot2)
ggplot(cars, aes(dist, speed))+
geom_point()+
theme(
axis.ticks.length.x = unit(15, "pt")
)
But, I would like the x-axis ticks 0, 50, and 100 (i.e., every second tick) to be without the added top margin.
A generalized answer is preferred as my x-axis is categorical and not numerical (and contains 430 ticks, so nothing I can set by hand).
Any ideas?
Edit:
Output should be:
Edit2:
A more intricate example would be:
#MWE
ggplot(diamonds, aes(cut, price, fill = clarity, group = clarity))+
geom_col(position = 'dodge')+
theme(
axis.ticks.length.x = unit(15, "pt")
)
Edit -- added categorical approach at bottom.
Here's a hack. Hope there's a better way!
ticks <- data.frame(
x = 25*0:5,
y = rep(c(-0.2, -2), 3)
)
ggplot(cars, aes(dist, speed))+
geom_point()+
geom_rect(fill = "white", xmin = -Inf, xmax = Inf,
ymin = 0, ymax = -5) +
geom_segment(data = ticks,
aes(x = x, xend = x,
y = 0, yend = y)) +
geom_text(data = ticks,
aes(x = x, y = y, label = x), vjust = 1.5) +
theme(axis.ticks.x = element_blank()) +
scale_x_continuous(breaks = 25*0:5, labels = NULL, name = "") +
coord_cartesian(clip = "off")
Here's a similar approach used with a categorical x.
cats <- sort(as.character(unique(diamonds$cut)))
ticks <- data.frame(x = cats)
ticks$y = ifelse(seq_along(cats) %% 2, -500, -2000)
ggplot(diamonds, aes(cut, price, fill = clarity, group = clarity))+
geom_col(position = 'dodge') +
annotate("rect", fill = "white",
xmin = 0.4, xmax = length(cats) + 0.6,
ymin = 0, ymax = -3000) +
geom_segment(data = ticks, inherit.aes = F,
aes(x = x, xend = x,
y = 0, yend = y)) +
geom_text(data = ticks, inherit.aes = F,
aes(x = x, y = y, label = x), vjust = 1.5) +
scale_x_discrete(labels = NULL, name = "cut") +
scale_y_continuous(expand = expand_scale(mult = c(0, 0.05))) +
theme(axis.ticks.x = element_blank()) +
coord_cartesian(clip = "off")
Using ggplot2 in R, i'm trying to insert a red line that indicates the average of a chain. I would like to insert the average value close to the line so that it was not necessary to "deduct" the value.
I tried to use a negative coordinate for x, but it did not work, the value is behind the axis.
ggplot(data = chain.fmBC) +
geom_line(aes(1:25000, chain.fmBC$V2)) +
labs(y = "", x = "") +
labs(caption= "Bayes C") +
geom_hline(yintercept = mean(chain.fmBC$V2), colour = "RED") +
geom_text(label = round(mean(chain.fmBC$V2), 2),
x = 0, y = min(chain.fmBC$V2), colour = "RED")
this is a picture of my graph:
How could I put the value that is in red (media) to the left of the y-axis of the graph, between 0 and 5000, as if it were a label of the y-axis?
You can set your y axis ticks manually so that it includes the mean value. This will give you a nicely positioned annotation. If the real issue is the colored axis label, unfortunately this does not solve that
Example:
ggplot(mtcars, aes(disp)) +
geom_histogram() +
geom_hline(yintercept = 0.5, color = "red") +
scale_y_continuous(breaks = c(0,0.5,1,2,3,4)) +
theme(axis.text.y = element_text())
Which will give you this:
I was successful following the suggestions, I would like to share.
I got good help here.
cadeia.bayesc <- ggplot(data = chain.fmBC) + geom_line(aes(1:25000, chain.fmBC$V2)) +
theme(plot.margin = unit(c(0.5,0.5,0.5,1), "lines")) + # Make room for the grob
labs(y = "", x = "") + labs(caption= "Bayes C") +
cadeia.bayesc <- cadeia.bayesc + geom_hline(yintercept = mean(chain.fmBC$V2), colour = "RED") # insert the line
cadeia.bayesc <- cadeia.bayesc + annotation_custom( # grid::textgrob configure the label
grob = textGrob(label = round(mean(chain.fmBC$V2),2), hjust = 0, gp = gpar(cex = .7, col ="RED")),
xmin = -6000, xmax = -100, ymin = mean(chain.fmBC$V2), ymax = mean(chain.fmBC$V2))
# Code to override clipping
cadeia.bayesc.plot <- ggplot_gtable(ggplot_build(cadeia.bayesc))
cadeia.bayesc.plot$layout$clip[cadeia.bayesc.plot$layout$name == "panel"] <- "off"
grid.draw(cadeia.bayesc.plot)
result (https://i.imgur.com/ggbuNuK.jpg)
I try to establish R as data visualisation tool in my company. A typical graph type used in my department are waterfall charts (https://en.wikipedia.org/wiki/Waterfall_chart).
In R, there are some packages and hints for ggplot to generate a waterfall chart (https://learnr.wordpress.com/2010/05/10/ggplot2-waterfall-charts/), which I used already.
Unfortunately, a common feature for the used waterfall charts are annotations with arrows to indicate the percentage changes within the steps.
See an example below:
Or here in this video (https://www.youtube.com/watch?v=WMHf7uFR6Rk)
The software used to produce such kind of plots is think cell (https://www.think-cell.com/), which is an add-on to Excel and Powerpoint.
The problem I have is that I don't know how to start to tackle the topic. My first thoughts are going in this direction:
Use geom_segment for generating the arrows and boxes
Use ggplot's annotate funktion to place the text at the arrows or in the boxes
Calculate the positions automatically based on the data provided to the waterfall chart.
May I ask you, if you have additional thoughts/ideas to implement such graphs in ggplot?
Best Regards Markus
Here's an example of the approach I would take.
Step 1. Pick which elements should be added, and add them one at a time.
Let's say we're starting with this simple chart:
df <- data.frame(x = c(2007, 2008, 2009),
y = c(100, 120, 140))
ggplot(df, aes(x, y, label = y)) +
geom_col() +
geom_text(vjust = -0.5)
First of all, we need some extra vertical space:
ggplot(df, aes(x, y, label = y)) +
geom_col() +
geom_text(vjust = -0.5) +
scale_y_continuous(expand = expand_scale(add = c(10, 50))) # Add 50 y padding
Now, I incrementally add layers until it looks like I want:
# Semi-manual proof of concept
ggplot(df, aes(x, y, label = y)) +
geom_col() +
geom_text(vjust = -0.5) +
scale_y_continuous(expand = expand_scale(add = c(10, 50))) + # Add 50 y padding
# Line with arrow
geom_segment(aes(x = df$x[3], y = df$y[3] + 50,
xend = df$x[3], yend = df$y[3] + 50),
arrow = arrow(length = unit(0.02, "npc"), type = "closed")) +
# Background box
geom_tile(aes(x = mean(c(df$x[3], df$x[3])),
y = mean(c(df$y[3], df$y[3])) + 50, width = 1, height = 40),
fill = "white", color = "black", size = 0.5) +
# Text
geom_text(aes(x = mean(c(df$x[3], df$x[3])),
y = mean(c(df$y[3], df$y[3])) + 50,
label = paste0("CAGR\n",
df$x[3], "-", df$x[3], "\n",
scales::percent((df$y[3] / df$y[3]) ^ (1/(df$x[3]-df$x[3])) - 1))))
Step 2. Make it into a function
Now I move the CAGR-related layers into a function, replacing most of the constants with function parameters.
add_CAGR <- function(df, first_val_pos, second_val_pos,
y_offset, box_width = 1, box_height) {
list(
# Line with arrow
geom_segment(aes(x = df$x[first_val_pos],
xend = df$x[second_val_pos],
y = df$y[first_val_pos] + y_offset,
yend = df$y[second_val_pos] + y_offset),
arrow = arrow(length = unit(0.02, "npc"), type = "closed")),
# Background box
geom_tile(aes(x = mean(c(df$x[first_val_pos], df$x[second_val_pos])),
y = mean(c(df$y[first_val_pos], df$y[second_val_pos])) + y_offset,
width = box_width, height = box_height),
fill = "white", color = "black", size = 0.5),
# Text
geom_text(aes(x = mean(c(df$x[first_val_pos], df$x[second_val_pos])),
y = mean(c(df$y[first_val_pos], df$y[second_val_pos])) + y_offset,
label = paste0("CAGR\n",
df$x[first_val_pos], "-", df$x[second_val_pos], "\n",
scales::percent((df$y[second_val_pos] / df$y[1]) ^
(1/(df$x[second_val_pos]-df$x[first_val_pos])) - 1))),
lineheight = 0.8)
)
}
Step 3: Use in plot
ggplot(df, aes(x, y, label = y)) +
geom_col() +
geom_text(vjust = -0.5) +
scale_y_continuous(expand = expand_scale(add = c(0, 50))) + # Add 50 y padding
add_CAGR(df, first_val_pos = 1, second_val_pos = 3,
y_offset = 50,
box_width = 0.7, box_height = 40)
Or the same thing just between the first two bars:
ggplot(df, aes(x, y, label = y)) +
geom_col() +
geom_text(vjust = -0.5) +
scale_y_continuous(expand = expand_scale(add = c(0, 50))) + # Add 50 y padding
add_CAGR(df, first_val_pos = 1, second_val_pos = 2,
y_offset = 50,
box_width = 0.7, box_height = 40)
I'm plotting some graphs to introduce the concept of mathematical function to highschool students. Right now, I'd like to give them an example of what is NOT a function, by plotting an horizontal parabola:
x <- seq(from = -3, to = 3, by = 0.001)
y <- -x^2 + 5
grafico <- ggplot()+
geom_hline(yintercept = 0)+
geom_vline(xintercept = 0)+
geom_line(mapping = aes(x = x, y = y),color="darkred",size=1)+
theme_light()+
xlab("")+
ylab("")+
scale_x_continuous(breaks = seq(from = -100, to = 100, by = 1))+
scale_y_continuous(breaks = seq(from = -100, to = 100, by = 1))+
coord_flip(ylim = c(-1.5,5.5), xlim = c(-3,3),expand = FALSE)
print(grafico)
Which outputs the following image:
This is quite close to what I want, but I would like both axes' scales to match, to keep things simple for the students. For this, I'd tried using coord_equal, but unluckily, it seems to cancel coord_flip's effects:
x <- seq(from = -3, to = 3, by = 0.001)
y <- -x^2 + 5
grafico <- ggplot()+
geom_hline(yintercept = 0)+
geom_vline(xintercept = 0)+
geom_line(mapping = aes(x = x, y = y),color="darkred",size=1)+
theme_light()+
xlab("")+
ylab("")+
scale_x_continuous(breaks = seq(from = -100, to = 100, by = 1))+
scale_y_continuous(breaks = seq(from = -100, to = 100, by = 1))+
coord_flip(ylim = c(-1.5,5.5), xlim = c(-3,3),expand = FALSE)+
coord_equal()
print(grafico)
My question is: Is there a simple way to include coord_flip functionality into coord_equal?
For example, I know I can get coord_cartesian functionality by using the parameters ylim and xlim.
Based on your use case, it doesn't look like you really need to flip the coordinates: you can just reverse the order of inputs for x & y, and use geom_path() instead of geom_line() to force the plot to follow the order in your inputs.
The ggplot help file states:
geom_path() connects the observations in the order in which they
appear in the data. geom_line() connects them in order of the
variable on the x axis.
ggplot() +
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0) +
geom_path(mapping = aes(x = y, y = x), color="darkred", size = 1) + # switch x & y here
theme_light() +
xlab("") +
ylab("") +
scale_x_continuous(breaks = seq(from = -100, to = 100, by = 1)) +
scale_y_continuous(breaks = seq(from = -100, to = 100, by = 1)) +
coord_equal(xlim = c(-1.5, 5.5), ylim = c(-3, 3), expand = FALSE) # switch x & y here