Inserting a custom label on the y axis in ggplot2 - r

Using ggplot2 in R, i'm trying to insert a red line that indicates the average of a chain. I would like to insert the average value close to the line so that it was not necessary to "deduct" the value.
I tried to use a negative coordinate for x, but it did not work, the value is behind the axis.
ggplot(data = chain.fmBC) +
geom_line(aes(1:25000, chain.fmBC$V2)) +
labs(y = "", x = "") +
labs(caption= "Bayes C") +
geom_hline(yintercept = mean(chain.fmBC$V2), colour = "RED") +
geom_text(label = round(mean(chain.fmBC$V2), 2),
x = 0, y = min(chain.fmBC$V2), colour = "RED")
this is a picture of my graph:
How could I put the value that is in red (media) to the left of the y-axis of the graph, between 0 and 5000, as if it were a label of the y-axis?

You can set your y axis ticks manually so that it includes the mean value. This will give you a nicely positioned annotation. If the real issue is the colored axis label, unfortunately this does not solve that
Example:
ggplot(mtcars, aes(disp)) +
geom_histogram() +
geom_hline(yintercept = 0.5, color = "red") +
scale_y_continuous(breaks = c(0,0.5,1,2,3,4)) +
theme(axis.text.y = element_text())
Which will give you this:

I was successful following the suggestions, I would like to share.
I got good help here.
cadeia.bayesc <- ggplot(data = chain.fmBC) + geom_line(aes(1:25000, chain.fmBC$V2)) +
theme(plot.margin = unit(c(0.5,0.5,0.5,1), "lines")) + # Make room for the grob
labs(y = "", x = "") + labs(caption= "Bayes C") +
cadeia.bayesc <- cadeia.bayesc + geom_hline(yintercept = mean(chain.fmBC$V2), colour = "RED") # insert the line
cadeia.bayesc <- cadeia.bayesc + annotation_custom( # grid::textgrob configure the label
grob = textGrob(label = round(mean(chain.fmBC$V2),2), hjust = 0, gp = gpar(cex = .7, col ="RED")),
xmin = -6000, xmax = -100, ymin = mean(chain.fmBC$V2), ymax = mean(chain.fmBC$V2))
# Code to override clipping
cadeia.bayesc.plot <- ggplot_gtable(ggplot_build(cadeia.bayesc))
cadeia.bayesc.plot$layout$clip[cadeia.bayesc.plot$layout$name == "panel"] <- "off"
grid.draw(cadeia.bayesc.plot)
result (https://i.imgur.com/ggbuNuK.jpg)

Related

ggplot trying to make a Cleveland plot but I cannot get a legend

library(ggplot2)
library(ggthemes)
data <- read.csv('/Users/zbhay/Documents/r-data.csv', header = 1)
zb <- ggplot(data) +
geom_segment( aes(x=x, xend=x, y=value1, yend=value2), color="black")+
geom_point( aes(x=x, y=value1), color=rgb(0.2,0.7,0.1,1), size=4 )+
geom_point( aes(x=x, y=value2), color=rgb(0.7,0.2,0.1,1), size=4 )+
coord_flip() +
theme_solarized() +
scale_y_continuous(breaks = seq(0, 10000, by = 500)
)
zb + labs(title = "Title",
subtitle = "subtitle") +
xlab("Business Functions") +
ylab("# of hours")
legend("left", c("Starting", "Ending"),
box.col = "darkgreen"
)
Hello, here is the code. The CSV file is structured as follows; column A = names, column b = starting number, column c = final number.
I am trying to set up a legend that calls out the final number vs starting number. I have tried and tried but cannot seem to be able to crack it. If anyone knows a fix, I would appreciate it if you could let me know.
As a general rule when using ggplot2 you have to map on aesthetics if you want to get a legend, i.e. instead of setting the colors for your points as arguments map a value on the color aes, e.g. in my code below I map the constant value or category start on the color aes inside aes() for the first geom_point. Afterwards you could use scale_color_manual to assign your desired colors and labels to these "categories" or "values". Finally, the color of the legend box could be set via the theme option legend.background. However, the legend keys themselves have a background color too, which I set to NA via legend.key.
Using some fake random example data:
library(ggplot2)
library(ggthemes)
set.seed(123)
data <- data.frame(x = letters[1:5], value1 = runif(5, 0, 10000), value2 = runif(5, 0, 10000))
ggplot(data) +
geom_segment(aes(x = x, xend = x, y = value1, yend = value2), color = "black") +
geom_point(aes(x = x, y = value1, color = "start"), size = 4) +
geom_point(aes(x = x, y = value2, color = "end"), size = 4) +
coord_flip() +
theme_solarized() +
scale_y_continuous(breaks = seq(0, 10000, by = 500)) +
scale_color_manual(values = c(start = rgb(0.2, 0.7, 0.1, 1), end = rgb(0.7, 0.2, 0.1, 1)), labels = c(start = "Starting", end = "Ending")) +
labs(title = "Title", subtitle = "subtitle", x = "Business Functions", y = "# of hours", color = NULL) +
theme(
legend.key = element_rect(fill = NA),
legend.background = element_rect(fill = "darkgreen")
)

How do I shift the geom_text labels to AFTER a geom_segment arrow in ggplot2?

I have an NMDS ordination that I've plotted using ggplot2. I've added environmental vectors on top (from the envfit() function in vegan) using geom_segment() and added corresponding labels to the same coordinates as the segments using geom_text() (code below):
ggplot() +
geom_point(data = nmds.sites.plot, aes(x = NMDS1, y = NMDS2, col = greening), size = 2) +
labs(title = "Study Area",
col = "Sites") +
geom_polygon(data = hull.data, aes(x = NMDS1, y = NMDS2, fill = grp, group = grp), alpha = 0.2) +
scale_fill_discrete(name = "Ellipses",
labels = c("High", "Moderate", "Control")) +
xlim(c(-1, 1)) +
guides(shape = guide_legend(order = 1),
colour = guide_legend(order = 2)) +
geom_segment(data = env.arrows,
aes(x = 0, xend = NMDS1, y = 0, yend = NMDS2),
arrow = arrow(length = unit(0.25, "cm")),
colour = "black", inherit.aes = FALSE) +
geom_text(data = env.arrows, aes(x = NMDS1, y = NMDS2, label = rownames(env.arrows))) +
coord_fixed() +
theme_bw() +
theme(text = element_text(size = 14))
However, since the labels are justified to centre, part of the label sometimes overlaps with the end of the arrow. I want to have the text START at the end of the arrow. In some other cases, if the arrow is pointing up, it pushes into the middle of the text. Essentially, I want to be able to see both the arrow head AND the text.
I have tried using geom_text_repel() from the ggrepel package but the placement seems random (and will also repel from other points or text in the plot (or just not do anything at all).
[EDIT]
Below are the coordinates of the NMDS vectors (this is the env.arrows object from the example code above):
NMDS1 NMDS2
Variable1 -0.46609087 0.27567532
Variable2 -0.21524887 -0.10128795
Variable3 0.59093184 0.03423775
Variable4 -0.00136418 0.46550043
Variable5 -0.30900813 -0.19659929
Variable6 0.53510347 -0.36387227
Variable7 0.66376246 -0.05220685
In the code below, we create a radial shift function to move the labels away from the arrows. The shift includes a constant amount plus an additional shift that varies with the absolute value of the cosine of the label's angle to the x-axis. This is because labels with theta near 0 or 180 degrees have a larger length of overlap with the arrows, and therefore need to be moved farther, than labels with theta near 90 or 270 degrees.
You may need to tweak the code a bit to get the labels exactly where you want them. Also, you'll likely need to add an additional adjustment if the variable names can have different widths.
One additional note: I've turned the variable names into a data column. You should do this with your data as well and then map that data column to the label argument of aes. Using rownames(env.arrows) for the labels reaches outside the ggplot function environment to the external data frame env.arrows and breaks the mapping to the data frame you've provided in the data argument to geom_text (although it likely won't cause a problem in this particular case).
library(tidyverse)
library(patchwork)
# data
env.arrows = read.table(text=" var NMDS1 NMDS2
Variable1 -0.46609087 0.27567532
Variable2 -0.21524887 -0.10128795
Variable3 0.59093184 0.03423775
Variable4 -0.00136418 0.46550043
Variable5 -0.30900813 -0.19659929
Variable6 0.53510347 -0.36387227
Variable7 0.66376246 -0.05220685", header=TRUE)
# Radial shift function
rshift = function(r, theta, a=0.03, b=0.07) {
r + a + b*abs(cos(theta))
}
# Calculate shift
env.arrows = env.arrows %>%
mutate(r = sqrt(NMDS1^2 + NMDS2^2),
theta = atan2(NMDS2,NMDS1),
rnew = rshift(r, theta),
xnew = rnew*cos(theta),
ynew = rnew*sin(theta))
p = ggplot() +
geom_segment(data = env.arrows,
aes(x = 0, xend = NMDS1, y = 0, yend = NMDS2),
arrow = arrow(length = unit(0.25, "cm")),
colour = "black", inherit.aes = FALSE) +
geom_text(data = env.arrows, aes(x = NMDS1, y = NMDS2, label = var)) +
coord_fixed() +
theme_bw() +
theme(text = element_text(size = 14))
pnew = ggplot() +
geom_segment(data = env.arrows,
aes(x = 0, xend = NMDS1, y = 0, yend = NMDS2),
arrow = arrow(length = unit(0.2, "cm")),
colour = "grey60", inherit.aes = FALSE) +
geom_text(data = env.arrows, aes(x = xnew, y = ynew, label = var), size=3.5) +
coord_fixed() +
theme_bw() +
theme(text = element_text(size = 14)) +
scale_x_continuous(expand=expansion(c(0.12,0.12))) +
scale_y_continuous(expand=expansion(c(0.07,0.07)))
p / pnew

ground geom_text to x axis (e.g. y =0)

I have a graph made in ggplot that looks like this:
I wish to have the numeric labels at each of the bars to be grounded/glued to the x axis where y <= 0.
This is the code to generate the graph as such:
ggplot(data=df) +
geom_bar(aes(x=row, y=numofpics, fill = crop, group = 1), stat='identity') +
geom_point(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_line(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_text(aes(x=row, y=numofpics, label=bbch)) +
geom_hline(yintercept=300, linetype="dashed", color = "red", size=1) +
scale_y_continuous(sec.axis= sec_axis(~./50, name="Number of Parcels")) +
scale_x_discrete(name = c(),breaks = unique(df$crop), labels = as.character(unique(df$crop)))+
labs(x=c(), y="Number of Pictures")
I've tried vjust and experimenting with position_nudge for the geom_text element, but every solution I can find changes the position of each element of the geom_text respective to its current position. As such everything I try results in situation like this one:
How can I make ggplot ground the text to the bottom of the x axis where y <= 0, possibly with the possibility to also introduce a angle = 45?
Link to dataframe = https://drive.google.com/file/d/1b-5AfBECap3TZjlpLhl1m3v74Lept2em/view?usp=sharing
As I said in the comments, just set the y-coordinate of the text to 0 or below, and specify the angle : geom_text(aes(x=row, y=-100, label=bbch), angle=45)
I'm behind a proxy server that blocks connections to google drive so I can't access your data. I'm not able to test this, but I would introduce a new label field in my dataset that sets y to be 0 if y<0:
df <- df %>%
mutate(labelField = if_else(numofpics<0, 0, numofpics)
I would then use this label field in my geom_text call:
geom_text(aes(x=row, y=labelField, label=bbch), angle = 45)
Hope that helps.
You can simply define the y-value in geom_text (e.g. -50)
ggplot(data=df) +
geom_bar(aes(x=row, y=numofpics, fill = crop, group = 1), stat='identity') +
geom_point(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_line(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_text(aes(x=row, y=-50, label=bbch)) +
geom_hline(yintercept=300, linetype="dashed", color = "red", size=1) +
scale_y_continuous(sec.axis= sec_axis(~./50, name="Number of Parcels")) +
scale_x_discrete(name = c(),breaks = unique(df$crop), labels =
as.character(unique(df$crop)))+
labs(x=c(), y="Number of Pictures")

How to include "think-cell"-like percentage changes in a waterfall-chart generated in ggplot2

I try to establish R as data visualisation tool in my company. A typical graph type used in my department are waterfall charts (https://en.wikipedia.org/wiki/Waterfall_chart).
In R, there are some packages and hints for ggplot to generate a waterfall chart (https://learnr.wordpress.com/2010/05/10/ggplot2-waterfall-charts/), which I used already.
Unfortunately, a common feature for the used waterfall charts are annotations with arrows to indicate the percentage changes within the steps.
See an example below:
Or here in this video (https://www.youtube.com/watch?v=WMHf7uFR6Rk)
The software used to produce such kind of plots is think cell (https://www.think-cell.com/), which is an add-on to Excel and Powerpoint.
The problem I have is that I don't know how to start to tackle the topic. My first thoughts are going in this direction:
Use geom_segment for generating the arrows and boxes
Use ggplot's annotate funktion to place the text at the arrows or in the boxes
Calculate the positions automatically based on the data provided to the waterfall chart.
May I ask you, if you have additional thoughts/ideas to implement such graphs in ggplot?
Best Regards Markus
Here's an example of the approach I would take.
Step 1. Pick which elements should be added, and add them one at a time.
Let's say we're starting with this simple chart:
df <- data.frame(x = c(2007, 2008, 2009),
y = c(100, 120, 140))
ggplot(df, aes(x, y, label = y)) +
geom_col() +
geom_text(vjust = -0.5)
First of all, we need some extra vertical space:
ggplot(df, aes(x, y, label = y)) +
geom_col() +
geom_text(vjust = -0.5) +
scale_y_continuous(expand = expand_scale(add = c(10, 50))) # Add 50 y padding
Now, I incrementally add layers until it looks like I want:
# Semi-manual proof of concept
ggplot(df, aes(x, y, label = y)) +
geom_col() +
geom_text(vjust = -0.5) +
scale_y_continuous(expand = expand_scale(add = c(10, 50))) + # Add 50 y padding
# Line with arrow
geom_segment(aes(x = df$x[3], y = df$y[3] + 50,
xend = df$x[3], yend = df$y[3] + 50),
arrow = arrow(length = unit(0.02, "npc"), type = "closed")) +
# Background box
geom_tile(aes(x = mean(c(df$x[3], df$x[3])),
y = mean(c(df$y[3], df$y[3])) + 50, width = 1, height = 40),
fill = "white", color = "black", size = 0.5) +
# Text
geom_text(aes(x = mean(c(df$x[3], df$x[3])),
y = mean(c(df$y[3], df$y[3])) + 50,
label = paste0("CAGR\n",
df$x[3], "-", df$x[3], "\n",
scales::percent((df$y[3] / df$y[3]) ^ (1/(df$x[3]-df$x[3])) - 1))))
Step 2. Make it into a function
Now I move the CAGR-related layers into a function, replacing most of the constants with function parameters.
add_CAGR <- function(df, first_val_pos, second_val_pos,
y_offset, box_width = 1, box_height) {
list(
# Line with arrow
geom_segment(aes(x = df$x[first_val_pos],
xend = df$x[second_val_pos],
y = df$y[first_val_pos] + y_offset,
yend = df$y[second_val_pos] + y_offset),
arrow = arrow(length = unit(0.02, "npc"), type = "closed")),
# Background box
geom_tile(aes(x = mean(c(df$x[first_val_pos], df$x[second_val_pos])),
y = mean(c(df$y[first_val_pos], df$y[second_val_pos])) + y_offset,
width = box_width, height = box_height),
fill = "white", color = "black", size = 0.5),
# Text
geom_text(aes(x = mean(c(df$x[first_val_pos], df$x[second_val_pos])),
y = mean(c(df$y[first_val_pos], df$y[second_val_pos])) + y_offset,
label = paste0("CAGR\n",
df$x[first_val_pos], "-", df$x[second_val_pos], "\n",
scales::percent((df$y[second_val_pos] / df$y[1]) ^
(1/(df$x[second_val_pos]-df$x[first_val_pos])) - 1))),
lineheight = 0.8)
)
}
Step 3: Use in plot
ggplot(df, aes(x, y, label = y)) +
geom_col() +
geom_text(vjust = -0.5) +
scale_y_continuous(expand = expand_scale(add = c(0, 50))) + # Add 50 y padding
add_CAGR(df, first_val_pos = 1, second_val_pos = 3,
y_offset = 50,
box_width = 0.7, box_height = 40)
Or the same thing just between the first two bars:
ggplot(df, aes(x, y, label = y)) +
geom_col() +
geom_text(vjust = -0.5) +
scale_y_continuous(expand = expand_scale(add = c(0, 50))) + # Add 50 y padding
add_CAGR(df, first_val_pos = 1, second_val_pos = 2,
y_offset = 50,
box_width = 0.7, box_height = 40)

Using ggrepel with single plot point/adding line between label and point

Ok so I have a data set with 2 variables X and Y, and an ID variable. I've created a full plot using this code:
ggplot(data = X_Y) +
geom_point(mapping = aes(x = X, y = Y))+
geom_text_repel(mapping = aes(x = X, y = Y, label = ID))+
xlim(0,100)+
ylim(0,100)
This produces a plot like this:
I now wish to create a number of separate plots only showing a single data point at a time with their label.
Now I can use just geom_label without repel and nudge the label to get this:
While this plot is ok, I was wondering if there was any way to keep the lines connecting labels to points like how ggrepel does...
EDIT
From the first suggestion, when I try use repel with only one case selected I get the following plot:
ggplot(data = X_Y) +
geom_point(aes(x = X[4], y = Y[4]))+
geom_label_repel(aes(x = X[4], y = Y[4]),
label = "You are here",
min.segment.length = unit(0, 'lines'),
nudge_y = 6)+
labs(x = "X",y = "Y",title = "mytitle")+
scale_x_continuous(limits = c(0, 100)) +
scale_y_continuous(limits = c(0, 100))
RESOLVED
Figured it out! I need to specify my data in ggplot() to only be the X and Y variables and limit to the row of interest.
Like this:
ggplot(data = X_Y[4,c(3,4)) +
geom_point(aes(x = X, y = Y))+
geom_label_repel(aes(x = X, y = Y),
label = "You are here",
min.segment.length = unit(0, 'lines'),
nudge_y = 6)+
labs(x = "X",y = "Y",title = "mytitle")+
scale_x_continuous(limits = c(0, 100)) +
scale_y_continuous(limits = c(0, 100))
You can of course still use geom_label_repel, even with a single point. To be sure a segment is drawn adjust the min.segment.length arg. This arg sets the minimum distance from the point to the label to draw a segment, setting it to unit(0, 'lines') ensures every segment is drawn:
library(ggplot2)
library(ggrepel)
ggplot(data.frame(x = 2, y = 3)) +
geom_point(aes(x, y)) +
geom_label_repel(aes(x, y),
label = 'You are here',
min.segment.length = unit(0, 'lines'),
nudge_y = .2) +
scale_x_continuous(limits = c(0, 3)) +
scale_y_continuous(limits = c(0, 4))

Resources