Add id labels to points above the limit line in ggplot - r

I have a dataframe like this
id <- c(5738180,51845,167774,517814,1344920)
amount <- c(3.76765976,0.85195407,1.96821355,0.01464609,0.57378284)
outlier <- c("TRUE","FALSE","FALSE","FALSE","FALSE")
df.sample <- data.frame(id,amount,outlier)
I am trying to plot the points and add an id label to any point that is above the limit. In this case (id=5738180)
I am plotting like this
library(tidyverse)
library(ggplot2)
library(ggrepel) # help avoid overlapping text labels
library(gridExtra) # adds custom table inside ggplot
library(scales) # breaks and labels for axes and legends
library(ggthemes) # Adding desired ggplot themes
df.sample %>%
ggplot(aes(x = as.numeric(row.names(df.sample)),
y = amount, label = as.character(id))) +
geom_point(alpha = 0.6, position = position_jitter(w = 0.05, h = 0.0),
aes(colour = (amount < 3)), size = 2) +
geom_hline(aes(yintercept = 3, linetype = "Limit"),
color = 'black', size = 1) +
geom_text(aes(y = 3,x = amount[4],
label = paste("Limit = ", round(3, 3)),
hjust = 0, vjust = 1.5)) +
geom_text_repel(data = subset(df.sample, outlier == 'TRUE'),
nudge_y = 0.75,
size = 4,
box.padding = 1.5,
point.padding = 0.5,
force = 100,
segment.size = 0.2,
segment.color = "grey50",
direction = "x") +
geom_label_repel(data = subset(df.sample, outlier == 'TRUE'),
nudge_y = 0.75,
size = 4,
box.padding = 0.5,
point.padding = 0.5,
force = 100,
segment.size = 0.2,
segment.color = "grey50",
direction = "x")
labs(title = "Outlier Detection",
y = "amount",
x = "") +
theme_few() +
theme(legend.position = "none",
axis.text = element_text(size = 10, face = "bold"),
axis.title = element_text(size = 10, face = "bold"),
plot.title = element_text(colour = "blue", hjust = 0.5,
size = 15, face = "bold"),
strip.text = element_text(size = 10, face = "bold")) +
scale_colour_manual(values = c("TRUE" = "green","FALSE" = "red"))
I am running into an error "Error: Aesthetics must be either length 1 or the same as the data (1): x"
Can someone point me in the right direction?

The issue is with geom_text_repel() and geom_label_repel(). You subset the data, which now only includes 1 row, but the aes() are inheriting from the original data which have 5 rows, hence the error. To fix this, subset the data outside of the ggplot() call, and change the aesthetics for it. You are also missing a + after geom_label_repel() and the result below modifies the nudge_y to nudge_x and removes the geom_text_repel().
outliers <- subset(df.sample, outlier == TRUE)
ggplot(data = df.sample,
aes(x = as.numeric(row.names(df.sample)),
y = amount,
label = as.character(id))) +
geom_point(alpha = 0.6,
position = position_jitter(w = 0.05, h = 0.0),
aes(colour = (amount < 3)),
size = 2) +
geom_hline(aes(yintercept = 3,
linetype = "Limit"),
color = 'black',
size = 1) +
geom_text(aes(y = 3,x = amount[4],
label = paste("Limit = ",
round(3, 3)),
hjust = 0,
vjust = 1.5)) +
geom_label_repel(data = outliers,
aes(x = as.numeric(rownames(outliers)),
y = amount,
label = amount),
nudge_x = 0.75,
size = 4,
box.padding = 0.5,
point.padding = 0.5,
force = 100,
segment.size = 0.2,
segment.color = "grey50",
direction = "x",
inherit.aes = F) +
labs(title = "Outlier Detection",
y = "amount",
x = "") +
theme_few() +
theme(legend.position = "none",
axis.text = element_text(size = 10, face = "bold"),
axis.title = element_text(size = 10, face = "bold"),
plot.title = element_text(colour = "blue", hjust = 0.5,
size = 15, face = "bold"),
strip.text = element_text(size = 10, face = "bold")) +
scale_colour_manual(values = c("TRUE" = "green","FALSE" = "red"))

Related

Pie chart and Bar chart aligned on same plot

After seeing this question on how to recreate this graph from the economist in ggplot2, I decided to attempt this myself from scratch (since no code or data was provided), as I found this quite interesting.
Here is what I have managed to do so far:
I was able to do this with relative ease. However, I am struggling with putting pie charts. Because ggplot uses cartesian coordinates to make pie charts, I can't have bars and pies on the same graph. So I discovered geom_arc_bar() from ggforce, which does allow pies on cartesian coordinate system. However, the issue is with coord_fixed(). I can get the pies to align but I cannot get the circular shape without coord_fixed(). However, with coord_fixed(), I can't get the graph to match the height of Economist graph. Without coord_fixed() I can, but the pies are ovals rather than circles. See below:
With coord_fixed():
Without coord_fixed():
The other option that I have tried is to make a series of pie charts separately and then combine the plots together. However, I struggled to get the plots aligned with gridExtra and other alternatives. I did combining with paint. Obviously this works, but is not programmatic. I need a solution that is 100% based in R.
My solution with pasting separate images from R in paint:
Anybody with a solution to this problem? I think it is an interesting question to answer and I have provided a starting point. I am open to any suggestions, also feel free to suggest an entirely different approach, as I acknowledge that mine is not the best. Thanks!
CODE:
# packages
library(data.table)
library(dplyr)
library(forcats)
library(ggplot2)
library(ggforce)
library(ggnewscale)
library(ggtext)
library(showtext)
library(stringr)
# data
global <- fread("Sector,ROE,Share,Status
Technology,14.2,10,Local
Technology,19,90,Multinational
Other consumer,16.5,77,Multinational
Other consumer,20.5,23,Local
Industrial,13,70,Multinational
Industrial,18,30,Local
Cyclical consumer,12,77,Multinational
Cyclical consumer,21,23,Local
Utilities,6,88,Local
Utilities,11,12,Multinational
All sectors,10,50,Local
All sectors,10.2,50,Multinational
Financial,6,27,Multinational
Financial,10.5,73,Local
Diversified,4.9,21,Local
Diversified,5,79,Multinational
Basic materials,4,82,Multinational
Basic materials,9,18,Local
Media & communications,3,76,Multinational
Media & communications,14,24,Local
Energy,-1,40,Local
Energy,1,60,Multinational
")
equity <- global %>%
group_by(Sector) %>%
mutate(xend = ifelse(min(ROE) > 0, 0, min(ROE)))
equity$Sector <- factor(equity$Sector, levels= rev(c("Technology", "Other consumer",
"Industrial", "Cyclical consumer",
"Utilities", "All sectors", "Financial",
"Diversified", "Basic materials",
"Media & communications", "Energy")))
equity$Status <- factor(equity$Status, levels = c("Multinational", "Local"))
# fonts
font_add_google("Montserrat", "Montserrat")
font_add_google("Roboto", "Roboto")
# scaling text for high res image
img_scale <- 5.5
# graph
showtext_auto() # for montserrat font to show
economist <- ggplot(equity)+
geom_vline(aes(xintercept = -2.5, color = "+-"), show.legend = FALSE)+
geom_vline(aes(xintercept = 2.5, color = "+-"), show.legend = FALSE)+
geom_segment(aes(x = ROE, xend = xend, y = Sector, yend = Sector, color = "line"),
show.legend = FALSE, size = 2)+
geom_tile(aes(x = ROE, y = Sector, width = 1, height = 0.5, fill = Status),
size = 0.5)+
geom_vline(aes(xintercept = 0, color = "x-axis"), show.legend = FALSE)+
scale_fill_manual("", values = c("Local" = "#ea5f47", "Multinational" = "#0a5268"))+
scale_color_manual(values = c("x-axis" = "red", "+-" = "#cddee6", "line" = "#a8adb3"))+
scale_x_continuous(position = "top", limits = c(-5, 25),
breaks = c(-5, -2.5, 0, 2.5, 5,10,15,20,25),
labels = c(5, "-", 0, "+", 5,10,15,20,25),
minor_breaks = c(-2.5, 2.5)
)+
scale_y_discrete(labels = function(x) str_replace_all(x, "& c" , "&\nc"))+
#width = 40))+
labs(x = "", y = "", caption = c("Sources: Bloomberg;",
"The Economist",
"<span style='font-size:80px;
color:#292929;'><sup>*</sup></span>Top 500 global companies"))+
ggtitle("The price of being global",
subtitle = "Return on equity<span style='font-size:80px;color:#292929;'>*</span>, latest 12 months, %")+
theme(legend.position = "top",
legend.direction = "vertical",
legend.justification = -1.25,
legend.key.size = unit(0.18, "cm"),
legend.key.height = unit(0.1, "cm"),
legend.background = element_rect("#cddee6"),
legend.text = element_text("Montserrat", size = 9 * img_scale),
plot.background = element_rect("#cddee6"),
plot.margin = margin(t = 10, r = 10, b = 20, l = 10),
panel.background = element_rect("#cddee6"),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
panel.grid.minor.x = element_blank(),
axis.ticks = element_blank(),
axis.text = element_text(family = "Montserrat", size = 9 * img_scale,
colour = "black"),
axis.text.y = element_text(hjust = 0, lineheight = 0.15,
face = c(rep("plain",5), "bold.italic", rep("plain",5))
),
#axis.text.x = element_text(family = "Montserrat", size = 9*img_scale,)
plot.title = element_text(family = "Montserrat", size = 12 * img_scale,
face = "bold",
hjust = -34.12),
text = element_text(family = "Montserrat"),
plot.subtitle = element_markdown(family = "Montserrat", size = 9 * img_scale,
hjust = 7.5),
plot.caption = element_markdown(size = 9*img_scale,
face = c("plain", "italic", "plain"),
hjust = c(-1.35, -1.85, -2.05),
vjust = c(0,0.75,0)))
# only way to get google fonts on plot (R device does not show them)
png("bar.png", height = 480*8, width = 250*8, res = 72*8) # increased resolution (dpi)
economist
dev.off()
# piechart
pies <- equity %>%
mutate(Sector = fct_rev(Sector)) %>%
ggplot(aes(x = "", y = Share, fill = Status, width = 0.15)) +
geom_bar(stat = "identity", position = position_fill(), show.legend = FALSE, size = 0.1) +
# geom_text(aes(label = Cnt), position = position_fill(vjust = 0.5)) +
coord_polar(theta = "y", direction = -1) +
facet_wrap(~ Sector, dir = "v", ncol = 1) +
scale_fill_manual("", values = c("Local" = "#93b7c7", "Multinational" = "#08526b"))+
#theme_void()+
theme(panel.spacing = unit(-0.35, "lines"),
plot.background = element_rect("#cddee6"),
panel.background = element_rect("transparent"),
strip.text = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
legend.position='none',
axis.ticks = element_blank(),
axis.text = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank())
# guides(fill=guide_legend(nrow=2, byrow=TRUE))
png("pie_chart.png", height = 350*8, width = 51*8, res = 72*8)
pies
dev.off()
# geom_bar_arc (ggforce) with coord_fixed - cannot match height but pies are circular
eco_circle_pies <- ggplot(equity)+
geom_vline(aes(xintercept = -2.5, color = "+-"), show.legend = FALSE)+
geom_vline(aes(xintercept = 2.5, color = "+-"), show.legend = FALSE)+
geom_segment(aes(x = ROE, xend = xend, y = Sector, yend = Sector, color = "line"),
show.legend = FALSE, size = 1)+
scale_fill_manual("", values = c("Local" = "#ea5f47", "Multinational" = "#0a5268"))+
geom_tile(aes(x = ROE, y = Sector, width = 1, height = 0.5, fill = Status),
size = 0.5, show.legend = TRUE)+
geom_vline(aes(xintercept = 0, color = "x-axis"), show.legend = FALSE)+
new_scale_fill()+
geom_arc_bar(aes(x0 = 27, y0 = as.numeric(equity$Sector), r0 = 0, r = 0.45,
amount = Share,
fill = Status),
stat = 'pie',
color = "transparent",
show.legend = FALSE)+
coord_fixed()+
scale_fill_manual("", values = c("Local" = "#93b7c7", "Multinational" = "#08526b"))+
scale_color_manual(values = c("x-axis" = "red", "+-" = "#cddee6", "line" = "#a8adb3"))+
scale_x_continuous(position = "top", limits = c(-5, 30),
breaks = c(-5, -2.5, 0, 2.5, 5,10,15,20,25),
labels = c(5, "-", 0, "+", 5,10,15,20,25),
minor_breaks = c(-2.5, 2.5)
)+
scale_y_discrete(labels = function(x) str_replace_all(x, "& c" , "&\nc"))+
# below is to get * superscript
labs(x = "", y = "", caption = c("Sources: Bloomberg;",
"<span style='font-style:italic;font-color:#292929'>The Economist</span>",
"<span style='font-size:80px;
color:#292929;'><sup>*</sup></span>Top 500 global companies"))+ # this is to get
ggtitle("The price of being global",
subtitle = "Return on equity<span style='font-size:80px;color:#292929;'>*</span>, latest 12 months, %")+
guides(color = FALSE)+
theme(legend.position = "top",
legend.direction = "vertical",
# legend.justification = -0.9,
legend.key.size = unit(0.18, "cm"),
legend.key.height = unit(0.1, "cm"),
legend.background = element_rect("#cddee6"),
legend.text = element_text("Montserrat", size = 9 * img_scale),
plot.background = element_rect("#cddee6"),
# plot.margin = margin(t = -80, r = 10, b = -20, l = 10),
panel.background = element_rect("#cddee6"),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
panel.grid.minor.x = element_blank(),
axis.ticks = element_blank(),
axis.text = element_text(family = "Montserrat", size = 9 * img_scale,
colour = "black"),
axis.text.y = element_text(hjust = 0, lineheight = 0.15),
#axis.text.x = element_text(family = "Montserrat", size = 9*img_scale,)
plot.title = element_text(family = "Montserrat", size = 12 * img_scale,
hjust = -2.12),
plot.subtitle = element_markdown(family = "Montserrat", size = 9 * img_scale,
hjust = -5.75),
plot.caption = element_markdown(size = 9*img_scale,
face = c("plain", "italic", "plain"),
#hjust = c(-.9, -1.22, -1.95),
#vjust = c(0,0.75,0)))
))
png("eco_circle_pies.png", height = 220*8, width = 420*8, res = 72*8)
eco_circle_pies
dev.off()
# geom_bar_arc (ggforce) without coord_fixed - matches height, but pies are oval
eco_oval_pie <- ggplot(equity)+
geom_vline(aes(xintercept = -2.5, color = "+-"), show.legend = FALSE)+
geom_vline(aes(xintercept = 2.5, color = "+-"), show.legend = FALSE)+
geom_segment(aes(x = ROE, xend = xend, y = Sector, yend = Sector, color = "line"),
show.legend = FALSE, size = 1)+
scale_fill_manual("", values = c("Local" = "#ea5f47", "Multinational" = "#0a5268"))+
geom_tile(aes(x = ROE, y = Sector, width = 1, height = 0.5, fill = Status),
size = 0.5, show.legend = TRUE)+
geom_vline(aes(xintercept = 0, color = "x-axis"), show.legend = FALSE)+
new_scale_fill()+
geom_arc_bar(aes(x0 = 27, y0 = as.numeric(equity$Sector), r0 = 0, r = 0.45,
amount = Share,
fill = Status),
stat = 'pie',
color = "transparent",
show.legend = FALSE)+
# coord_fixed()+
scale_fill_manual("", values = c("Local" = "#93b7c7", "Multinational" = "#08526b"))+
scale_color_manual(values = c("x-axis" = "red", "+-" = "#cddee6", "line" = "#a8adb3"))+
scale_x_continuous(position = "top", limits = c(-5, 30),
breaks = c(-5, -2.5, 0, 2.5, 5,10,15,20,25),
labels = c(5, "-", 0, "+", 5,10,15,20,25),
minor_breaks = c(-2.5, 2.5)
)+
scale_y_discrete(labels = function(x) str_replace_all(x, "& c" , "&\nc"))+
#width = 40))+
labs(x = "", y = "", caption = c("Sources: Bloomberg;",
"<span style='font-style:italic;font-color:#292929'>The Economist</span>",
"<span style='font-size:80px;
color:#292929;'><sup>*</sup></span>Top 500 global companies"))+
ggtitle("The price of being global",
subtitle = "Return on equity<span style='font-size:80px;color:#292929;'>*</span>, latest 12 months, %")+
guides(color = FALSE)+
theme(legend.position = "top",
legend.direction = "vertical",
legend.justification = -1.1,
legend.key.size = unit(0.18, "cm"),
legend.key.height = unit(0.1, "cm"),
legend.background = element_rect("#cddee6"),
legend.text = element_text("Montserrat", size = 9 * img_scale),
plot.background = element_rect("#cddee6"),
# plot.margin = margin(t = -80, r = 10, b = -20, l = 10),
panel.background = element_rect("#cddee6"),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
panel.grid.minor.x = element_blank(),
axis.ticks = element_blank(),
axis.text = element_text(family = "Montserrat", size = 9 * img_scale,
colour = "black"),
axis.text.y = element_text(hjust = 0, lineheight = 0.15),
text = element_text(family = "Montserrat"),
plot.title = element_text(family = "Montserrat", size = 12 * img_scale,
face = "bold",
hjust = -7.05),
plot.subtitle = element_markdown(family = "Montserrat", size = 9 * img_scale,
hjust = 53.75),
plot.caption = element_markdown(size = 9*img_scale,
face = c("plain", "italic", "plain"),
hjust = c(-1.15, -1.58, -1.95),
vjust = c(0.5,1.15,0.5)))
png("eco_oval_pies.png", height = 480*8, width = 250*8, res = 72*8)
eco_oval_pie
dev.off()
Indeed an interesting problem. In my opinion the easiest way to get your desired result is to create two separate plots and to glue them together using the wonderful patchwork package:
Note: To focus on the main issue and to make the code more minimal I dropped all or most of your theme adjustments, ggtext styling, custom fonts, ... . Instead I relied on ggthemes::theme_economist to get close to the economist look.
# packages
library(data.table)
library(dplyr)
library(stringr)
library(forcats)
library(ggplot2)
library(patchwork)
library(ggthemes)
bars <-ggplot(equity)+
geom_vline(aes(xintercept = -2.5, color = "+-"), show.legend = FALSE)+
geom_vline(aes(xintercept = 2.5, color = "+-"), show.legend = FALSE)+
geom_segment(aes(x = ROE, xend = xend, y = Sector, yend = Sector, color = "line"),
show.legend = FALSE, size = 2)+
geom_tile(aes(x = ROE, y = Sector, width = 1, height = 0.5, fill = Status),
size = 0.5)+
geom_vline(aes(xintercept = 0, color = "x-axis"), show.legend = FALSE)+
scale_fill_manual("", values = c("Local" = "#ea5f47", "Multinational" = "#0a5268"))+
scale_color_manual(values = c("x-axis" = "red", "+-" = "#cddee6", "line" = "#a8adb3"))+
scale_x_continuous(position = "top", limits = c(-5, 25),
breaks = c(-5, -2.5, 0, 2.5, 5,10,15,20,25),
labels = c(5, "-", 0, "+", 5,10,15,20,25),
minor_breaks = c(-2.5, 2.5)
)+
scale_y_discrete(labels = function(x) str_replace_all(x, "& c" , "&\nc"))+
labs(x = "", y = "") +
ggthemes::theme_economist() +
theme(legend.position = "top", legend.justification = "left")
pies <- equity %>%
mutate(Sector = fct_rev(Sector)) %>%
ggplot(aes(x = "", y = Share, fill = Status, width = 0.15)) +
geom_bar(stat = "identity", position = position_fill(), show.legend = FALSE, size = 0.1) +
coord_polar(theta = "y", direction = -1) +
facet_wrap(~ Sector, dir = "v", ncol = 1) +
scale_fill_manual("", values = c("Local" = "#93b7c7", "Multinational" = "#08526b")) +
labs(x = NULL, y = NULL) +
ggthemes::theme_economist() +
theme(strip.text = element_blank(), panel.spacing.y = unit(0, "pt"),
axis.text = element_blank(), , axis.ticks = element_blank(), axis.line = element_blank(),
panel.grid.major = element_blank())
bars + pies +
plot_layout(widths= c(5, 1)) +
plot_annotation(caption = c("Sources: Bloomberg;",
"The Economist", "Top 500 global companies"),
title = "The price of being global",
subtitle = "Return on equity, latest 12 months, %",
theme = theme_economist())
Here's a base figure
global <- read.csv(strip.white = TRUE, text = "Sector,ROE,Share,Status
Technology,14.2,10,Local
Technology,19,90,Multinational
Other consumer,16.5,77,Multinational
Other consumer,20.5,23,Local
Industrial,13,70,Multinational
Industrial,18,30,Local
Cyclical consumer,12,77,Multinational
Cyclical consumer,21,23,Local
Utilities,6,88,Local
Utilities,11,12,Multinational
All sectors,10,50,Local
All sectors,10.2,50,Multinational
Financial,6,27,Multinational
Financial,10.5,73,Local
Diversified,4.9,21,Local
Diversified,5,79,Multinational
Basic materials,4,82,Multinational
Basic materials,9,18,Local
Media & communications,3,76,Multinational
Media & communications,14,24,Local
Energy,-1,40,Local
Energy,1,60,Multinational")
global <- within(global, {
Sector <- factor(Sector, unique(Sector))
Status <- factor(Status, unique(Status))
})
global <- global[order(global$Sector, global$Status), ]
f <- function(x, y, z, col, lbl, xat) {
all <- grepl('All', lbl)
par(mar = c(0, 0, 0, 0))
pie(rev(z), labels = '', clockwise = TRUE, border = NA, col = rev(col))
par(mar = c(0, 10, 0, 0))
plot.new()
plot.window(range(xat), c(-1, 1))
abline(v = xat, col = 'white', lwd = 3)
abline(v = 0, col = 'tomato3', lwd = 3)
segments(min(c(x, 0)), 0, max(x), 0, ifelse(all, 'grey50', 'grey75'), lwd = 7, lend = 1)
text(grconvertX(0.05, 'ndc'), 0, paste(strwrap(lbl, 15), collapse = '\n'),
xpd = NA, adj = 0, cex = 2, font = 1 + all * 3)
for (ii in 1:2)
segments(x[ii], -y / 2, x[ii], y / 2, col = col[ii], lwd = 7, lend = 1)
}
pdf('~/desktop/fig.pdf', height = 10, width = 7)
layout(
matrix(rev(sequence(nlevels(global$Sector) * 2)), ncol = 2, byrow = TRUE),
widths = c(5, 1)
)
cols <- c(Local = '#ea5f47', Multinational = '#08526b')
op <- par(bg = '#cddee6', oma = c(5, 6, 15, 0))
sp <- rev(split(global, global$Sector))
for (x in sp)
f(x$ROE, 1, x$Share, cols, x$Sector[1], -1:5 * 5)
axis(3, lwd = 0, cex.axis = 2)
cols <- rev(cols)
legend(
grconvertX(0.05, 'ndc'), grconvertY(0.91, 'ndc'), paste(names(cols), 'firms'),
border = NA, fill = cols, bty = 'n', xpd = NA, cex = 2
)
text(
grconvertX(0.05, 'ndc'), grconvertY(c(0.96, 0.925), 'ndc'),
c('The price of being global', 'Return on equity*, latest 12 months, %'),
font = c(2, 1), adj = 0, cex = c(3, 2), xpd = NA
)
text(
grconvertX(0.05, 'ndc'), grconvertY(0.03, 'ndc'),
'Sources: Bloomberg;\nThe Economist', xpd = NA, adj = 0, cex = 1.5
)
text(
grconvertX(0.95, 'ndc'), grconvertY(0.03, 'ndc'),
'*Top 500 global companies', xpd = NA, adj = 1, cex = 1.5
)
box('outer')
par(op)
dev.off()

How to modify the x axis levels without affecting the number of points of the plot?

Need to display the x-axis levels in neatly way without affecting the actual point numbers in the final output. As currently, I am getting x-axis in closely spaced which looks not good while I am showing in powerpoint
library("readxl")
my_data <-read_excel("central_high.xlsx") # Input file
str(my_data)
my_data = as.data.frame(my_data)
str(my_data)
my_data$var1 = NULL
f20 = as.data.frame(table(my_data$Year20))
f20$Var1 = as.Date(f20$Var1, "%Y-%m-%d")
f20$Var1 = format(f20$Var1, format="%m-%d")
f20$Cumulative_F20 = cumsum(f20$Freq) # cumulative calculation
f20
newcol_20 = c( my_data$Year19,
my_data$Year18, my_data$Year17,
my_data$Year16, my_data$Year15,
my_data$Year14, my_data$Year13,
my_data$Year12, my_data$Year11,
my_data$Year10, my_data$Year9,
my_data$Year8, my_data$Year7,
my_data$Year6, my_data$Year5,
my_data$Year4, my_data$Year3,
my_data$Year2, my_data$Year1)
str(newcol_20)
newdata_20 = data.frame(newcol_20)
str(newdata_20)
newdata_20$newcol_20 = as.Date(as.character(newdata_20$newcol_20), "%Y-%m-%d")
newdata_20$newcol_20 = format(newdata_20$newcol_20, format="%m-%d")
str(newdata_20)
newtable_20 = table(newdata_20$newcol_20)
newtable_20
newdf_20 = as.data.frame(newtable_20)
#newdf_20$Cumulative_20 = cumsum(newdf_20$Freq)/19 # cumulative calculation
newdf_20$Freq = newdf_20$Freq/19
newdf_20
newcol_05 = c( my_data$Year19,
my_data$Year18, my_data$Year17,
my_data$Year16)
str(newcol_05)
newdata_05 = data.frame(newcol_05)
str(newdata_05)
newdata_05$newcol_05 = as.Date(as.character(newdata_05$newcol_05), "%Y-%m-%d")
newdata_05$newcol_05 = format(newdata_05$newcol_05, format="%m-%d")
str(newdata_05)
newtable_05 = table(newdata_05$newcol_05)
newtable_05
newdf_05 = as.data.frame(newtable_05)
newdf_05$Cumulative_05 = cumsum(newdf_05$Freq)/4 # cumulative calculation
newdf_05$Freq = newdf_05$Freq/4
newdf_05
library(ggplot2)
library(ggpubr)
ggplot() +
geom_line(data = newdf_20, aes(x=Var1, y=cumsum(Freq), group = 1, color = "#111111"), size = 1.6) +
geom_line(data = newdf_05, aes(x=Var1, y=cumsum(Freq), group = 1, color = "#999999"), size = 1.6) +
geom_line(data = f20, aes(x=Var1, y=cumsum(Freq), group = 1, color = "#CC79A7"), size = 1.6) +
geom_vline(xintercept = "03-25", color="gray", size=1)+
geom_vline(xintercept = "04-21", color="gray", size=1)+
labs(y = "Cumulative_Frequency", colour= "#000000", size = 16 )+
font("ylab", size = 15, color = "black", face = "bold.italic")+
font("legend.text",size = 10, face = "bold")+
font("legend.title",size = 15, face = "bold")+
theme(axis.line.x = element_line(size = 0.5, colour = "black"), # theme modification
axis.line.y = element_line(size = 0.5, colour = "black"),
#axis.text.x=element_blank(),axis.ticks.x=element_blank(),
panel.background = element_blank(),
legend.position = 'none',
axis.text.x = element_text(colour = "#000000", size = 7,
angle = 90, face ="bold" ),
axis.text.y = element_text(colour = "#000000", size = 12,
angle = 90, face ="bold" ))
Please modify the code and I also added the final output what I am getting need a little bit of modification in the code to get x-axis neatly
One option would be dodging the labels in x-axis:
library(ggplot2)
library(ggpubr)
ggplot() +
geom_line(data = newdf_20, aes(x=Var1, y=cumsum(Freq), group = 1, color = "#111111"), size = 1.6) +
geom_line(data = newdf_05, aes(x=Var1, y=cumsum(Freq), group = 1, color = "#999999"), size = 1.6) +
geom_line(data = f20, aes(x=Var1, y=cumsum(Freq), group = 1, color = "#CC79A7"), size = 1.6) +
geom_vline(xintercept = "03-25", color="gray", size=1)+
geom_vline(xintercept = "04-21", color="gray", size=1)+
scale_x_discrete(guide = guide_axis(n.dodge=2))+
labs(y = "Cumulative_Frequency", colour= "#000000", size = 16 )+
font("ylab", size = 15, color = "black", face = "bold.italic")+
font("legend.text",size = 10, face = "bold")+
font("legend.title",size = 15, face = "bold")+
theme(axis.line.x = element_line(size = 0.5, colour = "black"), # theme modification
axis.line.y = element_line(size = 0.5, colour = "black"),
#axis.text.x=element_blank(),axis.ticks.x=element_blank(),
panel.background = element_blank(),
legend.position = 'none',
axis.text.x = element_text(colour = "#000000", size = 7,
angle = 90, face ="bold" ),
axis.text.y = element_text(colour = "#000000", size = 12,
angle = 90, face ="bold" ))
Output:

Store variable in a for loop with concatenated column

I'm doing a rpois simulation and I want to create a function to automate variable change (lambda).
My function should be able to change the lambda value. For example, here I want to change three variables n1 (175), n2 (11) and n3 (14)
and the number of poison random depending on the number of row of the concatenate input data frame like in my example.
library(tidyverse)
library(ggrepel)
library(broom)
set.seed(12358)
pois_1 <- tidy(summary(as.factor(rpois(n = 1000000, lambda = (1/336981)*175*11*14)))/1000000)
pois_2 <- tidy(summary(as.factor(rpois(n = 1000000, lambda = (1/336981)*500*11*14)))/1000000)
pois_3 <- tidy(summary(as.factor(rpois(n = 1000000, lambda = (1/336981)*900*11*14)))/1000000)
df_info <- data.frame(pois_1[1:5, ], pois_2[1:5, 2], pois_3[1:5, 2])
names(df_info) <- c("occurence", "175",
"500", "900")
df_info %>%
gather(fl, proba, "175":"900") -> df_info
ggplot(data = df_info, aes(x = fl,
y = proba,
group = occurence)) +
geom_point(size = 2) +
geom_label_repel(aes(label = ifelse(proba > 0.02, as.character(round(proba, 2)), "")),
box.padding = 0.35,
point.padding = 0.5,
segment.color = 'grey50') +
geom_line(aes(linetype = occurence, color = occurence), size = 1) +
theme_bw() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1),
plot.title = element_text(hjust = 0.5, face = "bold"),
axis.title.x = element_text(hjust = 0.5, face = "bold"),
axis.title.y = element_text(hjust = 0.5, face = "bold"))
Here I wanted to create a function like that with a for loop but it seems to be quite complicated:
Edit : done with only a vector but I want to do it for n1> 1
vizFun <- function(n1, n2, n3){
df_info <- cbind(n1, n2, n3)
names(df_info) <- c("n1", "n2", "n3")
if (nrow(df_info) == 1){
for (i in seq_along(nrow(df_info))){
lambda <- (1/336981)*df_info[i,"n1"]*df_info[i, "n2"]*df_info[i, "n3"]
pois <- tidy(summary(as.factor(rpois(n = 1000000, lambda = lambda)))/1000000)
df_info <- data.frame(pois[, ])
names(df_info) <- c("occurence", "175")
df_info %>%
gather(fl, proba, "175") -> df_info
}
ggplot(data = df_info, aes(x = fl,
y = proba,
group = occurence)) +
geom_point(size = 2) +
geom_label_repel(aes(label = ifelse(proba > 0.02, as.character(round(proba, 2)), "")),
box.padding = 0.35,
point.padding = 0.5,
segment.color = 'grey50') +
geom_line(aes(linetype = occurence, color = occurence), size = 1) +
theme_bw() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1),
plot.title = element_text(hjust = 0.5, face = "bold"),
axis.title.x = element_text(hjust = 0.5, face = "bold"),
axis.title.y = element_text(hjust = 0.5, face = "bold"))
}
}
vizFun(500, 11, 14)
Try this:
yourfunction<-function(x=c(),seed=12358){
set.seed(seed)
require(tidyverse)
require(ggrepel)
require(broom)
listdata<-list()
for (i in 1:length(x)) {
listdata[[i]]<- assign(paste("pois_",i),tidy(summary(as.factor(rpois(n = 1000000, lambda = (1/336981)*x[i]*11*14)))/1000000)[1:5,]) }
df_info<-cbind.data.frame(occurence=as.character(rep(0:(5-1)), length(x)),fl=as.character(rep(x,each=5)) ,proba=dplyr::bind_rows(listdata)[,2])
names(df_info)<-c("occurence" , "fl" ,"proba")
ggplot(data = df_info, aes(x = fl,
y = proba,
group = occurence)) +
geom_point(size = 2) +
geom_label_repel(aes(label = ifelse(proba > 0.02, as.character(round(proba, 2)), "")),
box.padding = 0.35,
point.padding = 0.5,
segment.color = 'grey50') +
geom_line(aes(linetype = occurence, color = occurence), size = 1) +
theme_bw() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1),
plot.title = element_text(hjust = 0.5, face = "bold"),
axis.title.x = element_text(hjust = 0.5, face = "bold"),
axis.title.y = element_text(hjust = 0.5, face = "bold"))
}
yourfunction(x=c(175,500,700,900))

Using ggplot2; How to get position_dodge() to work with this example?

Background
I took the data from a Stephen Few Example and wanted to add labels to each of the bars to pull the legend from the side of the graphic.
The code in the "Hack Graphic" section got me there because I couldn't get the position_dodge() to work with the text labels.
Load Data
library(tidyverse)
library(forcats)
### Build data from his table
candidates <- tibble::tibble(`Rating Areas` = c("Experience",
"Communication", "Friendliness", "Subject matter knowledge", "Presentation",
"Education"), `Karen Fortou` = c(4,3.5, 4, 4, 3, 3.5), `Mike Rafun` = c(4.5,
2, 2, 5, 1.5, 4.5), `Jack Nymbul` = c(2.5, 5, 4.5, 2.5, 2.75, 2)) %>%
gather("Candidates", "Score", -`Rating Areas`)
# The totals for each candidate
totals <- candidates %>% group_by(Candidates) %>% summarise(Score =
sum(Score))
Hack Graphic
Notice how I used manually created x-axis values (x = c(seq(.6,1.35, by = .15), seq(1.6,2.35, by = .15), seq(2.6,3.35, by = .15))) to place the labels instead of using position = position_dodge() as described in this post.
candidates %>%
ggplot(aes(x = fct_reorder(Candidates, Score), y = Score)) +
geom_col(data = totals, alpha = .45) +
geom_col(aes(fill = `Rating Areas`), position = position_dodge(.9), color = "black",
show.legend = FALSE) +
geom_text(label = rep(c("Experience", "Communication", "Friendliness",
"Subject matter knowledge", "Presentation", "Education"),3),
x = c(seq(.6,1.35, by = .15), seq(1.6,2.35, by = .15),
seq(2.6,3.35, by = .15)), y = 5.1, angle = 90, color = "black",
hjust = "left", size = 4, fontface = "bold") +
scale_fill_brewer(type = "qual") +
scale_y_continuous(breaks = seq(0, 25, by = 2)) +
theme_bw() +
labs(x = "\nCandidates", y = "Rating Score") +
theme(axis.text.x = element_text(size = 14, color = "black"), legend.text = element_text(size = 14),
legend.title = element_text(size = 15), axis.title = element_text(size = 15))
Graphic Code that doesn't work
When I follow the example from the previous Stack answer using geom_text(aes(label =Rating Areas), position = position_dodge(width = 0.9), angle = 90, color = "black", hjust = "left", size = 4, fontface = "bold") it does not spread the labels out ever each bar.
I must be missing something obvious. Please help with how to get position_dodge() to work with this example?
candidates %>%
ggplot(aes(x = fct_reorder(Candidates, Score), y = Score)) +
geom_col(data = totals, alpha = .45) +
geom_col(aes(fill = `Rating Areas`), position = position_dodge(.9), color = "black", show.legend = FALSE) +
geom_text(aes(label = `Rating Areas`), position = position_dodge(width = 0.9), angle = 90, color = "black", hjust = "left", size = 4, fontface = "bold") +
scale_fill_brewer(type = "qual") +
scale_y_continuous(breaks = seq(0, 25, by = 2)) +
theme_bw() +
labs(x = "\nCandidates", y = "Rating Score") +
theme(axis.text.x = element_text(size = 14, color = "black"), legend.text = element_text(size = 14), legend.title = element_text(size = 15), axis.title = element_text(size = 15))
I think you need to have the same mapping for both geom_col and geom_text. You can add fill = Rating Areas to the aesthetics of geom_text. You will get a warning though.
candidates %>%
ggplot(aes(x = fct_reorder(Candidates, Score), y = Score)) +
geom_col(data = totals, alpha = .45) +
geom_col(aes(fill = `Rating Areas`), position = position_dodge(.9), color = "black", show.legend = FALSE) +
geom_text(aes(fill = `Rating Areas`, label = `Rating Areas`), position = position_dodge(width = 0.9), angle = 90, color = "black", hjust = "left", size = 4, fontface = "bold") +
scale_fill_brewer(type = "qual") +
scale_y_continuous(breaks = seq(0, 25, by = 2)) +
theme_bw() +
labs(x = "\nCandidates", y = "Rating Score") +
theme(axis.text.x = element_text(size = 14, color = "black"), legend.text = element_text(size = 14), legend.title = element_text(size = 15), axis.title = element_text(size = 15))
Edit: Here's a way to do it without the warning:
candidates %>%
ggplot(aes(x = fct_reorder(Candidates, Score), y = Score, fill = `Rating Areas`)) +
geom_col(data = totals, aes(x = fct_reorder(Candidates, Score), y = Score), alpha = .45, inherit.aes = FALSE) +
geom_col(position = position_dodge(.9), color = "black", show.legend = FALSE) +
geom_text(aes(label = `Rating Areas`), position = position_dodge(width = 0.9), angle = 90, color = "black", hjust = "left", size = 4, fontface = "bold") +
scale_fill_brewer(type = "qual") +
scale_y_continuous(breaks = seq(0, 25, by = 2)) +
theme_bw() +
labs(x = "\nCandidates", y = "Rating Score") +
theme(axis.text.x = element_text(size = 14, color = "black"), legend.text = element_text(size = 14), legend.title = element_text(size = 15), axis.title = element_text(size = 15))

Exact Positioning of multiple plots in ggplot2 with grid.arrange

I'm trying to create a multiple plot with the same x-axis but different y-axes, because I have values for two groups with different ranges. As I want to control the values of the axes (respectively the y-axes shall reach from 2.000.000 to 4.000.000 and from 250.000 to 500.000), I don't get along with facet_grid with scales = "free".
So what I've tried is to create two plots (named "plots.treat" and "plot.control") and combine them with grid.arrange and arrangeGrob. My problem is, that I don't know how to control the exact position of the two plots, so that both y-axes are positioned on one vertical line. So in the example below the second plot's y-axis needs to be positioned a bit more to the right.
Here is the code:
# Load Packages
library(ggplot2)
library(grid)
library(gridExtra)
# Create Data
data.treat <- data.frame(seq(2005.5, 2015.5, 1), rep("SIFI", 11),
c(2230773, 2287162, 2326435, 2553602, 2829325, 3372657, 3512437,
3533884, 3519026, 3566553, 3527153))
colnames(data.treat) <- c("Jahr", "treatment",
"Aggregierte Depositen (in Tausend US$)")
data.control <- data.frame(seq(2005.5, 2015.5, 1), rep("Nicht-SIFI", 11),
c(324582, 345245, 364592, 360006, 363677, 384674, 369007,
343893, 333370, 318409, 313853))
colnames(data.control) <- c("Jahr", "treatment",
"Aggregierte Depositen (in Tausend US$)")
# Create Plot for data.treat
plot.treat <- ggplot() +
geom_line(data = data.treat,
aes(x = `Jahr`,
y = `Aggregierte Depositen (in Tausend US$)`),
size = 1,
linetype = "dashed") +
geom_point(data = data.treat,
aes(x = `Jahr`,
y = `Aggregierte Depositen (in Tausend US$)`),
fill = "white",
size = 2,
shape = 24) +
scale_x_continuous(breaks = seq(2005, 2015.5, 1),
minor_breaks = seq(2005, 2015.5, 0.5),
limits = c(2005, 2015.8),
expand = c(0.01, 0.01)) +
scale_y_continuous(breaks = seq(2000000, 4000000, 500000),
minor_breaks = seq(2000000, 4000000, 250000),
labels = c("2.000.000", "2.500.000", "3.000.000",
"3.500.000", "4.000.000"),
limits = c(2000000, 4000000),
expand = c(0, 0.01)) +
theme(text = element_text(family = "Times"),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.line.x = element_line(color="black", size = 0.6),
axis.line.y = element_line(color="black", size = 0.6),
legend.position = "none") +
geom_segment(aes(x = c(2008.7068),
y = c(2000000),
xend = c(2008.7068),
yend = c(3750000)),
linetype = "dotted") +
annotate(geom = "text", x = 2008.7068, y = 3875000, label = "Lehman\nBrothers + TARP",
colour = "black", size = 3, family = "Times") +
geom_segment(aes(x = c(2010.5507),
y = c(2000000),
xend = c(2010.5507),
yend = c(3750000)),
linetype = "dotted") +
annotate(geom = "text", x = 2010.5507, y = 3875000, label = "Dodd-Frank-\nAct",
colour = "black", size = 3, family = "Times") +
geom_rect(aes(xmin = 2007.6027, xmax = 2009.5, ymin = -Inf, ymax = Inf),
fill="dark grey", alpha = 0.2)
# Create Plot for data.control
plot.control <- ggplot() +
geom_line(data = data.control,
aes(x = `Jahr`,
y = `Aggregierte Depositen (in Tausend US$)`),
size = 1,
linetype = "solid") +
geom_point(data = data.control,
aes(x = `Jahr`,
y = `Aggregierte Depositen (in Tausend US$)`),
fill = "white",
size = 2,
shape = 21) +
scale_x_continuous(breaks = seq(2005, 2015.5, 1), # x-Achse
minor_breaks = seq(2005, 2015.5, 0.5),
limits = c(2005, 2015.8),
expand = c(0.01, 0.01)) +
scale_y_continuous(breaks = seq(250000, 500000, 50000),
minor_breaks = seq(250000, 500000, 25000),
labels = c("250.000", "300.000", "350.000", "400.000",
"450.000", "500.000"),
limits = c(250000, 500000),
expand = c(0, 0.01)) +
theme(text = element_text(family = "Times"),
axis.title.x = element_blank(), # Achse
axis.title.y = element_blank(), # Achse
axis.line.x = element_line(color="black", size = 0.6),
axis.line.y = element_line(color="black", size = 0.6),
legend.position = "none") +
geom_segment(aes(x = c(2008.7068),
y = c(250000),
xend = c(2008.7068),
yend = c(468750)),
linetype = "dotted") +
annotate(geom = "text", x = 2008.7068, y = 484375, label = "Lehman\nBrothers + TARP",
colour = "black", size = 3, family = "Times") +
geom_segment(aes(x = c(2010.5507),
y = c(250000),
xend = c(2010.5507),
yend = c(468750)),
linetype = "dotted") +
annotate(geom = "text", x = 2010.5507, y = 484375, label = "Dodd-Frank-\nAct",
colour = "black", size = 3, family = "Times") +
geom_rect(aes(xmin = 2007.6027, xmax = 2009.5, ymin = -Inf, ymax = Inf),
fill="dark grey", alpha = 0.2)
# Combine both Plots with grid.arrange
grid.arrange(arrangeGrob(plot.treat, plot.control,
ncol = 1,
left = textGrob("Aggregierte Depositen (in Tausend US$)",
rot = 90,
vjust = 1,
gp = gpar(fontfamily = "Times",
size = 12,
colout = "black",
fontface = "bold")),
bottom = textGrob("Jahr",
vjust = 0.1,
hjust = 0.2,
gp = gpar(fontfamily = "Times",
size = 12,
colout = "black",
fontface = "bold"))))
Do:
install.packages("cowplot")
but do not library(cowplot) as it'll mess up your theme work.
Then, do:
grid.arrange(
arrangeGrob(cowplot::plot_grid(plot.treat, plot.control, align = "v", ncol=1),
ncol = 1,
left = textGrob("Aggregierte Depositen (in Tausend US$)",
rot = 90,
vjust = 1,
gp = gpar(fontfamily = "Times",
size = 12,
colout = "black",
fontface = "bold")),
bottom = textGrob("Jahr",
vjust = 0.1,
hjust = 0.2,
gp = gpar(fontfamily = "Times",
size = 12,
colout = "black",
fontface = "bold"))))

Resources