Why doesn't my geom_vline object show up on my plot? - r

I have looked at a number of solutions to others' problems relating to the same issues, but nothing thus far has worked for me.
I want a vertical line to appear on "2018-07-23" and this code is the closest I have gotten (in that it doesn't produce an error):
ggplot(grouped) +
geom_line(aes(x = date, y = sitewide_opens, group = 1),
linetype = "dashed",
colour = "forestgreen",
alpha = 0.5) +
geom_line(aes(x = date, y = homepage_opens, group = 1),
colour = "blue") +
geom_vline(aes(xintercept = as.Date(grouped$date[8])),
linetype = 4, colour = "black")
The format of grouped$date is character, which is why I convert it to date. Note that I get the same (non-)result with as.POSIXct too.
Where am I going wrong?
My data frame:
grouped <- structure(list(date = c("2018-07-16", "2018-07-17", "2018-07-18",
"2018-07-19", "2018-07-20", "2018-07-21", "2018-07-22", "2018-07-23",
"2018-07-24", "2018-07-25", "2018-07-26", "2018-07-27", "2018-07-28",
"2018-07-29", "2018-07-30", "2018-07-31"), homepage_opens = c(5L,
0L, 0L, 3L, 1L, 2L, 0L, 1L, 0L, 2L, 5L, 0L, 0L, 0L, 0L, 0L),
sitewide_opens = c(39L, 34L, 19L, 62L, 46L, 44L, 16L, 51L,
25L, 66L, 75L, 0L, 0L, 0L, 0L, 0L), chats_started = c(10L,
16L, 9L, 8L, 13L, 13L, 5L, 13L, 4L, 8L, 11L, 0L, 0L, 0L,
0L, 0L), chats_completed = c(7L, 13L, 8L, 4L, 5L, 9L, 6L,
13L, 2L, 7L, 5L, 0L, 0L, 0L, 0L, 0L)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -16L))
My graph:

Reproduced your plot with your example and casted the grouped$date as.Date in all instances and it worked fine:
ggplot(grouped) +
geom_line(aes(x = as.Date(grouped$date), y = sitewide_opens, group = 1),
linetype = "dashed",
colour = "forestgreen",
alpha = 0.5) +
geom_line(aes(x = as.Date(grouped$date), y = homepage_opens, group = 1),
colour = "blue") +
geom_vline(aes(xintercept = as.Date(grouped$date[8])),
linetype = 4, colour = "black")
Out comes the plot:
https://i.imgur.com/Km8UvaX.jpg

How about changing xintercept = as.Date(grouped$date[8]) to xintercept = 8
ggplot(grouped) +
geom_line(aes(x = date, y = sitewide_opens, group = 1),
linetype = "dashed",
colour = "forestgreen",
alpha = 0.5) +
geom_line(aes(x = date, y = homepage_opens, group = 1),
colour = "blue") +
geom_vline(aes(xintercept = 8),
linetype = 4, colour = "black")

Related

Change a sparse plot into a different

I have a sparse plot due to data input
Data input
dframe <- structure(list(value = c(1L, 2L, 3L, 4L, 5L, 8L, 6L, 7L,
10L, 9L, 14L, 15L, 20L, 22L, 24L), level= c(1009L, 103L, 43L,
7L, 5L, 4L, 3L, 3L, 2L, 1L, 1L, 1L, 1L, 1L, 1L)), class = "data.frame", row.names = c(NA,
-15L))
And the plot:
library(ggplot2)
p <- ggplot(data=dframe, mapping = aes(x=value, y=level)) +
geom_col(color = '#032838', fill = 'steelblue', size = 1) +
geom_text(aes(label = level), vjust = -0.4, size = 4, position = position_dodge(0.9))
Is there any alternative plot which will not be so sparse after frequency of 30 in x axis?
Here is a hypothesis: you could zoom in on the part of the plot where the data are more sparse. An example with ggforce
library(ggforce)
#transform your data to be plotted by geom_histogram (or geom_density)
df <- data.frame(value=rep(dframe$value,dframe$level))
ggplot() +
geom_histogram(aes(x=value),dplyr::mutate(df, z = F),bins = 25,color = '#032838', fill = 'steelblue') +
geom_histogram(aes(x=value),dplyr::mutate(df, z = T),bins =50,color = '#032838', fill = 'steelblue') +
facet_zoom(xlim = c(5, 25),ylim=c(0,10), horizontal = F,zoom.data = z,zoom.size=0.5)+
theme(zoom.y = element_blank(), validate = FALSE)
which give you:
you can play with the bins argument to find the perfect solution for you.
N.B. I remove the geom_text part since you did not provide the Users variable
Why not just take the logarithm of your level data? That would be the standard thing to do in such a situation. Consider:
p <- ggplot(data=dframe, mapping = aes(x=value, y=log(level))) +
geom_col(color = '#032838', fill = 'steelblue', size = 1)

Whats the right way to add text to geom_histogram in ggplot?

I've plotted a histograph with wage on the x-axis and a y-axis that shows the percentage of individuals in the data set that has this particular wage. Now I want the individual bars to display how many observarions there is in every bar. e.g in the sample_data I've provided, how many wages is in the 10% bars and how many in the 20% bars?
Here's a small sample of my data:
sample_data<- structure(list(wage = c(81L, 77L, 63L, 84L, 110L, 151L, 59L,
109L, 159L, 71L), school = c(15L, 12L, 10L, 15L, 16L, 18L, 11L,
12L, 10L, 11L), expr = c(17L, 10L, 18L, 16L, 13L, 15L, 19L, 20L,
21L, 20L), public = c(0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L),
female = c(1L, 1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L), industry = c(63L,
93L, 71L, 34L, 83L, 38L, 82L, 50L, 71L, 37L)), row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame")
Here's my R script
library(ggplot2)
library(dplyr)
ggplot(data = sample_data) +
geom_histogram(aes(x = wage, y = stat(count) / sum(count)), binwidth = 4, color = "black") +
scale_x_continuous(breaks = seq(0, 300, by = 20)) +
scale_y_continuous(labels = scales::percent_format())
I'm happy with this basically, but whatever I try -- I can't get text on top of my columns. Here is one example of many using stat_count that doesn't work:
ggplot(data = sample_data) +
geom_histogram(aes(x = wage, y = stat(count) / sum(count)), binwidth = 4, color = "black") +
scale_x_continuous(breaks = seq(0, 300, by = 20)) +
scale_y_continuous(labels = scales::percent_format()) +
stat_count(aes(y = ..count.., label =..count..), geom = "text")
Iv'e also tried using geom_text to no avail.
EDIT: ANSWER!
Many thanks too those who replied.
I ended up using teunbrand's solution with a small modification where I changed after_stat(density) to after_stat(count) / sum(count).
Here's the 'final' code:
ggplot(sample_data) +
geom_histogram(
aes(x = wage,
y = after_stat(count) / sum(count)),
binwidth = 4, colour = "black"
) +
stat_bin(
aes(x = wage,
y = after_stat(count) / sum(count),
label = after_stat(ifelse(count == 0, "", count))),
binwidth = 4, geom = "text", vjust = -1) +
scale_x_continuous(breaks = seq(0, 300, by = 20)) +
scale_y_continuous(labels = scales::percent_format())
Different layers typically don't share stateful information, so you could use the same stat as the histogram (stat_bin()) to display the labels. Then, you can use after_stat() to use the computed variables of the stat part of the layer to make labels.
library(ggplot2)
sample_data<- structure(list(
wage = c(81L, 77L, 63L, 84L, 110L, 151L, 59L, 109L, 159L, 71L),
school = c(15L, 12L, 10L, 15L, 16L, 18L, 11L, 12L, 10L, 11L),
expr = c(17L, 10L, 18L, 16L, 13L, 15L, 19L, 20L, 21L, 20L),
public = c(0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L),
female = c(1L, 1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L),
industry = c(63L, 93L, 71L, 34L, 83L, 38L, 82L, 50L, 71L, 37L)),
row.names = c("1","2", "3", "4", "5", "6", "7", "8", "9", "10"),
class = "data.frame")
ggplot(sample_data) +
geom_histogram(
aes(x = wage,
y = after_stat(density)),
binwidth = 4, colour = "black"
) +
stat_bin(
aes(x = wage,
y = after_stat(density),
label = after_stat(ifelse(count == 0, "", count))),
binwidth = 4, geom = "text", vjust = -1
)
Created on 2021-03-28 by the reprex package (v1.0.0)
Personally I find the existing answers on this topic somewhat frustrating, and one that I would expect had a much simpler solution somewhere out there. I am personally not a fan of the 0's showing up in my histograms either, and positioning using stat_bin becomes frustrating at times. Having have to do this a couple of times I usually revert to some manual calculations and using geom_rect in combination with geom_text/geom_label. Maybe some day I'll sit down and actually create the, I believe, 3 functions needed to create a proper geom_*. Until then the basic idea is:
Create my histogram data using hist
Alter the data to a data.frame with the aesthethics needed for geom_rect (our "geom_hist" substitute) and geom_text.
Plot manually with this data in the necessary layers.
#' Compute data for creating a manual histogram with ggplot including labels
#'
#' #param bardata output from \code{hist(data, plot = FALSE)}
#' #param probs should labels be in probability scale or non-probability scales?
#'
#' #return a \code{data.frame} with columns xmin, ymin, xmax, ymax, mids and label
create_gg_hist_df <- function(bardata, probs = TRUE){
nb <- length(bardata$breaks)
xmax <- bardata$breaks[-1L]
xmin <- bardata$breaks[-nb]
mids <- bardata$mids
ymin <- integer(nb - 1)
ymax <- bardata$count / sum(bardata$count)
label <- if(!probs) ymax else bardata$count
data.frame(xmin = xmin,
ymin = ymin,
xmax = xmax,
ymax = ymax,
mids = mids,
label = label)
}
ggbardata <- create_gg_hist_df(hist(sample_data$wage,
# breaks based on ggplot2 when "width" is supplied
breaks = ggplot2:::bin_breaks_width(range(sample_data$wage),
width = 4)$breaks,
plot = FALSE))
ggbardata %>%
# Remove "0" columns ( I don't want them. That is my preference )
filter(ymax > 0) %>%
ggplot(aes(xmin = xmin, xmax = xmax,
ymin = ymin, ymax = ymax,
label = label)) +
# Add histogram
geom_rect(color = 'black') +
# Add text
geom_text(aes(x = mids, y = ymax), nudge_y = 0.005) +
scale_y_continuous(labels = scales::percent_format()) +
labs(x = 'wage', y = 'frequency')

How to create segmented graphs in ggplot2 with legend?

I have a data as follows:
I would like to create a segmented plot (like a pre- and post- plot, including the vertical line at t = 10, to indicate the change. t refers to the elapsed time, x refers to 0 for pre-implementation, 1 for post-implementation and count_visit_triage\\d are count data that I would like to plot in the y-axis.
This is my r-code. I have pieced together multiple geom_smooth into the same figure, each colour representing values from triage1, triage2 etc. Because of this, I couldn't obtain the legend. My question is (1) how can we simplify this code so that the legend can be included in the figure?
ggplot(df, aes(x = t, y = count_visit_triage1)) +
geom_smooth(data = subset(df, x == 0), aes(x = t, y = count_visit_triage1), colour = "blue", se = F) +
geom_smooth(data = subset(df, x == 1), aes(x = t, y = count_visit_triage1), colour = "blue", se = F) +
geom_smooth(data = subset(df, x == 0), aes(x = t, y = count_visit_triage2), colour = "orange", se = F) +
geom_smooth(data = subset(df, x == 1), aes(x = t, y = count_visit_triage2), colour = "orange", se = F) +
geom_smooth(data = subset(df, x == 0), aes(x = t, y = count_visit_triage3), colour = "green", se = F) +
geom_smooth(data = subset(df, x == 1), aes(x = t, y = count_visit_triage3), colour = "green", se = F) +
geom_smooth(data = subset(df, x == 0), aes(x = t, y = count_visit_triage4), colour = "red", se = F) +
geom_smooth(data = subset(df, x == 1), aes(x = t, y = count_visit_triage4), colour = "red", se = F) +
geom_vline(xintercept = 10, linetype = "dashed") +
theme_bw()
Data:
df <- structure(list(t = 1:20, x = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), count_visit_triage1 = c(42L,
55L, 61L, 52L, 58L, 38L, 47L, 46L, 66L, 44L, 24L, 17L, 40L, 25L,
18L, 23L, 34L, 35L, 22L, 23L), count_visit_triage2 = c(175L,
241L, 196L, 213L, 189L, 163L, 181L, 166L, 229L, 224L, 153L, 139L,
125L, 145L, 134L, 115L, 152L, 153L, 136L, 154L), count_visit_triage3 = c(120L,
114L, 106L, 88L, 108L, 103L, 103L, 93L, 80L, 81L, 88L, 94L, 94L,
77L, 91L, 100L, 93L, 70L, 79L, 77L), count_visit_triage4 = c(3L,
0L, 0L, 1L, 2L, 2L, 0L, 4L, 4L, 2L, 0L, 0L, 0L, 0L, 0L, 1L, 0L,
0L, 1L, 2L)), row.names = c(NA, -20L), class = c("tbl_df", "tbl",
"data.frame"))
Reshape the data then specify the col and group aesthetics.
library(tidyverse)
df %>%
pivot_longer(starts_with("count_")) %>%
ggplot(aes(t, value, col = name, group = paste(x, name))) +
geom_smooth(se = FALSE) +
geom_vline(xintercept = 10, linetype = "dashed") +
theme_bw()
You can try this:
library(tidyverse)
df %>%
pivot_longer(cols = -c(t,x),
names_to = "visit",
values_to = "count") %>%
ggplot() +
geom_line(aes(x = t,
y = count,
color = visit,
group = interaction(x,visit))) +
geom_vline(xintercept = 10, linetype = "dashed") +
scale_color_manual(name = "legend",
values = 1:4,
labels = c("Visit Triage 1",
"Visit Triage 2",
"Visit Triage 3",
"Visit Triage 4")) +
theme_bw()

ggplot2: nudge geom_step() upwards a little bit for every group with discrete y-axis

I have objects moving through different places over time, the plots look like this (but with many more paths):
ggplot(data = df, aes(
y = place,
x = value,
color = order,
group = order
)) +
geom_step(alpha = 0.5) +
theme(legend.position = "bottom") +
guides(color = guide_legend(ncol = 1)) +
geom_point(alpha = 0.5) +
facet_wrap( ~ order)
I'd like to combine the facets into one plot:
ggplot(data = df, aes(
y = place,
x = value,
color = order,
group = order
)) +
geom_step(alpha = 0.5) +
theme(legend.position = "bottom") +
guides(color = guide_legend(ncol = 1)) +
geom_point(alpha = 0.5)
The problem I have with this is the overlapping. I would like to nudge/move every different color of geom_step() up by a few pixels (maybe a linewidth), so that overlapping lines appear thicker. I have tried this R - ggplot dodging geom_lines but changing the x- and y-coordinate messes up the plot.
ggplot(data = df, aes(
y = value,
x = place,
color = order,
group = order
)) +
geom_step(alpha = 0.5, direction = "vh", position = position_dodge(width = 0.5)) +
theme(legend.position = "bottom") +
guides(color = guide_legend(ncol = 1)) +
coord_flip()
I hope I was clear about my desired output. I'm grateful for any hints!
the data:
df <- structure(list(place = structure(c(1L, 7L, 8L, 2L, 8L, 4L, 8L,
11L, 9L, 10L, 9L, 7L, 6L, 7L, 1L, 7L, 8L, 3L, 8L, 5L, 9L, 11L,
9L, 10L, 8L, 7L, 6L, 7L), .Label = c("A", "B", "C", "D", "E",
"F", "G", "H", "I", "J", "K"), class = "factor"), order = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("a",
"b"), class = "factor"), value = c(0, 38.0069999694824, 254.986999988556,
266.786999940872, 358.447000026703, 368.375, 613.148000001907,
626.457999944687, 778.240999937057, 790.655999898911, 844.833999872208,
914.274999856949, 925.282999992371, 952.84299993515, 0, 38.3450000286102,
80.5469999313354, 93.7960000038147, 188.280999898911, 199.918999910355,
380.635999917984, 385.131000041962, 447.441999912262, 455.503999948502,
528.233000040054, 677.162999868393, 690.805000066757, 713.063999891281
)), row.names = c(NA, -28L), class = "data.frame")
Okay, so after some more googling i stumbled upon the ggstance-package, which includes a vertical version of position_dodge which does exactly what i need:
library(ggstance)
ggplot(data = df, aes(
y = place,
x = value,
color = order,
group = order
)) +
geom_step(position = position_dodge2v(height = 0.2, preserve = "single")) +
theme(legend.position = "bottom") +
guides(color = guide_legend(ncol = 1))

ggplot2 - alternative to geom_ribbon for non continuous x values with facet

I need to plot a ribbon around a hline in a graph with barplots divided in facets. The x axis is non continuous and even though I have tried different solutions like making x numeric for geom_ribbon, I can't find a solution.
toplot=structure(list(size = c(10L, 10L, 10L, 10L, 30L, 30L, 30L, 30L,
50L, 50L, 50L, 50L, 100L, 100L, 100L, 100L), density = structure(c(2L,
3L, 4L, 5L, 2L, 3L, 4L, 5L, 2L, 3L, 4L, 5L, 2L, 3L, 4L, 5L), .Label = c("control",
"low", "medium", "high", "extreme"), class = "factor"), mean = c(0.649495617453177,
0.595030456501759, 0.671853292620394, 0.772710452129729, 0.208287258947775,
0.113070097194118, 0.138593272196695, 0.106836463449531, 0.142217123599047,
0.291860533054406, 0.187033701620647, 0.12045308442074, 0, 0.0000389132497170763,
0.00251973356226341, 0), sd = c(0.0472308191904496, 0.0716594048000388,
0.0857233139528986, 0.0534307204561747, 0.0481240616513752, 0.0390094013972726,
0.0412224562146842, 0.0278742510208481, 0.0233346723409426, 0.0559831409664118,
0.0494588911471589, 0.0270924698136921, 0, 0.000218839700404029,
0.00550243848896909, 0), period = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), class = "factor", .Label = "final")), .Names = c("size",
"density", "mean", "sd", "period"), row.names = c(2L, 3L, 4L,
5L, 7L, 8L, 9L, 10L, 12L, 13L, 14L, 15L, 17L, 18L, 19L, 20L), class = "data.frame")
contr=structure(list(size = c(10L, 30L, 50L, 100L), density = structure(c(1L,
1L, 1L, 1L), .Label = c("control", "low", "medium", "high", "extreme"
), class = "factor"), mean = c(0.640615964924125, 0.231731093831607,
0.122309113981835, 0.0053438272624331), sd = c(0.04503167947312,
0.0406874041671366, 0.0173288744394121, 0.00181433175554796),
period = c("final", "final", "final", "final")), .Names = c("size",
"density", "mean", "sd", "period"), row.names = c(1L, 6L, 11L,
16L), class = "data.frame")
and the code that I have
p <- ggplot(data=toplot,aes(x=period,y=mean,fill=density)) +
geom_bar(stat='identity',position = 'dodge') +
facet_grid(~size) +
geom_hline(data = contr, aes(yintercept = mean,linetype = "control"),size=1.2) +
scale_linetype_manual(name = "",values=2)
I would like to draw a ribbon around the horizontal control line but it's not working. This doesn't draw anything and changes the fill.
p + geom_ribbon(data=contr, aes(ymin = mean - sd, ymax = mean + sd),fill='grey')
and this also messes up the facets
p + geom_ribbon(data=contr, aes(x=1:4, ymin = mean - sd, ymax = mean + sd),fill='grey')
I have also tried to use group=size to match the facet command but nothing happens.
Either I am using the wrong geom or I am missing how to structure the data. I tried to use this http://mjskay.github.io/tidybayes/reference/geom_lineribbon.html but it doesn't exist in ggplot2
Objects like geom_ribbon expect a series of x and y values, so that points can be connected via lines. The main problem here is that your x-axis has only 1 value ('final'), so there's nothing to connect. You can get around the problem with geom_rect, which only needs values for the upper-right and lower-left corners. We simply use -Inf and Inf for the xmin and xmax values, so that the rectangle spans the full width of each facet:
p <- ggplot(data=toplot,aes(x=period,y=mean,fill=density)) +
geom_bar(stat='identity',position = 'dodge') +
facet_grid(~size) +
geom_rect(data = contr, aes(ymin = mean - sd, ymax = mean + sd), xmin = -Inf, xmax = Inf, alpha = 0.25, fill = 'black') +
geom_hline(data = contr, aes(yintercept = mean,linetype = "control"),size=1.2) +
scale_linetype_manual(name = "",values=2)
The geom_rect() approach is nice. You could do something similar with geom_crossbar():
p <- ggplot(data=toplot,aes(x=period,y=mean,fill=density)) +
geom_bar(stat='identity',position = 'dodge') +
facet_grid(~size) +
geom_crossbar(data = contr,
aes(ymin = (mean - 2*sd),
ymax=(mean + 2*sd), linetype = "control"),
size=.2, alpha=.5, width=1, fill='darkgrey') +
scale_linetype_manual(name = "",values=2)
p + theme_minimal()
Something like this. Modify the size=7 value to change the thickness of the line; and alpha=0.2 to edit transparency.
p <- ggplot(data=toplot,aes(x=period,y=mean,fill=density)) +
geom_bar(stat='identity',position = 'dodge') +
facet_grid(~size) +
geom_hline(data = contr, aes(yintercept = mean),size=7,alpha=0.2) +
geom_hline(data = contr, aes(yintercept = mean,linetype = "control"),size=1.2) +
scale_linetype_manual(name = "",values=2)

Resources