How to manually add a tick mark in ggplot2? [duplicate] - r

This question already has answers here:
ggplot2 annotate on y-axis (outside of plot)
(2 answers)
Closed 5 months ago.
I am quite new to the ggplot2 world in R, so I am trying to get familiar with the technicalities of plotting with ggplot2. In particular, I have a problem, which can be replicated by the following MWE:
ggplot(data.frame(x = c(1:10), y = c(1:10)), aes(x = c(1:10), y = c(1:10))) +
geom_line() +
geom_hline(aes(yintercept = 6), lty = 2)
This generates a simple graph of a diagonal line with a horizontal dashed line that cuts the y-axis at 6.
I would like to know whether there is a way to add a tick mark of 6 on the y-axis? In other words, say I simply showed this plot to a person without the code. I want the person to know that the horizontal dashed line cuts the y-axis at 6, which is not easily seen since 6 is not labelled currently.
Any intuitive suggestions will be greatly appreciated :)

You can use scale_y_continuous and give it all the points where you'd want a tick in the breaks argument.
library(ggplot2)
ggplot(data.frame(x = c(1:10), y = c(1:10)), aes(x = x, y = y)) +
geom_line() +
geom_hline(aes(yintercept = 6), lty = 2) +
scale_y_continuous(breaks = c(2, 4, 6, 8, 10))
in this specific example there is lots of "empty space" and you might consider adding the number within the plot using geom_text or geom_label if you think that is, what people should take away from your plot:
library(ggplot2)
ggplot(data.frame(x = c(1:10), y = c(1:10)), aes(x = x, y = y)) +
geom_line() +
geom_hline(aes(yintercept = 6), lty = 2) +
#scale_y_continuous(breaks = c(2, 4, 6, 8, 10)) +
geom_label(aes(x=2, y=6, label = "line at y = 6.0"))

Your data frame
df <- data.frame(x = c(1:10), y = c(1:10))
#Plotting
p <- ggplot(df, aes(x, y)) +
geom_point()
p
Custom function from here
It creates both x and y breaks
add_x_break <- function(plot, xval) {
p2 <- ggplot_build(plot)
breaks <- p2$layout$panel_params[[1]]$x$breaks
breaks <- breaks[!is.na(breaks)]
plot +
geom_vline(xintercept = xval) +
scale_x_continuous(breaks = sort(c(xval, breaks)))
}
add_y_break <- function(plot, yval) {
p2 <- ggplot_build(plot)
breaks <- p2$layout$panel_params[[1]]$y$breaks
breaks <- breaks[!is.na(breaks)]
plot +
geom_hline(yintercept = yval) +
scale_y_continuous(breaks = sort(c(yval, breaks)))
}
Define your break in your case y break
p <- add_y_break(p, 6)
p
Hope this helps
A bit of improvement with the ticks
p <- ggplot(df, aes(x, y)) +
geom_point()+
scale_x_continuous(breaks = round(seq(min(df$x), max(df$x), by = 0.5),1)) +
scale_y_continuous(breaks = round(seq(min(df$y), max(df$y), by = 0.5),1))
p
p <- add_y_break(p, 6)
p

Related

Parse superscript in discrete axis values on geom_bar

I'm trying to add a superscript to some x-axis values in order to connect to a footnote that'll be at the bottom of the page. The easy workaround would just be an asterisk instead of ^a but that won't work for my purposes.
I did a lot of searching and while there's plenty of posts about superscripts in axis labels, I couldn't find any about superscripts in axis values. Most of them appeared to centera round adding a gg + labs(x = expression("blah^a")).
I did find this post about parsing superscripts inside a geom_text() but it appears the same doesn't work for a geom_bar().
Here's some test data:
library(ggplot2)
dat <- data.frame(x = c("alpha", "bravo^a"),
y = c(10, 8))
ggplot(data = dat) +
geom_bar(aes(x = x,
y = y),
stat = "identity")
You just need to parse the text inside scale_x_discrete
Edit: add geom_text example
library(ggplot2)
dat <- data.frame(x = c("alpha", "bravo^a"),
y = c(10, 8))
### need to convert x to factor if R >= 4.0
dat$x <- factor(dat$x)
ggplot(data = dat) +
geom_bar(aes(x = x,
y = y),
stat = "identity") +
scale_x_discrete(labels = parse(text = levels(dat$x))) +
geom_text(aes(x = x, y = y,
label = x),
parse = TRUE,
nudge_y = 1,
size = 5) +
theme_minimal(base_size = 14)
Created on 2018-08-27 by the reprex package (v0.2.0.9000).

Directlabels package-- labels do not fit in plot area

I want to explore the directlabels package with ggplot. I am trying to plot labels at the endpoint of a simple line chart; however, the labels are clipped by the plot panel. (I intend to plot about 10 financial time series in one plot and I thought directlabels would be the best solution.)
I would imagine there may be another solution using annotate or some other geoms. But I would like to solve the problem using directlabels. Please see code and image below. Thanks.
library(ggplot2)
library(directlabels)
library(tidyr)
#generate data frame with random data, for illustration and plot:
x <- seq(1:100)
y <- cumsum(rnorm(n = 100, mean = 6, sd = 15))
y2 <- cumsum(rnorm(n = 100, mean = 2, sd = 4))
data <- as.data.frame(cbind(x, y, y2))
names(data) <- c("month", "stocks", "bonds")
tidy_data <- gather(data, month)
names(tidy_data) <- c("month", "asset", "value")
p <- ggplot(tidy_data, aes(x = month, y = value, colour = asset)) +
geom_line() +
geom_dl(aes(colour = asset, label = asset), method = "last.points") +
theme_bw()
On data visualization principles, I would like to avoid extending the x-axis to make the labels fit--this would mean having data space with no data. Rather, I would like the labels to extend toward the white space beyond the chart box/panel (if that makes sense).
In my opinion, direct labels is the way to go. Indeed, I would position labels at the beginning and at the end of the lines, creating space for the labels using expand(). Also note that with the labels, there is no need for the legend.
This is similar to answers here and here.
library(ggplot2)
library(directlabels)
library(grid)
library(tidyr)
x <- seq(1:100)
y <- cumsum(rnorm(n = 100, mean = 6, sd = 15))
y2 <- cumsum(rnorm(n = 100, mean = 2, sd = 4))
data <- as.data.frame(cbind(x, y, y2))
names(data) <- c("month", "stocks", "bonds")
tidy_data <- gather(data, month)
names(tidy_data) <- c("month", "asset", "value")
ggplot(tidy_data, aes(x = month, y = value, colour = asset, group = asset)) +
geom_line() +
scale_colour_discrete(guide = 'none') +
scale_x_continuous(expand = c(0.15, 0)) +
geom_dl(aes(label = asset), method = list(dl.trans(x = x + .3), "last.bumpup")) +
geom_dl(aes(label = asset), method = list(dl.trans(x = x - .3), "first.bumpup")) +
theme_bw()
If you prefer to push the labels into the plot margin, direct labels will do that. But because the labels are positioned outside the plot panel, clipping needs to be turned off.
p1 <- ggplot(tidy_data, aes(x = month, y = value, colour = asset, group = asset)) +
geom_line() +
scale_colour_discrete(guide = 'none') +
scale_x_continuous(expand = c(0, 0)) +
geom_dl(aes(label = asset), method = list(dl.trans(x = x + .3), "last.bumpup")) +
theme_bw() +
theme(plot.margin = unit(c(1,4,1,1), "lines"))
# Code to turn off clipping
gt1 <- ggplotGrob(p1)
gt1$layout$clip[gt1$layout$name == "panel"] <- "off"
grid.draw(gt1)
This effect can also be achieved using geom_text (and probably also annotate), that is, without the need for direct labels.
p2 = ggplot(tidy_data, aes(x = month, y = value, group = asset, colour = asset)) +
geom_line() +
geom_text(data = subset(tidy_data, month == 100),
aes(label = asset, colour = asset, x = Inf, y = value), hjust = -.2) +
scale_x_continuous(expand = c(0, 0)) +
scale_colour_discrete(guide = 'none') +
theme_bw() +
theme(plot.margin = unit(c(1,3,1,1), "lines"))
# Code to turn off clipping
gt2 <- ggplotGrob(p2)
gt2$layout$clip[gt2$layout$name == "panel"] <- "off"
grid.draw(gt2)
Since you didn't provide a reproducible example, it's hard to say what the best solution is. However, I would suggest trying to manually adjust the x-scale. Use a "buffer" increase the plot area.
#generate data frame with random data, for illustration and plot:
p <- ggplot(tidy_data, aes(x = month, y = value, colour = asset)) +
geom_line() +
geom_dl(aes(colour = asset, label = asset), method = "last.points") +
theme_bw() +
xlim(minimum_value, maximum_value + buffer)
Using scale_x_discrete() or scale_x_continuous() would likely also work well here if you want to use the direct labels package. Alternatively, annotate or a simple geom_text would also work well.

ggplot2: Stat_function misbehaviour with log scales

I am trying to plot a point histogram (a histogram that shows the values with a point instead of bars) that is log-scaled. The result should look like this:
MWE:
Lets simulate some Data:
set.seed(123)
d <- data.frame(x = rnorm(1000))
To get the point histogram I need to calculate the histogram data (hdata) first
hdata <- hist(d$x, plot = FALSE)
tmp <- data.frame(mids = hdata$mids,
density = hdata$density,
counts = hdata$counts)
which we can plot like this
p <- ggplot(tmp, aes(x = mids, y = density)) + geom_point() +
stat_function(fun = dnorm, col = "red")
p
to get this graph:
In theory we should be able to apply the log scales (and set the y-limits to be above 0) and we should have a similar picture to the target graph.
However, if I apply it I get the following graph:
p + scale_y_log10(limits = c(0.001, 10))
The stat_function clearly shows non-scaled values instead of producing a figure closer to the solid line in the first picture.
Any ideas?
Bonus
Are there any ways to graph the histogram with dots without using the hist(..., plot = FALSE) function?
EDIT Workaround
One possible solution is to calculate the dnorm-data outside of ggplot and then insert it as a line. For example
tmp2 <- data.frame(mids = seq(from = min(tmp$mids), to = max(tmp$mids),
by = (max(tmp$mids) - min(tmp$mids))/10000))
tmp2$dnorm <- dnorm(tmp2$mids)
# Plot it
ggplot() +
geom_point(data = tmp, aes(x = mids, y = density)) +
geom_line(data = tmp2, aes(x = mids, y = dnorm), col = "red") +
scale_y_log10()
This returns a graph like the following. This is basically the graph, but it doesn't resolve the stat_function issue.
library(ggplot2)
set.seed(123)
d <- data.frame(x = rnorm(1000))
ggplot(d, aes(x)) +
stat_bin(geom = "point",
aes(y = ..density..),
#same breaks as function hist's default:
breaks = pretty(range(d$x), n = nclass.Sturges(d$x), min.n = 1),
position = "identity") +
stat_function(fun = dnorm, col = "red") +
scale_y_log10(limits = c(0.001, 10))
Another possible solution that I found while revisiting this issue is to apply the log10 to the stat_function-call.
library(ggplot2)
set.seed(123)
d <- data.frame(x = rnorm(1000))
hdata <- hist(d$x, plot = FALSE)
tmp <- data.frame(mids = hdata$mids,
density = hdata$density,
counts = hdata$counts)
ggplot(tmp, aes(x = mids, y = density)) + geom_point() +
stat_function(fun = function(x) log10(dnorm(x)), col = "red") +
scale_y_log10()
Created on 2018-07-25 by the reprex package (v0.2.0).

Plot with multiple breaks of different sizes

I would like to create a plot with multiple breaks of different sized intervals on the y axis. The closest post I could find is this Show customised X-axis ticks in ggplot2 But it doesn't fully solve my problem.
# dummy data
require(ggplot2)
require(reshape2)
a<-rnorm(mean=15,sd=1.5, n=100)
b<-rnorm(mean=1500,sd=150, n=100)
df<-data.frame(a=a,b=b)
df$x <- factor(seq(100), ordered = T)
df.m <- melt(df)
ggplot(data = df.m, aes(x = x, y=value, colour=variable, group=variable)) +
geom_line() + scale_y_continuous(breaks = c(seq(from = 0, to = 20, by = 1),
seq(from = 1100, to = max(y), by = 100))) +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
The problem is how to get the first set of breaks to be proportional to the second (thus visible).
Any pointer would be very much appreciated, thanks!
You can try something like this:
# Rearrange the factors in the data.frame
df.m$variable <- factor(df.m$variable, levels = c("b", "a"))
ggplot(data = df.m, aes(x = x, y=value, colour=variable, group=variable)) +
geom_line() + facet_grid(variable~., scales = "free")
Hope this helps

Adding multiple text annotations to a faceted ggplot geom_histogram

I have the following data.frame:
hist.df <- data.frame(y = c(rnorm(30,1,1), rnorm(15), rnorm(30,0,1)),
gt = c(rep("ht", 30), rep("hm", 15), rep("hm", 30)),
group = c(rep("sc", 30), rep("am", 15), rep("sc",30)))
from which I produce the following faceted histogram ggplot:
main.plot <- ggplot(data = hist.df, aes(x = y)) +
geom_histogram(alpha=0.5, position="identity", binwidth = 2.5,
aes(fill = factor(gt))) +
facet_wrap(~group) +
scale_fill_manual(values = c("darkgreen","darkmagenta"),
labels = c("ht","hm"),
name = "gt",
limits=c(0, 30))
In addition, I have this data.frame:
text.df = data.frame(ci.lo = c(0.001,0.005,-10.1),
ci.hi = c(1.85,2.25,9.1),
group = c("am","sc","sc"),
factor = c("nu","nu","alpha"))
Which defines the text annotations I want to add to the faceted histograms, so that the final figure will be:
So text.df$ci.lo and text.df$ci.hi are confidence intervals on the corresponding text.df$factor and they correspond to the faceted histograms through text.df$group
Note that not every histogram has all text.df$factor's.
Ideally, the ylim's of the faceted histograms will leave enough space for the text to be added above the histograms so that they appear only on the background.
Any idea how to achieve this?
Wrapping my comment into an answer:
text.df$ci <- paste0(text.df$factor, ' = [', text.df$ci.lo, ', ', text.df$ci.hi, ']')
new_labels <- aggregate(text.df$ci, by = list(text.df$group),
FUN = function(x) paste(x, collapse = '\n'))$x
hist.df$group <- factor(hist.df$group)
hist.df$group <- factor(hist.df$group,
labels = paste0(levels(hist.df$group), '\n', new_labels))
main.plot <- ggplot(data = hist.df, aes(x = y)) +
geom_histogram(alpha=0.5, position="identity", binwidth = 2.5,
aes(fill = factor(gt))) +
facet_wrap(~group) +
scale_fill_manual(values = c("darkgreen","darkmagenta"),
labels = c("ht","hm"),
name = "gt")
main.plot + theme(strip.text = element_text(size=20))
If you wish to stick to the original idea, this question has an answer that will help.

Resources