Equal spacing with multiple atop - r

I'm trying to create a legend in a ggplot2 graph with multiple lines and a parameter and value on each line. Since I have symbols as variables, this needs to be done with expression. To create new lines, I have used multiple atop commands, but this leads to uneven spacing in the final line. Please see my following example:
library(ggplot2)
N = 25
a = -5
b = 2
sigma = 1
x = runif(N, 0, 10)
y = a + x * b + rnorm(N, sd = sigma)
df = data.frame(x, y)
ggplot(df, aes(x, y)) +
geom_point() +
geom_label(aes(x = 1, y = max(y) - 2),
label = paste0("atop(atop(",
"textstyle(a == ", a, "),",
"textstyle(b == ", b, ")),",
"textstyle(sigma == ", sigma, "))"
), parse = TRUE
)
ggsave("plotmath_atop.png", width = 6, height = 4, scale = 1)
This gives the following plot:
As you can see, the spacing between the lines b=2 and \sigma=1 is noticeably larger than the spacing between the lines a=-5 and b=2.
Is there a way of using expression with multiple line breaks while still having even spacing between each line?

you could use gridExtra::tableGrob,
library(gridExtra)
library(grid)
table_label <- function(label, params=list()) {
params <- modifyList(list(hjust=0, x=0), params)
mytheme <- ttheme_minimal(padding=unit(c(1, 1), "mm"),
core = list(fg_params = params), parse=TRUE)
disect <- strsplit(label, "\\n")[[1]]
m <- as.matrix(disect)
tg <- tableGrob(m, theme=mytheme)
bg <- roundrectGrob(width = sum(tg$widths) + unit(3, "mm"), height = sum(tg$heights) + unit(3, "mm"))
grobTree(bg, tg)
}
txt <- 'a == -5\n
b == 2\n
sigma == 1'
library(ggplot2)
qplot(1:10,1:10) +
annotation_custom(table_label(txt), xmin=0, xmax=5, ymin=7.5)

A simple solution is to avoid the use of expressions, print the sigma letter using the unicode character \u03c3, and use \n for line breaking.
library(ggplot2)
N = 25
a = -5
b = 2
sigma = 1
df = data.frame(runif(N, 0, 10), a + x * b + rnorm(N, sd = sigma))
lab <- paste0("a = ", a, "\n",
"b = ", b, "\n",
"\u03c3 = ", sigma)
ggplot(df, aes(x, y)) +
geom_point() +
geom_label(aes(x = 1, y = max(y) - 2), label = lab, parse = FALSE)
ggsave("plot_multiline_label.png", width = 6, height = 4, scale = 1)

Related

Axis with label = comma without showing decimals for large numbers

I have an x-axis in logscale, and I would like to display the labels without scientific notation (i.e. not 1e3, but 1,000 instead). I have always done this with label = scales::comma, but now my dataset also has very small values (0.001, for instance). Hence, when I add + scale_x_log10(label = comma), I get an x-axis where 1e-3 looks like 0.001 (as it should), but 1e3 looks like 1,000.000. I would like to remove the three decimal places, so that instead of 1,000.000 I just have 1,000. Using label = comma_format(accuracy = 1), as suggested here will make values like 0.001 look just like 0, so it's not a valid option.
Anyone has any idea?
Here there is a reproducible example of the problem:
library(ggplot2)
X <- 10^seq(-3, 3, length.out = 50)
Y <- 100 * X/(X + 1)
Demo_data <- data.frame(X, Y)
ggplot(Demo_data, aes(x = X, y = Y)) + geom_line(size = 1.5) +
scale_x_log10(breaks = c(1e-3, 1e-2, 1e-1, 1, 10, 1e2, 1e3),
label = scales::comma)
This solution does not work:
ggplot(Demo_data, aes(x = X, y = Y)) + geom_line(size = 1.5) +
scale_x_log10(breaks = c(1e-3, 1e-2, 1e-1, 1, 10, 1e2, 1e3),
label = scales::comma_format(accuracy = 1))
One option would be to use an ifelse to conditionally set the accuracy for values > 1 and < 1:
X <- 10^seq(-3, 3, length.out = 50)
Y <- 100 * X / (X + 1)
Demo_data <- data.frame(X, Y)
library(ggplot2)
library(scales)
ggplot(Demo_data, aes(x = X, y = Y)) +
geom_line(size = 1.5) +
scale_x_log10(
breaks = c(1e-3, 1e-2, 1e-1, 1, 10, 1e2, 1e3),
label = ~ ifelse(.x < 1, scales::comma(.x), scales::comma(.x, accuracy = 1))
)
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` instead.

Setting color levels in contourplots in ggplot R

I am plotting contour plots using ggplot in loop. I have few concerns -
the color levels are different in all iterations, how do it keep it steady iterations?
the number and range of levels are also changing with iteration, how to keep it constant across iterations ?
the length occupied by color scale is much longer than actual figure. How do I adjust that ?
How do I manually set the levels of colors in contours?
I have attached a sample below. Can someone please edit in the same code with comments
library(tidyverse)
library(gridExtra)
library(grid)
# data generation
x <- seq(-10, 10, 0.2)
y <- seq(-10, 10, 0.2)
tbl <- crossing(x, y)
for (i in seq(1, 2)) # to create two sample plots
{
# initialize list to store subplots
p <- list()
for (j in seq(1, 3)) # to create 3 subplots
{
# for randomness
a <- runif(1)
b <- runif(1)
# add z
tbl <- tbl %>%
mutate(z = a*(x - a)^2 + b*(y - b)^2)
# plot contours
p[[j]] <- ggplot(data = tbl,
aes(x = x,
y = y,
z = z)) +
geom_contour_filled(alpha = 0.8) +
theme_bw() +
theme(legend.position = "right") +
theme(aspect.ratio = 1) +
ggtitle("Sample")
}
p <- grid.arrange(p[[1]], p[[2]], p[[3]],
ncol = 3)
ggsave(paste0("iteration - ", i, ".png"),
p,
width = 8,
height = 3)
}
The actual plots are subplot for another plot, so I can increase its size. Therefore, width and height cannot be increased in ggsave.
Thanks
You can set breaks in geom_contour_filled. You can change your pngs by doubling their size but halfing their resolution. They will remain the same in terms of pixel dimensions.
for (i in seq(1, 2)) # to create two sample plots
{
p <- list()
for (j in seq(1, 3)) # to create 3 subplots
{
# for randomness
a <- runif(1)
b <- runif(1)
tbl <- tbl %>%
mutate(z = a*(x - a)^2 + b*(y - b)^2)
p[[j]] <- ggplot(data = tbl,
aes(x = x,
y = y,
z = z)) +
geom_contour_filled(alpha = 0.8, breaks = 0:9 * 20) +
scale_fill_viridis_d(drop = FALSE) +
theme_bw() +
theme(legend.position = "right") +
theme(aspect.ratio = 1) +
ggtitle("Sample")
}
p <- grid.arrange(p[[1]], p[[2]], p[[3]],
ncol = 3)
ggsave(paste0("iteration - ", i, ".png"),
p,
width = 16,
height = 6,
dpi = 150)
}
iteration-1.png
iteration-2.png

saving ggplot in a list gives me the same graph

I am trying to plot 12 different plots on a 3 by 4 grid. But,it only plots the last one 12 times. Can any one help me? I am so fed up with it. Thanks
library(ggplot2)
library(gridExtra)
pmax=0.85
K_min = 0.0017
T = seq(100,1200,by=100) ## ISIs
lambda =1/T
p=list()
for(i in (1:length(lambda))){
p[[i]]<-ggplot(data.frame(x = c(0, 1)), aes(x = x)) +
stat_function(fun = function (x) (lambda[i]*(1-(1-pmax))/K_min)*(1-x)^((lambda[i]/K_min)-1)*
(1-(1-pmax)*x)^-((lambda[i]/K_min)+1),colour = "dodgerblue3")+
scale_x_continuous(name = "Probability") +
scale_y_continuous(name = "Frequency") + theme_bw()
main <- grid.arrange(grobs=p,ncol=4)
}
This code produces the correct picture but I need to use ggplot since my other figures are in ggplot.
par( mfrow = c( 3, 4 ) )
for (i in (1:length(lambda))){
f <- function (x) ((lambda[i]*(1-(1-pmax))/K_min)*(1-x)^((lambda[i]/K_min)-1)*
(1-(1-pmax)*x)^-((lambda[i]/K_min)+1) )
curve(f,from=0, to=1, col = "violet",lwd=2,sub = paste0("ISI = ",round(1/lambda[i],3), ""),ylab="PDF",xlab="R")
}
Correct plot using curve:
ggplot objects created in a loop are evaluated at the end of the loop. Since all the ggplot objects in this case use data calculated with lambda[i], they get the same result based on the last i value (12). Here are two possible workarounds:
Workaround 1. Convert each ggplot object into a grob within the loop, & save that to the list:
for(i in (1:length(lambda))){
# code for generating each plot is unchanged
g <- ggplot(data.frame(x = c(0, 1)), aes(x = x)) +
stat_function(fun = function (x) (lambda[i]*(1-(1-pmax))/K_min)*(1-x)^((lambda[i]/K_min)-1)*
(1-(1-pmax)*x)^-((lambda[i]/K_min)+1),colour = "dodgerblue3")+
scale_x_continuous(name = "Probability") +
scale_y_continuous(name = "Frequency") + theme_bw()
p[[i]] <- ggplotGrob(g)
}
main <- grid.arrange(grobs=p, ncol=4)
Workaround 2. Put all the data in a data frame, & create a single ggplot with a facet for each ISI:
library(dplyr)
pmax = 0.85
K_min = 0.0017
ISI = seq(100, 1200, by = 100) # I changed this; using `T` as a name clashes with T from TRUE/FALSE
lambda = 1/ISI
df <- data.frame(
x = rep(seq(0, 1, length.out = 101), length(ISI)),
ISI = rep(ISI, each = 101),
l = rep(lambda, each = 101)
) %>%
mutate(y = (l * pmax / K_min) * (1-x) ^ ((l / K_min) - 1) *
(1 - (1 - pmax) * x)^-((l / K_min) + 1))
ggplot(data,
aes(x = x, y = y, group = 1)) +
geom_line(colour = "dodgerblue3") +
facet_wrap(~ISI, nrow = 3, scales = "free_y") +
labs(x = "Probability", y = "Frequency") +
theme_bw()

Manually assigning colors with scale_fill_manual only works for certain hexagon sizes

I am trying to create a scatterplot that is summarized by hexagon bins of counts. I would like the user to be able to define the count breaks for the color scale. I have this working, using scale_fill_manual(). Oddly, however, it only works sometimes. In the MWE below, using the given seed value, if xbins=10, there are issues resulting in a plot as follows:
However, if xbins=20 or 40, for example, the plot doesn't seem to have problems:
My MWE is as follows:
library(ggplot2)
library(hexbin)
library(RColorBrewer)
set.seed(1)
xbins <- 20
x <- abs(rnorm(10000))
y <- abs(rnorm(10000))
minVal <- min(x, y)
maxVal <- max(x, y)
maxRange <- c(minVal, maxVal)
buffer <- (maxRange[2] - maxRange[1]) / (xbins / 2)
h <- hexbin(x = x, y = y, xbins = xbins, shape = 1, IDs = TRUE,
xbnds = maxRange, ybnds = maxRange)
hexdf <- data.frame (hcell2xy(h), hexID = h#cell, counts = h#count)
my_breaks <- c(2, 4, 6, 8, 20, 1000)
clrs <- brewer.pal(length(my_breaks) + 3, "Blues")
clrs <- clrs[3:length(clrs)]
hexdf$countColor <- cut(hexdf$counts, breaks = c(0, my_breaks, Inf),
labels = rev(clrs))
ggplot(hexdf, aes(x = x, y = y, hexID = hexID, fill = countColor)) +
scale_fill_manual(values = levels(hexdf$countColor)) +
geom_hex(stat = "identity") +
geom_abline(intercept = 0, color = "red", size = 0.25) +
coord_fixed(xlim = c(-0.5, (maxRange[2] + buffer)),
ylim = c(-0.5, (maxRange[2] + buffer))) +
theme(aspect.ratio=1)
My goal is to tweak this code so that the plot does not have problems (where suddenly certain hexagons are different sizes and shapes than the rest) regardless of the value assigned to xbins. However, I am puzzled what may be causing this problem for certain xbins values. Any advice would be greatly appreciated.
EDIT:
I am updating the example code after taking into account comments by #bdemarest and #Axeman. I followed the most popular answer in the link #Axeman recommends, and believe it is more useful when you are working with scale_fill_continuous() on an integer vector. Here, I am working on scale_fill_manual() on a factor vector. As a result, I am still unable to get this goal to work. Thank you.
library(ggplot2)
library(hexbin)
library(RColorBrewer)
set.seed(1)
xbins <- 10
x <- abs(rnorm(10000))
y <- abs(rnorm(10000))
minVal <- min(x, y)
maxVal <- max(x, y)
maxRange <- c(minVal, maxVal)
buffer <- (maxRange[2] - maxRange[1]) / (xbins / 2)
bindata = data.frame(x=x,y=y,factor=as.factor(1))
h <- hexbin(bindata, xbins = xbins, IDs = TRUE, xbnds = maxRange, ybnds = maxRange)
counts <- hexTapply (h, bindata$factor, table)
counts <- t (simplify2array (counts))
counts <- melt (counts)
colnames (counts) <- c ("factor", "ID", "counts")
counts$factor =as.factor(counts$factor)
hexdf <- data.frame (hcell2xy (h), ID = h#cell)
hexdf <- merge (counts, hexdf)
my_breaks <- c(2, 4, 6, 8, 20, 1000)
clrs <- brewer.pal(length(my_breaks) + 3, "Blues")
clrs <- clrs[3:length(clrs)]
hexdf$countColor <- cut(hexdf$counts, breaks = c(0, my_breaks, Inf), labels = rev(clrs))
ggplot(hexdf, aes(x = x, y = y, fill = countColor)) +
scale_fill_manual(values = levels(hexdf$countColor)) +
geom_hex(stat = "identity") +
geom_abline(intercept = 0, color = "red", size = 0.25) +
coord_cartesian(xlim = c(-0.5, maxRange[2]+buffer), ylim = c(-0.5, maxRange[2]+ buffer)) + theme(aspect.ratio=1)
you can define colors in 'geom' instead of 'scale' that modifies the scale of plot:
ggplot(hexdf, aes(x = x, y = y)) +
geom_hex(stat = "identity",fill =hexdf$countColor)

Extendable piecewise function for any number of knots/breaks

I have the following collection of slopes, breaks, and intercepts:
slopes <- c(4, 2, 8, 4)
breaks <- c(0.0150, 0.030, 0.035)
intercepts <- c(0.0299, 0.0599, -0.1201, 0.0199)
They define the following lines:
# y = slopes[1] * x + intercepts[1]
# y = slopes[2] * x + intercepts[2]
# y = slopes[3] * x + intercepts[3]
# y = slopes[4] * x + intercepts[4]
Graphing the lines yields:
tibble(x = seq(0.0025, 0.06, 0.0025), y = x) %>%
ggplot(aes(x, y)) +
geom_point(alpha = 0) +
geom_abline(intercept = intercepts[1], slope = slopes[1], color = "red") +
geom_abline(intercept = intercepts[2], slope = slopes[2], color = "orange") +
geom_abline(intercept = intercepts[3], slope = slopes[3], color = "yellow") +
geom_abline(intercept = intercepts[4], slope = slopes[4], color = "green2") +
scale_y_continuous(limits = c(0, 1))
I'd like to create a piecewise function based on the lines and breaks/knots, like so (follow: red -> orange -> yellow -> green):
I could wrap a function over a couple of if/else statements to get what I want. But I'd like for the solution to be extendable for any number of breaks/knots (instead of just 3, in this example).
How might I accomplish this?
This should be fairly extensible:
piecewise <- function(x, slopes, intercepts, breaks) {
i = 1 + findInterval(x, breaks)
y = slopes[i] * x + intercepts[i]
return(y)
}
Note that I put the breaks argument last, since that seemed most natural to me.
It automatically implements the piecewise defined function for any number of pieces.
Example:
slopes <- c(4, 2, 8, 4)
intercepts <- c(0.0299, 0.0599, -0.1201, 0.0199)
breaks <- c(0.0150, 0.030, 0.035)
df <- tibble(x = seq(0.0025, 0.06, 0.0025)) %>%
mutate(y = piecewise(x, slopes, intercepts, breaks))
df %>%
ggplot(aes(x, y)) +
geom_line()

Resources