R: Change number of ticks in ggplot2 with ggthemes::extended_range_breaks() - r

assume the following MWE:
library(ggplot2)
library(ggthemes)
set.seed(100)
df <- data.frame(x = rnorm(50), y = rnorm(50))
ggplot(df, aes(x,y)) +
geom_point() +
theme_tufte() +
geom_rangeframe() +
scale_x_continuous(breaks = extended_range_breaks()(df$x),
labels = scales::number_format(accuracy = 0.1))+
scale_y_continuous(breaks = extended_range_breaks()(df$y),
labels = scales::number_format(accuracy = 0.1))
From my point of view the number of ticks on the y-axis is not enough, the space between 1.0 and 2.9 is too large and I would like to have another tick at 2.0. Anyone an idea how to do that when working with extended_range_breaks or do I have to switch to manually setting the ticks?
I tried scale_y_continuous(n.breaks = 7) and scale_y_continuous(breaks = extended_range_breaks(n = 7)(df$y) but both don't have an effect.

Not a perfect answer, but at least a work-around for now: change the weights applied to the four optimization components. The parameters are set to w = c(0.25, 0.2, 0.5, 0.05) for simplicity, coverage, density, and legibility. If we set coverage for example to 2 (for the y-axis), the graph changes to the following:
The actual desired goal to simly add (-2,2) was yet not possible with this tweak.

Related

How can I keep ggplot2 annotations aligned and in fixed position on plot?

I'm plotting 2 densities and trying to add a few annotations that align horizontally while text is rotated 90 degrees but I can't seem to get them to line up when the annotations are of different character lengths.
library(ggplot2)
n <- 10000
mu_a <- .089
mu_b <- .099
s_a <- .0092
s_b <- .004
df <- data.frame(
variant = factor(c(rep("A", n),rep("B", n))),
p = c(rnorm(n = n, mean = mu_a, sd = s_a), rnorm(n = n, mean = mu_b, sd = s_b)))
ggplot(df, aes(x = p, fill = variant)) +
geom_density() +
scale_x_continuous(labels = scales::percent) +
scale_y_continuous(expand = expansion(mult = c(0, .1))) +
annotate("text",
x = c(mu_a,mu_b),
y = Inf,
vjust = "center",
hjust = 6,
label = c("5char","06char"),
angle = 90
)
Created on 2021-05-12 by the reprex package (v0.3.0)
Plot image at https://i.imgur.com/CxW1pjP.png
I've tried changing the y axis to a 0:1 scale with scale_y_continuous(y = ..scaled..) and then setting the annotation y values to fixed positions like y = 0.2 but then the densities aren't sized appropriately. Have tried all manner of combinations of hjust and vjust. I thought that these were supposed to work like percentages of the plot. So vjust = 0.2 means 20% up the plot, but it's not working like that for me. I was not expecting that by rotating the text 90 degrees, that hjust and vjust would swap, but that seems to be what happened.
I'm not sure I'm 100% following, but try this:
ggplot(df, aes(x = p, ..scaled.., fill = variant)) +
geom_density(alpha = 0.8, adjust = 0.2) +
scale_x_continuous(labels = scales::percent) +
scale_y_continuous(expand = expansion(mult = c(0, .1))) +
annotate("text",
x = c(mu_a,mu_b),
y = 0.5,
hjust = 0.5,
label = c("5char","20charxxxxxxxxxxxxxx"),
angle = 90
)
The main callouts:
Set y to just be 0.5. It's a density plot, so it can be scaled so that the max y is always 1 using ..scaled.. in the mapping (see geom_density y-axis goes above 1)
Removed the vjust
Changed the hjust to be 0.5. While values >1 are accepted, it's easiest to just think of 0 as left-justified, 1 as right-justified, and 0 as center-justified. The reason it's hjust rather than vjust is because the justification is from the perspective of the text—not the orientation of the plot (this makes some sense—consider an angle that is anything other than 0 or 90).
I threw an alpha value into the`geom_density() function so that the full curve would show for both (which wasn't part of the question at all, but I couldn't help myself)
This should return the following:
Plot using the code above
The best answers I've found here are:
Use annotation_custom() with the grid package and the textGrob() function. This does allow positioning of annotations by % of x and y axis. Problem is you can't mix methods like setting x to point on the scale and y to % of scale like I'm trying to do.
Calculate the upper end of the range of values in the plot. You can get this from a ggplot object like so ggplot_build(.)$layout$panel_scales_y[[1]]$range$range[[2]] or you can get it from a density function like so d <- density(.) then d$y[which.max(d$y)]. Once you have the upper end of the range, you can continue to build the plot by using a proportion of that upper end for the y placement.
Setting the y scale to ..scaled.. does indeed set the y scale to max 1, however, when plotting multiple densities, it sets both to their own scale rather than scaling accurately to each other. So wide and narrow densities will have the same height.

How do I adjust formatting of ggplot axis labels when using log-scales?

I have been having trouble taming the accuracy settings for a ggplot chart with a log scale. Using label_number doesn't quite give me the format I would like. The chart below shows a stylised example. Here, I would like to tweak the labels with one fewer decimal place, so they are 0.1, 1.0, 10.0, 100.0 and 1 000.0.
ggplot(data = data.frame(x = 1:5, y=10^(1:5 - 2)), aes(x=x, y=y)) +
geom_point() +
scale_y_log10(labels = scales::label_number())
I though that the accuracy parameter would help me here, but it doesn't seem to behave nicely when using a log scale and ends up with odd behaviour. Here's an example - notice the strange appearance of the 2s at the end of the labels.
ggplot(data = data.frame(x = 1:5, y=10^(1:5 - 2)), aes(x=x, y=y)) +
geom_point() +
scale_y_log10(labels = scales::label_number(accuracy=6))
Well, that looks pretty close.
Have you tried an accuracy of 0.1? (I took the idea from an example in the help file)
This works for me:
ggplot(data = data.frame(x = 1:5, y=10^(1:5 - 2)), aes(x=x, y=y)) +
geom_point() +
scale_y_log10(labels = scales::label_number(accuracy = 0.1))
Output:

Use facet_zoom in a stat_density plot

I'd like to use facet_zoom but for some reason the zoomed area results empty.
The two data sets I use are just numeric vectors of 1.000.000 numbers generated from a modified polynomial distribution. In the zoomed area there is a small spike that I'd like to show.
prova <-readRDS("probcond1.rds")
prova1 <-readRDS("probpoly.rds")
dfGamma <-data.frame(prova)
ggplot(dfGamma, aes(x=prova)) + stat_density(aes(y=..count..), color="black", fill="blue", alpha=0.3)
g <- ggplot(dfGamma, aes(x=prova)) +
stat_density(aes(y=..count..), color="black", fill="blue", alpha=0.3) +
scale_x_continuous(breaks=c(0,1,2,3,4,5,10,30,100,300,1000,4000,5000), trans="log1p", expand=c(0,0)) +
theme_bw()
g+expand_limits(x = c(1, 6000)) +facet_zoom(xlim = c(4000,5000))
I'm really new to R. sorry for my ignorance
Your axis is on a log1p scale, so your xlim should be wrapped inside log1p to do a zoom. You can do as follows:
g+expand_limits(x = c(1, 6000)) +facet_zoom(xlim = c(log1p(4000),log1p(5000)))
Here is a sample using the mtcars dataset.
library(ggplot2)
library(ggforce)
g <- ggplot(mtcars, aes(x=hp)) +
stat_density(aes(y=..count..), color="black", fill="blue", alpha=0.3) +
scale_x_continuous(breaks=c(0,1,2,3,4,5,10,30,100,300), trans="log1p", expand=c(0,0)) +
theme_bw()
If you use facet_zoom(xlim = c(100,300)) as follows will produce empty zoom output (flat values of 100 and 300 don't exist on the g's x-axis):
g+expand_limits(x = c(1, 300)) +facet_zoom(xlim = c(100,300))
Output-1 (flat value zoom)
If you transform the xlim using log1p, you can zoom on the corresponding values of the x-axis of plot g. You can do that as follows:
g+expand_limits(x = c(1, 300)) +facet_zoom(xlim = c(log1p(100),log1p(300)))
Output-2 (log1p zoom)
If you want to zoom in the axis independently, you can do as follows:
g+expand_limits(x = c(1, 300)) +facet_zoom(xlim = c(log1p(100),log1p(300)), ylim = c(5,10), split = TRUE)
Output
As you can see I did zoom the ylim between 5 and 10 and the split = TRUE makes the zoom independent and you can have multiple views of the zoom axis or if you just want one view, you can leave the split to its default value FALSE. The manual has a lot more information which you might want to consult, just in case it is available at Package ‘ggforce’
Hope that helps.

Add new geom as new row in ggplot2, preventing layering of plots

I am pretty sure that this is easy to do but I can't seem to find a proper way to query this question into google or stack, so here we are:
I have a plot made in ggplot2 which makes use of geom_jitter(), efficiently creating one row for each element in a factor and plotting its values.
I would like to add a complementary geom_violin() to the plot, but just adding the extra geom_ function to the plot code returns two layers: the jitter and the violin, one on top of the other (as usually expected).
EDIT:
This is how the plot looks like:
How can I have the violin as a separate row, without generating a second plot?
Side quest: how I can I have the jitter and the violin geoms interleaved? (i.e. element A jitter row followed by element A violin row, and then element B jitter row followed by element B violin row)
This is the minimum required code to make it (without all the theme() embellishments):
P1 <- ggplot(data=TEST_STACK_SUB, aes(x=E, y=C, col=A)) +
theme(... , aspect.ratio=0.3) +
geom_point(position = position_jitter(w = 0.30, h = 0), alpha=0.2, size=0.5) +
geom_violin(data=TEST_STACK_SUB, mapping=aes(x=E, y=C), position="dodge") +
scale_x_discrete() +
scale_y_continuous(limits=c(0,1), breaks=seq(0,1,0.1),
labels=c(seq(0,1,0.1))) +
scale_color_gradient2(breaks=seq(0,100,20),
limits=c(0,100),
low="green3",
high="darkorchid4",
midpoint=50,
name="") +
coord_flip()
options(repr.plot.width=8, repr.plot.height=2)
plot(P1)
Here is a subset of the data to generate it (for you to try):
data
How about manipulating your factor as a continuous variable and nudging the entries across the aes() calls like so:
library(dplyr)
library(ggplot2)
set.seed(42)
tibble(x = rep(c(1, 3), each = 10),
y = c(rnorm(10, 2), rnorm(10))) -> plot_data
ggplot(plot_data) +
geom_jitter(aes(x = x - 0.5, y = y), width = 0.25) +
geom_violin(aes(x = x + 0.5, y = y, group = x), width = 0.5) +
coord_flip() +
labs(x = "x") +
scale_x_continuous(breaks = c(1, 3),
labels = paste("Level", 1:2),
trans = scales::reverse_trans())

How to adjust figure settings in plotmatrix?

Can I adjust the point size, alpha, font, and axis ticks in a plotmatrix?
Here is an example:
library(ggplot2)
plotmatrix(iris)
How can I:
make the points twice as big
set alpha = 0.5
have no more than 5 ticks on each axis
set font to 1/2 size?
I have fiddled with the mapping = aes() argument to plotmatrix as well as opts() and adding layers such as + geom_point(alpha = 0.5, size = 14), but none of these seem to do anything. I have hacked a bit of a fix to the size by writing to a large pdf (pdf(file = "foo.pdf", height = 10, width = 10)), but this provides only a limited amount of control.
Pretty much all of the ggplot2 scatterplot matrix options are still fairly new and can be a bit experimental.
But the facilities in GGally do allows you to construct this kind of plot manually, though:
custom_iris <- ggpairs(iris,upper = "blank",lower = "blank",
title = "Custom Example")
p1 <- ggplot(iris,aes(x = Sepal.Length,y = Sepal.Width)) +
geom_point(size = 1,alpha = 0.3)
p2 <- ggplot(iris,aes(x = Sepal.Width,y = Sepal.Length)) +
geom_point()
custom_iris <- putPlot(custom_iris,p1,2,1)
custom_iris <- putPlot(custom_iris,p2,3,2)
custom_iris
I did that simply by directly following the last example in ?ggpairs.

Resources