Transforming ggplot2 axes to log10 using scales::trans_breaks() can sometimes (if the range is small enough) produce un-pretty breaks, at non-integer powers of ten.
Is there a general purpose way of setting these breaks to occur only at 10^x, where x are all integers, and, ideally, consecutive (e.g. 10^1, 10^2, 10^3)?
Here's an example of what I mean.
library(ggplot2)
# dummy data
df <- data.frame(fct = rep(c("A", "B", "C"), each = 3),
x = rep(1:3, 3),
y = 10^seq(from = -4, to = 1, length.out = 9))
p <- ggplot(df, aes(x, y)) +
geom_point() +
facet_wrap(~ fct, scales = "free_y") # faceted to try and emphasise that it's general purpose, rather than specific to a particular axis range
The unwanted result -- y-axis breaks are at non-integer powers of ten (e.g. 10^2.8)
p + scale_y_log10(
breaks = scales::trans_breaks("log10", function(x) 10^x),
labels = scales::trans_format("log10", scales::math_format(10^.x))
)
I can achieve the desired result for this particular example by adjusting the n argument to scales::trans_breaks(), as below. But this is not a general purpose solution, of the kind that could be applied without needing to adjust anything on a case-by-case basis.
p + scale_y_log10(
breaks = scales::trans_breaks("log10", function(x) 10^x, n = 1),
labels = scales::trans_format("log10", scales::math_format(10^.x))
)
Should add that I'm not wed to using scales::trans_breaks(), it's just that I've found it's the function that gets me closest to what I'm after.
Any help would be much appreciated, thank you!
Here is an approach that at the core has the following function.
breaks = function(x) {
brks <- extended_breaks(Q = c(1, 5))(log10(x))
10^(brks[brks %% 1 == 0])
}
It gives extended_breaks() a narrow set of 'nice numbers' and then filters out non-integers.
This gives us the following for you example case:
library(ggplot2)
library(scales)
#> Warning: package 'scales' was built under R version 4.0.3
# dummy data
df <- data.frame(fct = rep(c("A", "B", "C"), each = 3),
x = rep(1:3, 3),
y = 10^seq(from = -4, to = 1, length.out = 9))
ggplot(df, aes(x, y)) +
geom_point() +
facet_wrap(~ fct, scales = "free_y") +
scale_y_continuous(
trans = "log10",
breaks = function(x) {
brks <- extended_breaks(Q = c(1, 5))(log10(x))
10^(brks[brks %% 1 == 0])
},
labels = math_format(format = log10)
)
Created on 2021-01-19 by the reprex package (v0.3.0)
I haven't tested this on many other ranges that might be difficult, but it should generalise better than setting the number of desired breaks to 1. Difficult ranges might be those just in between -but not including- powers of 10. For example 11-99 or 101-999.
Related
I want to draw this type of line/segment in R.
The ggforce::geom_link2() function can interpolate aesthetics between two points on a line.
library(ggplot2)
library(ggforce)
df <- data.frame(
x = c(1, 2), y = c(1, 2),
width = c(1, 2)
)
ggplot(df, aes(x, y)) +
geom_link2(aes(size = width),
lineend = "round")
Created on 2021-08-27 by the reprex package (v1.0.0)
The answer is, I think, "Yes, of course, because ggplot2 is amazing and you can do anything" and, at the same time, "unfortunately it's going to be at least a little bit painful".
Here is my best approximation of your image using only points and lines:
library(ggplot2)
x <- seq(1, 10, length = 200)
y <- - 2 * x
dat <- data.frame(x, y)
ggplot(dat, aes(x, y, size = x ** 2)) +
geom_line(show.legend = FALSE) +
geom_point(aes(size = x ** 2 * 0.7),
data = dat[c(1, 200),],
show.legend = FALSE) +
theme_void()
And the result looks pretty decent, in my opinion.
The greatest advantage of this technique is that it allows you to color your line, and get beautiful graphs like this!
This is the plot I have:
I used this code (including sample data):
# dummy data
df_test <- data.frame(long = rep(447030:447050, 21),
lat = rep(5379630:5379650, each=21),
z = rnorm(21*21))
# plot
ggplot(df_test) +
geom_tile(aes(x=long, y = lat, fill = z)) +
scale_fill_stepsn(
limits = c(-3, 3), breaks = seq(-3, 3, 1), # labels = seq(-3, 3, 1),
colors = c("#ff6f69", "grey90", "#00aedb"))
I would like the legend to show the maximum and minimum value (-3, +3). But when I uncomment the label-code labels = seq(-3, 3, 1), I get an error:
Error: Breaks and labels are different lengths"
Is this a bug or am I misusing the function? (aka: Is it a bug or a feature?) Either way: Do you guys know any workaround / solution for this issue? Maybe something with override.aes() (I am not really good with that function)?
R version: 4.1.0 | ggplot2 version: 3.3.5
(Maybe related: Breaks and labels of different lengths scale_size_binned)
Edit: If I install ggplot2 version 3.3.3, the last box in the legend is bigger somehow (which I don't like either).
This is just a workaround for what I think might be a bug, but you might tweak the breaks a little bit to add/subtract a very small value:
library(ggplot2)
# dummy data
df_test <- data.frame(long = rep(447030:447050, 21),
lat = rep(5379630:5379650, each=21),
z = rnorm(21*21))
# plot
smallvalue <- 10 * .Machine$double.eps
ggplot(df_test) +
geom_tile(aes(x=long, y = lat, fill = z)) +
scale_fill_stepsn(
limits = c(-3, 3),
breaks = c(-3 + smallvalue, -2:2, 3 - smallvalue),
labels = seq(-3, 3, 1),
colors = c("#ff6f69", "grey90", "#00aedb")
)
Created on 2021-08-06 by the reprex package (v1.0.0)
EDIT:
Alternatively, you can set the inner breaks and use a function for the labels argument.
library(ggplot2)
# dummy data
df_test <- data.frame(long = rep(447030:447050, 21),
lat = rep(5379630:5379650, each=21),
z = rnorm(21*21))
# plot
smallvalue <- 10 * .Machine$double.eps
ggplot(df_test) +
geom_tile(aes(x=long, y = lat, fill = z)) +
scale_fill_stepsn(
limits = c(-3, 3),
breaks = -2:2,
labels = function(x) {x}, # Just display the breaks
show.limits = TRUE,
colors = c("#ff6f69", "grey90", "#00aedb")
)
Created on 2021-08-06 by the reprex package (v1.0.0)
I would like to create a graph that has superscripts on the axis instead of displaying unformatted numbers using ggplot2. I know that there are a lot of answers which change the axis label, but not the axis text. I am not trying to change the label of the graph, but the text on the axis.
Example:
x<-c('2^-5','2^-3','2^-1','2^1','2^2','2^3','2^5','2^7','2^9','2^11','2^13')
y<-c('2^-5','2^-3','2^-1','2^1','2^2','2^3','2^5','2^7','2^9','2^11','2^13')
df<-data.frame(x,y)
p<-ggplot()+
geom_point(data=df,aes(x=x,y=y),size=4)
p
So I would like the x-axis to display the same numbers but without the carrot.
EDIT:
A purely base approach:
df %>%
mutate_all(as.character)->new_df
res<-unlist(Map(function(x) eval(parse(text=x)),new_df$x))#replace with y for y
to_use<-unlist(lapply(res,as.expression))
split_text<-strsplit(gsub("\\^"," ",names(to_use))," ")
join_1<-as.numeric(sapply(split_text,"[[",1)) #tidyr::separate might help, less robust for numeric(I think)
join_2<-as.numeric(sapply(split_text,"[[",2))
to_use_1<-sapply(seq_along(join_1),function(x) parse(text=paste(join_1[x],"^",
join_2[x])))
The above can be reduced to less step, I posted the stepwise approach I took. The result for only x, the same can be done for y:
new_df %>%
ggplot()+
geom_point(aes(x=x,y=y),size=4)+
scale_x_discrete(breaks=df$x,labels=to_use_1)#replace with y and scale_y_discrete for y
Plot:
Original and erroneous answer:
I have deviated from standard tidyverse practice by using $, you can replace it with . and it might work although in this case it's not really important since the focus is on labels.:
library(dplyr)
df %>%
mutate(new_x=gsub("\\^"," ",x),
new_y=gsub("\\^"," ",y))->new_df
new_df %>%
ggplot()+
geom_point(aes(x=x,y=y),size=4)+
scale_x_discrete(breaks=x,labels=new_df$new_x)+
scale_y_discrete(breaks=y,labels=new_df$new_y)
This can be done with functions scale_x_log2 and scale_y_log2 that can be found in GitHub package jrnoldmisc.
First, install the package.
devtools::install_github("jrnold/rubbish")
Then, coerce the variables to numeric. I wil work with a copy of the original dataframe.
df1 <- df
df1[] <- lapply(df1, function(x){
x <- as.character(x)
sapply(x, function(.x)eval(parse(text = .x)))
})
Now, graph it.
library(jrnoldmisc)
library(ggplot2)
library(MASS)
library(scales)
a <- ggplot(df1, aes(x = x, y = y, size = 4)) +
geom_point(show.legend = FALSE) +
scale_x_log2(limits = c(0.01, NA),
labels = trans_format("log2", math_format(2^.x)),
breaks = trans_breaks("log2", function(x) 2^x, n = 10)) +
scale_y_log2(limits = c(0.01, NA),
labels = trans_format("log2", math_format(2^.x)),
breaks = trans_breaks("log2", function(x) 2^x, n = 10))
a + annotation_logticks(base = 2)
Edit.
Following the discussion in the comments, here are the two other ways that were seen to give different axis labels.
Axis labels every tick mark. Set limits = c(1.01, NA) and function argument n = 11, an odd number.
Axis labels on odd number exponents. Keep limits = c(0.01, NA), change to function(x) 2^(x - 1), n = 11.
Just the instructions, no plots.
The first.
a <- ggplot(df1, aes(x = x, y = y, size = 4)) +
geom_point(show.legend = FALSE) +
scale_x_log2(limits = c(1.01, NA),
labels = trans_format("log2", math_format(2^.x)),
breaks = trans_breaks("log2", function(x) 2^(x), n = 11)) +
scale_y_log2(limits = c(1.01, NA),
labels = trans_format("log2", math_format(2^.x)),
breaks = trans_breaks("log2", function(x) 2^(x), n = 11))
a + annotation_logticks(base = 2)
And the second.
a <- ggplot(df1, aes(x = x, y = y, size = 4)) +
geom_point(show.legend = FALSE) +
scale_x_log2(limits = c(0.01, NA),
labels = trans_format("log2", math_format(2^.x)),
breaks = trans_breaks("log2", function(x) 2^(x - 1), n = 11)) +
scale_y_log2(limits = c(0.01, NA),
labels = trans_format("log2", math_format(2^.x)),
breaks = trans_breaks("log2", function(x) 2^(x - 1), n = 11))
a + annotation_logticks(base = 2)
You can provide a function to the labels argument of the scale_x_*** and scale_y_*** functions to generate labels with superscripts (or other formatting). See examples below.
library(jrnoldmisc)
library(ggplot2)
df<-data.frame(x=2^seq(-5,5,2),
y=2^seq(-5,5,2))
ggplot(df) +
geom_point(aes(x=x,y=y),size=2) +
scale_x_log2(breaks=2^seq(-5,5,2),
labels=function(x) parse(text=paste("2^",round(log2(x),2))))
ggplot(df) +
geom_point(aes(x=x,y=y),size=2) +
scale_x_continuous(breaks=c(2^-5, 2^seq(1,5,2)),
labels=function(x) parse(text=paste("2^",round(log2(x),2))))
ggplot(df) +
geom_point(aes(x=x,y=y),size=2) +
scale_x_log10(breaks=10^seq(-1,1,1),
labels=function(x) parse(text=paste("10^",round(log10(x),2))))
See example:
I hope I don't need to manually assign the coordinators of the texts. If this is too complicated to achieve in ggplot2, what are the alternatives in R? Or maybe even not in R?
As #Axeman says, ggrepel is a decent option. Unfortunately it will only avoid overlap with other labels, and not the lines, so the solution isn't quite perfect.
library(ggplot2)
install.packages("ggrepel")
library(ggrepel)
set.seed(50)
d <- data.frame(y = c(rnorm(50), rnorm(50, 5), rnorm(50, 10)),
x = rep(seq(50), times = 3),
group = rep(LETTERS[seq(3)], each = 50))
ggplot(d, aes(x, y, group = group, label = group)) +
geom_line() +
geom_text_repel(data = d[d$x == sample(d$x, 1), ], size = 10)
I'm trying to make a scatter plot in R with ggplot2, where the middle of the y-axis is collapsed or removed, because there is no data there. I did it in photoshop below, but is there a way to create a similar plot with ggplot?
This is the data with a continuous scale:
But I'm trying to make something like this:
Here is the code:
ggplot(data=distance_data) +
geom_point(
aes(
x = mdistance,
y = maxZ,
shape = factor(subj),
color = factor(side),
size = (cSA)
)
) +
scale_size_continuous(range = c(4, 10)) +
theme(
axis.text.x = element_text(colour = "black", size = 15),
axis.text.y = element_text(colour = "black", size = 15),
axis.title.x = element_text(colour = "black", size= 20, vjust = 0),
axis.title.y = element_text(colour = "black", size= 20),
legend.position = "none"
) +
ylab("Z-score") +
xlab("Distance")
You could do this by defining a coordinate transformation. A standard example are logarithmic coordinates, which can be achieved in ggplot by using scale_y_log10().
But you can also define custom transformation functions by supplying the trans argument to scale_y_continuous() (and similarly for scale_x_continuous()). To this end, you use the function trans_new() from the scales package. It takes as arguments the transformation function and its inverse.
I discuss first a special solution for the OP's example and then also show how this can be generalised.
OP's example
The OP wants to shrink the interval between -2 and 2. The following defines a function (and its inverse) that shrinks this interval by a factor 4:
library(scales)
trans <- function(x) {
ifelse(x > 2, x - 1.5, ifelse(x < -2, x + 1.5, x/4))
}
inv <- function(x) {
ifelse(x > 0.5, x + 1.5, ifelse(x < -0.5, x - 1.5, x*4))
}
my_trans <- trans_new("my_trans", trans, inv)
This defines the transformation. To see it in action, I define some sample data:
x_val <- 0:250
y_val <- c(-6:-2, 2:6)
set.seed(1234)
data <- data.frame(x = sample(x_val, 30, replace = TRUE),
y = sample(y_val, 30, replace = TRUE))
I first plot it without transformation:
p <- ggplot(data, aes(x, y)) + geom_point()
p + scale_y_continuous(breaks = seq(-6, 6, by = 2))
Now I use scale_y_continuous() with the transformation:
p + scale_y_continuous(trans = my_trans,
breaks = seq(-6, 6, by = 2))
If you want another transformation, you have to change the definition of trans() and inv() and run trans_new() again. You have to make sure that inv() is indeed the inverse of inv(). I checked this as follows:
x <- runif(100, -100, 100)
identical(x, trans(inv(x)))
## [1] TRUE
General solution
The function below defines a transformation where you can choose the lower and upper end of the region to be squished, as well as the factor to be used. It directly returns the trans object that can be used inside scale_y_continuous:
library(scales)
squish_trans <- function(from, to, factor) {
trans <- function(x) {
if (any(is.na(x))) return(x)
# get indices for the relevant regions
isq <- x > from & x < to
ito <- x >= to
# apply transformation
x[isq] <- from + (x[isq] - from)/factor
x[ito] <- from + (to - from)/factor + (x[ito] - to)
return(x)
}
inv <- function(x) {
if (any(is.na(x))) return(x)
# get indices for the relevant regions
isq <- x > from & x < from + (to - from)/factor
ito <- x >= from + (to - from)/factor
# apply transformation
x[isq] <- from + (x[isq] - from) * factor
x[ito] <- to + (x[ito] - (from + (to - from)/factor))
return(x)
}
# return the transformation
return(trans_new("squished", trans, inv))
}
The first line in trans() and inv() handles the case when the transformation is called with x = c(NA, NA). (It seems that this did not happen with the version of ggplot2 when I originally wrote this question. Unfortunately, I don't know with which version this startet.)
This function can now be used to conveniently redo the plot from the first section:
p + scale_y_continuous(trans = squish_trans(-2, 2, 4),
breaks = seq(-6, 6, by = 2))
The following example shows that you can squish the scale at an arbitrary position and that this also works for other geoms than points:
df <- data.frame(class = LETTERS[1:4],
val = c(1, 2, 101, 102))
ggplot(df, aes(x = class, y = val)) + geom_bar(stat = "identity") +
scale_y_continuous(trans = squish_trans(3, 100, 50),
breaks = c(0, 1, 2, 3, 50, 100, 101, 102))
Let me close by stressing what other already mentioned in comments: this kind of plot could be misleading and should be used with care!