I am trying to (naively?) set the limits of a color scale, but what happens is that the color scale itself is overridden additionally. What am I making wrong or how can it be done?
Simple example:
p <- volcano %>% reshape2::melt(varnames=c("x", "y")) %>% as_tibble() %>%
ggplot(aes(x,y, fill=value)) + geom_tile() +
scale_fill_gradientn(colours=hcl.colors(15, palette = "Purple-Green"))
p
p + lims(fill=c(50,200))
changes the whole color scale:
NB: In the real world example I want to center the color scale symmetrically around 0 with a diverging color scale and I do not want to use scale_fill_gradient2
Thanks in advance for any help!
The reason why lims doesn't work here is that it adds a whole new scale object to the plot which overrides the one you have already specified. If you look at the code for lims, it does its work through sending all its arguments individually to the generic function ggplot2:::limits. In your case, this invokesggplot2:::limits.numeric, which creates a new scale object via ggplot2:::make_scale. This function ends up just calling scale_fill_continuous.
As for why you can use lims after specifying an x or y scale without overwriting the existing one, the answer is: you can't, it does override the existing scale, and in fact warns you that it is doing so. Suppose we specify an x axis scale with lots of breaks in your example:
library(tidyverse)
p <- volcano %>%
reshape2::melt(varnames = c("x", "y")) %>%
as_tibble() %>%
ggplot(aes(x,y, fill = value)) +
geom_tile() +
scale_fill_gradientn(colours = hcl.colors(15, palette = "Purple-Green"),
limits = c(50, 200)) +
scale_x_continuous(breaks = 0:44 * 2)
p
Now look what happens if we add x axis lims to our scale:
p + lims(x = c(0, 90))
#> Scale for 'x' is already present. Adding another scale for 'x', which will
#> replace the existing scale.
We lost all our breaks, and got a warning that our x scale was being overwritten.
So the bottom line is that passing numbers to lims just adds a vanilla contnuous scale to whichever aesthetic you specify. Doing + lims(fill = c(0, 10)) gives exactly the same result as + scale_fill_continuous(limits = c(0, 10)). The answer, as you have found yourself, is to specify the limits argument directly in the scale you wish to add.
Created on 2022-08-21 with reprex v2.0.2
Ok, now I found the solution myself....
Did not know that there is a limits keyword directly:
volcano %>% reshape2::melt(varnames=c("x", "y")) %>% as_tibble() %>%
+ ggplot(aes(x,y, fill=value)) + geom_tile() +
+ scale_fill_gradientn(colours=hcl.colors(15, palette = "Purple-Green"), limits=c(50,200))
Still don't fully get, why lims overrides the whole scale. And this mean one cannot change the limits afterwards (as one can do x and y scales?
Related
I am trying to change both the yaxis scale and the amount of decimal places. I am using ylim() (to change y scale) and scale_y_continuous(labels = scales::number_format(accuracy = 0.01)) (from scale package to change the decimal points) but they wont work together. I am using ggplot to plot my data.
If you use limits in scale_y_continuous, it will work.
Also you may want to use label_number instead of number_format, because, number_format is superseded by label_number as per the documentation,
These functions are kept for backward compatibility; you should switch to label_number()/label_comma() for new code.
library(ggplot2)
ggplot(mtcars, aes(x = mpg, y = drat)) +
geom_point() +
scale_y_continuous(
labels = scales::label_number(accuracy = 0.1),
limits = c(3, 4)
)
(Using mtcars built-in data for demo)
Is there any possible way to make the points on a boxplot show and not have them overlap each other if they arent unique?
Currently:
I want it to look like this (with the colours and other features):
I tried beeswarm and I'm getting the error:
Warning in f(...) : The default behavior of beeswarm has changed in version 0.6.0. In versions <0.6.0, this plot would have been dodged on the y-axis. In versions >=0.6.0, grouponX=FALSE must be explicitly set to group on y-axis. Please set grouponX=TRUE/FALSE to avoid this warning and ensure proper axis choice.
even though I have geom_beeswarm(grouponY=TRUE)
You could do something like this ...
library(tidyverse)
tibble(x = "label", y = rnorm(100, 10, 10)) |>
ggplot(aes(x, y)) +
geom_jitter(width = 0.1) +
geom_boxplot(outlier.shape = NA)
Created on 2022-04-24 by the reprex package (v2.0.1)
Slight modifications to Carl's answer could be to:
Move the geom_jitter layer below the geom_boxplot layer, so that the points show through the box; or
Make the box more transparent to allow the points to be visible
tibble(x = "label", y = rnorm(100, 10, 10)) %>%
ggplot(aes(x, y)) +
geom_boxplot(outlier.shape = NA, alpha ) +
geom_jitter(width = 0.1)
Alternatively, have you considered using a violin plot? It can more effectively show the density of the distribution, as the width of the plot is proportional to the proportion of data points around that level (for the y axis).
Let's say I want to make a histogram
So I use the following code
v100<-c(runif(100))
v100
library(ggplot2)
private_plot<-ggplot()+aes(v100)+geom_histogram(binwidth = (0.1),boundary=0
)+scale_x_continuous(breaks=seq(0,1,0.1), lim=c(0,1))
private_plot
How do I separate my columns so that the whole thing is more pleasing to the eye?
I tried this but it somehow doesn't work:
Adding space between bars in ggplot2
Thanks
You could set the line color of the histogram bars with the col parameter, and the filling color with the fill parameter. This is not really adding space between the bars, but it makes them visually distinct.
library(ggplot2)
set.seed(9876)
v100<-c(runif(100))
### use "col="grey" to set the line color
ggplot() +
aes(v100) +
geom_histogram(binwidth = 0.1, fill="black", col="grey") +
scale_x_continuous(breaks = seq(0,1,0.1), lim = c(0,1))
Yielding this graph:
Please let me know whether this is what you want.
If you want to increase the space for e.g. to indicate that values are discrete, one thing to do is to plot your histogram as a bar plot. In that case, you have to summarize the data yourself, and use geom_col() instead of geom_histogram(). If you want to increase the space further, you can use the width parameter.
library(tidyverse)
lambda <- 1:6
pois_bar <-
map(lambda, ~rpois(1e5, .x)) %>%
set_names(lambda) %>%
as_tibble() %>%
gather(lambda, value, convert = TRUE) %>%
count(lambda, value)
pois_bar %>%
ggplot() +
aes(x = value, y = n) +
geom_col(width = .5) +
facet_wrap(~lambda, scales = "free", labeller = "label_both")
Just use color and fill options to distinguish between the body and border of bins:
library(ggplot2)
set.seed(1234)
df <- data.frame(sex=factor(rep(c("F", "M"), each=200)),
weight=round(c(rnorm(200, mean=55, sd=5), rnorm(200, mean=65, sd=5))))
ggplot(df, aes(x=weight)) +
geom_histogram(color="black", fill="white")
In cases where you are creating a "histogram" over a range of integers, you could use:
ggplot(data) + geom_bar(aes(x = value, y = ..count..))
I just came across this issue. My solution was to add vertical lines at the points separating my bins. I use "theme_classic" and have a white background. I set my bins to break at 10, 20, 30, etc. So I just added 9 vertical lines with:
geom_vline(xintercept=10, linetype="solid", color = "white", size=2)+
geom_vline(xintercept=20, linetype="solid", color = "white", size=2)+
etc
A silly hack, but it works.
I have created a function for creating a barchart using ggplot.
In my figure I want to overlay the plot with white horizontal bars at the position of the tick marks like in the plot below
p <- ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_bar(stat = 'identity')
# By inspection I found the y-tick postions to be c(50,100,150)
p + geom_hline(aes(yintercept = seq(50,150,50)), colour = 'white')
However, I would like to be able to change the data, so I can't use static positions for the lines like in the example. For example I might change Sepal.With to Sepal.Height in the example above.
Can you tell me how to:
get the tick positions from my ggplot; or
get the function that ggplot uses for tick positions so that I can use this to position my lines.
so I can do something like
tickpositions <- ggplot_tickpostion_fun(iris$Sepal.Width)
p + scale_y_continuous(breaks = tickpositions) +
geom_hline(aes(yintercept = tickpositions), colour = 'white')
A possible solution for (1) is to use ggplot_build to grab the content of the plot object. ggplot_build results in "[...] a panel object, which contain all information about [...] breaks".
ggplot_build(p)$layout$panel_ranges[[1]]$y.major_source
# [1] 0 50 100 150
See edit for pre-ggplot2 2.2.0 alternative.
Check out ggplot2::ggplot_build - it can show you lots of details about the plot object. You have to give it a plot object as input. I usually like to str() the result of ggplot_build to see what all the different values it has are.
For example, I see that there is a panel --> ranges --> y.major_source vector that seems to be what you're looking for. So to complete your example:
p <- ggplot() +
geom_bar(data = iris, aes(x = Species, y = Sepal.Width), stat = 'identity')
pb <- ggplot_build(p)
str(p)
y.ticks <- pb$panel$ranges[[1]]$y.major_source
p + geom_hline(aes(yintercept = y.ticks), colour = 'white')
Note that I moved the data argument from the main ggplot function to inside geom_bar, so that geom_line would not try to use the same dataset and throw errors when the number in iris is not a multiple of the number of lines we're drawing. Another option would be to pass a data = data.frame() argument to geom_line; I cannot comment on which one is a more correct solution, or if there's a nicer solution altogether. But the gist of my code still holds :)
For ggplot 3.1.0 this worked for me:
ggplot_build(p)$layout$panel_params[[1]]$y.major_source
#[1] 0 50 100 150
for sure you can. Read the help file for the seq() function.
seq(from = min(), to = max(), len = 5)
and do something like this.
p <- ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_bar(stat = 'identity')
p + geom_hline(aes(yintercept = seq(from = min(), to = max(), len = 5)), colour = 'white')
I have a plot created in ggplot2 that uses scale_fill_gradientn. I'd like to add text at the minimum and maximum of the scale legend. For example, at the legend minimum display "Minimum" and at the legend maximum display "Maximum". There are posts using discrete fills and adding labels with numbers instead of text (e.g. here), but I am unsure how to use the labels feature with scale_fill_gradientn to only insert text at the min and max. At the present I am apt to getting errors:
Error in scale_labels.continuous(scale, breaks) :
Breaks and labels are different lengths
Is this text label possible within ggplot2 for this type of scale / fill?
# The example code here produces an plot for illustrative purposes only.
# create data frame, from ggplot2 documentation
df <- expand.grid(x = 0:5, y = 0:5)
df$z <- runif(nrow(df))
#plot
ggplot(df, aes(x, y, fill = z)) + geom_raster() +
scale_fill_gradientn(colours=topo.colors(7),na.value = "transparent")
For scale_fill_gradientn() you should provide both arguments: breaks= and labels= with the same length. With argument limits= you extend colorbar to minimum and maximum value you need.
ggplot(df, aes(x, y, fill = z)) + geom_raster() +
scale_fill_gradientn(colours=topo.colors(7),na.value = "transparent",
breaks=c(0,0.5,1),labels=c("Minimum",0.5,"Maximum"),
limits=c(0,1))
User Didzis Elfert's answer slightly lacks "automatism" in my opinion (but it is of course pointing to the core of the problem +1 :).
Here an option to programatically define minimum and maximum of your data.
Advantages:
You will not need to hard code values any more (which is error prone)
You will not need hard code the limits (which also is error prone)
Passing a named vector: You don't need the labels argument (manually map labels to values is also error-prone).
As a side effect you will avoid the "non-matching labels/breaks" problem
library(ggplot2)
foo <- expand.grid(x = 0:5, y = 0:5)
foo$z <- runif(nrow(foo))
myfuns <- list(Minimum = min, Mean = mean, Maximum = max)
ls_val <- unlist(lapply(myfuns, function(f) f(foo$z)))
# you only need to set the breaks argument!
ggplot(foo, aes(x, y, fill = z)) +
geom_raster() +
scale_fill_gradientn(
colours = topo.colors(7),
breaks = ls_val
)
# You can obviously also replace the middle value with sth else
ls_val[2] <- 0.5
names(ls_val)[2] <- 0.5
ggplot(foo, aes(x, y, fill = z)) +
geom_raster() +
scale_fill_gradientn(
colours = topo.colors(7),
breaks = ls_val
)