How to handle yaxs in geom_bar in ggplot2? [duplicate] - r

Is it possible to only set the lower bound of a limit for continuous scale? I want to make all my plots 0 based without needing to specify the upper limit bound.
e.g.
+ scale_y_continuous(minlim=0)

You can use expand_limits
ggplot(mtcars, aes(wt, mpg)) + geom_point() + expand_limits(y=0)
Here is a comparison of the two:
without expand_limits
with expand_limits
As of version 1.0.0 of ggplot2, you can specify only one limit and have the other be as it would be normally determined by setting that second limit to NA. This approach will allow for both expansion and truncation of the axis range.
ggplot(mtcars, aes(wt, mpg)) + geom_point() +
scale_y_continuous(limits = c(0, NA))
specifying it via ylim(c(0, NA)) gives an identical figure.

How about using aes(ymin=0), as in:
ggplot(mtcars, aes(wt, mpg)) + geom_point() + aes(ymin=0)

You can also try the following code which will give you the min y-axis at zero and also without the extra gap between x-axis and min y value.
scale_y_continuous(limits = c(0, NA), expand = c(0,0))

I don't think you can do this directly. But as a work-around, you can mimic the way that ggplot2 determines the upper limit:
scale_y_continuous(limits=c(0, max(mydata$y) * 1.1))

Related

Automatically setting the limits of an axis when the other is manually defined in ggplot2

Consider the following example:
ggplot(mtcars, aes(mpg, wt)) +
geom_point() +
coord_cartesian(c(15, 20))
It sets the the limits of the x-axis, but the y limits remain as with the original plot, leaving a huge empty area.
Is it possible to automatically adjust y limits in this case? Similar to what
ggplot(mtcars[mtcars$mpg>15&mtcars$mpg<20,], aes(mpg, wt)) +
geom_point()
would produce.
Such automatism would make it unnecessary to manually calculate the y limits (which is not even trivial unless expand=0, as one has to take into account how y limits are expanded compared to what is provided).
Why don't you just set the y limits too?
ggplot(mtcars, aes(mpg, wt)) +
geom_point() +
coord_cartesian(xlim = c(15, 20), ylim = c(2.5,4.5))
Of course you can calculate the limits beforehand with some function, but I'm not sure if that makes any sense, because to calculate the limits in a region, you will have to tell that function which are the limits of the region, which represents the same amount manual effort as putting those limits into the ggplot function directly.
Such a function could look like this:
find_ylimits <- function(data,xlim,overhead = 1){
filter <- xlim[1] <= data[[1]] & data[[1]] <= xlim[2]
c(min(data[[2]][filter])*overhead,
max(data[[2]][filter])*overhead)
}
And then you could make the plot as follows:
ggplot(mtcars, aes(mpg, wt)) +
geom_point() +
coord_cartesian(xlim = c(15, 20), ylim = find_ylimits(mtcars[,c("mpg","wt")],c(15,20)))

Scale geom_density to match geom_bar with percentage on y

Since I was confused about the math last time I tried asking this, here's another try. I want to combine a histogram with a smoothed distribution fit. And I want the y axis to be in percent.
I can't find a good way to get this result. Last time, I managed to find a way to scale the geom_bar to the same scale as geom_density, but that's the opposite of what I wanted.
My current code produces this output:
ggplot2::ggplot(iris, aes(Sepal.Length)) +
geom_bar(stat="bin", aes(y=..density..)) +
geom_density()
The density and bar y values match up, but the scaling is nonsensical. I want percentage on the y axes, not well, the density.
Some new attempts. We begin with a bar plot modified to show percentages instead of counts:
gg = ggplot2::ggplot(iris, aes(Sepal.Length)) +
geom_bar(aes(y = ..count../sum(..count..))) +
scale_y_continuous(name = "%", labels=scales::percent)
Then we try to add a geom_density to that and somehow get it to scale properly:
gg + geom_density()
gg + geom_density(aes(y=..count..))
gg + geom_density(aes(y=..scaled..))
gg + geom_density(aes(y=..density..))
Same as the first.
gg + geom_density(aes(y = ..count../sum(..count..)))
gg + geom_density(aes(y = ..count../n))
Seems to be off by about factor 10...
gg + geom_density(aes(y = ..count../n/10))
same as:
gg + geom_density(aes(y = ..density../10))
But ad hoc inserting numbers seems like a bad idea.
One useful trick is to inspect the calculated values of the plot. These are not normally saved in the object if one saves it. However, one can use:
gg_data = ggplot_build(gg + geom_density())
gg_data$data[[2]] %>% View
Since we know the density fit around x=6 should be about .04 (4%), we can look around for ggplot2-calculated values that get us there, and the only thing I see is density/10.
How do I get geom_density fit to scale to the same y axis as the modified geom_bar?
Bonus question: why are the grouping of the bars different? The current function does not have spaces in between bars.
Here is an easy solution:
library(scales) # ! important
library(ggplot2)
ggplot(iris, aes(Sepal.Length)) +
stat_bin(aes(y=..density..), breaks = seq(min(iris$Sepal.Length), max(iris$Sepal.Length), by = .1), color="white") +
geom_line(stat="density", size = 1) +
scale_y_continuous(labels = percent, name = "percent") +
theme_classic()
Output:
Try this
ggplot2::ggplot(iris, aes(x=Sepal.Length)) +
geom_histogram(stat="bin", binwidth = .1, aes(y=..density..)) +
geom_density()+
scale_y_continuous(breaks = c(0, .1, .2,.3,.4,.5,.6),
labels =c ("0", "1%", "2%", "3%", "4%", "5%", "6%") ) +
ylab("Percent of Irises") +
xlab("Sepal Length in Bins of .1 cm")
I think your first example is what you want, you just want to change the labels to make it seem like it is percents, so just do that rather than mess around.

ggplot, ggplotly, scale_y_continuous, ylim and percentage

I would like to plot a graph where the y axis is in percentage:
p = ggplot(test, aes(x=creation_date, y=value, color=type)) +
geom_line(aes(group=type)) +
scale_colour_manual(values=c("breach"="red","within_promise"="green","before_promise"="blue")) +
geom_vline(xintercept=c(as.numeric(as.Date('2016-05-14'))),linetype="dotted") +
scale_y_continuous(labels=percent)
ggplotly()
Now I would like to set the y axis superior limit to be 100%
p = ggplot(test, aes(x=creation_date, y=value, color=type)) +
geom_line(aes(group=type)) +
scale_colour_manual(values=c("breach"="red","within_promise"="green","before_promise"="blue")) +
geom_vline(xintercept=c(as.numeric(as.Date('2016-05-14'))),linetype="dotted") +
scale_y_continuous(labels=percent) +
ylim(0, 1)
ggplotly()
But result is the same as the previous plot, the y axis limits are the same.
It works when I don't put the y axis to be in percent:
p = ggplot(test, aes(x=creation_date, y=value, color=type)) +
geom_line(aes(group=type)) +
scale_colour_manual(values=c("breach"="red","within_promise"="green","before_promise"="blue")) +
geom_vline(xintercept=c(as.numeric(as.Date('2016-05-14'))),linetype="dotted") +
ylim(0, 1)
ggplotly()
Moreover using ggplotly when I set the y axis to be in percent when I put my mouse on a point of the graph the value is not in percent:
I'm aware it's been a whle since you asked, but you could use limits inside scale_y_continuous(), like this:
scale_y_continuous(labels = scales::percent, limits=c(0,1))
Minor suggested edit to the response above:
It seems that you have to specify the limits within the scale_y_continuous call prior to setting the values as percentages:
scale_y_continuous(limits=c(0,1), labels = scales::percent)
As you have not given the dataset, I am making my best guess.
You need to give limits option within scale_y_continuous. ylim as you see, does not override the aesthetics set by scale_y_continuous. You need to use one function to change aesthetics of y-axis. Use ylim or scale_y_continuous.
I had a similar issue here and neither solutions worked for me. It's clear that we can't combine scale_y_continuous with ylim. Setting the limit parameter within scale_y_continuous caused some erros. However, as suggested in the docs we can use the function coord_cartesian() in combination with scale_y_continuous. The final code would be something like this:
...+
coord_cartesian(ylim=c(0.50, 0.75)) +
scale_y_continuous(labels = scales::percent)

In ggplot2, how can I limit the range of geom_hline?

Taking a simple plot from ggplot2 manual
p <- ggplot(mtcars, aes(x = wt, y=mpg)) + geom_point()
p + geom_hline(yintercept=20)
I get a horizontal line at value 20, as advertised.
Is there a way to limit the range of this line on x axis, to let's say2 - 4 range?
You can use geom_segment() instead of geom_hline() and provide x= and xend= values you need.
p+geom_segment(aes(x=2,xend=4,y=20,yend=20))

ggplot2: Issues with changing binwidth of stacked histogram

Having issues changing the binwidth of stacked histogram created with ggplot2.
It does not error out but seems to be ignoring the binwidth setting.
ggplot(trade.a, aes(x=variable1,y=value ,fill=category)) +
geom_bar(stat = "identity", binwidth=c(0,300),position ='fill') +
xlim(0, 300) +
xlab("Variable1") +
ylab("Count") +
ggtitle("Category") +
scale_y_continuous(labels = percent_format()) +
theme_grey(base_size = 20)
Any ideas?
Using stat="identity" inside the geom_bar means that the data in trade.a has already been binned and counted (which is also implied by specifying a y aesthetic which points into the trade.a data). binwidth is an argument to stat_bin (the default stat for geom_bar) which does the aggregation for you. (Additionally it takes only a single value; the breaks argument can take a vector of breakpoints.) Thus to change the binning width for the trade.a data, you need to go back to the step where you did the binning. Or start with unbinned data and use the default stat for geom_bar with the binwidth specified.

Resources