How to break axis in R/ggplot2? [duplicate] - r

I'm generating plots for some data, but the number of ticks is too small, I need more precision on the reading.
Is there some way to increase the number of axis ticks in ggplot2?
I know I can tell ggplot to use a vector as axis ticks, but what I want is to increase the number of ticks, for all data. In other words, I want the tick number to be calculated from the data.
Possibly ggplot do this internally with some algorithm, but I couldn't find how it does it, to change according to what I want.

You can override ggplots default scales by modifying scale_x_continuous and/or scale_y_continuous. For example:
library(ggplot2)
dat <- data.frame(x = rnorm(100), y = rnorm(100))
ggplot(dat, aes(x,y)) +
geom_point()
Gives you this:
And overriding the scales can give you something like this:
ggplot(dat, aes(x,y)) +
geom_point() +
scale_x_continuous(breaks = round(seq(min(dat$x), max(dat$x), by = 0.5),1)) +
scale_y_continuous(breaks = round(seq(min(dat$y), max(dat$y), by = 0.5),1))
If you want to simply "zoom" in on a specific part of a plot, look at xlim() and ylim() respectively. Good insight can also be found here to understand the other arguments as well.

Based on Daniel Krizian's comment, you can also use the pretty_breaks function from the scales library, which is imported automatically:
ggplot(dat, aes(x,y)) + geom_point() +
scale_x_continuous(breaks = scales::pretty_breaks(n = 10)) +
scale_y_continuous(breaks = scales::pretty_breaks(n = 10))
All you have to do is insert the number of ticks wanted for n.
A slightly less useful solution (since you have to specify the data variable again), you can use the built-in pretty function:
ggplot(dat, aes(x,y)) + geom_point() +
scale_x_continuous(breaks = pretty(dat$x, n = 10)) +
scale_y_continuous(breaks = pretty(dat$y, n = 10))

You can supply a function argument to scale, and ggplot will use
that function to calculate the tick locations.
library(ggplot2)
dat <- data.frame(x = rnorm(100), y = rnorm(100))
number_ticks <- function(n) {function(limits) pretty(limits, n)}
ggplot(dat, aes(x,y)) +
geom_point() +
scale_x_continuous(breaks=number_ticks(10)) +
scale_y_continuous(breaks=number_ticks(10))

Starting from v3.3.0, ggplot2 has an option n.breaks to automatically generate breaks for scale_x_continuous and scale_y_continuous
library(ggplot2)
plt <- ggplot(mtcars, aes(x = mpg, y = disp)) +
geom_point()
plt +
scale_x_continuous(n.breaks = 5)
plt +
scale_x_continuous(n.breaks = 10) +
scale_y_continuous(n.breaks = 10)

Additionally,
ggplot(dat, aes(x,y)) +
geom_point() +
scale_x_continuous(breaks = seq(min(dat$x), max(dat$x), by = 0.05))
Works for binned or discrete scaled x-axis data (I.e., rounding not necessary).

A reply to this question and How set labels on the X and Y axises by equal intervals in R ggplot?
mtcars %>%
ggplot(aes(mpg, disp)) +
geom_point() +
geom_smooth() +
scale_y_continuous(limits = c(0, 500),
breaks = seq(0,500,50)) +
scale_x_continuous(limits = c(0,40),
breaks = seq(0,40,5))

Related

box plot in R with additional point

I have a dataframe of multiple columns (let's say n) with different range and a vector of length n. I want different x-axis for each variable to be shown below each box plot. I tried facet_grid and facet_wrap but it gives common x axis.
This is what I have tried:
d <- data.frame(matrix(rnorm(10000), ncol = 20))
point_var <- rnorm(20)
plot.data <- gather(d, variable, value)
plot.data$test_data <- rep(point_var, each = nrow(d))
ggplot(plot.data, aes(x=variable, y=value)) +
geom_boxplot() +
geom_point(aes(x=factor(variable), y = test_data), color = "red") +
coord_flip() +
xlab("Variables") +
theme(legend.position="none")
If you can live with having the text of the x axis above the plot, and having the order of the graphs a bit messed-up this could work:
library(grid)
p = ggplot(plot.data, aes(x = 0, y=value)) +
geom_boxplot() +
geom_point(aes(x = 0, y = test_data), color = "red") +
facet_wrap(~variable, scales = "free_y", switch = "y") +
xlab("Variables") +
theme(legend.position="none") + theme_bw() + theme(axis.text.x=element_blank())
print(p, vp=viewport(angle=270, width = unit(.75, "npc"), height = unit(.75, "npc")))
I'm actually just creating the graph without flipping coords, so that scales = 'free_y' works, swithcing the position of the strip labels, and then rotating the graph.
If you don't like the text above graph (which is understandable), I would consider creating a list of single plots and then putting them together with grid.arrange.
HTH,
Lorenzo

How can I flip and then zoom in on a boxplot?

Consider the following code:
library(ggplot2)
ggplot(diamonds, aes("", price)) + geom_boxplot() + coord_flip()
After flipping the box plot, how can I zoom in to c(0,7000) on price (which is the new x-axis)?
I feel like it has something to do with coord_cartesian(ylim=c(0, 7000)), but this doesn't seem to work in conjunction with coord_flip().
Here is my solution:
ggplot(diamonds, aes("", price)) +
geom_boxplot() +
coord_flip(ylim=c(0, 7000))
Just combine the ylim command as argument in coord_flip().
You can use scale_y_continuous():
library(ggplot2)
ggplot(diamonds, aes("", price)) +
geom_boxplot() +
coord_flip() +
scale_y_continuous(limits = c(0, 7000))
Remember that coord_flip() just rotates the plot, hence you call scale_ on the y axis, which is what you specify price as. I usually like to call it last for that reason: to help limit confusion over which axis is which!
I think you need to manually compute the boxplot statistics and plot these.
# Compute summary statistics with max (y100) set to cutoff (7000)
df <- data.frame(x = 1,
y0 = min(diamonds$price),
y25 = quantile(diamonds$price, 0.25),
y50 = median(diamonds$price),
y75 = quantile(diamonds$price, 0.75),
y100 = 7000
)
ggplot(df, aes(x)) +
geom_boxplot(aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100),
stat = "identity") +
coord_flip()

Histogram with ggplot2: change xticks, percentage of y

I have a vector x
x = sample (1:3000, 20000, replace = T)
I tried to plot histogram of x
ggplot() + aes(x)+ geom_histogram() + scale_x_log10() + geom_bar(aes(y = (..count..)/sum(..count..))) + scale_y_continuous(labels = scales::percent)
I have two problems:
Why does the y axis showed 200%, 400%; it is impossible for a histogram.
How can I customize the x-ticks values. I want to display x-ticks as 0, 1, 2 ... 10, 20, 30 ... 10, 100.
Thanks a lot
I cleaned your code a bit. You included one geom_histogram() too much. Because of that the wired y-axis. The ticks you can control within the breaks argument of the scale functions.
Try this:
df <- data.frame(x = sample (1:3000, 20000, replace = T))
ggplot(df, aes(x = x)) +
geom_histogram(aes(y = (..count..)/sum(..count..))) +
scale_y_continuous(labels = scales::percent) +
scale_x_log10(breaks=c(1,100,1000))

geom_histogram and data labels [duplicate]

Below code works well and it labels the barplot correctly, However, if I try geom_text for a histogram I fail since geom_text requires a y-component and a histogram's y component is not part of the original data.
Label an "ordinary" bar plot (geom_bar(stat = "identity") works well:
ggplot(csub, aes(x = Year, y = Anomaly10y, fill = pos)) +
geom_bar(stat = "identity", position = "identity") +
geom_text(aes(label = Anomaly10y,vjust=1.5))
My Problem: How to get the correct y and label (indicated by ?) for geom_text, to put labels on top of the histogram bars
ggplot(csub,aes(x = Anomaly10y)) +
geom_histogram()
geom_text(aes(label = ?, vjust = 1.5))
geom_text requires x, y and labels. However, y and labels are not in the original data, but generated by the geom_histogram function. How can I extract the necessary data to position labels on a histogram?
geom_histogram() is just a fancy wrapper to stat_bin so you can all that yourself with the bars and text that you like. Here's an example
#sample data
set.seed(15)
csub<-data.frame(Anomaly10y = rpois(50,5))
And then we plot it with
ggplot(csub,aes(x=Anomaly10y)) +
stat_bin(binwidth=1) + ylim(c(0, 12)) +
stat_bin(binwidth=1, geom="text", aes(label=..count..), vjust=-1.5)
to get
Ok to make it aesthetically appealing here is the solution:
set.seed(15)
csub <- data.frame(Anomaly10y = rpois(50, 5))
Now Plot it
csub %>%
ggplot(aes(Anomaly10y)) +
geom_histogram(binwidth=1) +
stat_bin(binwidth=1, geom='text', color='white', aes(label=..count..),
position=position_stack(vjust = 0.5))
resultant plot will be

Get values and positions to label a ggplot histogram

Below code works well and it labels the barplot correctly, However, if I try geom_text for a histogram I fail since geom_text requires a y-component and a histogram's y component is not part of the original data.
Label an "ordinary" bar plot (geom_bar(stat = "identity") works well:
ggplot(csub, aes(x = Year, y = Anomaly10y, fill = pos)) +
geom_bar(stat = "identity", position = "identity") +
geom_text(aes(label = Anomaly10y,vjust=1.5))
My Problem: How to get the correct y and label (indicated by ?) for geom_text, to put labels on top of the histogram bars
ggplot(csub,aes(x = Anomaly10y)) +
geom_histogram()
geom_text(aes(label = ?, vjust = 1.5))
geom_text requires x, y and labels. However, y and labels are not in the original data, but generated by the geom_histogram function. How can I extract the necessary data to position labels on a histogram?
geom_histogram() is just a fancy wrapper to stat_bin so you can all that yourself with the bars and text that you like. Here's an example
#sample data
set.seed(15)
csub<-data.frame(Anomaly10y = rpois(50,5))
And then we plot it with
ggplot(csub,aes(x=Anomaly10y)) +
stat_bin(binwidth=1) + ylim(c(0, 12)) +
stat_bin(binwidth=1, geom="text", aes(label=..count..), vjust=-1.5)
to get
Ok to make it aesthetically appealing here is the solution:
set.seed(15)
csub <- data.frame(Anomaly10y = rpois(50, 5))
Now Plot it
csub %>%
ggplot(aes(Anomaly10y)) +
geom_histogram(binwidth=1) +
stat_bin(binwidth=1, geom='text', color='white', aes(label=..count..),
position=position_stack(vjust = 0.5))
resultant plot will be

Resources