I am making a plot in ggplot2 where on the y axis I have the indices of groups and on the x axis some information. For readability I would like to make the labels bigger but then they start overlapping. Therefore I would like to put the labels into two columns as shown in the figure so they can be bigger. Is there a way to do this in ggplot? I tried vjust and hjust but they only seem to accept 1 argument applying to all labels.
Current labels:
Objective labeling:
Well, there is no obvious parameter responsible for that, at least AFAIK.
However, for your specific goal my first thought was to add some spaces to numeric labels.
avoid_overlap <- function(x)
{
ind <- seq_along(x) %% 2 == 0
x[ind] <- paste0(x[ind], " ")
x
}
ggplot(mtcars, aes(cyl, mpg)) + geom_point() +
scale_y_continuous(breaks = 10:35, labels = avoid_overlap(10:35)) +
theme(axis.text.y = element_text(size = 32))
Play with grid lines (minor/major) via theme if the grid is too dense.
Related
I'm currently working on a sales dataset, and I have been trying to make a barchart, showcasing the units sold by each country, with the colouring of the charts being the number of inhabitants in each country.
so far I have used the following code to get the chart
ChartProductsbyCountry <- ggplot(Dataframe10, aes(x=Country, y=SumAct_U)) + geom_col(aes(fill=Inhabitants_2019)) + theme(axis.text.x = element_text(angle = 90, size = 10))
library(scales)
ChartProductsbyCountry + scale_y_continuous(labels = comma)
This gives me the following chart
Barchart
I'm already very satisfied with it, however I would like to change some things but don't know how:
on the right side, showing the "colouring labels/legend", it does not show the actual numbers but rather 2e+07, 4e+07, etc... how can I change it to showing Numbers instead? And as for the y-axis, how should I change my code to have ticks going from 0 to 1.250.000,00 at every 125.000 (Units) (so starting with 0, then 125.000, then 250.000, then 375.000, ...)?
OP, as mentioned by #stefan in the comment, you can use the breaks= argument to control spacing of your ticks on the axis. To format the colorbar legend items, you can use scale_fill_continuous() in the same manner you did for the y axis. Here's an example:
library(ggplot2)
library(scales)
set.seed(8675309)
df <- data.frame(x=LETTERS, y=sample(0:20, replace=T, size=26) * 1000000)
p <-
ggplot(df, aes(x,y, fill=y)) + geom_col() +
scale_y_continuous(labels = comma)
p
To set the number formatting in the colorbar, you can use labels = comma within scale_fill_continuous(). To change the ticks on the y axis, you can access that via the breaks= argument. Essentially, send a vector of values to breaks= to have the labels marked where you want. In this case, I'll set it up to make a tick mark every 1 million, and then set the colorbar to use comma format:
ggplot(df, aes(x,y, fill=y)) + geom_col() +
scale_y_continuous(labels = comma, breaks=seq(0, max(df$y), by=1000000)) +
scale_fill_continuous(labels=comma)
I'm trying to force the grid of a scatter plot to be composed of squares, with x and y values that have different ranges.
I tried to force a square shape of the whole plot (aspect.ratio=1), but this does not solve the problem of different ranges. Then I tried to change limits of values of my axes.
1)Here is what I tried first:
p + theme(aspect.ratio = 1) +
coord_fixed(ratio=1, xlim = c(-0.050,0.050),ylim = c(-0.03,0.03))
2) I changed the ratio by using the range of the values for each axis:
p + coord_fixed(ratio=0.06/0.10, xlim = c(-0.050,0.050), ylim = c(-0.03,0.03))
3)Then I changed the limits of y to match those of x:
p + theme(aspect.ratio = 1) +
coord_fixed(ratio=1, xlim = c(-0.050,0.050),ylim = c(-0.05,0.05))
1) The grid on the background is composed by rectangles.
2) I would expect this to change the position of the tick marks automatically in order to give me a grid composed of squares. Still triangles.
3) It obviously worked 'cause I matched the ranges of x and y. But there was a lot of empty space in the graph.
Is there something else I should try?
Thanks in advance.
If you want the plot to be square and you want the grid to be square you can do this by rescaling the y variable to be on the same scale as the x variable (or vice versa) for plotting, and then inverting the rescaling to generate the correct axis value labels for the rescaled axis.
Here's an example using the mtcars data frame, and we'll use the rescale function from the scales package.
First let's create a plot of mpg vs. hp but with the hp values rescaled to be on the same scale as mpg:
library(tidyverse)
library(scales)
theme_set(theme_bw())
p = mtcars %>%
mutate(hp.scaled = rescale(hp, to=range(mpg))) %>%
ggplot(aes(mpg, hp.scaled)) +
geom_point() +
coord_fixed() +
labs(x="mpg", y="hp")
Now we can invert the rescaling to generate the correct value labels for hp. We do that below by supplying the inverting function to the labels argument of scale_y_continuous:
p + scale_y_continuous(labels=function(x) rescale(x, to=range(mtcars$hp)))
But note that rescaling back to the original hp scale results in non-pretty breaks. We can fix that by generating pretty breaks on the hp scale, rescaling those to the mpg scale to get the locations where we want the tick marks and then inverting that to get the label values. However, in that case we won't get a square grid if we want to keep the overall plot panel square:
p + scale_y_continuous(breaks = rescale(pretty_breaks(n=5)(mtcars$hp),
from=range(mtcars$hp),
to=range(mtcars$mpg)),
labels = function(x) rescale(x, from=range(mtcars$mpg), to=range(mtcars$hp)))
I'm not sure what code you are using, it is missing in block 1 and 3. But using the mtcars data set the following works:
library(ggplot2)
ggplot(mtcars, aes(mpg, wt)) +
geom_point() +
coord_fixed(ratio = 1) +
scale_x_continuous(breaks = seq(10, 35, 1)) +
scale_y_continuous(breaks = seq(1, 6, 1))
The last two lines make it clear that 1 point on the x-axis is equal to 1 point on the y-axis.
In the documention you will further find the following advise:
ensures that the ranges of axes are equal to the specified ratio by
adjusting the plot aspect ratio
I have a set of code that produces multiple plots using facet_wrap:
ggplot(summ,aes(x=depth,y=expr,colour=bank,group=bank)) +
geom_errorbar(aes(ymin=expr-se,ymax=expr+se),lwd=0.4,width=0.3,position=pd) +
geom_line(aes(group=bank,linetype=bank),position=pd) +
geom_point(aes(group=bank,pch=bank),position=pd,size=2.5) +
scale_colour_manual(values=c("coral","cyan3", "blue")) +
facet_wrap(~gene,scales="free_y") +
theme_bw()
With the reference datasets, this code produces figures like this:
I am trying to accomplish two goals here:
Keep the auto scaling of the y axis, but make sure only 1 decimal place is displayed across all the plots. I have tried creating a new column of the rounded expr values, but it causes the error bars to not line up properly.
I would like to wrap the titles. I have tried changing the font size as in Change plot title sizes in a facet_wrap multiplot, but some of the gene names are too long and will end up being too small to read if I cram them on a single line. Is there a way to wrap the text, using code within the facet_wrap statement?
Probably cannot serve as definite answer, but here are some pointers regarding your questions:
Formatting the y-axis scale labels.
First, let's try the direct solution using format function. Here we format all y-axis scale labels to have 1 decimal value, after rounding it with round.
formatter <- function(...){
function(x) format(round(x, 1), ...)
}
mtcars2 <- mtcars
sp <- ggplot(mtcars2, aes(x = mpg, y = qsec)) + geom_point() + facet_wrap(~cyl, scales = "free_y")
sp <- sp + scale_y_continuous(labels = formatter(nsmall = 1))
The issue is, sometimes this approach is not practical. Take the leftmost plot from your figure, for example. Using the same formatting, all y-axis scale labels would be rounded up to -0.3, which is not preferable.
The other solution is to modify the breaks for each plot into a set of rounded values. But again, taking the leftmost plot of your figure as an example, it'll end up with just one label point, -0.3
Yet another solution is to format the labels into scientific form. For simplicity, you can modify the formatter function as follow:
formatter <- function(...){
function(x) format(x, ..., scientific = T, digit = 2)
}
Now you can have a uniform format for all of plots' y-axis. My suggestion, though, is to set the label with 2 decimal places after rounding.
Wrap facet titles
This can be done using labeller argument in facet_wrap.
# Modify cyl into factors
mtcars2$cyl <- c("Four Cylinder", "Six Cylinder", "Eight Cylinder")[match(mtcars2$cyl, c(4,6,8))]
# Redraw the graph
sp <- ggplot(mtcars2, aes(x = mpg, y = qsec)) + geom_point() +
facet_wrap(~cyl, scales = "free_y", labeller = labeller(cyl = label_wrap_gen(width = 10)))
sp <- sp + scale_y_continuous(labels = formatter(nsmall = 2))
It must be noted that the wrap function detects space to separate labels into lines. So, in your case, you might need to modify your variables.
This only solved the first part of the question. You can create a function to format your axis and use scale_y_continous to adjust it.
df <- data.frame(x=rnorm(11), y1=seq(2, 3, 0.1) + 10, y2=rnorm(11))
library(ggplot2)
library(reshape2)
df <- melt(df, 'x')
# Before
ggplot(df, aes(x=x, y=value)) + geom_point() +
facet_wrap(~ variable, scale="free")
# label function
f <- function(x){
format(round(x, 1), nsmall=1)
}
# After
ggplot(df, aes(x=x, y=value)) + geom_point() +
facet_wrap(~ variable, scale="free") +
scale_y_continuous(labels=f)
scale_*_continuous(..., labels = function(x) sprintf("%0.0f", x)) worked in my case.
I am plotting a forest plot in ggplot2 and am having issues with the ordering of the labels in the legend matching the order of the labels in the data set. Here is my code below.
data code
d<-data.frame(x=c("Co-K(W) N=720", "IH-K(W) N=67", "IF-K(W) N=198", "CO-K(B)N=78", "IH-K(B) N=13", "CO=A(W) N=874","D-Sco Ad(W) N=346","DR-Ad (W) N=892","CE_A(W) N=274","CO-Ad(B) N=66","D-So Ad(B) N=215","DR-Ad(B) N=123","CE-Ad(B) N=79"),
y = rnorm(13, 0, 0.1))
d <- transform(d, ylo = y-1/13, yhi=y+1/13)
d$x <- factor(d$x, levels=rev(d$x)) # reverse ordering
forest plot code
credplot.gg <- function(d){
# d is a data frame with 4 columns
# d$x gives variable names
# d$y gives center point
# d$ylo gives lower limits
# d$yhi gives upper limits
require(ggplot2)
p <- ggplot(d, aes(x=x, y=y, ymin=ylo, ymax=yhi,group=x,colour=x,)) +
geom_pointrange(size=1) +
theme_bw() +
scale_color_discrete(name="Sample") +
coord_flip() +
theme(legend.key=element_rect(fill='cornsilk2')) +
guides(colour = guide_legend(override.aes = list(size=0.5))) +
geom_hline(aes(x=0), colour = 'red', lty=2) +
xlab('Cohort') + ylab('CI') + ggtitle('Forest Plot')
return(p)
}
credplot.gg(d)
This is what I get. As you can see the labels on the y axis matches the labels in the order that it is in the data. However, it is not the same order in the legend. I'm not sure how to correct this. This is my first time creating a plot in ggplot2. Any feedback is well appreciated.Thanks in advanced
Nice plot, especially for a first ggplot! I've not tested, but I think all you need is to add reverse=TRUE inside your colour's guide_legend(found this in the Cookbook for R).
If I were to make one more comment, I'd say that ordering your vertical factor by numeric value often makes comparisons easier when alphabetical order isn't particularly meaningful. (Though maybe your alpha order is meaningful.)
I am trying to put multiple ggplot2 time series plots on a page using the gridExtra package's arrange() function. Unfortunately, I am finding that the x-axis labels get pushed together; it appears that the plot is putting the same number of x-axis labels as a full-page chart, even though my charts only take up 1/4 of a page. Is there a better way to do this? I would prefer not to have to manually set any points, since I will be dealing with a large number of charts that span different date ranges and have different frequencies.
Here is some example code that replicates the problem:
dfm <- data.frame(index=seq(from=as.Date("2000-01-01"), length.out=100, by="year"),
x1=rnorm(100),
x2=rnorm(100))
mydata <- melt(dfm, id="index")
pdf("test.pdf")
plot1 <- ggplot(mydata, aes(index, value, color=variable))+geom_line()
plot2 <- ggplot(mydata, aes(index, value, color=variable))+geom_line()
plot3 <- ggplot(mydata, aes(index, value, color=variable))+geom_line()
plot4 <- ggplot(mydata, aes(index, value, color=variable))+geom_line()
arrange(plot1, plot2, plot3, plot4, ncol=2, nrow=2)
dev.off()
either rotate the axis labels
+ opts(axis.text.x=theme_text(angle=45, hjust=1))
Note that opts is deprecated in current versions of ggplot2. This functionality has been moved to theme():
+ theme(axis.text.x = element_text(angle = 45, hjust = 1))
or dilute the x-axis
+scale_x_datetime(major = "10 years")
to automatically shift the labels, I think the arrange() function needs to be fiddled with (though I'm not sure how).
I wrote this function to return the proper major axis breaks given that you want some set number of major breaks.
year.range.major <- function(df, column = "index", n = 5){
range <- diff(range(df[,column]))
range.num <- as.numeric(range)
major = max(pretty((range.num/365)/n))
return(paste(major,"years"))
}
So, instead of always fixing the breaks at 10 years, it'll produce fixed number of breaks at nice intervals.
+scale_x_date(major = year.range.major())
or
+scale_x_date(major = year.range.major(n=3))