Proper line labelling within canvas in ggplot2 - r

Problem description
I'm writing a function that outputs a line plot ggplot2 object. I would like to have an argument that can control whether to add labels at the end of each line. A visual example is found here. The difficulty lies in the variable length of line labels. Ideally, the function will be smart enough to figure out the proper extra space to expand for the labels on the right.
In base R, there is a function graphics::strwidth that computes how many inches needed for the passed in string. I was wondering if there is a way that can do one step further, i.e. maps the string length to that with respect to the data scale. A dummy example is provided below for better explanation.
A dummy example
library(directlabels)
library(reshape2)
library(ggplot2)
dts <- cbind(`my long group 1` = mdeaths, `my long group 2` = fdeaths, time = 1:length(mdeaths))
ddf <- melt(as.data.frame(dts), id = "time")
names(ddf) <- c("time", "group", "deaths")
plot_wo_label <- ggplot(ddf, aes(x = time, y = deaths, group = group)) + geom_line()
plot_with_label <- plot_wo_label + geom_dl(aes(label = group), method = list(dl.combine('last.points')))
Plot with line label
As we can see above, the long line labels ('my long group 1' and 'my long group 2') get truncated due to the margin space. An ad hoc solution is to use xlim to expand the right edge of x-axis by trial and error. But that certainly is not an option in my case.
I know there are posted solutions by turning off clipping (like here), however, I imagine that some of the lines may end early and having a label at the canvas edge far away from the line end may cause difficulty to associate the labels with its corresponding lines.
So if there is a way to figure out how much space a string of arbitrary length will occupy on the x-axis (in the dummy example, the "duration" of label "my long group 1" in the "time" axis), that would be very helpful. But this is just one possible direction in my mind, other solutions are welcome and greatly appreciated!
Thanks!

The difficulty is that the absolute sizes of the text labels stay the same when you change the rendered output size of the plot. As a result, when you make the rendered size of the plot larger, the text labels span a smaller fraction of the plot area, and vice versa.
There's probably a way to generate the plot, dig into the grob structure of the plot to get the label width in plot coordinates, and then use scale_x_continuous to adjust the plot's x limits to include all of the label text. Unfortunately, I'm not sure how to do that, but hopefully someone else will come along who does.
For now, here's a demonstration of the issue. (I've switched to geom_text to place the labels, as I don't think directlabels is necessary here.):
library(tidyverse)
ggplot(ddf, aes(x = time, y = deaths, group = group)) +
geom_line() +
geom_text(data=ddf %>% group_by(group) %>% filter(time==max(time)),
aes(label=group), hjust=0, position=position_nudge(x=0.5)) +
scale_x_continuous(limits=c(0,1.15*max(ddf$time)))
Here's a screenshot of two saved versions of the plot, one version saved as a 700x350 pixel png file and the other saved as a 500x250 pixel png file. You can see that the absolute font sizes are the same, even though the sizes of the plots are different.

Related

Adjust bin width with geom_col()

I'm having a trouble with the binwidth of my col graphics.
I'm trying to show the highest/lowest suicide rate by country on the same page using shiny.
But, the country names are overlaping one another as you can see below:
How can i adjust this?
There are quite a few possibilities for resizing and adjusting things so that your long axis labels "fit" on a column plot (or any other ggplot for that matter). I'll go through some options here.
First of all... a sample dataset, since we did not get a suitable reprex from your question.
df <- data.frame(
x=c('Text1', 'Text2', 'Long Text Here', 'Really Really Long Text Label Here', 'Text5', 'Text6', 'Text7'),
y=c(sample(1:20, 7, replace=TRUE)))
df$x <- factor(df$x, levels=df$x) # making sure ggplot doesn't alphabetically sort!
library(ggplot2)
p <- ggplot(df, aes(x,y)) +
geom_col(aes(fill=x), show.legend = FALSE)
p
There's your overlapping labels. Now for some options:
Option #1: Resize the plot
One very simple method to "solving" the problem is to realize that R handles graphics... kind of funny. The look of a particular plot depends on the resolution and aspect ratio of the graphics device. Not only that, but text does not scale the same as the other plot elements. This means that you can fix the problem by forcing a different aspect ratio.
# this was used to create the above plot
ggsave('original.png', width=8, height=5)
# changing the aspect ratio produces the plot below on my graphics device
ggsave('resized.png', width=12, height=5)
Option #2: Change the Text Size
The other option is to make your text size for the axis labels smaller. The result is really similar to just resizing the plot.
p + theme(axis.text.x=element_text(size=6))
Option #3: Angle the Text
One really good option is to angle your text using the theme() element again. Note that when you do this you want to change the default alignment of your labels. Set hjust=1 so that the text is "right aligned". If you are setting your angle to 90°, you will also want to set your vjust=0.5 to make the text aligned with the tick mark vertically. Here I'll show you a 45° angled text option:
p + theme(axis.text.x=element_text(angle=45, hjust=1))
Option #4: Wrap Text Labels
One of my go-to favorite options here is to wrap the text label. There are a few ways to do this, but I prefer using wrap_format() from the scales package and a scale_* function. Note, the number given to wrap_format(X) indicates that wrapping should happen after X number of characters in the label.
library(scales)
p + scale_x_discrete(labels=wrap_format(22))
Option #5: Combine all Above
The best way to fix your problem is to use a combination of all techniques above to get the chart to look the way you believe looks most satisfying. This will depend on how many columns you have in your shiny plot and how you generate that plot (user input or always the same, etc). So it's up to you here.
p + scale_x_discrete(name=NULL, labels=wrap_format(22)) +
scale_y_continuous(expand=expansion(mult=c(0,0.15))) +
theme_classic() + #important to put this before overwriting individual theme elements!
theme(
axis.text.x=element_text(angle=40, hjust=1, size=15),
axis.text.y=element_text(size=15),
panel.grid.major.y=element_line(color='gray75', linetype=2))
You can flip your graph horizontally, it will be better for the readers. You can add this to your graph to flip it:
coord_flip()

How to create colorbars in ggplot similar to those created by Lattice

I want colourbars created with ggplot to be similar to what spplot function (from lattice package) creates. Something like the attached image with each finite number of colours being assigned to rectangular blocks, instead of creating a continuous spectrum of colours. I need to be able to define the outline colour of the colourbar and also the format of the ticks.
I put this simple example together. How can I change this into something similar to this attached image? For example, I want the legend to start from -3 and end at 3 with 10 blocks of colours. I already tried 'nbin' in the function 'guides'. But I need the labels to be put at the 'edges' of the colour blocks instead of at the middle of them (i.e. centre of the bins).
ps: And sometimes ggplot creates a labels beyond the length of the colourbar!
library(ggplot2)
dat <- data.frame(x = rnorm(100), y = rnorm(100), col=rnorm(100))
ggplot(dat, aes(x,y,color=col)) +
geom_point() +
scale_color_gradient2(limits=c(-3,3), midpoint=0) +
guides(color=guide_colourbar(nbin=10, raster=FALSE))
I think what you ask for is not possible using the latest (public) version of ggplot2.
Ugly method, do at your own discretion
However, if you install the development version (this led to some version conflicts with other packages on my machine and I guess some things are not fully working yet) using
devtools::install_github("tidyverse/ggplot2")
library(ggplot2)
You will get some more options to modify guides such as ticks.colour, frame.colour or frame.linewidth which lets you customize the colorbar according to your requirements:
set.seed(6)
dat <- data.frame(x = rnorm(100), y = rnorm(100), z=rnorm(100))
ggplot(dat, aes(x,y,color=z)) + geom_point() +
scale_color_gradientn(colours=c("blue","gray80","red"), limits=c(-3,3),
breaks=c(-3/9*8,-3/9*4,0,3/9*4,3/9*8), labels=c(-2.4,-1.2,0,1.2,2.4), na.value = "green",
guide=guide_colorbar(nbin=10, raster=F, barwidth=20, frame.colour=c("black"),
frame.linewidth=1, ticks.colour="black", direction="horizontal")) +
theme(legend.position = "bottom")
Use colours = c() to specify a vector of colors
Use breaks together with labels to manually assign labels at the correct positions along the colorbar. EDIT: We can easily compute the required position along the colorbar by dividing 3 (the length of one half along the colorbar) by 9 (there are 9 half-boxes from the middle of the bar to the centre of the first box) and multiplying that by the number of half-boxes where we want the label to appear.
Values outside of limits will be colored according to na.value
You could additionally specify name = "Your Variable Name" to replace the z next to the colorbar
I see no way to put -3 / 3 at the very ends of the color bar, other than manually placing a text element at the correct position in the plot (which I would strongly advice against).

How to set heigth of rows grid in graph lines on ggplots (R)?

I'm trying plots a graph lines using ggplot library in R, but I get a good plots but I need reduce the gradual space or height between rows grid lines because I get big separation between lines.
This is my R script:
library(ggplot2)
library(reshape2)
data <- read.csv('/Users/keepo/Desktop/G.Con/Int18/input-int18.csv')
chart_data <- melt(data, id='NRO')
names(chart_data) <- c('NRO', 'leyenda', 'DTF')
ggplot() +
geom_line(data = chart_data, aes(x = NRO, y = DTF, color = leyenda), size = 1)+
xlab("iteraciones") +
ylab("valores")
and this is my actual graphs:
..the first line is very distant from the second. How I can reduce heigth?
regards.
The lines are far apart because the values of the variable plotted on the y-axis are far apart. If you need them closer together, you fundamentally have 3 options:
change the scale (e.g. convert the plot to a log scale), although this can make it harder for people to interpret the numbers. This can also change the behavior of each line, not just change the space between the lines. I'm guessing this isn't what you will want, ultimately.
normalize the data. If the actual value of the variable on the y-axis isn't important, just standardize the data (separately for each value of leyenda).
As stated above, you can graph each line separately. The main drawback here is that you need 3 graphs where 1 might do.
Not recommended:
I know that some graphs will have the a "squiggle" to change scales or skip space. Generally, this is considered poor practice (and I doubt it's an option in ggplot2 because it masks the true separation between the data points. If you really do want a gap, I would look at this post: axis.break and ggplot2 or gap.plot? plot may be too complexe
In a nutshell, the answer here depends on what your numbers mean. What is the story you are trying to tell? Is the important feature of your plots the change between them (in which case, normalizing might be your best option), or the actual numbers themselves (in which case, the space is relevant).
you could use an axis transformation that maps your data to the screen in a non-linear fashion,
fun_trans <- function(x){
d <- data.frame(x=c(800, 2500, 3100), y=c(800,1950, 3100))
model1 <- lm(y~poly(x,2), data=d)
model2 <- lm(x~poly(y,2), data=d)
scales::trans_new("fun",
function(x) as.vector(predict(model1,data.frame(x=x))),
function(x) as.vector(predict(model2,data.frame(y=x))))
}
last_plot() + scale_y_continuous(trans = "fun")
enter image description here

Set categorical axis labels with scales "free" ggplot2

I am trying to set the labels on a categorical axis within a faceted plot using the ggplot2 package (1.0.1) in R (3.1.1) with scales="free". If I plot without manually setting the axis tick labels they appear correctly (first plot), but when I try to set the labels (second plot) only the first n labels are used on both facets (not in sequence as with the original labels).
Here is a reproducible code snippet exemplifying the problem:
foo <- data.frame(yVal=factor(letters[1:8]), xVal=factor(rep(1:4,2)), fillVal=rnorm(8), facetVar=rep(1:2,each=4))
## axis labels are correct
p <- ggplot(foo) + geom_tile(aes(x=xVal, y=yVal, fill=fillVal)) + facet_grid(facetVar ~ ., scales='free')
print(p)
## axis labels are not set correctly
p <- p + scale_y_discrete(labels=c('a','a','b','b','c','d','d','d'))
print(p)
I note that I cannot set the labels correctly within the data.frame as they are not unique. Also I am aware that I can accomplish this with arrange.grid, but this requires "manually" aligning the plots if there are different length labels etc. Additionally, I would like to have the facet labels included in the plot which is not an available option with the arrange.grid solution. Also I haven't tried viewports yet. Maybe that is the solution, but I was hoping for more of the faceted look to this plot and that seems to be more similar to grid.arrange.
It seems to me as though this is a bug, but I am open to an explanation as to how this might be a "feature". I also hope that there might be a simple solution to this problem that I have not thought of yet!
The easiest method would be to create another column in your data set with the right conversion. This would also be easier to audit and manipulate. If you insist on changing manually:
You cannot simply set the labels directly, as it recycles (I think) the label vector for each facet. Instead, you need to set up a conversion using corresponding breaks and labels:
p <- p + scale_y_discrete(labels = c('1','2','3','4','5','6','7','8'), breaks=c('a','b','c','d','e','f','g','h'))
print(p)
Any y axis value of a will now be replaced with 1, b with 2 and so on. You can play around with the label values to see what I mean. Just make sure that every factor value you have is also represented in the breaks argument.
I think I may actually have a solution to this. My problem was that my labels were not correct because as someone above has said - it seems like the label vector is recycled through. This line of code gave me incorrect labels.
ggplot(dat, aes(x, y))+geom_col()+facet_grid(d ~ t, switch = "y", scales = "free_x")+ylab(NULL)+ylim(0,10)+geom_text(aes(label = x))
However when the geom_text was moved prior to the facet_grid, the below code gave me correct labels.
ggplot(dat, aes(x, y))+geom_col()+geom_text(aes(label = x))+facet_grid(d ~ t, switch = "y", scales = "free_x")+ylab(NULL)+ylim(0,10)
There's a good chance I may have misunderstood the problem above, but I certainly solved my problem so hopefully this is helpful to someone!

How to plot matrix with background color varying according to entry?

I wanted to ask for any general idea about plotting this kind of plot in R which can compare for example the overlaps of different methods listed on the horizontal and vertical side of the plot? Any sample code or something
Many thanks
A ggplot2-example:
# data generation
df <- matrix(runif(25), nrow = 5)
# bring data to long format
require(reshape2)
dfm <- melt(df)
# plot
require(ggplot2)
ggplot(dfm, aes(x = Var1, y = Var2)) +
geom_tile(aes(fill = value)) +
geom_text(aes(label = round(value, 2)))
The corrplot package and corrplot function in that package will create plots similar to what you show above, that may do what you want or give you a starting point.
If you want more control then you could plot the colors using the image function, then use the text function to add the numbers. You can either create the margins large enough to place the text in the margins, see the axis function for the common way to add text labels in the margin. Or you could leave enough space internally (maybe use rasterImage instead of image) and use text to do the labelling. Look at the xpd argument to par if you want to add the lines and the grconvertX and grconvertY functions to help with the coordinates of the line segents.

Resources