Saving dataframe to pdf adjust width - r

I found that grid.table could be used to plot a dataframe to a pdf file, as described here. I want to save a dataframe to a landscape A4 format, however it seems to not scale the data such that is nicely fits within the borders of the pdf.
Code
library(gridExtra)
set.seed(1)
strings <- c('Wow this is nice', 'I need some coffee', 'Insert something here', 'No ideas left')
table <- as.data.frame(matrix(ncol = 9, nrow = 30, data = sample(strings,30, replace = TRUE)))
pdf("test.pdf", paper = 'a4r')
grid.table(table)
dev.off()
Output
Not the whole table is shown in the pdf:
Question
How can I make sure that the dataframe is scaled to fit within the landscape A4? I don't explicitly need gridExtra or the default pdf save, I can use any other package if these are easier to fix this.
EDIT
I came across this other question, appareantly one can figure out the required height and width of the tableGrob
tg = gridExtra::tableGrob(table)
h = grid::convertHeight(sum(tg$heights), "mm", TRUE)
w = grid::convertWidth(sum(tg$widths), "mm", TRUE)
ggplot2::ggsave("test.pdf", tg, width=w, height=h, units = 'mm')
Here h = 172.2 and w = 444.3 which exceed the size of a standard A4, namely 210 x 279. So I know this causes the problem however still can't figure out to scale down the table to fit it on an A4.

I figured out I could add the scale parameter to ggsave. I wrote a simple function to get the optimal scale:
optimal.scale <- function(w,h, wanted.w, wanted.h) max(c(w/wanted.w, h/wanted.h))
I added 0.1 to the scale to add a margin to the plot such that the text is not directly on the edge of the paper. Then I passed the resulting scale to ggsave
tg = gridExtra::tableGrob(table
h = grid::convertHeight(sum(tg$heights), "mm", TRUE)
w = grid::convertWidth(sum(tg$widths), "mm", TRUE)
scale = optimal.scale(w,h, 279, 210) + 0.1 #A4 = 279 x 210 in landscape
ggplot2::ggsave("test.pdf", tg, width = 279, height = 210, units = 'mm' , scale = scale)
Now my table fits on the A4:

Related

Adding legend to venn diagram

I am using library VennDiagram to plot venn diagrams. But this function does not have a functionality to add legend and set names are displayed on or close to the sets themselves.
library(VennDiagram)
x <- list(c(1,2,3,4,5),c(4,5,6,7,8,9,10))
venn.diagram(x,filename="test.png",fill=c("#80b1d3","#b3de69"),
category.names=c("A","B"),height=500,width=500,res=150)
And with many sets, overplotting names is an issue and I would like to have a legend instead. The function is built on grid graphics and I have no idea how grid plotting works. But, I am attempting to add a legend anyway.
Looking into the venn.diagram function, I find that final plotted object is grob.list and it is a gList object and its plotted using grid.draw().
png(filename = filename, height = height, width = width,
units = units, res = resolution)
grid.draw(grob.list)
dev.off()
I figured out that I could create a legend by modifying the venn.diagram function with the code below.
cols <- c("#80b1d3","#b3de69")
lg <- legendGrob(labels=category.names, pch=rep(19,length(category.names)),
gp=gpar(col=cols, fill="gray"),byrow=TRUE)
Draw the object lg
png(filename = filename, height = height, width = width,
units = units, res = resolution)
grid.draw(lg)
dev.off()
to get a legend
How do I put the venn diagram (gList) and the legend (gTree,grob) together in a usable way? I am hoping to get something like base plot style:
or the ggplot style
If you are allowed to use other packages than VennDiagram, I suggest the following code using the eulerr package:
library(eulerr)
vd <- euler(c(A = 5, B = 3, "A&B" = 2))
plot(vd, counts = TRUE,lwd = 2,
fill=c("#80b1d3","#b3de69"),
opacity = .7,
key = list( space= "right", columns=1))
with key you define the legend location and appearance.
If you want to continue using the VennDiagram package and learn a bit of grid on the way:
Prepare diagram and legend
library(VennDiagram)
x <- list(c(1,2,3,4,5),c(4,5,6,7,8,9,10))
diag <- venn.diagram(x,NULL,fill=c("#80b1d3","#b3de69"),
category.names=c("A","B"),height=500,width=500,res=150)
cols <- c("#80b1d3","#b3de69")
lg <- legendGrob(labels=c("A","B"), pch=rep(19,length(c("A","B"))),
gp=gpar(col=cols, fill="gray"),
byrow=TRUE)
Transform the diagram to a gTree
(I'd love to find a better way if anyone knows one)
library(gridExtra)
g <- gTree(children = gList(diag))
Plot the two gTrees side by side
gridExtra::grid.arrange(g, lg, ncol = 2, widths = c(4,1))
Or one above the other
grid.arrange(g, lg, nrow = 2, heights = c(4,1))
I have found a solution as well, but the venn diagram region is not square aspect ratio. And the legend is not spaced ideally.
library(gridGraphics)
png("test.png",height=600,width=600)
grab_grob <- function(){grid.echo();grid.grab()}
grid.draw(diag)
g <- grab_grob()
grid.arrange(g,lg,ncol=2,widths=grid::unit(c(0.7,0.3),"npc"))
dev.off()

How can I prevent resizing of fonts, plot objects etc. in R?

I want to have multiple plots in the same image, and I want to have a different number of plots depending on image. To be precise, I first create a 1x2 matrix of plots, and then a 3x2 matrix of plots. I want to use the same basic settings for these two images - the same font sizes especially, since this is for a paper and the font size has to be at least 6 pt for a plot.
In order to achieve this, I wrote the following code for R:
filename = "test.png"
font.pt = 6 # font size in pts (1/72 inches)
total.w = 3 # total width in inches
plot.ar = 4/3 # aspect ratio for single plot
mat.col = 2 # number of columns
mat.row = 1 # number of rows
dpi = 300
plot.mar = c(3, 3, 1, 2) + 0.1
plot.mgp = c(2, 1, 0)
plot.w = total.w / mat.col - 0.2 * plot.mar[2] - 0.2 * plot.mar[4]
plot.h = plot.w / plot.ar
total.h = (plot.h + 0.2 * plot.mar[1] + 0.2 * plot.mar[3]) * mat.row
png(filename, width = total.w, height = total.h, res = dpi * 12 / font.pt, units = "in")
par(mfrow = c(mat.row, mat.col), mai = 0.2 * plot.mar, mgp = plot.mgp)
plot(1, 1, axes = T, typ = 'p', pch = 20, xlab = "Y Test", ylab = "X Test")
dev.off()
As you can see, I set a total width of 3 inches and then calculate the total height for my image, so that the aspect ratio of the plots is correct. The font size only changes the resolution by a factor.
Anyway, the problem is now that the font size changes significantly when I go from mat.row = 1 to mat.row = 3. Other things change as well, for example the labelling of the axes and the margins, even though I specifically set those before in inches. Have a look:
When 3 rows are set (cropped image):
When only 1 row is set (cropped image):
How can I prevent this? As far as I can see, I did everything I could. This took me quite a while, so I'd like to get this to work instead of switching to gglplot and learning everything from scratch again. It's also small enough that I really hope I'm just missing something very obvious.
In ?par we can find:
In a layout with exactly two rows and columns the base value of "cex"
is reduced by a factor of 0.83: if there are three or more of either
rows or columns, the reduction factor is 0.66.
Therefore, when you change mfrow values from (2, 1) to (2, 3) the cex value changes from 0.83 to 0.66. cex affects font size and text line height.
So, you can manually specify cex value for your plots.
par(mfrow = c(mat.row, mat.col), mai = 0.2 * plot.mar, mgp = plot.mgp, cex = 1)
Hope, it is what you need.
Plot for mat.row = 1 (cropped):
And plot for mat.row = 3 (cropped):

Printing out a dataframe in R: grid.table outputs cropped tables, doesn't respond to fontsize

I am trying to automate a series of analyses which are intended to save a number of plots for later inspection. One of the plots will be accompanied by a table of values. I'd like to have them in the same pdf so that the users don't have to jump between files.
I have checked numerous questions on SO regarding outputting data frames to pdf, here are a couple of reasons why existing answers aren't satisfactory in my case:
Not familiar with knitr/Sweave
Batch generation of figures mean that I cannot do it manually via RStudio Viewer
grid.table based solutions do not generate the entire table.
Which brings me to my problems, say I have a table 48 x 5 in proportions. If I try to plot it out with grid.table(geno) it results in a cropped table showing some 20-30 rows in the middle. If I go with grid.table(geno, gp = gpar(fontsize=8)) to decrease the fontsize I get the following error message.
Error in gtable_table(d, name = "core", fg_fun = theme$core$fg_fun, bg_fun = theme$core$bg_fun, :
unused argument (gp = list(fontsize = 8)
)
Essentially I would like to be able to use it in this way:
library(grid)
library(gridExtra)
pdf(file="gtype.pdf", title = "Genotype data")
plotGenotype(geno, text_size = 10) # outputs a custom plot
grid.newpage()
grid.table(geno) # grid.table(geno, gp = gpar(fontsize=8))
dev.off()
The problem here is that I either get a cropped table or nothing at all, on the second page. I noticed that many people add height=11, width=8.5 to the pdf() call. I am not sure if/why that would make a difference but setting paper="a4" or height/width according to A4 does not make any difference in my case.
Q1: Is it not possible to get grid.table to resize based on content and not paper?
Q2: Is there some other way to get a data frame printed to a pdf without having to go through LaTeX based solutions?
(I am currently running R 3.3.1 and gridExtra 2.2.1)
Q1: Is it not possible to get grid.table to resize based on content and not paper?
It is possible, but generally not desirable. A table is meant to be read, and if text and spacings were determined by the page rather than the content, it would often yield unreadable results. Thus the usual advice: manually tweak the font size and padding, or split the table.
It is by no means a technical limitation: feel free to set the cell size to fit the page:
grid.newpage()
pushViewport(viewport(width=unit(0.8,"npc"), height=unit(0.8,"npc")))
g <- g2 <- tableGrob(iris[1:4, 1:3], cols = NULL, rows=NULL)
g2$heights <- unit(rep(1/nrow(g2), nrow(g2)), "npc")
grid.arrange(rectGrob(), rectGrob(), nrow=1, newpage = FALSE)
grid.arrange(g, g2, nrow=1, newpage = FALSE)
but with too much content for the page it's unclear what result is better
grid.newpage()
pushViewport(viewport(width=unit(0.8,"npc"), height=unit(0.8,"npc")))
g <- g2 <- tableGrob(iris[1:20, 1:3], cols = NULL, rows=NULL)
g3 <- tableGrob(iris[1:20, 1:3], cols = NULL, rows=NULL, theme=ttheme_default(base_size=7))
g2$heights <- g3$heights <- unit(rep(1/nrow(g2), nrow(g2)), "npc")
grid.arrange(rectGrob(), rectGrob(), rectGrob(), nrow=1, newpage = FALSE)
grid.arrange(g, g2, g3, nrow=1, newpage = FALSE)
If the page size can be changed, it is usually the best option. One can query the table size before drawing, convert it to inches, and pass it to the device.
g1 <- tableGrob(iris[1:4, 1:5])
g2 <- tableGrob(iris[1:20, 1:5])
maxheight <- convertHeight(sum(g2$heights), "in", TRUE)
pdf("fit.pdf", height=maxheight)
grid.draw(g1)
grid.newpage()
grid.draw(g2)
dev.off()
However, as far as I know all pages in a given pdf will have to have the same size (there might be ways around it, but tricky).

Output Stem and Leaf Plot to Image

I'm trying to output a Stem and Leaf plot in R as an image. I'm not sure if there's a nice library which can accomplish this but below is some of the code I've tried.
jpeg(filename="stem.jpeg",width=480,height=480, units="px",pointsize=12)
plot.new()
tmp <- capture.output(stem(men, scale = 1, width = 40))
text( 0,1, paste(tmp, collapse='\n'), adj=c(0,1), family='mono' )
dev.off()
This above code resulted in the data being saved, but it looks very blurry and the plot gets cut off pretty badly. When adding a histogram to an image, R seems to do a good job to scale everything to fit in the size of the image.
jpeg(filename="stem.jpeg",width=480,height=480,
units="px",pointsize=12)
stem(men, scale = 1, width = 40)
dev.off()
This created the image but had no content within it.
Any ideas? Thanks!
That's because stem and leaf plots produce text not images. You can save the text as follows using the sink command: http://stat.ethz.ch/R-manual/R-devel/library/base/html/sink.html
sink(file=“Stem.txt”)
stem(men, scale = 1, width = 40)
sink(file=NULL)
unlink("stem.txt")
To export a stemplot as graphics, you can use a vector graphics format, such
as .eps, .pdf, or .emf. For example, a windows metafile:
win.metafile("stem.wmf", pointsize = 10)
plot.new()
tmp <- capture.output(stem(mtcars$mpg))
text(0,1,paste(tmp,collapse='\n'),family='mono',adj=c(0,1))
dev.off()

ploting artefact with points over raster

I noticed some weird behavior when resizing the plot window. Consider
library(sp)
library(rgeos)
library(raster)
rst.test <- raster(nrows=300, ncols=300, xmn=-150, xmx=150, ymn=-150, ymx=150, crs="NA")
sap.krog300 <- SpatialPoints(coordinates(matrix(c(0,0), ncol = 2)))
sap.krog300 <- gBuffer(spgeom = sap.krog300, width = 100, quadsegs = 20)
shrunk <- gBuffer(spgeom = sap.krog300, width = -30)
shrunk <- rasterize(x = shrunk, y = rst.test)
shrunk.coords <- xyFromCell(object = rst.test, cell = which(shrunk[] == 1))
plot(shrunk)
points(shrunk.coords, pch = "+")
If you resize the window, plotted points get different extent compared to the underlying raster. If you resize the window and plot shrunk and shrunk.coords again, the plot turns out fine. Can anyone explain this?
If you plot directly with the RasterLayer method for plot the resize problem does not occur.
## gives an error, but still plots
raster:::.imageplot(shrunk)
points(shrunk.coords, pch = ".")
So it must be something in the original plot call before the .imageplot method is called.
showMethods("plot", classes = "RasterLayer", includeDefs = TRUE)
It does occur if we call raster:::.plotraster directly, and this is the function that calls raster:::.imageplot:
raster:::.plotraster(shrunk, col = rev(terrain.colors(255)), maxpixels = 5e+05)
points(shrunk.coords, pch = ".")
It is actually in the axis labels, not the image itself. See with this, this plots faithfully on resize:
raster:::.imageplot(shrunk)
abline(h = c(-80, 80), v = c(-80, 80))
But do it like this, and the lines are no longer at [-80, 80] after resize:
plot(shrunk)
abline(h = c(-80, 80), v = c(-80, 80))
So it is actually the points plotted after the raster that are showing incorrectly: the plot method keeps the aspect ratio fixed, so widening the plot doesn't "stretch" out the raster circle to an ellipse. But, it does something to the points that are added afterwards so the calls to par() must not be handled correctly (probably in raster:::.imageplot).
Another way of seeing the problem is to show that axis() does not know the space being used by the plot, which is the same problem you see when overplotting:
plot(shrunk)
axis(1, pos = 1)
When you resize the x-axis length the two axes are no longer synchronized.
Because you have a raster, try replacing plot() with image(). I had the same problem but this solved it for me.

Resources