I am generating a heatmap in R with lots of rows.
TL;DR how do I get the real size of a plot in R?
df=data.frame(one=1:100,two=101:200,three=201:300)
names=1:100
names=paste0("Cell",names)
rownames(df)=(names)
pheatmap(df,scale="row")
default image fits in window, but we can't read row names.
pheatmap(df,scale="row",cellheight = 10)
changing the cell height lets us read row names but now the image doesn't fit in the window!
In this example i am using pheatmap, but also run into this stuff with other plot generating packages.
While I've grown to expect frustrating behavior like this from R and by trial and error could make an appropriately size image for the plot, this seems like something that I should be able to get from the program?
Is there a way to get the dimensions of the plot automatically so that I can create correctly size PDF or PNG for it?
The function pheatmap uses grid graphics to draw its plots, and specifies the size of its elements in "bigpts" where 72 "bigpts" == 1 inch. If you have lots of rows and specify a reasonable row height, this will exceed the plotting window.
Because it is specified as a gtree, we can actually access the height and width of the components and use them to set the dimensions of our png or pdf.
This function will harvest the total height and width in inches of a plot, returning them in a named list:
get_plot_dims <- function(heat_map)
{
plot_height <- sum(sapply(heat_map$gtable$heights, grid::convertHeight, "in"))
plot_width <- sum(sapply(heat_map$gtable$widths, grid::convertWidth, "in"))
return(list(height = plot_height, width = plot_width))
}
We can use this to specify the dimensions of our plotting device:
my_plot <- pheatmap(df,scale="row", cellheight = 10)
plot_dims <- get_plot_dims(my_plot)
png("plot.png", height = plot_dims$height, width = plot_dims$width, units = "in", res = 72)
my_plot
dev.off()
Which gives the desired plot
Note that this is not a general solution for R plots, but specific to pheatmap objects.
Related
I am trying to plot a boxplot (just a single box, i.e 1 data column) but the issue I am having is that the frame enclosing the boxplot is so big compared to the actual plot/box which is in the middle. I would like to reduce the size of the frame without reducing the size of the plotted box so that the plotted box size is relative to the frame enclosing it.
For example, in this image
you can see the box is in the middle but there is a lot of space to the left and right of the box. So is there a parameter/s to make the frame size relative to the size of the plotted box? I tried to play around with a lot of parameters by setting them as arguments in par() function but I was not successful.
This code illustrates the problem:
x = data.frame(a = 1:15)
boxplot(x, boxlwd = 2, outwex = 0.5, boxwex = 0.2)
I am very new in using the power of R to create graphical output.
I use the forest()-function in the metafor-package to create Forest plots of my meta-analyses. I generate several plots using a loop and then save them via png().
for (i in 1:ncol(df)-2)){
dat <- escalc(measure="COR", ri=ri, ni=ni, data=df) # Calcultes Effect Size
res_re <- rma.uni(yi, vi, data=dat, method="DL", slab=paste(author)) # Output of meta-analysis
png(filename=path, width=8.27, height=11.69, units ="in", res = 210)
forest(res_re, showweight = T, addfit= T, cex = .9)
text(-1.6, 18, "Author(s) (Year)", pos=4)
text( 1.6, 18, "Correlation [95% CI]", pos=2)
dev.off()
}
This works great if the size of the plot is equal. However, each iteration of the loop integrates a different number of studies in the forest plot. Thus, the text-elements are not on the right place and the forest-plot with many studies looks a bit strange. I have two questions:
How can I align the "Author(s) (Year)" and "Correlation [95%CI]" automatically to the changing size of the forest-plot such that the headings are above the upper line of the forest-table?
How can I scale the size of the forest plot such that the width and the size of the text-elements is the same for all plots and for each additional study just a new line will be added (changing height)?
Each forest-plot should look like this:
Here is what you will have to do to get this to work:
I would fix xlim across plots, so that there is a fixed place to place the "Author(s) (Year)" and "Correlation [95%CI]" headings. After you have generated a forest plot, take a look at par()$usr[1:2]. Use these values as a starting point to adjust xlim so that it is appropriate for all your plots. Then use those two values for the two calls to text().
There are k rows in each plot. The headings should go two rows above that. So, use text(<first xlim value>, res_re$k+2, "Author(s) (Year)", pos=4) and text(<second xlim value>, res_re$k+2, "Correlation [95% CI]", pos=2)
Set cex in text() to the same value you specified in your call to forest().
The last part is tricky. You have fixed cex, so the size of the text-elements should be the same across plots. But if there are more studies, then the k rows get crammed into less space, so they become less separated. If I understand you correctly, you want to keep the spacing between rows equal across plots by adjusting the actual height of the plot. Essentially, this will require making height in the call to png() a function of k. For each extra study, an additional amount needs to be added to height so that the row spacing stays constant, so something along the lines of height=<some factor> + res_re$k * <some factor>. But the increase in height as a function of k may also be non-linear. Getting this right would take a lot of try and error. There may be a clever way of determining this programmatically (digging into ?par and maybe ?strheight).
So make it easier for others to chime in, the last part of your question comes down to this: How do I have to adjust the height value of a plotting device, so that the absolute spacing between the rows in plot(1:10) and plot(1:20) stays equal? This is an interesting question in itself, so I am going to post this as a separate question.
ad 4.: In Wolfgangs question (Constant Absolute Spacing of Row in R Plots) you will find how to make plot-height depending on the amount of rows in it.
For forest() it would work a little different, since this function internally modifies the par("mar")-values.
However, if you set margins to zero, you only need to include the attribute yaxs="i" in your forest()-function, so that the y-axis will be segmented for the range of the data and nothing else. The device than needs to be configured to have the height (length(ma$yi)+4.5)*fact*res with fact as inches/line (see below) and res as pixels/inch (resolution).
The 4.5 depends if you have left addfit=T and intercept=T in your meta-analysis model (in that case forest() internally sets ylim <- c(-1.5, k + 3)). Otherwise you'd have to use 2.5 (than it would be ylim <- c(0.5, k + 3)).
If you feel like using margins you would do the following (I edited the following part, after I recognized some mistake):
res <- 'your desired resolution' # pixels per inch
fact <- par("mai")[1]/par("mar")[1] # calculate inches per line
### this following part is copied from inside the forest()-function.
# forest() modifies the margin internally in the same way.
par.mar <- par("mar")
par.mar.adj <- par.mar - c(0, 3, 1, 1)
par.mar.adj[par.mar.adj < 0] <- 0
###
ylim <- c(-1.5, length(ma$yi)+3) # see above
ylim.abs <- abs(ylim[1])+abs(ylim[2])-length(ma$yi) # calculate absolute distance of ylim-argument
pixel.bottom <- (par.mar.adj[1])*fact*res # calculate pixels to add to bottom and top based on the margin that is internally used by forest().
pixel.top <- (par.mar.adj[3])*fact*res
png(filename='path',
width='something meaningful',
height=((length(ma$yi)+ylim.abs)*fact*res) + pixel.bottom + pixel.top,
res=res)
par(mar=par.mar) # make sure that inside the new device the margins you want to define are actually used.
forest(res_re, showweight = T, addfit= T, cex = .9, yaxs="i")
...
dev.off()
I started using R's map() function to plot maps. I noticed that when I resize the plot window, the image does not scale to fill the window. How can I get the map image to automatically resize bigger or smaller, depending on how big I drag my window?
I am using R version 3.0.2 on MacOS.
For example, here is a map where I've dragged the plot window smaller and bigger. Notice that the map image's size does not change.
library(maps)
map("state")
On the other hand, the usual plot() command does resize the graphic to fit the window.
plot(1:100, 201:300)
It takes a bit of work, but by converting the maps object to a SpatialPolygonsDataFrame, and then spplot()'ing that, you can get a dynamically resizing map.
FWIW, I suspect this works better because spplot() is based on grid (via lattice), and the grid graphical system supports much more sophisticated ways of handling dimensions within plot objects than does R's base graphical system.
library(maps)
library(maptools) ## For map2SpatialPolygons()
## Convert data from a "maps" object to a "SpatialPolygonsDataFrame" object
mp <- map("state", fill = TRUE, plot = FALSE)
SP <- map2SpatialPolygons(mp, IDs = mp$names,
proj4string = CRS("+proj=longlat +datum=WGS84"))
DATA <- data.frame(seq_len(length(SP)), row.names = names(SP))
SPDF <- SpatialPolygonsDataFrame(SP, data = DATA)
## Plot it
spplot(SPDF, col.regions = "transparent", colorkey = FALSE,
par.settings = list(axis.line = list(col = "transparent")))
Here are a couple of screenshots to show that it works:
I note that Josh has provided an acceptable solution to the problem, but it might be useful to understand why map() has the behaviour you describe. Essentially it comes down to map() setting the size of the plotting region based on the current size & aspect ratio of the device (the figure region more specifically) at draw time.
As such, one solution, without converting to another format, as Josh nicely demonstrates, is just to redraw the map after you've rescaled the device to the desired size. You could avoid some guesswork by doing a couple of computations based on the aspect ratio of par("usr") and then set the device to a width that is compatible with that aspect ratio.
Probably more hassle than #Josh's solution, but it does explain the behaviour. A more detailed description of the issue is given below.
The reason that the drawn map doesn't "fill" the device (up to the specified margins) is due to the code in map() setting the size of the plotting region to have a particular aspect ratio based on the size of the device etc. The resulting plotting region is the sized such that it fits within the device, but preserves the correct aspect ratio so may not entirely fill it.
The key section of code is this:
else {
par(mar = mar)
p <- par("fin") - as.vector(matrix(c(0, 1, 1,
0, 0, 1, 1, 0), nrow = 2) %*% par("mai"))
par(pin = p)
p <- par("pin")
p <- d * min(p/d)
par(pin = p)
d <- d * myborder + ((p/min(p/d) - d)/2)/aspect
usr <- c(xrange, yrange) + rep(c(-1, 1), 2) *
rep(d, c(2, 2))
par(usr = usr)
}
with d defined slightly earlier as:
d <- c(diff(xrange), diff(yrange)) * (1 + 2 * myborder) *
aspect
(for the example you give). The second line of the else branch is getting the current size of the figure region in inches. The figure region is the size on the device of the region containing the margins and the plot region but not any outer margin. In effect, if there is no outer margin active, this code is grabbing the size of the device (and makes an adjustment). This result is then use to set the size of the plotting region, which gets updated.
The intention seems to be to take the size of the current figure region, and use that to update the region into which the map is drawn. The size of that plotting region is in that sense controlled via the aspect ratio of the device; if you start with a wide but short window, then the computed plotting region will not need to use all the available width (if it did the aspect ratio would be wrong) and hence the plotting region is set to a size smaller than the available space.
As to why this doesn't update when you resize the window, well that is because at draw-time the size of the plotting region is set absolutely in inches. If you resize the device, the size of the plotting region remains the same and hence the map gets cropped if you shrink the device sufficiently, or uses less and less of the device space if you enlarge the device.
The R package wordcloud has a very useful function which is called wordlayout. It takes initial positions of words and their respective sizes an rearranges them in a way that they do not overlap. I would like to use the results of this functions to do a geom_text plot in ggplot.
I came up with the following example but soon realized that there seems to be a big difference betweetn cex (wordlayout) and size (geom_plot) since words in graphics package appear way larger.
here is my sample code. Plot 1 is the original wordcloud plot which has no overlaps:
library(wordcloud)
library(tm)
library(ggplot2)
samplesize=100
textdf <- data.frame(label=sample(stopwords("en"),samplesize,replace=TRUE),x=sample(c(1:1000),samplesize,replace=TRUE),y=sample(c(1:1000),samplesize,replace=TRUE),size=sample(c(1:5),samplesize,replace=TRUE))
#plot1
plot.new()
pdf(file="plot1.pdf")
textplot(textdf$x,textdf$y,textdf$label,textdf$size)
dev.off()
#plot2
ggplot(textdf,aes(x,y))+geom_text(aes(label = label, size = size))
ggsave("plot2.pdf")
#plot3
new_pos <- wordlayout(x=textdf$x,y=textdf$y,words=textdf$label,cex=textdf$size)
textdf$x <- new_pos[,1]
textdf$y <- new_pos[,2]
ggplot(textdf,aes(x,y))+geom_text(aes(label = label, size = size))
ggsave("plot3.pdf")
#plot4
textdf$x <- new_pos[,1]+0.5*new_pos[,3]#this is the way the wordcloud package rearranges the positions. I took this out of the textplot function
textdf$y <- new_pos[,2]+0.5*new_pos[,4]
ggplot(textdf,aes(x,y))+geom_text(aes(label = label, size = size))
ggsave("plot4.pdf")
is there a way to overcome this cex/size difference and reuse wordlayout for ggplots?
cex stands for character expansion and is the factor by which text is magnified relative the default, specified by cin - set on my installation to 0.15 in by 0.2 in: see ?par for more details.
#hadley explains that ggplot2 sizes are measured in mm. Therefore cex=1 would correspond to size=3.81 or size=5.08 depending on if it is being scaled by the width or height. Of course, font selection may cause differences.
In addition, to use absolute sizes, you need to have the size specification outside the aes otherwise it considers it a variable to map to and choose the scale itself, eg:
ggplot(textdf,aes(x,y))+geom_text(aes(label = label),size = textdf$size*3.81)
Sadly I think you're going to find the short answer is no! I think the package handles the text vector mapping differently from ggplot2, so you can tinker with size and font face/family, etc. but will struggle to replicate exactly what the package is doing.
I tried a few things:
1) Try to plot the grobs from textdata using annotation_custom
require(plyr)
require(grid)
# FIRST TRY PLOT INDIVIDUAL TEXT GROBS
qplot(0:1000,0:1000,geom="blank") +
alply(textdf,1,function(x){
annotation_custom(textGrob(label=x$label,0,0,c("center","center"),gp=gpar(cex=x$size)),x$x,x$x,x$y,x$y)
})
2) Run the wordlayout() function which should readjust the text, but difficult to see for what font (similarly doesn't work)
# THEN USE wordcloud() TO GET CO-ORDS
plot.new()
wordlayout(textdf$x,textdf$y,words=textdf$label,cex=textdf$size,xlim=c(min(textdf$x),max(textdf$x)),ylim=c(min(textdf$y),max(textdf$y)))
plotdata<-cbind(data.frame(rownames(w)),w)
colnames(plotdata)=c("word","x","y","w","h")
# PLOT WORDCLOUD DATA
qplot(0:1000,0:1000,geom="blank") +
alply(plotdata,1,function(x){
annotation_custom(textGrob(label=x$word,0,0,c("center","center"),gp=gpar(cex=x$h*40)),x$x,x$x,x$y,x$y)
})
Here's a cheat if you just want to overplot other ggplot functions on top of it (although the co-ords don't seem to match up exactly between the data and the plot). It basically images the wordcloud, removes the margins, and under-plots it at the same scale:
# make a png file of just the panel
plot.new()
png(filename="bgplot.png")
par(mar=c(0.01,0.01,0.01,0.01))
textplot(textdf$x,textdf$y,textdf$label,textdf$size,xaxt="n",yaxt="n",xlab="",ylab="",asp=1)
dev.off()
# library to get PNG file
require(png)
# then plot it behind the panel
qplot(0:1000,0:1000,geom="blank") +
annotation_custom(rasterGrob(readPNG("bgplot.png"),0,0,1,1,just=c("left","bottom")),0,1000,0,1000) +
coord_fixed(1,c(0,1000),c(0,1000))
I would like to use R to make a barplot of ~100,000 numerical entries. The plot will be dense, which is what I want. So far I am using the following code:
sample_var <- c(2,5,3,2,3,2,6,10,20,...) #Filled with 100,000 entries
barplot(sample_var)
The resulting plot is just what I want, but it is a square, whereas I want a long rectangle. Is there a way to set the dimensions of the barplot? I would like to specific an aspect ratio of 10:1 for length x height, or a specific pixel setting of 1000px x 10px. I tried using xlim in the barplot function statement, but get an "invalid xlim" warning.
Any help is appreciated!
Set the width and hight when outputting to a file:
png(filename="figures.png", width=800, height=200, bg="white")
sample_var <- c(2,5,3,2,3,2,6,10,20)
barplot(sample_var)
dev.off()