ggplot with ggplot2: pdf very slow to display - r

I am producing a pdf plot with this kind of command:
ggplot(df, aes(sample = x))+
stat_qq(geom="point",distribution=qexp)+
geom_abline(intercept = 0, slope = 1,linetype='dashed',col='red')
ggsave(file="xxx.pdf")
Than I want to integrate the pdf into a tex file and produce a final pdf document.
But, the ggplot is very slow to display and makes the pdf crash very often.
When I use geom='line' it doesn't happen so I guess it comes from the number of circle points.
Do you have any idea on how to solve this? I really prefer the geom='point' option.

PDFs are vector based - so every single point on your chart has to be loaded individually. This produces a 'load-up' sort of effect on your PDF. My solution would be to save as a high DPI png/gif instead:
ggsave(file="xxx.png", dpi=400) #default is 300 which is probably sufficent
Tex to pdflatex (or AN Other) will find the file 'xxx' if you not forced an extension in your R to Tex conversion as the include statement will usually not mention an extension. You will need to make sure that the pdf is deleted from the your charts folders to ensure it doesn't get picked up in preference to the png.

Related

How to hide figures in knitr, but create them as png?

I am currently doing some statistical analysis in R and use knitr to generate results and an overview document.
There are some additional plots, which I want to be done and saved as a .png (with specified file name and location), but not included in the generated .html file (too many of them, and they are not at the end).
Using dev.copy(png, ...) works fine for generating the plots, but the figures appear in the .html. If I specify fig.keep=none the .png files are created, but blank.
Is there some way to do what I want?
This is from knitr website:
fig.show: ('asis'; character) how to show/arrange the plots; four
possible values are
asis: show plots exactly in places where they were
generated (as if the code were run in an R terminal)
hold: hold all
plots and output them in the very end of a code chunk
animate: wrap
all plots into an animation if there are mutiple plots in a chunk
hide: generate plot files but hide them in the output document
fig.show = 'hide' worked for me.

Can I reduce pdf file size in knitR/ggplot2 when using a large dataset without using external tools?

I have a number of large-ish files which I am reading into R in an rmarkdown document, cleaning up, and plotting with ggplot2.
Most files are about 3Mb in size with around 80,000 lines of data, but some are 12Mb in size, with 318,406 lines of data (Time, Extension, Force).
Time,Extension,Load
(sec),(mm),(N)
"0.00000","0.00000","-4.95665"
"0.00200","0.00000","-4.95677"
"0.00400","0.00000","-4.95691"
"0.10400","-0.00040","-4.95423"
It takes a while to churn through the data and create the pdf file (that's OK), but the PDF file is now nearly 6Mb in size with about 16 graphs in there (in fact 3 graphs which are facet plots using ggplot2).
I understand that the pdf is including a line segment for every datapoint in my dataset, and therefore as I increase the number of graphs the amount of data in the file increases./ However, I don't forsee a requirement to drill down into the pdf document to see that level of detail, and I will have problems emailing it around as it approaches 10Mb).
If I convert pdf to ps using pdf2ps and then go back to pdf with ps2pdf, I get a file about 1/3 of the size of the original pdf, and the quality looks great.
Therefore is there a method from within R/knitR/ggplot2 to reduce the number of points plotted in the pdf images without using an external tool to compress the pdf file ? (or to somehow optimise the pdf generated ?)
Cheers
Pete
You can try changing the graphic device from pdf to png by adding
knitr::opts_chunk$set(dev = 'png')
to your setup chunk.
Or you can add this to your output header
output:
pdf_document:
dev: png
Try different devices (png, jpg). Maybe this will change the size

How to save graphs/plots from quatrz generated by bio3d (R package)

I'm new to R and bio3d and have been unable to find any answers to my problem. I'm trying to find a way to save graphs/plots generated by bio3d (a R package). So far I have to manually click save as when the graph appears and I have tried many variations of R language to save the graph which either result in no saved file or a small file that cannot be opened. Can anyone give me some pointers please?
In an R script you may try with the following lines:
pdf('nameoftheplot.pdf', width=..., height=...)
Then you can write the R-code that generates your plot, and at the end you should add this last line:
dev.off()
Select all the lines and run them with cmd+R (Windows) or cmd+enter (OS X). The output pdf file with the plot should be located in your current working directory. Hope this works.
Edit: if you want a .png file as an output you have to replace the first line with:
png('nameoftheplot.pdf', width=..., height=..., res=...)
Edit2: Example:
pdf("firstplot.pdf", width=6, height=3)
qplot(carat, data = diamonds, geom = "density")
dev.off()
If you have already created your plot (i.e. it is displayed in the active quartz or x11 window) you can use dev.copy2pdf() and it relatives, e.g.
plot(c(1:10))
dev.copy2pdf(file="example.pdf")
If you want to do this without plotting to the quartz/x11 window then issue a call to png() or pdf() etc before your plot() call and then follow it with a dev.off() call, e.g.
pdf(file="example2.pdf")
plot(c(1:10))
dev.off()
The 'plot.new figure margins too large' error can occur when your plot window is too small to take all the graphic output you are trying to produce. Often making your window bigger will solve this.

R loop through complete script, find all plots generated and save them

I have the following problem using R and not found a solution so far:
I have a script where I run several operation and generate some plots. At the end, I would like to have a nice piece of code that automatically saves all the plots generated into the current working directory. So far, I am using:
trellis.device(device="png", filename="Plot_A.png")
print(Plot_A)
dev.off()
Which is working fine for just one specific plot. Now I am looking for some kind of for loop that takes all the plots and saves them with the name of the plot as a png file
In grid based plotting packages (lattice and ggplot), you can store the plot in an object and call print on them to trigger actual rendering of the plot. What you could do is not render the image on the spot, but append any plots to a list. Then, at the end, you can loop over the plots and output them.
plot_list = list()
lattice_plot = xyplot()
plot_list = append(plot_list, lattice_plot)
for(plot in plot_list) {
png('name.png')
print(plot)
dev.off()
}
Not exactly an answer but an alternative workflow.
If you are saving your plots in order to use it somewhere else, for example to include them in a Word document or in a presentation, you could just put your code in an RMarkdown document and knitr it to generate an html or doc document with all the output generated by the code, including plots. With RStudio all that could be done with a few clicks.
It may even be easier to take all plots from the Word document than from a folder of png files.

Redirecting R graphs to MS Word

I wonder how to redirect R graphs to MS Word? Like sink() redirect the R output to any file but not the graphs. I tried R2Wd but sometimes it doesn't work properly. Any comment and help will be highly appreciated. Thanks
To answer your direct question, the best way to get the results of R scripts and plots into word is probably via some form of Sweave. Look up odfweave to send R output to a LibreOffice file that can then be converted to word, or even opened directly in Word if you have the right plugin.
To create plots that can be editable (i.e you can alter the look of plots, move the legend etc) I would recommend saving the plot to an svg format (scalable vector graphic) that you can then edit using the excellent free vector graphics app inkscape.
For instance, if I create my ggplot2 graph as an object
library(ggplot2)
dataframe<-data.frame(fac=factor(c(1:4)),data1=rnorm(400,100,sd=15))
dataframe$data2<-dataframe$data1*c(0.25,0.5,0.75,1)
testplot<-qplot(x=fac, y=data2,data=dataframe, colour=fac, geom=c("boxplot", "jitter"))
You can use the Cairo package, which allows creation of svg files, I can then edit these in Inkscape.
library(Cairo)
Cairo(600,600,file="testplot.svg",type="svg",bg="transparent",pointsize=8, units="px",dpi=400)
testplot
dev.off()
Cairo(1200,1200,file="testplot12200.png",type="png",bg="transparent",pointsize=12, units="px",dpi=200)
testplot
dev.off()
For more info read this previous question that has more good answers Create Editable plots from R
Also, you can follow this advice from Hadley, and save the actual ggplot2 object, then load it later and modify it
save(testplot, file = "test-plot.rdata")
# Time passes and you start a new R session
load("test-plot.rdata")
testplot + opts(legend.position = "none")
testplot + geom_point()
To get sink() like behavior with MSword look at the wdTxtStart function in the TeachingDemos package. This uses R2wd internally, so you will see similar functionality, this just sends everything you do to the word document.
Graphs are not sent automatically since you may be adding to them, but once you know you are finished with the graph you can use wdtxtPlot to send the current graph to the word document.
If you know what you want to do ahead of time then sweave or something similar is probably the better approach (as has already been mentioned). The group that created Rexcel are also working on Sword that does sweave like things within MSword.

Resources