I am generating tree images with the seqtreedisplay() function from the R package TraMineR but the default resolution is 72 dpi. I need to create 300 dpi image. Is it be possible do it within the seqtreedisplay() function call using like a "res" argument?
Thanks for help
You can control the resolution of the output file produced by the seqtreedisplay function by passing a device.args argument (that will be treated as an element of the ... list).
The device.args argument should be a list of arguments that will be passed to the used device (jpeg when image.format="jpg", and png otherwise).
To get a 300 dpi resolution, you need to set res=300, but also to increase the the width and height.
I illustrate with the mvad data:
data(mvad)
## Defining a state sequence object
mvad.seq <- seqdef(mvad[, 17:86])
## Growing a seqtree using Hamming distances:
seqt <- seqtree(mvad.seq~ male + Grammar + funemp + gcse5eq + fmpr + livboth,
data=mvad, R=1000, pval=0.05, seqdist.arg=list(method="HAM"))
## Generating the plot as a 300 dpi image in mytree.jpg
seqtreedisplay(seqt, filename = "mytree.jpg", type="d", border=NA, image.format = "jpg",
device.args=list(width=480*300/72, height=480*300/72, res=300))
Below is my previous answer that does not work because seqtreedisplay internally first generates texts and plots in bitmap format before saving them in the image.format.
A solution would be to select a vectorial format (e.g. pdf or eps) for the outcome of seqtreedisplay and then to convert this vectorial file into a raster format with the desired resolution.
Assuming you have installed ImageMagick (and Gostscript on which ImageMagick relies to convert to/from pdf or eps), you could use the convert.g function of TraMineRextras for this conversion. I illustrate below using the mvaddata:
## Drawing the tree as a pdf file and converting into jpeg
seqtreedisplay(seqt, filename = "mytree.pdf", type="d", border=NA, image.format = "pdf")
path <- getwd() ## retrieve the path
convert.g(path = path, fileroot = "mytree", from = "pdf", to = "jpeg",
options = "-units PixelsPerInch -density 300x300")
The resulting jpeg file will be in a jpeg subdirectory of the current folder.
Related
I have just started trying to use pdftools to extract images from pdfs. However I have found that not all layers are reproduced. For example in the code below the lines are reproduced in the png but not the points. Obviously in this example I could just save the png directly but I'm just using it to highlight the problem I am having for other data when I don't have the source code/data creating the pdf.
Warnings the code below creates files in the C:\temp directory
library(tidyverse)
library(pdftools)
set.seed(5)
df <- data.frame(Date = rep(as.Date(1:50, origin = "1990-01-01"),2), value = c(1:50,1:50)+c(rnorm(50),rnorm(50,sd=5)), var = rep(c("a","b"),each = 50))
plt1 <- ggplot(df, aes(x = Date, y = value, colour = var))+
geom_line()+
geom_point()
ggsave(plt1, filename = "C:/temp/testplot.pdf", width = 5, height = 4)
This creates pdf with points and lines as expected
However when I convert I do no get points, only lines
pdf_convert("C:/temp/testplot.pdf", format = "png", filenames = "C:/temp/testpng.png")
#> Converting page 1 to C:/temp/testpng.png...
#> PDF error: No display font for 'ArialUnicode'
#> done!
#> [1] "C:/temp/testpng.png"
Created on 2019-11-19 by the reprex package (v0.3.0)
I have also tried using pdftools::pdf_render_page and the image_read_pdf and image_convert from the magick package with the same results. However I understand that the magick functions are actually using pdftools, so the problem must be there
Suggested work-around:
Open pdf file in Adobe Acrobat.
Select "File" -> "Print" -> "Microsoft Print to PDF" -> "Advanced" -> check in front of "Print As Image" -> "OK" -> "Print"
Then, perform the "pdf_convert" on the new .pdf copy you just created.
I am creating an html document by creating various objects with ggplotly() and htmltools functions like h3() and html(). Then I submit them as a list to htmltools::save_html() to create an html file.
I would like to add ggplot charts directly as images, rather than attaching all the plotly bells and whistles. In the end, I will create a self-contained html file (no dependencies), and the plotly stuff would make that file excessively large.
Is there some function that converts a ggplot object into some html-type object? Or do I have to save the ggplot as a .png file, then read the .png file into some object that I add to the list in the save_html() function?
My R code looks something like this:
library("tidyverse")
library("plotly")
library("htmltools")
HTMLOut <- "c:/Users/MrMagoo/My.html")
df <- data.frame(x=1:25, y=c(1:25*1:25))
g7 <- ggplot(df,aes(x=x, y=y)) + geom_point()
p7 <- ggplotly(g7) # I would like to use something other than ggplotly here. Just capturing the ggplot as an image would be fine.
# create other objects to add to the html file
t7 <- h2(id="graph7", "Title for graph #7")
d7 <- p("description of graph 7")
save_html(list(t7, p7, d7), HTMLOut)
# of course, the real code has many more objects in that list – more graphs, text, tables, etc.
I would like to replace the plotly object (p7) with something that just presents g7 in a way that would not cause an error in the save_html function.
I had hoped to find a function that could directly Base64 encode a ggplot object, but it seems that I first need to output the 'ggplot' object as a .png file (or SVG, per Teng L, below), then base64-encode it. I was hoping there was a more direct way, but I may end up doing that, as in https://stackoverflow.com/a/33410766/3799203 , ending it with
g7img <- "<img src=\"data:image/png;base64,(base64encode string)\""
g7img <- htmltools::html(g7img)
If you want to save the plot as a dynamic plotly graph, you could use htmlwidgets::saveWidget. This will produce a stand-alone html file.
Here is a minimal example:
library(tidyverse);
library(plotly);
library(htmlwidgets);
df <- data.frame(x = 1:25, y = c(1:25 * 1:25))
gg <- ggplot(df,aes(x = x, y = y)) + geom_point()
# Save ggplotly as widget in file test.html
saveWidget(ggplotly(gg), file = "test.html");
I ended up generating a temparory image file, then base64 encoding it, within a function I called encodeGraphic() (borrowing code from LukeA's post):
library(ggplot2)
library(RCurl)
library(htmltools)
encodeGraphic <- function(g) {
png(tf1 <- tempfile(fileext = ".png")) # Get an unused filename in the session's temporary directory, and open that file for .png structured output.
print(g) # Output a graphic to the file
dev.off() # Close the file.
txt <- RCurl::base64Encode(readBin(tf1, "raw", file.info(tf1)[1, "size"]), "txt") # Convert the graphic image to a base 64 encoded string.
myImage <- htmltools::HTML(sprintf('<img src="data:image/png;base64,%s">', txt)) # Save the image as a markdown-friendly html object.
return(myImage)
}
HTMLOut <- "~/TEST.html" # Say where to save the html file.
g <- ggplot(mtcars, aes(x=gear,y=mpg,group=factor(am),color=factor(am))) + geom_line() # Create some ggplot graph object
hg <- encodeGraphic(g) # run the function that base64 encodes the graph
forHTML <- list(h1("My header"), p("Lead-in text about the graph"), hg)
save_html(forHTML, HTMLOut) # output it to the html file.
I think what you want may be close to one of the following:
Seems you are creating an HTML report but hasn't checked out RMarkdown. It comes with Base64 encode. When you create an RMarkdown report, pandoc automatically converts any plots into an HTML element within the document, so the report is self-contained.
SVG plots. This is less likely to be what you might want, but SVG plots are markup-language based and may be easily portable. Specify .svg extension when you use ggsave() and you should be getting an SVG image. Note that SVG is an as-is implementation of the plot, so if can be huge in file size if you have thousands of shapes and lines.
This is an extension to the Maurits Evers post. In this answer I'm showing how to combine multiple plotly plots in the same html file in an organized fashion:
library("plotly")
library("htmltools")
# a small helper function to avoid repetition
my_func <- function(..., title){
## Description:
## A function to add title to put multiple gg plotly objects under a html heading
##
## Arguments:
## ...: a list of gg objects
## title: a character vector to specify the heading text
# get the ... in list format
lst <- list(...)
# create the heading
tmp_title <- htmltools::h1(title)
# convert each ggplot to ggplotly and put them under the same div html tag
tmp_plot <- lapply(lst, ggplotly) |>
htmltools::div()
# return the final object as list
return(list(tmp_title, tmp_plot))
}
# a toy data
df <- data.frame(x = 1:25, y = c(1:25 * 1:25))
# the ggplot object using the toy data
gg <- ggplot(df,aes(x = x, y = y)) + geom_point()
# put everything in order
final_list <- list(my_func(obj = list(gg, gg, gg), title = "The first heading"),
my_func(obj = list(gg, gg), title = "The second heading"))
# write to disk as a unified HTML file
htmltools::save_html(html = final_list,
file = "index.html"))
Disclaimer: I specifically did this to avoid using widgetframe R package and to be completely on par with the documentation of plotly-r. You can read the link if you are comfortable with adding extra dependency and extra abstraction layer. I prefer to use packages if and only if necessary. :)
I'd like to have an R plot as a matrix of pixels. I can simply save and read it again as a bitmap. But is there a faster way to do this directly without saving the plot first?
library(readbitmap)
tmp = tempfile()
bmp(tmp)
plot(1)
dev.off()
result = read.bitmap(tmp)
I would like to monitor the basic quality of the figures produced in R on individual pages such as byte size of each page,...
I can now do only quality assurance of average pages, see the following chapter about it.
I think there must be something builtin for the task than average measures.
Code which produces 4 pages in Rplots.pdf where I would like to know the byte size of each page in an output here; any other statistics of the page outputs is also welcome;
you can get the basic memory monitoring by objects here but I would like it to correspond to the outputs in PDF
# https://stat.ethz.ch/R-manual/R-devel/library/graphics/html/plot.html
require(stats) # for lowess, rpois, rnorm
plot(cars)
lines(lowess(cars))
plot(sin, -pi, 2*pi) # see ?plot.function
## Discrete Distribution Plot:
plot(table(rpois(100, 5)), type = "h", col = "red", lwd = 10,
main = "rpois(100, lambda = 5)")
## Simple quantiles/ECDF, see ecdf() {library(stats)} for a better one:
plot(x <- sort(rnorm(47)), type = "s", main = "plot(x, type = \"s\")")
points(x, cex = .5, col = "dark red")
## TODO summarise here the byte size of figures in the figures (1-4)
# Output: Rplot.pdf where 4 pages; I want to know the size of each page in bytes
I am currently doing the basic quality assurance in command-line but would like to move some of it to R, to observe bugs faster.
Expected output: byte size, for instance like 4th column of ls -l
To get bytesize of average individual page in an output document
Limitations
Requirement of the homogeneity of the data in pages. This method only works if the pages are all from the same sample.
Otherwise, it is troublesome because it is only average, not describing then the individual phenomenons.
Other possible weaknesses
PDF-elements and meta data. Consider PDF-file as whole, not focusing on the graphic objects itself. So this limits the absolute value use because the filesize contains also headers and other meta data which are not about the graphic objects.
Code
filename <- "main.pdf"
filesize <- file.size(filename)
# http://unix.stackexchange.com/q/331175/16920
pages <- Rpoppler::PDF_info(filename)$Pages
# print page size (= filesize / pages)
pagesize <- filesize / pages
## data of example file
num 7350960
int 62
num 118564
Input: just any 62-pages document
Output: average individual page size (118564)
Testing and's answer
Output but you cannot change the input easily to your wanted PDF-file
files size_bytes
[1,] "./test_page_size_pdf/page01.pdf" "4,123,942"
[2,] "./test_page_size_pdf/page02.pdf" " 4,971"
[3,] "./test_page_size_pdf/page03.pdf" " 4,672"
[4,] "./test_page_size_pdf/page04.pdf" " 5,370"
Input: just any 64-pages document
Expected output: 67 (= 64 + 3) pages, not 4 analysed
R: 3.3.2
OS: Debian 8.5
Download and install the pdftk utility if it is not already on your system and then try one of the following alternatives this from within R.
1) It will return a data frame with the page file sizes in bytes and other information.
myfile <- "Rplots.pdf"
system(paste("pdftk", myfile, "burst"))
file.info(Sys.glob("pg_*.pdf"))
It will also generate a file doc_data.txt with some miscellaneous information that may or may not be of interest.
1a) This alternative will not generate any files. It will simply return the character sizes of the pages as a numeric vector.
myfile <- "Rplots.pdf"
pages <- as.numeric(read.dcf(pipe(paste("pdftk", myfile, "dump_data")))[, "NumberOfPages"])
cmds <- sprintf("pdftk %s cat %d output - | wc -c", myfile, seq_len(pages))
unname(sapply(cmds, function(cmd) scan(pipe(cmd), quiet = TRUE)))
The above should work if pdftk and wc are on your path. Note that on Windows you can find wc in the Rtools distribution and is typically at "C:\\Rtools\\bin\\wc" once Rtools is installed.
2) This alternative is similar to (1) but uses the animation package:
library(animation)
ani.options(pdftk = "/path/to/pdftk")
pdftk("Rplots.pdf", "burst", "pg_%04d.pdf", "")
file.info(Sys.glob("pg_*.pdf"))
To measure the size of each page in a pdf-file I suggest this:
test_size <- TRUE
pdf_name <- "masterpiece"
if(test_size){
dir.create("test_page_size_pdf")
pdf_address <- paste0("./test_page_size_pdf/page%02d.pdf")
} else { pdf_address <- paste0("./", pdf_name, ".pdf")}
pdf(pdf_address, width=10, height=6, onefile=!test_size)
par(mar=c(1,1,1,1), oma=c(1,1,1,1))
plot(rnorm(10^6, 100, 5), type="l")
plot(sin, -pi, 2*pi)
plot(table(rpois(100, 5)), type = "h", col = "red", lwd = 10,
main = "rpois(100, lambda = 5)")
plot(x <- sort(rnorm(47)), type = "s", main = "plot(x, type = \"s\")")
points(x, cex = .5, col = "dark red")
dev.off()
if(test_size){
files <- paste0("./test_page_size_pdf/", list.files("./test_page_size_pdf/"))
size_bytes <- format(file.size(files), big.mark = ",")
file.remove(files)
file.remove("test_page_size_pdf")
cbind(files, size_bytes)
}
The size of a pdf-page in R depends on three things: the content of the plot(), the options used in the pdf() function and the plotting options which are here defined in par().
All this is difficult to estimate. You mention also that you like to have something similar to the shell function ls, which run on files as well. So in this solution I create a temporary folder dir.create() in which we save every page of the pdf separately in a file. We implement this with the option onefile. When the plotting is finish every pdf-page-file as well as the temporary folder will be deleted. And you can see the result in the console.
If you are finish with the testing and want the result in a single file you just have to change in the first line of this script the variable test_size <- FALSE. By the way; I have some doubt that the size of a page is a proxy for the quality of an image. Pdf is a vector format, so the size correspondent with the number of elements: see the size of the first page in my example where I plot 1mio points.
I have read, if tikz takes a raster image it is to be stored as png. Having that, tikz produces the rest of the graph around it and include the raster image in the final tex-file back again.
Now I have the following:
pic <- T
if(pic)
{
tikz(file=paste(plotpath,"Rohdaten_S1_S2_D21.tex",sep=""),width=width,height=height,engine = "pdftex",)
#png(filename=paste(plotpath,"Rohdaten_S1_S2_D6.png",sep=""),width=width,height=height,res=res,units="in")
par(mfrow=c(2,1),mar=c(1.1,3,2,0),mgp=c(1.5,0.5,0),ps=f.size,cex=1,xaxt="n")
}
if(!pic) par(mfrow=c(2,1),mar=c(1,4,3,0))
for(i in 1:2)
{
x <- sensors[[i]]$time
y <- sensors[[i]]$depth
z <- sensors[[i]]$velo
image(x,y,z)
# plot.image(x,y,z
# ,xlim=c(max(x)-400,max(x)),zlim=2*c(-1,1)
# ,xlab="",ylab="$d/\\mathrm{m}$",zlab="$v/(\\mathrm{mm/s})$"
# ,z.adj=c(0,0),ndz=5,z.cex=1
# )
abline(v=(1:10)/0.026+par("usr")[1],lty=2)
if(!pic) abline(h=(1:floor(max(y/0.02)))*0.02)
mtext(text=paste("Sensor",i),side=3,line=0.1,adj=0)
par(mar=c(3,3,0.1,0),xaxt="s")
}
title(xlab="t/s")
if(pic) dev.off()
even the simple image() function will produce a 100MB large .tex file.
No png is produced, everything is in the .tex file?!
What am I doing wrong? Is there a switch to be set TRUE? What do I have to do to put the rasterimage apart from the nice looking text.
Thank you for your help.
The solution is quite simple, but not obvious.
the image()-function in R produces vector graphics in the first
instance. There is a switch image(...,useRaster = T) with which
one can force the image()-function to produce raster graphics.
the image()-function aspects a regular grid (quadratic pixels). Otherwise an error occurs.
How to get a regular grid?
Suppose you have an image with the coordinates x[],y[] and the scalar matrix z[,]. Then the re-sampled regular grid can be calculated:
x.new<-seq(min(xlim),max(xlim),length.out=dim.max[1])
y.new<-seq(min(ylim),max(ylim),length.out=dim.max[2])
z<-apply(z,2,function(y,x,xout) return(approx(x,y,xout=xout+min(diff(x))/2,method="constant",rule=2)$y),x,x.new)
z<-t(apply(z,1,function(y,x,xout) return(approx(x,y,xout=xout+min(diff(x))/2,method="constant",rule=2)$y),y,y.new))
tikz(file ='a.tex',width = 2, height = 2)
image(x,y,z,useRaster = T)
dev.off()
The important things are the method = "constant" and the rule = 2 statements in the approx()-function. These enables a "shifting" to the regular grid.
Applying all this and tikz() will split the picture in a a.tex-file and a a_ras1.png-file.
I hope this will help sombody programming R and using tikzDevice to produce pictures for tex documents.