fastest way to create xlsx with plots R - r

I have list of data and list of plots which I want two write to xlsx file (each element to separate sheet). Example data:
require(ggplot2)
require(data.table)
n <- 10
N <- 100
dtList <- lapply(1:n, function(x) data.table(sample(1e6, N), 1:N))
names(dtList) <- 1:n
plots <- lapply(dtList, function(x) ggplot(x, aes(y = V1, x = V2)) + geom_line())
Currently I use openxlsx, but it is quite slow for multiple plots:
require(openxlsx)
wb <- createWorkbook()
modifyBaseFont(wb, fontSize = 10)
writeXlsx <- function(x, sName) {
addWorksheet(wb, sName, gridLines = FALSE)
writeData(wb, sName, x = x, xy = c(1, 1))
print(plots[[sName]])
insertPlot(wb, sName, width = 19, height = 9, dpi = 200, units = "cm",
startRow = 2, startCol = 5)
}
system.time(
sapply(seq_along(dtList), function(x) {
writeXlsx(dtList[[x]], names(dtList)[[x]])
})
) # ~ 17.00 sek
openXL(wb)
How could I increase speed of this? Is there a better package to accomplish this?

One options is to use simpler graphics. For example, changing plots to base graphics, like:
plots <- lapply(dtList, function(x) plot(x$V2, x$V1, type = 'l'))
reduces the xlsx creation time to ~0.72 seconds vs ~7.78 seconds (original code works now faster than before), that is around 10 times faster.
When ggplot graphics are needed, I modified insertPlot function to accept this type of object and save it to file without needing to print in R session (using ggsave):
insertggPlot <- function(wb, sheet, width = 6, height = 4, xy = NULL,
startRow = 1, startCol = 1, fileType = "png",
units = "in", dpi = 300, PLOT) {
od <- getOption("OutDec")
options(OutDec = ".")
on.exit(expr = options(OutDec = od), add = TRUE)
if (!"Workbook" %in% class(wb)) stop("First argument must be a Workbook.")
if (!is.null(xy)) {
startCol <- xy[[1]]
startRow <- xy[[2]]
}
fileType <- tolower(fileType)
units <- tolower(units)
if (fileType == "jpg") fileType = "jpeg"
if (!fileType %in% c("png", "jpeg", "tiff", "bmp"))
stop("Invalid file type.\nfileType must be one of: png, jpeg, tiff, bmp")
if (!units %in% c("cm", "in", "px"))
stop("Invalid units.\nunits must be one of: cm, in, px")
fileName <- tempfile(pattern = "figureImage",
fileext = paste0(".", fileType))
ggsave(plot = PLOT, filename = fileName, width = width, height = height,
units = units, dpi = dpi)
insertImage(wb = wb, sheet = sheet, file = fileName, width = width,
height = height, startRow = startRow, startCol = startCol,
units = units, dpi = dpi)
}
Using this, reduces time to ~2 sek.

Related

How can I add a baseplot with no fixed values to a document

I want to add a baseplot to a word document.
In the documentation for the officer package there's an example that uses the plot_instr function:
anyplot <- plot_instr(code = {
barplot(1:5, col = 2:6)
})
doc <- read_docx()
doc <- body_add(doc, anyplot, width = 5, height = 4)
print(doc, target = tempfile(fileext = ".docx"))
I want to add a plot to a word document inside a function so I need variable input for the plot function like this:
x=1:5
cols=2:6
anyplot <- plot_instr(code = {
barplot(x,col=cols)
})
doc <- read_docx()
doc <- body_add(doc, anyplot, width = 5, height = 4)
print(doc, target = tempfile(fileext = ".docx"))
But the code above doesn't work and I can't find any other examples of plot_instr usage.
I think I found a solution!
When I set a breakpoint before the barplot(...) call, I could see the code when body_add is called with a plot_instr wrapper function. The code creates a temporary png file. I copied this code and adapted it:
x=1:5
cols=2:6
doc <- read_docx()
file <- tempfile(fileext = ".png")
options(bitmapType = "cairo")
png(filename = file, width = 5, height = 5, units = "in", res = 300)
tryCatch({
barplot(x,col=cols)
}, finally = {
dev.off()
})
on.exit(unlink(file))
value <- external_img(src = file, width = 5, height = 5)
body_add(doc, value)
print(doc, target = tempfile(fileext = ".docx"))
The code is generating the following error
Error in barplot.default(x, col = cols) :
'height' must be a vector or a matrix
The issue here is that function body_add has an argument x and you are defining an x before the call. Changing the name to something else solves the issue:
z <- 1:5
cols=2:6
anyplot <- plot_instr(code = {
barplot(z, col = cols)
})
doc <- read_docx()
doc <- body_add(doc, anyplot, width = 5, height = 4)
print(doc, target = "temp.docx")

saving and naming files in R automatically based on input filename

I have generated several Utilisation Distributions (UD) with AdehabitatHR and stored them as Geotiffs. I am now using the same UDs with the Lattice package to generate some maps and saving them to a high-res tiff image with LZW compression. Problem is that I have literally hundreds of maps to make, save and name. Is there a way automatically do this once i have loaded all the necessary files from a directory? Each one of my UDs has the following structure of the filename "UD_resolution_species_area_year_season. tif" and in the final name I give to my map I would like to keep the same structure (or entire filename) but add the prefix "blablabla_" e.g. "blablabla_UD_resolution_species_area_year_season.tiff". The image also include a main name, a capital letter, which should also change.
At the moment I am using the following:
rlist = list.files(getwd(), pattern = "tif$", full.names = FALSE)
for (i in rlist) {
assign(unlist(strsplit(i, "[.]"))[1], raster(i))
}
shplist = list.files(getwd(), pattern = "shp$", full.names = FALSE)
for (i in shplist) {
assign(unlist(strsplit(i, "[.]"))[1], readOGR(i))
}
UD <- 'UD_resolution_species_area_year_season'
ext <- extent(UD) + 0.3 # set the extent for the plot
aa <-
quantile(UD,
probs = c(0.25, 0.75),
type = 8,
names = TRUE)
my.at <- c(aa[1], aa[2])
my.at <- round(my.at, 3)
maxval <- maxValue(UD)
tiff(
"C:/myworkingdirectory/maps/blablabla_UD_resolution_species_area_year_season.tiff",
res = 600,
compression = "lzw",
width = 15,
height = 15,
units = "cm"
)
levelplot(
UD,
xlab = "",
ylab = "",
xlim = c(ext[1], ext[2]),
ylim = c(ext[3], ext[4]),
margin = FALSE,
contour = FALSE,
col.regions = viridis(1000),
colorkey = list(at = seq(0, maxval)),
main = "A",
maxpixels = 2e5
) + latticeExtra::layer(sp.polygons(Land, fill = "grey50", col = NA)) + contourplot(
`UD`,
at = my.at[1],
labels = FALSE,
margin = FALSE,
lty = 2,
col = "orange",
pretty = TRUE
) + contourplot(
UD,
at = my.at[2],
labels = FALSE,
margin = FALSE,
lty = 2,
col = "red",
pretty = TRUE,
)
dev.off()
It is a common beginners mistake to use assign. Do not use it, it creates the type of trouble you are now facing. In stead, you can make lists and/or use a loop.
Also what you are asking is basic R stuff, but you are complicating the question with adding lots of irrelevant detail about setting the extent, and levelplot. It is better to learn about doing these basic things by removing the clutter and focus on a simple case first. That is also how you should write questions for this forum.
In essence you have a bunch of files you want to process. Below I show how you can loop over a vector of the names and then loop and do what you need to do in that loop.
library(raster)
rastfiles <- list.files(pattern = "tif$", full.names=TRUE)
outputfiles <- file.path("output/path", paste0("prefix_", basename(rastfiles)))
for (i in 1:length(rastfiles))
r <- raster(rastfiles[i])
png(outputfiles[i])
plot(r)
dev.off()
}
You can also first read all the files into a list
rastfiles <- list.files(pattern = "tif$", full.names=TRUE)
rlist <- lapply(rastfiles, raster)
names(rlist) <- gsub(".tif$", "", basename(rastfiles))
rastfiles <- list.files(pattern = "shp$", full.names=TRUE)
slist <- lapply(shpfiles, readOGR)
names(slist) <- gsub(".shp$", "", basename(shpfiles))
And perhaps create a vector of output filenames
outputtif <- file.path("output/dir", basename(rastfiles))
And then loop over the items in the list, or the output filenames

Function input used as string

saving_ggplot <- function(name = 'default', plotname = last_plot()) {
image_name = paste(name, ".png", sep="")
ggsave(image_name, plot = plotname,
scale = 1,
dpi = 300, limitsize = TRUE)
}
This is my function which saves a ggplot. However, I for the life of me cannot figure out how to take the name argument as a string.
for example if someone runes saving_ggplot(FILENAME, PLOTNAME)
it will just say no object FILENAME. In python I can just capture it and use it as str(), but using as.character or toString in R still doesn't work.
Error:
saving_ggplot(weightvsageTEST, weightvsageplot)
Error in paste(name, ".png", sep = "") :
object 'weightvsageTEST' not found
Successful call using ggsave:
ggsave('weightvsage.png', plot = last_plot(),
scale = 1,
dpi = 300, limitsize = TRUE)
You can use substitute():
saving_ggplot <- function(name, plotname) {
image_name = paste0(substitute(name), ".png") # paste0 removes need for sep arg
ggsave(image_name, plot = plotname,
scale = 1,
dpi = 300, limitsize = TRUE)
}
saving_ggplot(foo, p) # saves foo.png
Alternately, if you want to stay within tidyverse quasiquotation syntax, use enexpr() instead:
enexpr(name) # instead of substitute(name)
Data:
N <- 100
df <- data.frame(x=rnorm(n=N), y=rnorm(n=N))
p <- ggplot(df, aes(x,y)) + geom_smooth()

Saving multiply pdf plots r

I have made a loop for making multiply plots, however i have no way of saving them, my code looks like this:
#----------------------------------------------------------------------------------------#
# RING data: Mikkel
#----------------------------------------------------------------------------------------#
# Set working directory
setwd()
#### Read data & Converting factors ####
dat <- read.table("Complete RING.txt", header =TRUE)
str(dat)
dat$Vial <- as.factor(dat$Vial)
dat$Line <- as.factor(dat$Line)
dat$Fly <- as.factor(dat$Fly)
dat$Temp <- as.factor(dat$Temp)
str(dat)
datSUM <- summaryBy(X0.5_sec+X1_sec+X1.5_sec+X2_sec+X2.5_sec+X3_sec~Vial_nr+Concentration+Sex+Line+Vial+Temp,data=dat, FUN=sum)
fl<-levels(datSUM$Line)
colors = c("#e41a1c", "#377eb8", "#4daf4a", "#984ea3")
meltet <- melt(datSUM, id=c("Concentration","Sex","Line","Vial", "Temp", "Vial_nr"))
levels(meltet$variable) <- c('0,5 sec', '1 sec', '1,5 sec', '2 sec', '2,5 sec', '3 sec')
meltet20 <- subset(meltet, Line=="20")
meltet20$variable <- as.factor(meltet20$variable)
AllConcentrations <- levels(meltet20$Concentration)
for (i in AllConcentrations) {
meltet.i <- meltet20[meltet20$Concentration ==i,]
quartz()
print(dotplot(value~variable|Temp, group=Sex, data = meltet.i ,xlab="Time", ylab="Total height pr vial [mm above buttom]", main=paste('Line 20 concentration ', meltet.i$Concentration[1]),
key = list(points = list(col = colors[1:2], pch = c(1, 2)),
text = list(c("Female", "Male")),
space = "top"), col = colors, pch =c(1, 2))) }
I have tried with the quartz.save function, but that just overwrites the files. Im using a mac if that makes any difference.
When I want to save multiple plots in a loop I tend to do something like...
for(i in AllConcentrations){
meltet.i <- meltet20[meltet20$Concentration ==i,]
pdf(paste("my_filename", i, ".pdf", sep = ""))
dotplot(value~variable|Temp, group=Sex, data = meltet.i ,xlab="Time", ylab="Total height pr vial [mm above buttom]", main=paste('Line 20 concentration ', meltet.i$Concentration[1]),
key = list(points = list(col = colors[1:2], pch = c(1, 2)),
text = list(c("Female", "Male")),
space = "top"), col = colors, pch =c(1, 2))
dev.off()
}
This will create a pdf file for every level in AllConcentrations and save it in your working directory. It will paste together my_filename, the number of the iteration i, and then .pdf together to make each file unique. Of course, you will want to adjust height and width in the pdf function.

Problems saving several pdf files in R

I'm trying to save several xyplots created with a "for" loop in R and I'm not able to get complete pdf files (all files have the same size and I can't open them) if I execute the following loop:
for (i in 1:length(gases.names)) {
# Set ylim;
r_y <- round(range(ratio.cal[,i][ratio.cal[,i]<999], na.rm = T), digits = 1);
r_y <- c(r_y[1]-0.1, r_y[2]+0.1);
outputfile <- paste (path, "/cal_ratio_",gases.names[i], ".pdf", sep="");
dev.new();
xyplot(ratio.cal[,i] ~ data.GC.all$data.time, groups = data.vial, panel =
panel.superpose, xlab = "Date", ylab = gases.names[i], xaxt="n", ylim = r_y);
savePlot(filename = outputfile, type = 'pdf', device = dev.cur());
dev.off();
}
(a previous version was using trellis.device() instead of dev.new() + savePlot())
Do you know why I can't get good pdf files? If I do it "manually" it works... Any idea?
use pdf directly
for (i in seq_along(gases.names)) {
# Set ylim
r_y <- round(range(ratio.cal[,i][ratio.cal[,i]<999], na.rm = T), digits = 1)
r_y <- c(r_y[1]-0.1, r_y[2]+0.1)
outputfile <- paste (path, "/cal_ratio_",gases.names[i], ".pdf", sep="")
pdf(file = outputfile, width = 7, height = 7)
print(xyplot(ratio.cal[,i] ~ data.GC.all$data.time, groups = data.vial,
panel = panel.superpose, xlab = "Date", ylab = gases.names[i],
xaxt="n", ylim = r_y))
dev.off()
}

Resources