Attach date to PDF generated with Sweave - r

I generate via Sweave a daily report. I would like to attach to the PDF´s name the current date in the format YYYYMMDD. I am using the following code to generate the file:
rnwfile <- system.file("Sweave", "Margin.Rnw", package = "utils")
Sweave(rnwfile)
tools::texi2pdf("Margin.tex")
Margin.Rnw is my master copy of the report I want to generate (mixing LaTeX with R code). The output I get is a the file Margin.pdf. I would like instead to have a file named *Margin_YYYYMMDD.pdf*.
I would appreciate if you have any advise.

See the output argument to ?RweaveLatex.
This is untested but should (?) work:
rnwfile <- system.file("Sweave", "Margin.Rnw", package = "utils")
outfn <- paste0("Margin_",format(Sys.time(),"%Y%m%d"),".tex")
Sweave(rnwfile,output=outfn)
tools::texi2pdf(outfn)

Related

Saving pptx as pdf in R

I have created powerpoint files using officer package and I would also like to save them as pdf from R (dont want to manualy open and save as pdf each file). Is this possible?
you can save the powerpoint object edited using the code which is posted here: create pdf in addition to word docx using officer.
You will need to first install pdftools and libreoffice
library(pdftools)
office_shot <- function( file, wd = getwd() ){
cmd_ <- sprintf(
"/Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to pdf --outdir %s %s",
wd, file )
system(cmd_)
pdf_file <- gsub("\\.(docx|pptx)$", ".pdf", basename(file))
pdf_file
}
office_shot(file = "your_presentation.pptx")
Note that the author of the officer package is the one who referred someone to this response.
Note that the answer from Corey Pembleton has the LibreOffice iOS path. (Which I personally didn't initially notice). The Windows path would be something like "C:/Program Files/LibreOffice/program/soffice.exe".
Since the initial answer provided by Corey, an example using docxtractr::convert_to_pdf can now be found here.
The package and function are the ones John M commented in Corey initial answer.
An easy solution to this question is to use convert_to_pdf function from docxtractr package. Note: this solution requires to download LibreOffice from here. I used the following order.
First, I need to set the path to LibreOffice and soffice.exe
library(docxtractr)
set_libreoffice_path("C:/Program Files/LibreOffice/program/soffice.exe")
Second, I set the path of the PowerPoint document I want to convert to pdf.
pptx_path <- "G:/My Drive/Courses/Aysem/Certifications/September17_Part2.pptx"
Third, convert it using convert_to_pdf function.
pdf <- convert_to_pdf(pptx_path, pdf_file = tempfile(fileext = ".pdf"))
Be careful here. The converted pdf file is saved in a local temporary folder. Here is mine to give you an idea. Just go and copy it from the temporary folder.
"C:\\Users\\MEHMET~1\\AppData\\Local\\Temp\\RtmpqAaudc\\file3eec51d77d18.pdf"
EDIT: A quick solution to find where the converted pdf is saved. Just replace the third step with the following line of code. You can set the path where you want to save. You don't need to look for the weird local temp folder.
pdf <- convert_to_pdf(pptx_path, pdf_file = sub("[.]pptx", ".pdf", pptx_path))

R language Amelia specify prefix of output files

This R statement uses the Amelia package to create output data files containing imputed data:
ds.im <- amelia(ds, m=5, p2s=2)
The names of the 5 output files are: output1.csv to output5.csv
In the Amelia package, is there a way to specify the prefix of the output files to something more meaningful? For example, boat_impute1.csv to boat_impute5.csv
I could not locate such a command in the amelia documentation (https://cran.r-project.org/web/packages/Amelia/vignettes/amelia.pdf)
Thanks.
My question not a good one.
The output files are not written with the amelia command. Rather they are written with the
write.amelia command. One enters what I called the prefix with the file.stem command, such as:
write.amelia(ds.im, file.stem = "boat_impute", format = "csv")

How to scrape a downloaded PDF file with R

I’ve recently gotten into scraping (and programming in general) for my internship, and I came across PDF scraping. Every time I try to read a scanned pdf with R, I can never get it to work. I’ve tried using the file.choose() function to no avail. Do I need to change my directory, or how can I get the pdf from my files into R?
The code looks something like this:
> library(pdftools)
> text=pdf_text("C:/Users/myname/Documents/renewalscan.pdf")
> text
[1] ""
Also, using pdftables leads me here:
> library(pdftables)
> convert_pdf("C:/Users/myname/Documents/renewalscan.pdf","my.csv")
Error in get_content(input_file, format, api_key) :
Bad Request (HTTP 400).
You should use the packages pdftools and pdftables.
If you are trying to read text inside the pdf, then use pdf_text() function. What goes inside is the path (in your computer or web) to the pdf. For example
tt = pdf_text("C:/Users/Smith/Documents/my_file.pdf")
It would be nice if you were more specif and also give us reproducible example.
To use the PDFTables R package, you need to the run the following command:
convert_pdf('test/index.pdf', output_file = NULL, format = "xlsx-single", message = TRUE, api_key = "insert_API_key")
If you are looking to get tabular data, you might try tabulizer. Here is a full code tutorial: https://www.business-science.io/code-tools/2019/09/23/tabulizer-pdf-scraping.html
Basically, you can use this code from the tutorial:
library(tabulizer)
extract_tables(
file = "2019-09-23-tabulizer/endangered_species.pdf",
method = "decide",
output = "data.frame")

LIWC2015 import in r

I use LIWC2015 as student.
I would like to use it with R.
I found the package LIWCalike with which it is possible to use LIWC dictonary.
I have installed the dictionary to my computer.
However I can't find with file I should include into my path in order to use it with. There is the executable version, also a jar file and I extracted dictonaries however they are only available into pdf format.
What file should I use from LIWC2015 dictonary in order to use it in R?
This example code is from package but I don't have a cat file
liwc2007dict <- dictionary(file = "~/Dropbox/QUANTESS/dictionaries/LIWC/LIWC2007.cat",
format = "wordstat")
tail(liwc2007dict, 1)
You need to change the format of the file in the code in R.
Try this:
liwc2015dict <- dictionary(file = "~/Dropbox/QUANTESS/dictionaries/LIWC/LIWC2015_English_Flat.dic",
format = "LIWC")
It's documented here.

Export PMML to a text file?

Simple question, I have stored PMML code of an R object using pmmlcode <- pmml(my.object), and I would like some way to save it directly to a text file. The usual write.table method isn't working because the data is not a table.
You can simply use SaveXML as in the example below:
library(randomForest)
library(pmml)
data(airquality)
ozone.out <- randomForest(Ozone ~ Wind+Temp+Month, data=na.omit(airquality), ntree=200)
saveXML(pmml(ozone.out, data=airquality), "airquality_rf.pmml")
Try toString.XMLNode from XML package and then write to file with writeLines. You'll need to provide example data for a more complete answer.
I am using the iris data just to generate a dummy pmml file and sink command to put your pmml output into a .pmml file,
R > library(pmml)
R > lml <- lm(iris$Sepal.Length~iris$Sepal.Width)
R > sink("myPmml.pmml")
R > cat("<?xml version=\"1.0\"?>\n")
R > pmml(lml)
R > sink()
The output myPmml.pmml should be saved wherever your setwd is set on your .Rprofile , the default is "Mydocuments" in windows. Offcourse this will work even if you put .txt instead of .pmml in the sink() command , something like:
sink("mypmml.txt")
Edit: Added cat command to put xml tags on top, Thanks to J.Dimeo
In the absence of test code to create this but after solving my earlier problem with the availability of the pmml package on the UCLA CRAN mirror. This produces acceptable output for human readability although not in a format that will be interpretable my a PMML-aware application:
cat(paste(unlist(pmmlcode),"\n"), file="yourfile.txt")
Neither of these worked:
If it's just a character vector:
cat(pmmlcode, file="yourfile.txt")
Or if it's a list:
lapply(pmmlcode, cat, file="yourfile.txt", append=TRUE)

Resources