RMarkdown dynamic hyperlinks - r

I have been lately busy writing an RMarkdown document that would render a pdf report with information about genetic variants. However, the readers for whom the reports are intended asked whether I could embed a link to the UCSC Genome Browser to view the variant in context. I'm actually looping through the variants in a table and printing the annotations thus:
for i in nrows(table){cat("Region:"}
I know that if I do this:
for i in nrows(table){cat("[Region:](www.somewebsite.com)")}
I can get the word "Region" to become a clickable hyperlink in the final PDF. My problem is that the link should be dynamic (i.e. the link should take the genomic coordinates for that particular variant). I've been looking for hours but can't find a way to do that. Is that possible in R/RMarkdown or should I change my strategy?
Thanks!

Related

Automatically knit two versions of an Rmarkdown PDF file, one showing and one hiding all code

I am working on a PDF report that includes many analyses, figures and tables, and that needs to be frequently updated. This is read both by people who are interested only in the output (no code), but also by people who want to see the code.
I know I can hide all chunks globally putting knitr::opts_chunk$set(echo=FALSE) at the top of the doc. However, I was wondering whether there is a way to automatically knit two versions of a report, one showing all code chunks, and one hiding them?

Using Google Sheet Formula IMPORTXML to extract hyperlinks from table on web page, and flag when an image is in seperate column

I'm doing some analysis which requires me to save table data and (hyperlinked) links to lots of PDF's from a webpage (https://www.asx.com.au/asx/v2/statistics/prevBusDayAnns.do).
I've been playing around with the =IMPORTHTML and =IMPORTXML formulas in Google Sheets and have managed to extract the table data using =IMPORTHTML(A1,"table",1), but I'm struggling to extract the "Price sens." column which contains images or the hyperlinks attached to the "Headline" items. I'm having no luck with IMPORTXML so far, and can't seem to find any solutions online.
The formula for IMPORTXML you're looking for is:
=IMPORTXML("https://www.asx.com.au/asx/v2/statistics/prevBusDayAnns.do","//*[#id='content']/div/announcement_data/table/tbody/tr")
You need to provide an XPATH, which you can get by clicking on an element in the browser dev tools and selecting copy > XPATH.
Unfortunately, while this does produce output, it's just the same as for IMPORTHTML. The price sensitivity column is always empty, too.
The reason for this is, that the content of the price sensitivity columns is not text, but an image, as you can see in your screenshots.
So it looks like you need some more powerful HTML parsing tools here than Google Sheets provides. It would be easy to look for img tags if you parsed the website using Python and Beautifulsoup, for instance. So you may want to go down this route.
Here's what I got using IMPORTXML, same as you:
The problem is that price sensitivity is img not text:

How to open a Word document, choose 'yes' in a pop-up window and save it via code?

I am generating reports in Word using R officer package. I want my reports to contain table of contents, list of figures and list of tables, but when I add them, there is a warning "This document contains fields that may refer to other files. Do you want to update the fields in this document?". So in every report I have to click 'Yes' and then save changes in the report so the window does no show up again.
My question is if it's possible to do these three steps: opening the ready report, clicking 'Yes' when the window shows up and saving the changes in some sort of code?
I've been searching through the Internet, but haven't found anything that would help me.
Microsoft website provides some articles that seem related to my problem, such as this one: https://learn.microsoft.com/en-us/visualstudio/vsto/how-to-programmatically-open-existing-documents?view=vs-2019, but I don't really understand if they answer my question.
I will appreciate any suggestion on the software I should use to do this and/or sample codes.

How to write code and view results at the same time?

I am trying to create a rmarkdown document for monthly reports. The issue I have is writing code and trying to visualize what the result would look like in the final document. I can generate and view maps in RStudio but this doesn't show me how they are positioned relative to text and other features within the output document (PDF, HTML etc)
Is there software that exists that allows you to write code in one window and view the product in another as code is developed? At the moment I am knitting the code and viewing the results in a trial and error process. It would be nice to see what the results look like in the document without having to rerun my file each time I change a piece of code and open the output document.
Kind regards,
Simon
If you click the green arrow in the top right of your block of code, it will just run that block of code and won't create a full output file. It will also tell you where an error is if any.

MS Word track changes and RMarkDown

I try to write all data analysis reports using R Markdown, because I can have a reproducible document that I can share in several output formats (Pdf, html and MS Word).
However, most of my colleagues use MS Word and they have no idea about R, Markdown, etc.
One advantage of using R Markdown is that I can generate my report in MS Word and directly share it with my colleagues.
The disadvantage is that collaboration becomes cumbersome for me, because I receive feedback on MS Word as well (typically using track changes) and I have to manually introduce those changes back into the .rmd file.
So, my question is: how can I simplify the process (i.e. make it as automatic as possible) of getting the changes in the MS Word document into the .Rmd?
Are there any tools out there that can help me out?
P.s.getting my colleagues to become R-literate is not an option :(
I haven't yet tried what I'm proposing, but here is how I plan to handle this, since I have exactly the same need. First, there are two distinct scenarios:
I am the lead author, or I am responsible for the statistical analysis: I will require all collaborators to learn and use markdown (not R Markdown, just generic markdown) and I'll instruct them not to touch any R code. I believe markdown is easy enough that anyone who is competent enough to collaborate on an article with data analysis is more than competent to learn markdown. For teaching them, the key features for people familiar with working with Microsoft Word and track changes are the following:
Basic markdown references: I would give them the core R Markdown references, which are their Pandoc Markdown documentation and their R Markdown cheat sheet.
Track changes: Collaborators would simply edit the markdown in plain text and submit their edited version. To view and reconcile differences, I would simply use a diff tool; I would find a good online one to teach my collaborators how to diff changes.
Comments between authors: I would select one of the options for markdown comments and teach my collaborators to use that when needed. The modified HTML comment (<!--- Pandoc-enhanced HTML comment -->) is the one I would probably use.
Reference management: I use Zotero, so I would use Better BibTeX for Zotero to handle references. The nice thing about this is that although I would have to handle the references myself, collaborators can directly add references to the Zotero group library. In fact, using citation keys, it should be simple for collaborators to learn how to insert references themselves into the markdown text.
I am NOT the lead author and I am NOT responsible for the statistical analysis: I would use whatever workflow the lead author uses (e.g. if the lead author uses Word with tracked changes, I'll use the same things).
I want to note that it seems that the only part that seems to be not so easy (compared to Microsoft Word normal working features) is replacing track changes with diff. I'm not aware of a tool that makes incorporating diff files as easy as how Word reconciles changes, but if such a tool exists, then the process should be more seamless.
I believe we would need to work on several packages in order to make true collaboration possible between users of Word and RMarkdown. I would be happy to collaborate with anyone interested in making this happen.
Adding a CriticMarkup plugin for RStudio. https://github.com/CriticMarkup/CriticMarkup-toolkit/
Having an R package that can scrape Word documents along with tracked changes. The officer package can already read Word documents, but not the tracked changes. It would also be extremely useful if this package could add simple RMarkdown formatting to the scrapes, e.g. for bold, subscripts and perhaps even tables to facilitate the subsequent matching of Word text to the RMarkdown file.
https://github.com/davidgohel/officer/issues/132
Write a package that can translate the scraped Tracked changes to CriticMarkup into the RMarkdown file.
Generate a key (paragraph)->(lines) that matches paragraphs scraped from Word (without any of the tracked changes) to lines in the RMarkdown. The problem is that we don't know what was generated using code, and what was directly written as Rmd. The first step would be to find lines in the RMarkdown file that should form paragraphs (exclude R chunks, but not inline R). Then, ensuring the order remains the same, compare these lines (remove newlines) to paragraphs scraped from the Word document, using a regexp symbol for "any char, any length" in the place of inline r chunks. Next, split paragraphs with inline chunks as into sub-paragraphs in order to be able to apply tracked changes and comments to either the inline code, before, or after the inline chunk more easily. Finally, the paragraphs that could not be matched were likely generated within code chunks and should be matched to the appropriate code chunks, determined from the order of the paragraphs.
Use the generated key, apply tracked changes (as CritcMarkup) to the RMarkdwown file. Any changes made to code chunks should be reported as a CrticMarkup comment around that code chunk (or group of code chunks if there is no markdown in between chunks).
I suggest you try trackdown https://claudiozandonella.github.io/trackdown/
trackdown offers a simple answer to collaborative writing and editing of R Markdown (or Sweave) documents. Using trackdown, the local .Rmd (or .Rnw) file is uploaded as plain-text in Google Drive where, thanks to the easily readable Markdown (or LaTeX) syntax and the well-known online interface offered by Google Docs, collaborators can easily contribute to the writing and editing of the narrative part of the document. After integrating all authors’ contributions, the final document can be downloaded and rendered locally.
Using Google Docs, anyone can collaborate on the document as no programming experience is required, they only have to focus on the narrative text ignoring code jargon.
Moreover, you can hide code chunks setting hide_code = TRUE (they will be automatically restored when downloaded). This prevents collaborators from inadvertently making changes to the code that might corrupt the file and it allows collaborators to focus only on the narrative text ignoring code jargon.
You can also upload the actual Output (i.e., the resulting complied document) in Google Drive together with the .Rmd (or .Rnw) document. This helps collaborators to evaluate the overall layout, figures and tables and it allows them to use comments on the pdf to propose and discuss suggestions.
I know this is an old post, but for future askers, there is now a package available that can do (mostly) this:
The {redoc} package can output to Word, and by storing the R code internally within the Word document, it can also dedoc() a Word file back into RMarkdown. It uses the Critic Markup syntax discussed in another answer.

Resources