Sphinx for writing "parallel text" - restructuredtext

Problem: I am trying to write "parallel text" using reStructuredText. By "parallel text", I mean something like annotated works of fiction, where the text is in two columns. The left column contains the main text, and the right column contains commentary. I will be using sphinx for generating HTML and Latex documentation from it.
I have the following requirements:
As mentioned above, I should be able to typeset text in two columns, one for the main text and one for the annotations.
The annotations can be "sentence level", and will not always be "paragraph level". I.e., I want to be able to annotate different sentences in a paragraph, or the whole paragraph.
It will be great to have a mode where all the annotations are turned off, so the output HTML and Latex only contains the main text. In this case, I would like to be able to use the whole "real estate" of the medium, rather than just a column.
It will be very nice to have a "list of annotations" feature if possible.
I am pretty new to reStructuredText and to Sphinx, but have considerable experience with Python. I am looking for some ideas about how to do what I want to do. I have been reading about reStructuredText and also about writing Sphinx extensions, so writing an extension to Sphinx is not out of question.
Has anyone done something similar before?
Thanks!

These seem very similar to footnotes? I would suggest having a look at http://ignorethecode.net/blog/2010/04/20/footnotes/
If they suit your purpose, integrating them should not be too difficult. Sphinx outputs footnotes with a special class. Replacing the
$("a[rel='footnote']")
in the code, with a jquery css selector of your choice should give you what you desire.

Related

Postscript parser - add hyperlinks to text

I need to take a list of questions on a pdf, and hyperlink the answer to each question.
I currently have converted the pdf file to postscript. However, postscript is a very complicated language to programmatically hyperlink each question of the format Question #i: to a link example.com/answers/i/. How can I accomplish this?
PostScript isn't merely complicated, its a complete programming language. This means that the way your answer is expressed in the program is entirely arbitrary.
Assuming you are using the same conversion process each time, you can probably assume that its deterministic in its behaviour (ie it converts the same input to the same output every time), in which case you can probably look for the result in the output.
But basically, you're on your own here, there isn't some magic solution I can give you.
I'd suggest that you're doping it wrong anyway. PostScript isn't PDF, and it doesn't have any concept of a hyperlink. So this suggests to me that you intend to use a pdfmark extension operator, and then pass the resulting PostScript back through a Distiller-like application in order to get a PDF back out again.
Converting to PostScript and back to PDF really just confuses the issue. Assuming that the PDF is a form (again, by implication from the question and answer format) you can extract the form field readily enough from the PDF file directly. Then you can replace it with a /Link annotation.
In short, don't do this by going to PostScript and back, do it all in PDF.
If there's a reason why you can't do this, then you're going to have to explain it.

R ReporterRs Package - Add text after bookmark

I'm creating a tool to automatically generate reports for people in the office using the ReporteRs package in R. We have a standard set of tables/graphs, but the number of times a single table or graph may appear will vary from person to person. Due to this, I cannot make a single template with a fixed number of bookmarks.
I was hoping to get around this by having a bookmark on the first figure title and then repeatedly adding figure titles and graphs/tables underneath that one bookmark.
The 'addParagraph' function will only replace the bookmarked paragraph, so it will not work. I also tried replacing the bookmarked paragraph with a set of paragraphs, but since I have to alternate text/tables the bookmark gets placed onto two paragraphs after the first iteration and does not work after that.
Is there any way to simply add a piece of text after a bookmarked paragraph?
No that's not possible with ReporteRs. Maybe you could use package officer instead, cursor_* functions would help you.

How to check if a paragraph is part of a text in R

I have one paragrah of text (a vector of words) and I would like to see if it is "part" of a long text (a vector of words). However, I am know that this paragraph does not appear in the text in its exact form, but with slight changes: a few words could miss, the order could be slightly different, some words could be inserted as parenthetical elements etc.
I am currently implementing solutions "by hand", such as looking if most of the words of the paragraph are in the text, looking the distance between these words, their order, etc...
I was however wondering if there is no built-in method to do that?
I already checked the tm package, but it does not seem to do that...
Any idea?
I fear that you are stuck with hand-writing an approach, e.g. grep-ing some word groups and having some kind of matching threshold.

Easy export and table formatting of R dataframe to Word? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Is there an easy way to automatize the conversion of a R dataframe to a pretty Word table in APA format for publishing manuscripts? I'm currently doing this by saving the table in a csv, opening that in excel, copying the excel table to Word, and formatting it there, but I'm hoping there would be a way to automatize the formatting in R, so that when I convert it to Word, it would already be in APA format, because Word sucks in automatization.
Basically, I want to continue writing the manuscript itself in Word, while doing my analyses in R. Then gather all the results in R to a table (with manually modifiable formatting) by a script and convert it to whatever format I could then simply copy-paste to Word (so that the formatting actually holds). When I need to modify the table, I would make the changes in R and then just run the script again without the need to do any changes in Word.
I don't want to learn LaTeX, because everyone in my field uses Word with features like track changes, and I use Zotero add-in for citations, so it's simpler to just keep the writing separate from the analyses. Also, I am a psychologist, not a coder, so learning a lot of new technologies just for this is probably not worth the effort for me. Typically with new technologies come new technical problems, and I am aiming to make my workflow quicker, but not at the cost of unpredictability (which may make it slower exactly at the moment when I cannot afford it).
I found a R+knitr+rmarkdown+pander+pandoc solution "with as little overhead as possible", but it seems to be quite heavy still because I don't know any of those technologies apart from R. And I'm not eager to start learning all that, as it seems to be aimed for doing the writing and all in R to the very end, while I want to separate my writing and my code - I never need code in my writing, only the result tables. In addition, based on the examples, it seems to fetch the values directly from R code (e.g., from summary() to create a descriptive table), while I need to be able to tinker with my table manually before converting it, for instance, writing the title and notes (like a specific note to one cell and explaining it in the bottom). I also found R2wd, but it seems to be an older attempt for the same "whole workflow in R" problem as the solution above. SWord does not seem to be working anymore.
Any suggestions?
(Just to let you know, I am the author of the packages I recommend you...)
You can use package ReporteRs to output your table to Word. See here a tutorial (not mine):
http://www.sthda.com/english/wiki/create-and-format-word-documents-using-r-software-and-reporters-package
Objects FlexTable let you format and arrange tables easily with some standard R code. For example, to set the 2nd column in bold, the code looks like:
myFlexTable[, 2] = textBold()
There are (old) examples here:
http://davidgohel.github.io/ReporteRs/flextable_examples.html
These objects can be added to a Word report using the function addFlexTable. The word report can be generated with function writeDoc.
If you are working in RStudio, you can print the object and it will be rendered in the html viewer so you can export it in Word when you are satisfied with its content.
You can even add real Word footnotes (see the link below)
http://davidgohel.github.io/ReporteRs/pot_objects.html#pot_footnotes
If you need more tabular output, I recommend you also the rtable package that handles xtable objects (and other things I have to develop to satisfy my colleagues or customers) - a quick demo can be seen here:
http://davidgohel.github.io/tabular/
Hope it helps...
I have had the same need, and I have ended up using the package htmlTable, which is quite 'cost-efficient'. This creates a HTML table (in RStudio it is created in the "Viewer" windows in the bottom right which I just mark using the mouse copy-paste to Word. (Start marking form the bottom of the table and drag the mouse upwards, that way you are sure to include the start of the HTML code.) Word handles these tables quite nicely. The syntax of is quite simple involving just the function htmlTable(), but is still able to make somewhat more complex tables, such as grouped rows and primary and secondary column headers (i.e. column headers spanning more than one column). Check out the examples in the vignette.
One note of caution: htmlTable will not work will factor variables, i.e., they will come out as integer numbers (according to factor levels). So read the data using stringsAsFactors = FALSE or convert them using as.character().
Including trailing zeroes can be done using the txtRound function. Example:
mini_table <- data.frame(Name="A", x=runif(20), stringsAsFactors = FALSE)
txt <- txtRound(mini_table, 2)
It is not completely straightforward to assign formatting soch as bold or italics, but it can be done by wrapping the table contents in HTML code. If you for instance want to make an entire column bold, it can be done like this (please note the use of single and double quotation marks inside paste0):
library(plyr)
mini_table <- data.frame(Name="A", x=runif(20), stringsAsFactors = FALSE)
txt <- txtRound(mini_table, 2)
txt$x <- aaply(txt$x, 1, function(x)
paste0("<span style='font-weight:bold'>", x, "</span")
)
htmlTable(txt)
Of course, that would be easier to to in Word. However, it is more interesting to add formatting to numbers according to some criteria. For instance, if we want to emphasize all values of x that are less than 0.2 by applying bold font, we can modify the code above as follows:
library(plyr)
mini_table <- data.frame(Name="A", x=runif(20), stringsAsFactors = FALSE)
txt <- txtRound(mini_table, 2)
txt$x <- aaply(txt$x, 1, function(x)
if (as.numeric(x)<0.2) {
paste0("<span style='font-weight:bold'>", x, "</span>")
} else {
paste0("<span>", x, "</span>")
})
htmlTable(txt)
If you want even fancier emphasis, you can for instance replace the bold font by red background color by using span style='background-color: red' in the code above. All these changes carry over to Word, at least on my computer (Windows 7).
The short answer is "not really." I've never had much luck getting well formatted tables into MS Word. The best approach I can offer you requires using Rmarkdown to render your tables into an HTML file. You can copy and paste you results from the HTML file to MS Word, but I make no guarantees about how well the formatting will follow.
To format your tables, you can try something like the xtable package, or the pixiedust package. But again, no guarantees that the formatting will transfer.

Mathjax loading issues

I've a web page that renders latex equations using mathjax.
In order to load the equations faster, i'm trying to avoid the preprocessing step, by replacing the
math delimiters by <span class="MathJax_Preview">[loading...]</span><script type="math/tex;"> latex equation here </script>
But the problem is; while rendering, the html entities that comes within the equation are shown as such and hence the equations
are not rendered properly.For eg, '&' is used for alignment of multiple steps. But it is displayed as &
Replacing the math delimiters by <script> tag is done dynamically. If i remove this step, then the said issue is not there and html entities within equation are rendered properly.
How can i solve this?
My ultimate objective is to make the equation load faster.
Show a preloder like [loading...] until the maths is typeset fully.
Thanks,
LS Developer
Note that the contents of the <script type="math/tex"> is TeX (or LaTeX), not HTML, and so HTML entities should not be included there. The contents of any <script> in HTML is CDATA, and so no processing, including conversion of entities, is performed within it.
If you are using numeric entities like A or A then it is easy to replace those by the characters they represent. If you are using named entities, then you will need to translate them into their characters via a table lookup or other process. Better yet would be not to put in entities in the first place. Can you not perform that step? (I'm assuming that is done outside of your control.)
Note that the preprocessing step in MathJax is actually quite fast, and is not likely to be a bottleneck unless you have an enormous number of equations. It is the conversion to HTML that is the time sink. If you aren't using one of the combined configuration files, you will probably get a better improvement simply by moving to one of those than by removing the preprocessor step. If you already are using a combined configuration file, but aren't using the "-full" version, then moving to that will also speed up the processing of the math (since you won't have to wait for the input and output jax to load when they are first used).

Resources