I am working on a large knitr document that takes some time to calculate the initial results. Is there a way to knit a subsection of the document, ie. after the longer calculations have taken place for the purpose of development? I would prefer not to have to pre-calculate the results, save them and then read them into the document in order to expedite developing the final plots and tables.
Thanks,
John
Consider breaking your document up into smaller 'child' documents. Then you can process just the child you are interested in. As I mentioned in the comments, alternatively, you might consider using the caching features of knitr.
Related
I have several formatted tables (and some graphs) in a single excel book, and I want to use Rmarkdown to grab those tables and graphs and add them to the knitted word document as images. I cannot simply copy the cell range with Readxl or other packages because all cells lose their formatting, and I need the tables in the document to look exactly the same as in the spreadsheet.
Alternatively, I am considering recreating the tables in R, but again the new tables will not have the formatting (though I imagine kable could allow me to recreate all the formatting). This is certainly not ideal, however, since I'm creating the tables twice, so hopefully the copy ability is doable somehow. I know Rmarkdown can use Python, Javascript, and many other languages, so maybe one of those languages is capable of this.
Is there a way to do this?
The exams package is a really fantastic tool for generating exams from R.
I am interested in the possibilities of using it for (programming) assignments. The main difference from an exam is that besides solutions I'd also like hints to be included in the PDF / HTML output file.
Typically I put the hints for (sub)-questions in a separate section at the end of the PDF assignment (using a separate Latex section), but this requires manual labour. These are for students to consult if they need help getting started on any particular exercise, and it avoids having them look at the solutions directly for hints on how to start.
An assignment might look like:
Question 1
Question 2 ...
Question 10
Hints to all questions
I'd be open to changing the exact format as long it is possible to look up hints without looking up the answer, and it remains optional to read the hints.
So in fact I am looking for an intermediate "hints" section between the between the "question" and "solution" section, which is present for some questions but not for all.
My questions: Is this already possible? If not, how could this be implemented using the exams package?
R/exams does not have dedicated/native support for this kind of assignment so it isn't available out of the box. So if you want to get this kind of processing you have to ensure it yourself using LaTeX for PDF or CSS for HTML.
In LaTeX I think it should be possible to do what you want using the newfloat and endfloat packages in the LaTeX template that you pass to exams2pdf(). Any LaTeX template needs to provide {question} and {solution} environments, e.g., the plain.tex template shipped with the package has
\newenvironment{question}{\item \textbf{Problem}\newline}{}
\newenvironment{solution}{\textbf{Solution}\newline}{}
with the exercises embedded as
\begin{enumerate}
%% \exinput{exercises}
\end{enumerate}
Now instead of the \newenvironment{solution}... you could use
\usepackage{newfloat,endfloat}
\DeclareFloatingEnvironment{hint}
\DeclareDelayedFloat{hint}{Hint}
\DeclareFloatingEnvironment{solution}
\DeclareDelayedFloat{solution}{Solution}
This defines two new floating environments {hint} and {solution} which are then declared delayed floats. And then you would need to customize these environments regarding the text displayed within the questions at the beginning and the listing at the end. I'm not sure if this can get you exactly what you want, though, but hopefully it is a useful place to start from.
There are a range of tools available for creating publication quality tables using R, Sweave, and LaTeX.
In particular, there are helper functions like latex in the Hmisc package, and xtable in the xtable package. I've also often written my own code so that I could have complete control over table formatting (e.g., see this example).
However, when preparing publication quality tables a range of issues often arise:
how and when to apply numeric formatting
how to precisely control alignment of columns and cells
how to precisely control cell borders
how to convert variable labels to variable names
and so on
Beyond the high level issues of specifying the desired table format, there are issues of implementation.
When should a helper function such as xtable be used?
Which helper function should be used in a given situation?
How can the default output of helper functions be customised to particular requirements?
Question
It seems to me that the above issues are deserving of a detailed textbook-style introduction.
Are there any online or offline resources that provide a detailed overview of how to produce publication quality tables using R, Sweave, and LaTeX, and that address the issues discussed above?
Just to tie this up with a nice little bow at the time of current writing, the best existant tutorials on publication-quality tables and usage scenarios appear to be an amalgamation of these documents:
A Sweave example (source)
The Joy of Sweave: A Beginner's Guide to Reproducible Research with Sweave (source)
Latex and R via Sweave: An example document how to use Sweave (source)
Sweave = R · LaTeX2 (source)
The xtable gallery (source)
The Sweave Homepage
LaTeX documentation
Going beyond the scope of what currently exists, you may want to ask the author of The Joy of Sweave for a document on publication-quality tables specifically. It seems like he's gone above and beyond this problem in his research. In addition to the questions you've raised, this space specifically could use a style guide that, flatly, does not currently exist.
And, as mentioned in the question errata, this is a perfect example of a question for https://tex.stackexchange.com/. I encourage you to continue to ask specific questions there when you run into any difficulties in your current projects.
The package stargazer can create publication-quality - incl. using templates designed to resemble existing academic journals - from commonly used R statistical functions and packages (lm, glm, plm, svyglm, survival, pscl, AER, and others). Also good for creating summary statistics tables, and can directly output data frame content as well.
There is a tabular function in the tables package which addresses formatting, alignment and label operations. The package has a vignette which is a good starting point.
xtable has worked fine for me so far.
In combination with siunitx, and when necessary, longtable, it can produce pretty effective tables, in my opinion. With packages like booktabs and caption, the aesthetics can be pleasing too.
I am not sure this level of detail was asked for by the OP, but for what it's worth, the basic implementation could be something along these lines: https://tex.stackexchange.com/questions/41067/caption-for-longtable-in-sweave/41183#41183 (my own answer to another question).
I highly recommend ConTeXt which makes use of the TABLE package. There is a Table overview in contextgarden and an exhaustive manual.
I am using knitr through Lyx to create a document. In this document, I use knitr to print about 20 images (through R), and 5 calls from R, along with about 20 pages of text.
I save the pdf file, and it is only 1500 KB, and I can view and recompile it easily. But as soon as I go to print, the printer reads about 200MB of information. It takes a super long time (2+ hours) to print.
I was wondering if you knew the solution for this, or even the cause. I’ve been trying to remedy it by just copying the plots and putting them in as figures, but this obviously defeats the purpose of reproducible research. When I put the plots as pictures, we get down to a pdf size of 367 KB. I am fairly certain it is knitr generated plots that are causing the increase in data. When I changed the plots to pictures, it printed in about 5 minutes (which is still a long time, but much shorter than hours).
I've had this issue before, and I believe that it has something to do with plotting of multiple chains for traceplots. Are these known to take forever to print?
Has anyone else experienced this or know the solution for it?
The default for latex output is PDFs for plots. Presumably there are some effects within the PDF which are very expensive to render for your printer. I would specify an alternative graphics device such as png either per chunk using chunk options or as default for the whole file using opts_chunk$set. The relevant option is dev though you may need to change dpi too.
More details on the knitr page
There are a range of tools available for creating publication quality tables using R, Sweave, and LaTeX.
In particular, there are helper functions like latex in the Hmisc package, and xtable in the xtable package. I've also often written my own code so that I could have complete control over table formatting (e.g., see this example).
However, when preparing publication quality tables a range of issues often arise:
how and when to apply numeric formatting
how to precisely control alignment of columns and cells
how to precisely control cell borders
how to convert variable labels to variable names
and so on
Beyond the high level issues of specifying the desired table format, there are issues of implementation.
When should a helper function such as xtable be used?
Which helper function should be used in a given situation?
How can the default output of helper functions be customised to particular requirements?
Question
It seems to me that the above issues are deserving of a detailed textbook-style introduction.
Are there any online or offline resources that provide a detailed overview of how to produce publication quality tables using R, Sweave, and LaTeX, and that address the issues discussed above?
Just to tie this up with a nice little bow at the time of current writing, the best existant tutorials on publication-quality tables and usage scenarios appear to be an amalgamation of these documents:
A Sweave example (source)
The Joy of Sweave: A Beginner's Guide to Reproducible Research with Sweave (source)
Latex and R via Sweave: An example document how to use Sweave (source)
Sweave = R · LaTeX2 (source)
The xtable gallery (source)
The Sweave Homepage
LaTeX documentation
Going beyond the scope of what currently exists, you may want to ask the author of The Joy of Sweave for a document on publication-quality tables specifically. It seems like he's gone above and beyond this problem in his research. In addition to the questions you've raised, this space specifically could use a style guide that, flatly, does not currently exist.
And, as mentioned in the question errata, this is a perfect example of a question for https://tex.stackexchange.com/. I encourage you to continue to ask specific questions there when you run into any difficulties in your current projects.
The package stargazer can create publication-quality - incl. using templates designed to resemble existing academic journals - from commonly used R statistical functions and packages (lm, glm, plm, svyglm, survival, pscl, AER, and others). Also good for creating summary statistics tables, and can directly output data frame content as well.
There is a tabular function in the tables package which addresses formatting, alignment and label operations. The package has a vignette which is a good starting point.
xtable has worked fine for me so far.
In combination with siunitx, and when necessary, longtable, it can produce pretty effective tables, in my opinion. With packages like booktabs and caption, the aesthetics can be pleasing too.
I am not sure this level of detail was asked for by the OP, but for what it's worth, the basic implementation could be something along these lines: https://tex.stackexchange.com/questions/41067/caption-for-longtable-in-sweave/41183#41183 (my own answer to another question).
I highly recommend ConTeXt which makes use of the TABLE package. There is a Table overview in contextgarden and an exhaustive manual.