I'm writing my results (both text and tables) and the process is quite time-consuming...
I was wondering if a function in R could help me paste my results into the text so I could copy it into WORD?
For example:
R square = put number here, B = put number here... The difference between the models was significant/nonsignificant with p < put number here
Then I would love to paste it into WORD.
Best regards,
Daniel
Couldn't find any function that would help me... Tried Flextable...
If I understand the question correctly, this can largely be done in base R and reported with R Markdown or Quarto. You can either write the document as a .qmd or .rmd file and export to word, or simply render in RStudio and then copy and paste as you like.
I work the following way:
First assign the models to variables using <-.
Then examine the structure of the model with str().
In the text of the document, use $ with the variable name to access the various parts. Or paste it in the console for cut and paste of the values.
You may want to round() some numbers. Additionally, the function scales::pvalue() is very helpful for p values.
library(scales)
# Generate some model and assign it to a variable
model <- cor.test(cars$speed, cars$dist)
# Examine the structure of the model object
str(model)
Then in the text of an R markdown or Quarto document you can write:
$r$(`r model$parameter`) = `r round(model$estimate, digits = 2)`, $p$= `r pvalue(model$p.value, accuracy = 0.05)`
This will give the following: r(48) = 0.81, p= <0.05
In a code chunk $ is used to access a component. In text $ is used to start an equation. That can be confusing. To access a piece of R code in R Markdown text, use the convention `r <function or variable>` as in `r model$parameter` above. Alternatively, you can simply paste model$parameter into the console and copy the results to your target document.
You could knit the document to word, here's a quick example using inline R:
---
title: "test.Rmd"
output: word_document
---
``` {r, echo = FALSE}
# some variables
a <- 24
b <- 100
```
R square = `r a`, B = `r b` ... The difference between the models was `r if (a == 24) {"significant"} else {"insignificant"}` with p < `r b - a`
I have a .CSV file that includes an ID column and several text columns (title of story, content of story) and columns for a multiple choice questions (each question in a different column). Also, there are columns for a numerical variable (ternary plots).
Here is a screen shot of the CSV file:
CSV File
Now what I'm trying to do is to automatically generate multiple PDF reports for each ID number (generate a unique report for each individual person). With different values in the report depending on the ID column in the CSV.
I thought the best way to do that in R was to create a RMarkdown file and use parameters to make the values of the report match the ID number values.
Here is my code for the RMarkdown file:
---
title: "`r params$new_title`"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
pdf_document:
latex_engine: xelatex
html_document:
df_print: paged
header-includes:
\usepackage{fontspec}
\usepackage{fancyhdr}
mainfont: Arial
params:
id:
label: ID
value: 1
input: select
choices:
- 1
- 2
- 3
- 4
- 5
new_title: "My Title!"
---
library(tidyverse)
library(ggtern)
library(ggplot2)
library(readr)
library(lubridate)
library(magrittr)
library(rmarkdown)
knitr::opts_chunk$set(echo = FALSE)
data <- readr::read_csv("dummy.csv")
data_id <- data %>%
filter(id == v)
**Your title:** `r data_id$title`
**Your micro-narrative:** `r data_id$narrative`
Now the code is working, but the formatting in the generated report is not how I want it.
If the same ID number has multiple entries for story title and story content, the values are displayed next to each other. What I want is this:
Story #1 title:
Story #1 content:
Story #2 title:
Story #2 content:
and NOT:
Title: story#1 title, story#2 title, etc...
Content: story#1, story#2, etc...
To automatically generate multiple reports with one click, I created a loop. Here is the code:
require(rmarkdown)
data = read_csv("dummy.csv")
slices = unique(data$id)
for (v in slices){
render("~/Desktop/My_L3/report.Rmd", output_file = paste0("~/Desktop/report_", v, ".pdf"),
params=list(new_title=paste("Quarterly Report -", v)))
}
The loop is working and I was able to generate multiple PDFs by just running this code.
Is this the easiest way to do it? Any other way you're aware of?
And lastly, how do I include the multiple choice questions in the RMarkdown file?
For example, if a certain ID number has 3 choices selected (three 1s in the CSV) how do I display the result as the following:
You selected the following choices: bananas, apples, oranges
I would really appreciate your help as I'm an R noob and still learning a lot of stuff.
#badi congrats! For a newcomer to R you managed already quite a steep hill.
I hope the following will help you moving further:
(I) observation: use of rmarkdown::render(... , params = list(...))
You can pass multiple variables and objects as params to your "report" Rmd.
Thus, when you have lengthy preparatory steps, you can load your data, prepare it, and filter it with the loop you use to call rmarkdown::render().
E.g. inside your for loop you could do something like df <- data %>% filter(id == v) and then pass df as (one of the) params, e.g. rmarkdown::render(... , params = list(new_title=paste("Quarterly Report -", v)), data = df)
Then define a params for the dataframe. I recommend to "load" a dummy object/df, e.g.
...
params:
id ...
data: !mtcars # dummy object/df - do not forget the ! mark
(II) printing dynamic and static text
There are different ways to achieve this. For your example, it looks like you are looking for something relatively well-formatted that can be constructed from your table columns.
For this sprintf() is your friend. I abbreviated your example with a lighter dataframe.
I print this in the beginning of the document/pdf output.
For this you have to set the chunk parameter results = "as-is" and wrap the sprintf() call into a cat() to allow the template formatting and block R/Rmd from adding other output format stuff (e.g. the ## you can see when I print the table above).
The choices you can combine with a paste() call. Of course this can be done with varying levels of sophistication that I leave to you to explore.
I keep the 1 and NA coding. You can replace these with what you think is appropriate (ifelse/case_when, or complex(er) string substitute operations.
To prepare the list of choices, I just paste everything together:
df <- params$data %>%
mutate(choice_text = paste(choice1, choice2, choice3, sep = ","))
The following code-chunk defines the static/dynamic text template for sprintf() and we iterate over the rows of the data dataframe
# to programmatically print text sprintf()
# allows to combine static and dynamic text
# first define a template how your section looks like
# %s points to a string - not used here by %f caters for (float)numbers
template <- "
## Title %s
With content: %s.
\n You selected the following choices: %s
" # end of your "dynamic" text template
# recall to add an empty line for spacing
# you can force a new line for text entries with \n
# iterate over the input data frame
for (i in seq(nrow(df))) {
current <- df[i, ]
cat(sprintf(template, current$title, current$content, current$choice_text))
}
With the adequately set-up pdf template, you will get the following.
Note: My report breaks over to a 2nd page, I only show the first page here.
I have noted to strange behaviour in the R exams package when I load the dplyr library. the below example only works if I explicitly call the dplyr namespace, as indicated in the comments. notice that the error only occurs in a fresh session, i.e. you need to restart R in order to see what I see. You need to place the below in a file exam.Rmd, then call
library(exams)
library(dplyr)
exams2html("exam.Rmd") # in pwd
# this is exam.Rmd
```{r datagen,echo=FALSE,results='hide',warning=FALSE,message=FALSE}
df = data.frame(i = 1:4, y = 1:4, group = paste0("g",rep(1:2,2)))
# works:
b2 = diff(dplyr::filter(df,group!="g1")$y)
b3 = diff(dplyr::filter(df,group!="g2")$y)
# messes up the complete exercise:
# b2 = diff(filter(df,group!="g1")$y)
# b3 = diff(filter(df,group!="g2")$y)
nq = 2
questions <- solutions <- explanations <- rep(list(""), nq)
type <- rep(list("num"),nq)
questions[[1]] = "What is the value of $b_2$ rounded to 3 digits?"
questions[[2]] = "What is the value of $b_3$ rounded to 3 digits?"
solutions[[1]] = b2
solutions[[2]] = b3
explanations[[1]] = paste("You have you substract the conditional mean of group 2 from the reference group 1. gives:",b2)
explanations[[2]] = paste("You have you substract the conditional mean of group 3 from the reference group 1",b3)
```
Question
========
You are given the following dataset on two variables `y` and `group`.
```{r showdata,echo=FALSE}
# kable(df,row.names = FALSE,align = "c")
df
```
some text with math
$y_i = b_0 + b_2 g_{2,i} + b_3 g_{3,i} + e_i$
```{r questionlist, echo = FALSE, results = "asis"}
answerlist(unlist(questions), markup = "markdown")
```
Solution
========
```{r sollist, echo = FALSE, results = "asis"}
answerlist(unlist(explanations), markup = "markdown")
```
Meta-information
================
extype: cloze
exsolution: `r paste(solutions,collapse = "|")`
exclozetype: `r paste(type, collapse = "|")`
exname: Dummy Manual computation
extol: 0.001
Thanks for raising this issue and to #hrbrmstr for explanation of one part of the problem. However, one part of the explanation is still missing:
Of course, the root of the problem is that both stats and dplyr export different filter() functions. And it can depend on various factors which function is found first.
In an interactive session it is sufficient to load the packages in the right order with stats being loaded automatically and dplyr subsequently. Hence this works:
library("knitr")
library("dplyr")
knit("exam.Rmd")
It took me a moment to figure out what is different when you do:
library("exams")
library("dplyr")
exams2html("exam.Rmd")
It turns out that in the latter code chunk knit() is called by exams2html() and hence the NAMESPACE of the exams package changes the search path because it fully imports the entire stats package. Therefore, stats::filter() is found before dplyr::filter() unless the code is evaluated in an environment where dplyr was loaded such as the .GlobalEnv. (For more details see the answer by #hrbrmstr)
As there is no pressing reason for the exams package to import the entire stats package, I have changed the NAMESPACE to import only the required functions selectively (which does not include the filter() function). Please install the development version from R-Forge:
install.packages("exams", repos = "http://R-Forge.R-project.org")
And then your .Rmd can be compiled without dplyr::... just by including library("dplyr") - either within the .Rmd or before calling exams2html(). Both should work now as expected.
Using your exams.Rmd, this is the source pane where I'm about to hit cmd-enter:
(I added quiet=FALSE so I could see what was going on).
Here's the console output after cmd-enter:
And here's the output:
If you read all the way through to the help on knit:
envir: Environment in which code chunks are to be evaluated, for example, parent.frame(), new.env(), or globalenv()).
So parent.frame() or globalenv() is required vs what you did (you don't seem to fully understand environments). You get TRUE from your exists() call because by default inherits is TRUE in the exists function and that tells the function to "[search] the enclosing frames of the environment" (from the help on exists.
And, you should care deeply about source code and triaging errors. You're using a programming language and open source software and you are right that the library(dplyr) didn't work inside the Rmd due to some terrible code choices in this "great" package and that you don't want pointed out since you don't want to look at source code.
End, as I can do no more for you. I just hope others benefit from this.
When knitting R-Markdown in RStudio, I would like to all console outputs in one chunk to be placed together in one code block. How can this be done?
As a workaround, I write two code blocks of the same code and set eval=FALSE on the first block and echo=FALSE on the second.
```{r Vector Demo 2, eval=FALSE}
# examine the class and structure of vectors
class(nums)
class(char)
str(nums)
str(char)
```
```{r Vector Demo 2b, echo=FALSE}
# examine the class and structure of vectors
class(nums)
class(char)
str(nums)
str(char)
```
This however produces the following output:
# examine the class and structure of vectors
class(nums)
class(char)
str(nums)
str(char)
## [1] "numeric"
## [1] "character"
## num [1:5] 1 2 3 4 5
## chr [1:3] "A" "B" "C"
What I would like is to have the output of the second chunk (i.e. Vector Demo 2b) to be placed together in one code block just like the first chunk (i.e. Vector Demo 2).
This is a sample output of how I would prefer to have my result:
# examine the class and structure of vectors
class(nums)
class(char)
str(nums)
str(char)
## [1] "numeric"
## [1] "character"
## num [1:5] 1 2 3 4 5
## chr [1:3] "A" "B" "C"
Note to bounty hunters:
Better still, I would be grateful for a way to have one code chunk first print the input code, and then print the output code. That way I could avoid duplication and possible inconsistencies that could come with it.
Here is a solution to your problem. There are two things to do:
## Test
```{r echo = F, cache = F}
knitr::knit_hooks$set(document = function(x){
gsub("```\n*```r*\n*", "", x)
})
```
```{r VectoDemo, results = 'hold'}
nums = 1:5
char = LETTERS[1:5]
# examine the class and structure of vectors
class(nums)
class(char)
str(nums)
str(char)
```
I have done two things here
Set results = 'hold' to "hold" printing of output after source is printed.
Added a document hook to collapse successive code chunks with no text in-between.
With knitr >= 1.5.23
You can get quite a bit of variety from the built-in knitr options.
For instance... get the same results as Ramnath's hook using collapse=TRUE, results="hold" (no global document hook needed).
If you really want distinct source/output parts as shown in the question then you are already on the right path. Just make use of chunk reuse to get the output combo without having to repeat yourself...
```{r}
nums <- 1:5
char <- LETTERS[1:3]
```
```{r "Vector Demo", eval=FALSE}
# examine the class and structure of vectors
class(nums)
class(char)
str(nums)
str(char)
```
```{r "Vector Demo", echo=FALSE}
```
The last chunk is your basic chunk reuse pattern.
You can see a variety of the basic combinations here.
Since you aren't making a knitr tutorial ignore the source and use the rendered examples.
#Ramnath's solution appears a bit simpler than this one. His is likely better in many (most?) circumstances, but this alternate solution might be good in others:
Test.
```{r echo=F,cache=F}
knitr::knit_hooks$set(document=function(x) {
paste(rapply(strsplit(x, '\n'), function(y) Filter(function(z) !grepl('# HIDEME',z),y)), collapse='\n')
})
```
```{r Vector Demo 1, results='hold', tidy=FALSE}
nums = 1:5
chars = LETTERS[1:5]
# examine the class and structure of vectors
{ # HIDEME
print(class(nums))
print(class(chars))
str(nums)
str(chars)
} # HIDEME
```
Notes:
Use of the brackets ({ and }) around the code keeps the output together. However, commands that return things and don't print things will be silent (unless last), ergo my print addition to those lines. This may or may not be a factor depending on your actual commands.
In my install, for some reason, tidy defaults to TRUE which was shifting my first # HIDEME comment to before the left bracket (and I have since edited to code to reflect the definition of nums and chars). Odd, but likely a side-effect of source tidying. This is why I force tidy=FALSE. Since this might affect how you present your code, using it as a per-block option at least limits the pretty-printing problem.
The # HIDEME is really just "comment-character plus some obscure string" for easy greping.
The knit_hook I added is not "simple," but I find it less likely to have side-effects on other chunks within a document. This could likely be done with more specificity (I know #Ramnath has worked on other knitr problems with Yihui, so there could be more "correct" ways to do this with more specificity.) (I tried and failed to do this as an "output" hook instead of "document" hook. Homework.)
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
On general request, a community wiki on producing latex tables in R. In this post I'll give an overview of the most commonly used packages and blogs with code for producing latex tables from less straight-forward objects. Please feel free to add any I missed, and/or give tips, hints and little tricks on how to produce nicely formatted latex tables with R.
Packages :
xtable : for standard tables of most simple objects. A nice gallery with examples can be found here.
memisc : tool for management of survey data, contains some tools for latex tables of (basic) regression model estimates.
Hmisc contains a function latex() that creates a tex file containing the object of choice. It is pretty flexible, and can also output longtable latex tables. There's a lot of info in the help file ?latex
miscFuncs has a neat function 'latextable' that converts matrix data with mixed alphabetic and numeric entries into a LaTeX table and prints them to the console, so they can be copied and pasted into a LaTeX document.
texreg package (JSS paper) converts statistical model output into LaTeX tables. Merges multiple models. Can cope with about 50 different model types, including network models and multilevel models (lme and lme4).
reporttools package (JSS paper) is another option for descriptive statistics on continuous, categorical and date variables.
tables package is perhaps the most general LaTeX table making package in R for descriptive statistics
stargazer package makes nice comparative statistical model summary tables
Blogs and code snippets
There is the outreg function of Paul Johnson that gives Stata-like tables in Latex for the output of regressions. This one works great.
As given in an earlier question, there's a code snippet to adapt the memisc package for lme4 objects.
Related questions :
Suggestion for R/LaTeX table creation package
Rreport/LaTeX quality output package
sorting a table for latex output with xtable
Any way to produce a LaTeX table from an lme4 mer model fit object?
R data.frame with stacked specified titles for latex output with xtable
Automating adding tables fast to latex from R, with a very flexible and interesting syntax using the formula language
I'd like to add a mention of the "brew" package. You can write a brew template file which would be LaTeX with placeholders, and then "brew" it up to create a .tex file to \include or \input into your LaTeX. Something like:
\begin{tabular}{l l}
A & <%= fit$A %> \\
B & <%= fit$B %> \\
\end{tabular}
The brew syntax can also handle loops, so you can create a table row for each row of a dataframe.
Thanks Joris for creating this question. Hopefully, it will be made into a community wiki.
The booktabs packages in latex produces nice looking tables. Here is a blog post on how to use xtable to create latex tables that use booktabs
I would also add the apsrtable package to the mix as it produces nice looking regression tables.
Another Idea: Some of these packages (esp. memisc and apsrtable) allow easy extensions of the code to produce tables for different regression objects. One such example is the lme4 memisc code shown in the question. It might make sense to start a github repository to collect such code snippets, and over time maybe even add it to the memisc package. Any takers?
The stargazer package is another good option. It supports objects from many commonly used functions and packages (lm, glm, svyreg, survival, pscl, AER), as well as from zelig. In addition to regression tables, it can also output summary statistics for data frames, or directly output the content of data frames.
I have a few tricks and work arounds to interesting 'features' of xtable and Latex that I'll share here.
Trick #1: Removing Duplicates in Columns and Trick #2: Using Booktabs
First, load packages and define my clean function
<<label=first, include=FALSE, echo=FALSE>>=
library(xtable)
library(plyr)
cleanf <- function(x){
oldx <- c(FALSE, x[-1]==x[-length(x)])
# is the value equal to the previous?
res <- x
res[oldx] <- NA
return(res)}
Now generate some fake data
data<-data.frame(animal=sample(c("elephant", "dog", "cat", "fish", "snake"), 100,replace=TRUE),
colour=sample(c("red", "blue", "green", "yellow"), 100,replace=TRUE),
size=rnorm(100,mean=500, sd=150),
age=rlnorm(100, meanlog=3, sdlog=0.5))
#generate a table
datatable<-ddply(data, .(animal, colour), function(df) {
return(data.frame(size=mean(df$size), age=mean(df$age)))
})
Now we can generate a table, and use the clean function to remove duplicate entries in the label columns.
cleandata<-datatable
cleandata$animal<-cleanf(cleandata$animal)
cleandata$colour<-cleanf(cleandata$colour)
#
this is a normal xtable
<<label=normal, results=tex, echo=FALSE>>=
print(
xtable(
datatable
),
tabular.environment='longtable',
latex.environments=c("center"),
floating=FALSE,
include.rownames=FALSE
)
#
this is a normal xtable where a custom function has turned duplicates to NA
<<label=cleandata, results=tex, echo=FALSE>>=
print(
xtable(
cleandata
),
tabular.environment='longtable',
latex.environments=c("center"),
floating=FALSE,
include.rownames=FALSE
)
#
This table uses the booktab package (and needs a \usepackage{booktabs} in the headers)
\begin{table}[!h]
\centering
\caption{table using booktabs.}
\label{tab:mytable}
<<label=booktabs, echo=F,results=tex>>=
mat <- xtable(cleandata,digits=rep(2,ncol(cleandata)+1))
foo<-0:(length(mat$animal))
bar<-foo[!is.na(mat$animal)]
print(mat,
sanitize.text.function = function(x){x},
floating=FALSE,
include.rownames=FALSE,
hline.after=NULL,
add.to.row=list(pos=list(-1,bar,nrow(mat)),
command=c("\\toprule ", "\\midrule ", "\\bottomrule ")))
#could extend this with \cmidrule to have a partial line over
#a sub category column and \addlinespace to add space before a total row
#
Two utilities in package taRifx can be used in concert to produce multi-row tables of nested heirarchies.
library(datasets)
library(taRifx)
library(xtable)
test.by <- bytable(ChickWeight$weight, list( ChickWeight$Chick, ChickWeight$Diet) )
colnames(test.by) <- c('Diet','Chick','Mean Weight')
print(latex.table.by(test.by), include.rownames = FALSE, include.colnames = TRUE, sanitize.text.function = force)
# then add \usepackage{multirow} to the preamble of your LaTeX document
# for longtable support, add ,tabular.environment='longtable' to the print command (plus add in ,floating=FALSE), then \usepackage{longtable} to the LaTeX preamble
... and Trick #3 Multiline entries in an Xtable
Generate some more data
moredata<-data.frame(Nominal=c(1:5), n=rep(5,5),
MeanLinBias=signif(rnorm(5, mean=0, sd=10), digits=4),
LinCI=paste("(",signif(rnorm(5,mean=-2, sd=5), digits=4),
", ", signif(rnorm(5, mean=2, sd=5), digits=4),")",sep=""),
MeanQuadBias=signif(rnorm(5, mean=0, sd=10), digits=4),
QuadCI=paste("(",signif(rnorm(5,mean=-2, sd=5), digits=4),
", ", signif(rnorm(5, mean=2, sd=5), digits=4),")",sep=""))
names(moredata)<-c("Nominal", "n","Linear Model \nBias","Linear \nCI", "Quadratic Model \nBias", "Quadratic \nCI")
Now produce our xtable, using the sanitize function to replace column names with the correct Latex newline commands (including double backslashes so R is happy)
<<label=multilinetable, results=tex, echo=FALSE>>=
foo<-xtable(moredata)
align(foo) <- c( rep('c',3),'p{1.8in}','p{2in}','p{1.8in}','p{2in}' )
print(foo,
floating=FALSE,
include.rownames=FALSE,
sanitize.text.function = function(str) {
str<-gsub("\n","\\\\", str, fixed=TRUE)
return(str)
},
sanitize.colnames.function = function(str) {
str<-c("Nominal", "n","\\centering Linear Model\\\\ \\% Bias","\\centering Linear \\\\ 95\\%CI", "\\centering Quadratic Model\\\\ \\%Bias", "\\centering Quadratic \\\\ 95\\%CI \\tabularnewline")
return(str)
})
#
(although this isn't perfect, as we need \tabularnewline so the table is formatted correctly, and Xtable still puts in a final \, so we end up with a blank line below the table header.)
You can also use the latextable function from the R package micsFuncs:
http://cran.r-project.org/web/packages/miscFuncs/index.html
latextable(M) where M is a matrix with mixed alphabetic and numeric entries outputs a basic LaTeX table onto screen, which can be copied and pasted into a LaTeX document. Where there are small numbers, it also replaces these with index notation (eg 1.2x10^{-3}).
Another R package for aggregating multiple regression models into LaTeX tables is texreg.