I need to use results = "asis" for reasons stated here: https://stackoverflow.com/a/36381976/
However, using that chunk option means other outputs render non-ideally. Specifically I'm having issues outputting prop.test results, but I'm sure this would happen for other data types.
I've provided 4 options in the example below, all of which fall short in some way:
---
title: "R Notebook"
output:
html_document:
df_print: paged
---
```{r, echo=F, message=F, warning=F, results="asis"}
library(knitr)
library(pander)
out <- prop.test(c(10,30), c(20,40))
cat("# Header \n")
cat(" \n## Straight output\n")
out # Only properly renders first line
cat(" \n## Print\n")
print(out) # Only properly renders first line
cat(" \n## Kable\n")
#kable(out) # Will fail: Error in as.data.frame.default(x) : cannot coerce class ""htest"" to a data.frame
kable(unlist(out)) # Renders everything but in an ugly way
cat(" \n## Pander\n")
pander(out) # Misses confidence interval.
cat(" \n As you can see, Pander misses some information, such as the confidence interval")
```
Pander gets it closest to a nice display but misses some information (confidence interval). Perhaps there's a way to make it display all?
How can I nicely display the output of prop.test and similar?
One option is to return to results = "markup" (the default) and replace your cat calls with asis_output (from the knitr package).
---
title: "R Notebook"
output:
html_document:
df_print: paged
---
```{r, echo=F, message=F, warning=F}
library(knitr)
library(pander)
out <- prop.test(c(10,30), c(20,40))
asis_output("# Header \n")
asis_output(" \n## Straight output\n")
out # Only properly renders first line
asis_output(" \n## Print\n")
print(out) # Only properly renders first line
asis_output(" \n## Kable\n")
#kable(out) # Will fail: Error in as.data.frame.default(x) : cannot coerce class ""htest"" to a data.frame
kable(unlist(out)) # Renders everything but in an ugly way
asis_output(" \n## Pander\n")
pander(out) # Misses confidence interval.
asis_output(" \n As you can see, Pander misses some information, such as the confidence interval")
```
You can use formattable like this
library(knitr)
library(formattable)
out <- prop.test(c(10,30), c(20,40))
cat("# Header \n")
cat(" \n## Straight output\n")
out # Only properly renders first line
cat(" \n## Print\n")
print(out) # Only properly renders first line
cat(" \n## Kable\n")
#kable(out) # Will fail: Error in as.data.frame.default(x) : cannot coerce class ""htest"" to a data.frame
kable(unlist(out)) # Renders everything but in an ugly way
cat(" \n## Pander\n")
df <- data.frame(value = unlist(out))
tdf <- as.data.frame(t(df))
formattable(tdf)
You can keep the columns you want, update the column names as all of these are in data frame. A rough example of how it looks is here
Related
I am using RStudio to write my R Markdown files. How can I remove the hashes (##) in the final HTML output file that are displayed before the code output?
As an example:
---
output: html_document
---
```{r}
head(cars)
```
You can include in your chunk options something like
comment=NA # to remove all hashes
or
comment='%' # to use a different character
More help on knitr available from here: http://yihui.name/knitr/options
If you are using R Markdown as you mentioned, your chunk could look like this:
```{r comment=NA}
summary(cars)
```
If you want to change this globally, you can include a chunk in your document:
```{r include=FALSE}
knitr::opts_chunk$set(comment = NA)
```
Just HTML
If your output is just HTML, you can make good use of the PRE or CODE HTML tag.
Example
```{r my_pre_example,echo=FALSE,include=TRUE,results='asis'}
knitr::opts_chunk$set(comment = NA)
cat('<pre>')
print(t.test(mtcars$mpg,mtcars$wt))
cat('</pre>')
```
HTML Result:
Welch Two Sample t-test
data: mtcars$mpg and mtcars$wt
t = 15.633, df = 32.633, p-value < 0.00000000000000022
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
14.67644 19.07031
sample estimates:
mean of x mean of y
20.09062 3.21725
Just PDF
If your output is PDF, then you may need some replace function. Here what I am using:
```r
tidyPrint <- function(data) {
content <- paste0(data,collapse = "\n\n")
content <- str_replace_all(content,"\\t"," ")
content <- str_replace_all(content,"\\ ","\\\\ ")
content <- str_replace_all(content,"\\$","\\\\$")
content <- str_replace_all(content,"\\*","\\\\*")
content <- str_replace_all(content,":",": ")
return(content)
}
```
Example
The code also needs to be a little different:
```{r my_pre_example,echo=FALSE,include=TRUE,results='asis'}
knitr::opts_chunk$set(comment = NA)
resultTTest <- capture.output(t.test(mtcars$mpg,mtcars$wt))
cat(tidyPrint(resultTTest))
```
PDF Result
PDF and HTML
If you really need the page work in both cases PDF and HTML, the tidyPrint should be a little different in the last step.
```r
tidyPrint <- function(data) {
content <- paste0(data,collapse = "\n\n")
content <- str_replace_all(content,"\\t"," ")
content <- str_replace_all(content,"\\ ","\\\\ ")
content <- str_replace_all(content,"\\$","\\\\$")
content <- str_replace_all(content,"\\*","\\\\*")
content <- str_replace_all(content,":",": ")
return(paste("<code>",content,"</code>\n"))
}
```
Result
The PDF result is the same, and the HTML result is close to the previous, but with some extra border.
It is not perfect but maybe is good enough.
I'm analyzing some data and would like to do a Simpsons paradox on R. I've installed the Simpsons package and loaded the library. Here is an example based on the package documentation:
---
output: html_document
---
```{r}
library(Simpsons)
#generating data
Coffee1=rnorm(100,100,15)
Neuroticism1=(Coffee1*.8)+rnorm(100,15,8)
g1=cbind(Coffee1, Neuroticism1)
Coffee2=rnorm(100,170,15)
Neuroticism2=(300-(Coffee2*.8)+rnorm(100,15,8))
g2=cbind(Coffee2, Neuroticism2)
Coffee3=rnorm(100,140,15)
Neuroticism3=(200-(Coffee3*.8)+rnorm(100,15,8))
g3=cbind(Coffee3, Neuroticism3)
data2=data.frame(rbind(g1,g2,g3))
colnames(data2) <- c("Coffee","Neuroticism")
example <- Simpsons(Coffee,Neuroticism,data=data2)
plot(example)
```
This is returning a plot with 3 clusters (exactly what I need). However, when I Knit the Rmd file to HTML, I'm getting a lot of equals signs (======) with a percentage next to it like a loading grid which I would like to remove from my final output.
You can suppress any output messages in R by setting the knitr chunk option. If we wish to hide all code output other than plots, we can use the following solution:
---
output: html_document
---
```{r echo=FALSE, results='hide', fig.keep='all', message = FALSE}
library(Simpsons)
#generating data
Coffee1=rnorm(100,100,15)
Neuroticism1=(Coffee1*.8)+rnorm(100,15,8)
g1=cbind(Coffee1, Neuroticism1)
Coffee2=rnorm(100,170,15)
Neuroticism2=(300-(Coffee2*.8)+rnorm(100,15,8))
g2=cbind(Coffee2, Neuroticism2)
Coffee3=rnorm(100,140,15)
Neuroticism3=(200-(Coffee3*.8)+rnorm(100,15,8))
g3=cbind(Coffee3, Neuroticism3)
data2=data.frame(rbind(g1,g2,g3))
colnames(data2) <- c("Coffee","Neuroticism")
example <- Simpsons(Coffee,Neuroticism,data=data2)
plot(example)
```
I would note that this package seems to print out a lot more content that most packages, and therefore the combination of options are quite long.
An easier method would probably be to move the plot to a separate chunk and have all the analysis run before it. The include argument can be used to suppress all outputs, but this includes plots, hence why we must use two chunks:
```{r, include = FALSE}
# your code to build model
```
```{r}
plot(example)
```
Check out the full list of knitr chunk options here
I wasn't able to find much pertinent to this on Stack-Overflow, or the web.
I'm getting this error:
> library(knitr)
> knit2html("pa1_template.rmd")
Error in knit2html("pa1_template.rmd") :
It seems you should call rmarkdown::render() instead of knitr::knit2html() because pa1_template.rmd appears to be an R Markdown v2 document.
I just ran it with rmarkdown::render(), and it created the HTML file. However, my assignment wants me to run it through knit2html() and create an md file.
When I run the Rmd file through the RStudio "Knit HTML" menu option, it creates the HTML file fine.
Any pointers appreciated.
Here is the content of the rmd file:
## Loading and preprocessing the data
Read the data file in.
```{r readfile}
steps<-read.csv("activity.csv",header=TRUE, sep=",")
steps_good<-subset(steps, !is.na(steps))
```
Sum the number of steps per day
```{r summarize/day}
steps_day<-aggregate(steps~date, data=steps_good, sum)
```
Create a histogram of the results
```{r histogram}
hist(steps_day$steps, main="Frequency of Steps/day", xlab="Steps/Day", border="blue", col="orange")
```
# What is the mean total number of steps taken per day?
Calculate the mean of the steps per day
```{r means_steps/day}
mean_steps<-mean(steps_day$steps)
mean_steps
```
Calculate the median of the steps per day
```{r median_steps/day}
med_steps<-median(steps_day$steps)
med_steps
```
#What is the average daily activity pattern?
Get the average steps per 5 minute interval
```{r avg_5_min}
step_5min<-aggregate(steps~interval, data=steps_good, mean)
```
Plot steps against time interval, averaged across all days
```{r plot_interval}
plot(step_5min$interval,step_5min$steps, type="l", main="steps per time interval",ylab="Steps",xlab="Interval")
```
On average, which interval during the day has the most steps.
```{r max_interval}
step_5min$interval[which.max(step_5min$steps)]
```
#Imputing missing values
How many NAs are there in the original table?
```{r NAs}
steps_na<-which(is.na(steps))
length(steps_na)
```
Merge 5 minute interval with original steps table
```{r merge}
steps_filled<-merge(steps, step_5min,by="interval")
```
Replace NA values with mean of steps values for that time interval
```{r replace_na}
steps_na<-which(is.na(steps_filled$steps.x))
steps_filled$steps.x[steps_na]<-steps_filled$steps.y[steps_na]
```
Create a histogram of the results
```{r new_hist}
steps_day_new<-aggregate(steps.x~date, data=steps_filled, sum)
hist(steps_day_new$steps.x, main="Frequency of Steps/day", xlab="Steps/Day", border="blue", col="orange")
```
It looks like the imputing of NA values increases the middle bar (mean/median) height, but other bars seem unchanged.
Calculate the new mean of the steps per day
```{r new_means_steps/day}
mean_steps<-mean(steps_day_new$steps.x)
mean_steps
```
Calculate the new median of the steps per day
```{r new_median_steps/day}
med_steps<-median(steps_day_new$steps.x)
med_steps
```
It looks like the mean did not change, but the median took on the value of the mean, now that some non-integer values were plugged in.
#Are there differences in activity patterns between weekdays and weekends?
Regenerate steps_filled, and flag whether a date is a weekend or a weekday.
Convert resulting column to factor.
```{r fill_weekdays}
steps_filled<-merge(steps, step_5min,by="interval")
steps_filled$steps.x[steps_na]<-steps_filled$steps.y[steps_na]
steps_filled<-cbind(steps_filled, wkday=weekdays(as.Date(steps_filled$date)))
steps_filled<-cbind(steps_filled, day_type="", stringsAsFactors=FALSE)
for(i in 1:nrow(steps_filled)){
if(steps_filled$wkday[i] %in% c("Saturday","Sunday"))
steps_filled$day_type[i]="Weekend"
else
steps_filled$day_type[i]="Weekday"
}
steps_filled$day_type<-as.factor(steps_filled$day_type)
```
Get average steps per interval and day_type
```{r plot_interva_day_type}
steps_interval_day<-aggregate(steps_filled$steps.x,by=list(steps_filled$interval,steps_filled$day_type),mean)
```
Plot the weekend and weekday results in a panel plot.
```{r day_type_plot}
weekday_intervals<-subset(steps_interval_day, steps_interval_day$Group.2=="Weekday",select=c("Group.1","x"))
weekend_intervals<-subset(steps_interval_day, steps_interval_day$Group.2=="Weekend",select=c("Group.1","x"))
par(mfrow=c(1,2))
plot(weekday_intervals$Group.1,weekday_intervals$x,type="l",xlim=c(0,2400), ylim=c(0,225),main="Weekdays",xlab="Intervals",ylab="Mean Steps/day")
plot(weekend_intervals$Group.1,weekend_intervals$x,type="l",xlim=c(0,2400), ylim=c(0,225),main="Weekends",xlab="Intervals",ylab="")
In RStudio, you can add keep_md: true in your YAML header:
---
title: "Untitled"
output:
html_document:
keep_md: true
---
With this option, you get both HTML and md files.
It worked with knit(), instead of knit2html()
try this:
setwd("working_directory")
library(knitr)
knit("PA1_template.Rmd", output = NULL)
adding output=NULL" was key for me.
Good luck!
When typesetting an R Markdown document to PDF, if a function draws multiple plots, those plots often appear side-by-side, with only the first plot fully within the margins of the page.
Minimal R Markdown example:
---
title: "Example re plotting problem"
author: "Daniel E. Weeks"
date: "May 3, 2016"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Multiple plots within a loop
```{r}
plots <- function() {
plot(rnorm(100))
hist(rnorm(100))
}
for (i in 1:3) {
plots()
}
```
Here is a screenshot of page 2 of the generated PDF
which shows the problem. I have searched online, but haven't yet found a solution to this problem.
Thank you.
The plot hook solution proposed by user2554330 is simple and works well. So this code draws all the plots within the margins of the resulting PDF:
---
title: "Example re plotting problem"
author: "Daniel E. Weeks"
date: "May 3, 2016"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Multiple plots within a loop
```{r}
plots <- function() {
plot(rnorm(100))
hist(rnorm(100))
}
```
## Call plotting function
```{r}
my_plot_hook <- function(x, options)
paste("\n", knitr::hook_plot_tex(x, options), "\n")
knitr::knit_hooks$set(plot = my_plot_hook)
for (i in 1:3) {
plots()
}
```
The problem is that the generated .tex file has no spaces between the \includegraphics{} calls. LaTeX gives warnings about overfull hboxes, because the graphics aren't big enough to sit alone on a line, and are too big when it puts two on each line.
You can tell LaTeX (TeX really) to output the bad lines without putting two figures on each line by adding
\pretolerance=10000
in the text before the code chunk. You'll probably want to set it back to its default value
\pretolerance=100
after the code chunk, or LaTeX won't try hyphenation afterwards, and text can look really ugly.
Another way to fix this would be to force each figure to be in its own paragraph. You can do this by adding this code
my_plot_hook <- function(x, options)
paste("\n", knitr::hook_plot_tex(x, options), "\n")
knitr::knit_hooks$set(plot = my_plot_hook)
into a code chunk before you do your plotting. This puts a blank line
before and after each figure.
I am using RStudio and knitr to knit .Rmd to .docx
I would like to include inline code in figure captions e.g. something like the following in the chunk options:
fig.cap = "Graph of nrow(data) data points"
However, knitr does not evaluate this code, instead just printing the unevaluated command.
Is there a way to get knitr to evaluate r code in figure/table captions?
knitr evaluates chunk options as R code. Therefore, to include a variable value in a figure caption, just compose the required string using paste or sprintf:
fig.cap = paste("Graph of", nrow(data), "data points")
Note that this might be problematic if data is created inside this chunk (and not in a previous chunk) because by default chunk options are evaluated before the chunk itself is evaluated.
To solve this issue, use the package option eval.after to have the option fig.cap be evaluated after the chunk itself has been evaluated:
library(knitr)
opts_knit$set(eval.after = "fig.cap")
Here a complete example:
---
title: "SO"
output:
word_document:
fig_caption: yes
---
```{r fig.cap = paste("Graph of", nrow(iris), "data points.")}
plot(iris)
```
```{r setup}
library(knitr)
opts_knit$set(eval.after = "fig.cap")
```
```{r fig.cap = paste("Graph of", nrow(data2), "data points.")}
data2 <- data.frame(1:10)
plot(data2)
```
The first figure caption works even without eval.after because the iris dataset is always available (as long as datasets has been attached). Generating the second figure caption would fail without eval.after because data2 does not exist before the last chunk has been evaluated.