Knitting an HTML file wont publish the inference command - r

I am currently doing a R course and I struggle with knitting an HTML file.
All the code works fine within RStudio. The file also knits properly, however it wont plot an output for the last command, when I run the inference. I added the code.
Any input is much appreciated.
Thanks
Markus
Firstly, we filter for the religions and the year of interest:
```{r filter}
gss2012 = gss %>%
filter(year =="2012")
gssCatPro2012 = gss2012 %>%
filter(relig=="Catholic" | relig=="Protestant")
```
Now we create a first histogram of both religions to get a first idea of the distributions:
{r plot both rel}
ggplot(data=gssCatPro2012, aes(x=childs))+geom_histogram()
Calculate ratio and represent in pie chart:
{r ratio}
gssCatPro2012 %>%
summarise(Catholicratio = sum(relig =="Catholic")/n())
percent <- c(32.64,67.36)
lbls <- c("Catholics", "Protestants")
pct <- round(percent/sum(percent)*100)
lbls <- paste(lbls, pct)
lbls <- paste(lbls,"%", sep="")
pie(percent, labels=lbls, col=rainbow(length(lbls)), main="Pie chart Catholics/Protestants")
Split data between religions:
{r split}
gssCat2012 = gssCatPro2012 %>%
filter(relig=="Catholic")
gssPro2012 = gssCatPro2012 %>%
filter(relig=="Protestant")
Plot first distribution of Catholics, then Protestants:
{r plot per religion}
ggplot(data=gssCat2012, aes(x=childs))+geom_histogram()
ggplot(data=gssPro2012, aes(x=childs))+geom_histogram()
Check if any NAs to clean:
{r NA}
anyNA(gssCatPro2012$childs)
completeFun <- function(data, desiredCols) {
completeVec <- complete.cases(data[, desiredCols])
return(data[completeVec, ])
}
gssCatPro2012=completeFun(gssCatPro2012,"childs")
anyNA(gssCatPro2012$childs)
Calculate means for both religions:
{r metrics}
gssCatPro2012 %>%
group_by(relig) %>%
summarise(mean_kids=mean(childs), med_kids=median(childs), sd_kids=sd(childs),n=n())
Inference
We are going to create a new variable in order to overwrite the content of the old variable relig:
{create new variable}
gssCatPro2012new <- gssCatPro2012 %>%
mutate(relignew = ifelse(relig == "Catholic", "Catholic", "Protestant"))
Now, we can run the inference function and see whether we can reject the 0 Hypothesis or not:
{hypothesis test}
inference(y = childs, x = relignew, data = gssCatPro2012new, statistic = "mean", type = "ht", null = 0, alternative = "twosided", method = "theoretical")

Modify the chunk names to use underscores instead of spaces and make sure each chunk begins with a leading "r".
For example:
{r create_new_variable}
instead of:
{create new variable}

Related

Conditionally display math formulas with variables in R Markdown

I have a problem with rendering formulas with variables in R markdown.
Here is the variables that I using (simple example):
```{r, include=FALSE}
series <- c(1, 2, 3, 4)
count <- leng(series)
sum <- sum(series)
sum2 <- sum(series^2)
sq_formula <- TRUE
My problem is to print in Rmd (output = Word) math expression with knitr like:
$$S = \frac{\sum{S}}{n} = \frac{`r sum`}{`r count`} = `r mean(series)`$$
if sq_formula is FALSE, otherwise it should be:
$$S = \frac{\sum{S^2}}{n} = \frac{`r sum2`}{`r count`} = `r sum2 / count`$$
There is a way to write formulas in R chunk and print it by condition, like:
```{r, include=FALSE}
formula1 <- '$$ \\overline{S} = \\frac{\\sum{S}}{n} = \\frac{\\sum{`r sum`}{`r count`}}$$'
formula2 <- '$$ \\overline{S} = \\frac{\\sum{S^2}}{n} = \\frac {\\sum{`r sum2`}}{`r count`}$$'
`r if (sq_formula <- TRUE) {formula1} else {formula2}`
but I can't insert the variables like r sum, r count inside the chunk.
I also tried to handle the problem with sprintf function, but haven't found a way to insert the variables in strings in math notation. So I would be grateful for any help.
What is the problem with sprintf? Can't you do
formula1 <- sprintf(
'$$ \\overline{S} = \\frac{\\sum{S}}{n} = \\frac{\\sum{%s}{%s}}$$',
sum, count
)

kableExtra : How can i set to bold the biggest value of the row?

Suppose i have a table that looks like :
x = matrix(runif(10*5),nrow=10,ncol=5)
When i display the matrix using kableextra, i want the highest value, per row, of say the last 2 rows, to be bolded.
I looked at this document https://rdrr.io/cran/kableExtra/f/inst/doc/awesome_table_in_pdf.pdf a lot and i did not found how to use cell_spec correctly to perform this goal.
I thought this would be easier than it turned out to be. As far as I can see, this is how to do it:
---
title: "Untitled"
output: pdf_document
---
```{r}
set.seed(123)
library(knitr)
library(kableExtra)
x <- matrix(round(runif(10*5),2), nrow=10,ncol=5)
j1 <- which.max(x[9,])
j2 <- which.max(x[10,])
col <- seq_len(ncol(x))
x[9,] <- x[9,] %>% cell_spec(bold = col == j1)
x[10,] <- x[10,] %>% cell_spec(bold = col == j2)
x %>% kable(booktabs = TRUE, escape = FALSE)
```
A few notes:
I rounded the values so they aren't so ugly when printed.
I couldn't see a way to do everything in one pipeline, though there probably is one. The trouble is that cell_spec is designed to work on vectors, not matrices.
Finally, the escape = FALSE in kable() is essential: otherwise you'll see the code to make it bold, rather than the bold entry itself.

knitr: how to print the visible parts of the expression in a custom code chunk using engine_output()?

I am creating a custom code chunk which rewrites the expressions that a user passes into the code block into valid R code, and then execute the analysis. Aside from the rewriting of the R expression that the user inputs, the goal is for the code chunk to work as a regular code chunk
I then execute the result and store it in a variable, which I am trying to output. I am running into two issues here:
It prints invisible variables
knit_print and engine_output do not seem to work with each other i.e. I want the output such as dataframes to be printed in the table format but I am not able to get it to work.
An example of what I am trying to do would be something like:
```{r}
custom_engine <- function(options) {
c = options$code
pasted <- paste(c, collapse = "\n")
code <- parse(text = pasted)
result = lapply(code, eval, envir = .GlobalEnv)
knit_print(result)
engine_output( options, code = c, out = NULL )
}
knitr::knit_engines$set(mmm = custom_engine)
```
```{r}
speed_data <- data.frame(
speed = rlnorm(100, log(200), 1),
device = sample(c("smartphone", "laptop", "tablet"), 100, TRUE, prob= c(0.1, 0.65, 0.25))
)
median = median(speed_data$speed)
iqr = IQR(speed_data$speed, na.rm=TRUE)
```
```{mmm}
result_analysis <- speed_data %>%
filter(speed < median + 3 * iqr & speed > 10) %>%
filter(device != "smartphone")
result_analysis
```
I would like the code above to print the table once, and in the default knit_print() format that RMarkdown does.
Any suggestions would be much appreciated.

Suppress list notation in Rmarkdown from lapply

I am using Rmarkdown to produce a report. One of the steps includes a using lapply() with a function that produces a plot in order to produce multiple plots. The function and lapply work well, but I get notation about which element in the list between each plot.
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
#{r pressure, echo=FALSE}
myPlotFun <- function(z){
diamonds %>%
filter(color == !!z) %>%
ggplot(aes(x= carat, y = price))+
geom_point()
}
myList <- c("E","D","H")
lapply(myList, myPlotFun)
and I get:
'## [1]
`##
`## [2]
How do I hide the list notation (e.g. ## [2]) lines in between the plots?
using include = FALSE hides both the plots and the list notation, which is not what I want. I tried warning = FALSE, but that doesn't help.
These numbers are artifacts from lapply. The easiest way to remove them is to use a for loop instead. Otherwise you could create a hook that removes any other output but plots:
```{r}
def <- knitr::knit_hooks$get("output")
knitr::knit_hooks$set(output = function(x, options) {
x <- def(x, options)
ifelse(!is.null(options$suppress), gsub(pattern = "```.*```", "", x), x)
})
```
Just set suppress = T for the relevant chunks.

Loop in rmarkdown

I am relatively new to r and rmarkdown so I apologise in advance if this is a stupid question. This is a simple replication of a bigger dataset.
I have three columns in a dataframe:
df <- data.frame( c(a, b), c(c, d), c(e, NA))
names(df) <- c("X", "Y", "Z")
I want to show them in a rmarkdown file as follows:
I like a b.
This is c
This is e
This is d
I have written a function that includes
X <- 0
for (i in 1:nrow(df)) {
X[i] <- df$X[[i]] }
Y <- 0
for (i in 1:nrow(df)) {
Y[i] <- df$Y[[i]] }
X <- 0
for (i in 1:nrow(df)) {
Z[i] <- df$Z[[i]] }
And in the markdown file (the bit I'm struggling with)
I like `r X` ### This is fine
``` {r}
for (i in 1:nrow(df)) {
Y[i]
Z[i] } ### Doesn't work and I want to include text i.e. This is
```
I want to make some sort of loop so it prints the element in row 1 of column Y then Z, then the next row etc. and skip ifNA
Any help whatsoever would be majorly appreciated! :)
First, I'd give you some tips in your first loop. If you want to pass a data.frame column to a vector, you can vectorize it. I recommend you check this later. Hence, instead of:
X <- 0
for (i in 1:nrow(df)) {
X[i] <- df$X[[i]] }
try to do:
X <- vector("numeric", nrow(df)) #suggestion to create a empty numerical vector
X <- as.numeric(df$X)
Answering your main question, you can name your code chunk to keep the things organized. Use eval=FALSE if you desire only the output and not the code printed. Now, you have your vectors and can use #jason suggestion:
I like `r X`
```{r code_chunk1, eval=FALSE}
paste0("This is ", X)
paste0("This is ", Y)
paste0("This is ", paste(Z,collapse = " ")) # if you want them all in the same line
}
```
Avoid the operator, it can produce unexpected results and create problems without you noticing! Visit this.
There is no need to use loops. However, the elements of df need to be re-arranged to get printed row-wise.
The rmarkdown file below reproduces the expected result:
---
title: Loop in rmarkdown
output: html_document
---
```{r, echo=FALSE}
df <- data.frame( c("a", "b"), c("c", "d"), c("e", NA))
names(df) <- c("X", "Y", "Z")
```
I like `r df$X`
```{r, echo=FALSE, warning=FALSE}
library(magrittr) # use piping to improve readability
df[, 2:3] %>% # treat columns like a matrix
t() %>% # change from row first to column first order
as.vector() %>% # flatten into vector
na.omit() %>% # drop NAs
paste("This is", .) %>% # Prepend text
knitr::kable(col.names = NULL) # print as table
```
The output is
Note that knitr::kable(col.names = NULL) is used to create inline text, i.e., text output not wrapped in a verbatim element.
Alternatively, the chunk option results='asis' can be used:
```{r, echo=FALSE, warning=FALSE, results='asis'}
library(magrittr) # use piping to improve readability
df[, 2:3] %>% # treat columns like a matrix
t() %>% # change from row first to column first order
as.vector() %>% # flatten into vector
na.omit() %>% # drop NAs
paste("This is", ., collapse = " \n") %>% # Prepend text and collapse into one string
cat() # use cat() instead of print()
```
Note that the 2 blanks before \n are required to indicate a line break in rmarkdown.

Resources