Is there a way to disable R diagnostics for RMD file?
I'm running a simple preliminary code here and getting these warnings "Unknown or uninitialized column: xxx".
library(readr)
dataset = read_csv("breast_cancer.csv")
is.factor(dataset$cellularity)
is.factor(dataset$cancer_type_detailed)
A quick google search tells me these warnings are common with no way to fix it. At least I would like to know how to disable these in my RMD file. Also don't want to have to write in warnings=FALSE for every R chunk.
Link to dataset:
https://www.kaggle.com/raghadalharbi/breast-cancer-gene-expression-profiles-metabric
It's usually a bad idea to suppress warnings: very few of them are false positives. Suppressing them globally is a really bad idea.
In this case, the warnings signal that your dataset does not have columns with those names. From the str(dataset) information, we can see that there is a column named Cellularity and one named Cancer Type Detailed. Those don't match the names you used in your code, cellularity and cancer_type_detailed.
If you get a warning like this, you should examine the dataset to see what the names really are. One way is to print str(dataset); even better is to print names(dataset), which will show very clearly exactly what the names are.
There's one situation where you might not want to see that warning. If you are not sure if a column is included, you might use a test like
if (is.null(dataset$notthere)) { ... }
You'll get the warning from this code when dataset is a tibble. The way to avoid it is to use
if (is.null(dataset[["notthere"]])) ...
or
if (! "notthere" %in% names(dataset)) ...
You can globally hide warnings like this:
```{r setup, include = FALSE}
knitr::opts_chunk$set(warning = FALSE)
```
You add this chunk in the beginning of your Rmarkdown file and warnings are not shown anymore. You can also add other sommon chunk options there if you like.
Related
Using knitr and R Markdown, I can produce a tabularised output from a matrix using the following command:
```{r results='asis'}
kable(head(x))
```
However, I’m searching for a way to make the kable code implicit since I don’t want to clutter the echoed code with it. Essentially, I want this:
```{r table=TRUE}
head(x)
```
… to yield a formatted tabular (rather than the normal output='markdown') output.
I actually thought this must be pretty straightforward since it’s a pretty obvious requirement, but I cannot find any way to achieve this, either via the documentation or on the web.
My approach to create an output hook fails because once the data arrives at the hook, it’s already formatted and no longer the raw data. Even when specifying results='asis', the hook obtains the output as a character string and not as a matrix. Here’s what I’ve tried:
default_output_hook <- knit_hooks$get('output')
knit_hooks$set(output = function (x, options)
if (! is.null(options$table))
kable(x)
else
default_output_hook(x, options)
)
But like I said, this fails since x is not the original matrix but rather a character string, and it doesn’t matter which value for the results option I specify.
Nowadays one can set df_print in the YAML header:
---
output:
html_document:
df_print: kable
---
```{r}
head(iris)
```
I think other answers are from a time when the following didn't work, but now we can just do :
```{r results='asis', render=pander::pander}
head(x)
```
Or set this for all chunks in the setup chunk, for instance :
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, render=pander::pander)
```
Lacking a better solution I’m currently re-parsing the character string representation that I receive in the hook. I’m posting it here since it kind of works. However, parsing a data frame’s string representation is never perfect. I haven’t tried the following with anything but my own data and I fully expect it to break on some common use-cases.
reparse <- function (data, comment, ...) {
# Remove leading comments
data <- gsub(sprintf('(^|\n)%s ', comment), '\\1', data)
# Read into data frame
read.table(text = data, header = TRUE, ...)
}
default_output_hook <- knit_hooks$get('output')
knit_hooks$set(output = function (x, options)
if (is.null(options$table))
default_output_hook(x, options)
else {
extra_opts <- if (is.list(options$table)) options$table else list()
paste(kable(do.call(reparse, c(x, options$comment, extra_opts))),
collapse = '\n')
}
)
This will break if the R markdown comment option is set to a character sequence containing a regular expression special char (e.g. *), because R doesn’t seem to have an obvious means of escaping a regular expression.
Here’s a usage example:
```{r table=TRUE}
data.frame(A=1:3, B=4:6)
```
You can pass extra arguments to the deparse function. This is necessary e.g. when the table contains NA values because read.table by default interprets them as strings:
```{r table=list(colClasses=c('numeric', 'numeric'))}
data.frame(A=c(1, 2, NA, 3), B=c(4:6, NA))
```
Far from perfect, but at least it works (for many cases).
Not exactly what you are looking for, but I am posting an answer here (that could not fit in a comment) as your described workflow is really similar to what my initial goal and use-case was when I started to work on my pander package. Although I really like the bunch of chunk options that are available in knitr, I wanted to have an engine that makes creating documents really easy, automatic and without any needed tweaks. I am aware of the fact that knitr hooks are really powerful, but I just wanted to set a few things in my Rprofile and let the literate programming tool its job without further trouble, that ended to be Pandoc.brew for me.
The main idea is to specify a few options (what markdown flavour are you using, what's your decimal mark, favorite colors for your charts etc), then simply write your report in a brew syntax without any chunk options, and the results of your code would be automatically transformed to markdown. Then convert that to pdf/docx/odt etc. with Pandoc.
The following is a segment of code that I need to perform so that in the later stages I can perform other functions to make histograms of the dataset. My problem is that even after correctly importing the dataset, it does not recognize the "Aboard" Variable.
This clearly shows that the variable exists and that the dataset has been imported properly but whenever i try and run the chunks the second chunk comes up with an error saying that it does not recognise the variable. I have also tried to do this with another variable and it comes up with the same error. I do not know why it is happening and if it it because i have missed out a step. I tried to fix this by putting the linecolnames(before) in but it did no good.
setwd("~/Uni Y2/Stats/Group Project")
Before <- read.csv("before_911_no_summary.csv",header = TRUE)
After <- read.csv("after_911_no_summary.csv",header = TRUE)
colnames(Before) <-c("Date","Location","Aboard","Fatalities","Ground",
"Total.dead")
colnames(After) <-c("Date1","Location1","Aboard1","Fatalities1","Ground1",
"Total.dead1")
```
```{r}
Survivors<- (Aboard-Fatalities)
Survivors1<- (Aboard1-Fatalities1)
```
I've been trying to find this on the web but haven't had any luck. I'm working on creating a report with R using knitr and was wondering if anyone knows of a good resource for all options involving <<>>=. I've seen some examples like <<setup, include=FALSE, cache=FALSE>>= but I don't know what these mean and would like to know what else I can do.
To let you close the question - everything is here: http://yihui.name/knitr/options/#chunk_options, I will just paste the most important (in my opinion) options below:
Code Evaluation
eval: (TRUE; logical) whether to evaluate the code chunk; it can also be a numeric vector to select which R expression(s) to evaluate, e.g. eval=c(1, 3, 4) or eval=-(4:5)
Text Results
echo: (TRUE; logical or numeric) whether to include R source code in the output file; besides TRUE/FALSE which completely turns on/off the source code, we can also use a numeric vector to select which R expression(s) to echo in a chunk, e.g. echo=2:3 means only echo the 2nd and 3rd expressions, and echo=-4 means to exclude the 4th expression
results: ('markup'; character) takes these possible values
markup: mark up the results using the output hook, e.g. put results in a special LaTeX
asis: output as-is, i.e., write raw results from R into the output document
hold: hold all the output pieces and push them to the end of a chunk
hide hide results; this option only applies to normal R output (not warnings, messages or errors)
collapse: (FALSE; logical; applies to Markdown output only) whether to, if possible, collapse all the source and output blocks from one code chunk into a single block (by default, they are written to separate <pre></pre> blocks)
warning: (TRUE; logical) whether to preserve warnings (produced by warning()) in the output like we run R code in a terminal (if FALSE, all warnings will be printed in the console instead of the output document); it can also take numeric values as indices to select a subset of warnings to include in the output
error: (TRUE; logical) whether to preserve errors (from stop()); by default, the evaluation will not stop even in case of errors!! if we want R to stop on errors, we need to set this option to FALSE
message: (TRUE; logical) whether to preserve messages emitted by message() (similar to warning)
include: (TRUE; logical) whether to include the chunk output in the final output document; if include=FALSE, nothing will be written into the output document, but the code is still evaluated and plot files are generated if there are any plots in the chunk, so you can manually insert figures; note this is the only chunk option that is not cached, i.e., changing it will not invalidate the cache.
Cache
cache: (FALSE; logical) whether to cache a code chunk; when evaluating code chunks, the cached chunks are skipped, but the objects created in these chunks are (lazy-) loaded from previously saved databases (.rdb and .rdx) files, and these files are saved when a chunk is evaluated for the first time, or when cached files are not found (e.g. you may have removed them by hand)
Plots
fig.path: ('figure/'; character) prefix to be used for figure filenames (fig.path and chunk labels are concatenated to make filenames); it may contain a directory like figure/prefix- (will be created if it does not exist); this path is relative to the current working directory
fig.width, fig.height: (both are 7; numeric) width and height of the plot, to be used in the graphics device (in inches) and have to be numeric
dev: ('pdf' for LaTeX output and 'png' for HTML/markdown; character) the function name which will be used as a graphical device to record plots
An examplary chunk:
```{r global_options, include = FALSE}
knitr::opts_chunk$set(fig.width = 9, fig.height = 4, fig.path = "Figs/", dev = "svg",
echo = FALSE, warning = FALSE, message = FALSE,
cache = FALSE, tidy = FALSE, size = "small")
```
Using knitr and R Markdown, I can produce a tabularised output from a matrix using the following command:
```{r results='asis'}
kable(head(x))
```
However, I’m searching for a way to make the kable code implicit since I don’t want to clutter the echoed code with it. Essentially, I want this:
```{r table=TRUE}
head(x)
```
… to yield a formatted tabular (rather than the normal output='markdown') output.
I actually thought this must be pretty straightforward since it’s a pretty obvious requirement, but I cannot find any way to achieve this, either via the documentation or on the web.
My approach to create an output hook fails because once the data arrives at the hook, it’s already formatted and no longer the raw data. Even when specifying results='asis', the hook obtains the output as a character string and not as a matrix. Here’s what I’ve tried:
default_output_hook <- knit_hooks$get('output')
knit_hooks$set(output = function (x, options)
if (! is.null(options$table))
kable(x)
else
default_output_hook(x, options)
)
But like I said, this fails since x is not the original matrix but rather a character string, and it doesn’t matter which value for the results option I specify.
Nowadays one can set df_print in the YAML header:
---
output:
html_document:
df_print: kable
---
```{r}
head(iris)
```
I think other answers are from a time when the following didn't work, but now we can just do :
```{r results='asis', render=pander::pander}
head(x)
```
Or set this for all chunks in the setup chunk, for instance :
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, render=pander::pander)
```
Lacking a better solution I’m currently re-parsing the character string representation that I receive in the hook. I’m posting it here since it kind of works. However, parsing a data frame’s string representation is never perfect. I haven’t tried the following with anything but my own data and I fully expect it to break on some common use-cases.
reparse <- function (data, comment, ...) {
# Remove leading comments
data <- gsub(sprintf('(^|\n)%s ', comment), '\\1', data)
# Read into data frame
read.table(text = data, header = TRUE, ...)
}
default_output_hook <- knit_hooks$get('output')
knit_hooks$set(output = function (x, options)
if (is.null(options$table))
default_output_hook(x, options)
else {
extra_opts <- if (is.list(options$table)) options$table else list()
paste(kable(do.call(reparse, c(x, options$comment, extra_opts))),
collapse = '\n')
}
)
This will break if the R markdown comment option is set to a character sequence containing a regular expression special char (e.g. *), because R doesn’t seem to have an obvious means of escaping a regular expression.
Here’s a usage example:
```{r table=TRUE}
data.frame(A=1:3, B=4:6)
```
You can pass extra arguments to the deparse function. This is necessary e.g. when the table contains NA values because read.table by default interprets them as strings:
```{r table=list(colClasses=c('numeric', 'numeric'))}
data.frame(A=c(1, 2, NA, 3), B=c(4:6, NA))
```
Far from perfect, but at least it works (for many cases).
Not exactly what you are looking for, but I am posting an answer here (that could not fit in a comment) as your described workflow is really similar to what my initial goal and use-case was when I started to work on my pander package. Although I really like the bunch of chunk options that are available in knitr, I wanted to have an engine that makes creating documents really easy, automatic and without any needed tweaks. I am aware of the fact that knitr hooks are really powerful, but I just wanted to set a few things in my Rprofile and let the literate programming tool its job without further trouble, that ended to be Pandoc.brew for me.
The main idea is to specify a few options (what markdown flavour are you using, what's your decimal mark, favorite colors for your charts etc), then simply write your report in a brew syntax without any chunk options, and the results of your code would be automatically transformed to markdown. Then convert that to pdf/docx/odt etc. with Pandoc.
I am trying to run a script that has lots of comments to explain each table, statistical test and graph. I am using RStudio IDE as follows
source(filename, echo=T)
That ensures that the script outputs everything to the console. If I run the following sequence it will send all the output to a txt file and then switch off the output diversion
sink("filenameIwantforoutput.txt")
source(filename, echo=T)
sink()
Alas, I am finding that a lot of my comments are not being outputted. Instead I get
"...but only if we had had an exclusively b .... [TRUNCATED]".
Once before I learned where to preserve the output but that was a few months ago and now I cannot remember. Can you?
Set the max.deparse.length= argument to source. You probably need something greater than the default of 150. For example:
source(filename, echo=TRUE, max.deparse.length=1e3)
And note the last paragraph in the Details section of ?source reads:
If ‘echo’ is true and a deparsed
expression exceeds
‘max.deparse.length’, that many
characters are output followed by ‘
.... [TRUNCATED] ’.
You can make this behavior the default by overriding the source() function in your .Rprofile.
This seems like a reasonable case for overriding a function because in theory the change should only affect the screen output. We could contrive an example where this is not the case, e.g., can capture the screen output and use as a variable like capture.output(source("somefile.R")) but it seems unlikely. Changing a function in a way that the return value is changed will likely come back to bite you or whoever you share your code with (e.g., if you change a default of a function's na.rm argument).
.source_modified <- source
formals(.source_modified)$max.deparse.length <- Inf
# Use 'nms' because we don't want to do it for all because some were already
# unlocked. Thus if we lock them again, then we are changing the previous
# state.
# (https://stackoverflow.com/a/62563023/1376404)
rlang::env_binding_unlock(env = baseenv(), nms = "source")
assign(x = "source", value = .source_modified, envir = baseenv())
rlang::env_binding_lock(env = baseenv(), nms = "source")
rm(.source_modified)
An alternative is to create your own 'alias'-like function. I use the following in my .Rprofile:
s <- source
formals(s)$max.deparse.length <- Inf
formals(s)$echo <- TRUE