I have R script.
mydat=read.csv("C:/Users/Admin/Downloads/test.csv", sep=";",dec=",")
View(mydat)
str(mydat)
#deleted after FS
mydat$symboling.<-NULL
mydat$make.<-NULL
mydat$num.of.cylinders.<-NULL
mydat$fuel.type.<-NULL
mydat$aspiration.<-NULL
mydat$num.of.cylinders.<-NULL
#this vars have small num. of obs.
mydat$engine.type.<-NULL
mydat$engine.location.<-NULL
mydat$num.of.doors.<-NULL
mydat=na.omit(mydat)
#Feature Selection
FS=Boruta(normalized.losses.~.,data=mydat)
getSelectedAttributes(FS, withTentative = F)
plot(FS, cex.axis=0.5)
#get scatterplot
scatter.smooth(x=mydat$length.,y=mydat$normalized.losses.,main="normalized losse~length")
#split sample on train and sample
index <- sample(1:nrow(mydat),round(0.70*nrow(mydat)))
train <- mydat[index,]
test <- mydat[-index,]
I have to save it in Rmarkdown format (html).
Of course in Rstudio i can do that:
file-new file-rmarkdown-HTML
and i get this script
```{r cars}
summary(cars)
```
I don't want manually write this prefix ```{r}.
Is it possible to make that those parts of the code that are separated by comments
#
#
were saved in rmarkdown format?
In output i expect for example
```{r}
mydat$symboling.<-NULL
mydat$make.<-NULL
mydat$num.of.cylinders.<-NULL
mydat$fuel.type.<-NULL
mydat$aspiration.<-NULL
mydat$num.of.cylinders.<-NULL
```
You can use the spin() function from the knitr package.
It will produce an .md file (but you can keep the intermediate .Rmd with the precious = TRUE argument), using the '#' chracter as the 'documentation argument:
doc
A regular expression to identify the documentation lines; by default it follows the roxygen convention, but it can be customized, e.g. if you want to use ## to denote documentation, you can use '^##\s*'.
For example:
spin('test.R', precious = TRUE, doc = '#')
produces:
test.Rmd
```{r }
mydat=read.csv("C:/Users/Admin/Downloads/test.csv", sep=";",dec=",")
View(mydat)
str(mydat)
```
deleted after FS
```{r }
mydat$symboling.<-NULL
mydat$make.<-NULL
mydat$num.of.cylinders.<-NULL
mydat$fuel.type.<-NULL
mydat$aspiration.<-NULL
mydat$num.of.cylinders.<-NULL
```
this vars have small num. of obs.
...
test.md
```r
mydat=read.csv("C:/Users/Admin/Downloads/test.csv", sep=";",dec=",")
```
```
## Warning in file(file, "rt"): cannot open file 'C:/Users/Admin/Downloads/
## test.csv': No such file or directory
```
```
## Error in file(file, "rt"): cannot open the connection
```
```r
View(mydat)
```
...
You may also have a look at the stitch() function and sibilings (stitch_rhtml and stitch_rmd), have a look here
Related
I've run my analyses in a source Rmd file and would like to knit a clean version from a final Rmd file using only a few of the chunks from the source. I've seen a few answers with regard to pulling all of the chunks from a source Rmd in Source code from Rmd file within another Rmd and How to source R Markdown file like `source('myfile.r')`?. I share the concern with these posts in that I don't want to port out a separate .R file, which seems to be the only way that read_chunk works.
I think I'm at the point where I can import the source Rmd, but now I'm not sure how to call specific chunks from it in the final Rmd. Here's a reproducible example:
SourceCode.Rmd
---
title: "Source Code"
output:
pdf_document:
latex_engine: xelatex
---
```{r}
# Load libraries
library(knitr) # Create tables
library(kableExtra) # Table formatting
# Create a dataframe
df <- data.frame(x = 1:10,
y = 11:20,
z = 21:30)
```
Some explanatory text
```{r table1}
# Potentially big block of stuff I don't want to have to copy/paste
# But I want it in the final document
kable(df, booktabs=TRUE,
caption="Big long title for whatever") %>%
kable_styling(latex_options=c("striped","HOLD_position")) %>%
column_spec(1, width="5cm") %>%
column_spec(2, width="2cm") %>%
column_spec(3, width="3cm")
```
[Some other text, plus a bunch of other chunks I don't need for anyone to see in the clean version.]
```{r}
save(df, file="Source.Rdata")
```
FinalDoc.Rmd
---
title: "Final Doc"
output:
pdf_document:
latex_engine: xelatex
---
```{r setup, include=FALSE}
# Load libraries and data
library(knitr) # Create tables
library(kableExtra) # Table formatting
opts_chunk$set(echo = FALSE)
load("Source.Rdata")
```
As far as I can tell, this is likely the best way to load up SourceCode.Rmd (from the first linked source above):
```{r}
options(knitr.duplicate.label = 'allow')
source_rmd2 <- function(file, local = FALSE, ...){
options(knitr.duplicate.label = 'allow')
tempR <- tempfile(tmpdir = ".", fileext = ".R")
on.exit(unlink(tempR))
knitr::purl(file, output=tempR, quiet = TRUE)
envir <- globalenv()
source(tempR, local = envir, ...)
}
source_rmd2("SourceCode.Rmd")
```
At this point, I'm at a loss as to how to call the specific chunk table1 from SourceCode.Rmd. I've tried the following as per instructions here with no success:
```{r table1}
```
```{r}
<<table1>>
```
The first seems to do nothing, and the second throws an unexpected input in "<<" error.
I wrote a function source_rmd_chunks() that sources chunk(s) by label name. See gist.
I'm trying to create multiple Rmarkdown documents (i.e., letters) that contain all the same text but are addressed to different people and have some unique text for each person. I've been taking a similar approach as the one laid out here:
http://rmarkdown.rstudio.com/articles_mail_merge.html
Basically, I have an R script that creates the Rmarkdown pdfs:
## Packages
library(knitr)
library(rmarkdown)
## Data
personalized_info <- read.csv(file = "meeting_times.csv")
## Loop
for (i in 1:nrow(personalized_info)){
rmarkdown::render(input = "mail_merge_handout.Rmd",
output_format = "pdf_document",
output_file = paste("handout_", i, ".pdf", sep=''),
output_dir = "handouts/")
}
and a .Rmd file to fill in the text below:
---
output: pdf_document
---
```{r echo=FALSE}
personalized_info <- read.csv("meeting_times.csv", stringsAsFactors = FALSE)
name <- personalized_info$name[i]
time <- personalized_info$meeting_time[i]
```
Dear `r name`,
Your meeting time is `r time`.
See you then!
When I run the above R script, I get a folder named "handouts," with files named "handout_1," "handout_2," etc. I would like the files to be named after the person in the dataset, and to do this I changed "i" to "name" under the loop heading of the code. This produces files named like "handout_Ezra Zanders," but the file name does not much the name of the person in the Rmarkdown pdf.
Anyone know of a solution for this in the loop part of the script, or another way of doing this?
You need to add the names in your i loop. By the way, because you are using an external script to run your markdown, it is not necessary to read again the csv file in the mardown.
The R script that creates the Rmarkdown pdfs:
## Packages
library(knitr)
library(rmarkdown)
## Data
personalized_info <- read.csv(file = "meeting_times.csv")
## Loop
for (i in 1:nrow(personalized_info)) {
name <- personalized_info$name[i]
time <- personalized_info$meeting_time[i]
rmarkdown::render(input = "mail_merge_handout.Rmd",
output_format = "pdf_document",
output_file = paste("handout_", name, ".pdf", sep=''),
output_dir = "handouts/")
}
and a .Rmd file to fill in the text below:
---
output: pdf_document
---
Dear `r name`,
Your meeting time is `r time`.
See you then!
How can I use a variable as the chunk name? I have a child document which gets called a number of times, and I need to advance the chunk labels in such a manner than I can also cross reference them.
Something like this:
child.Rmd
```{r }
if(!exists('existing')) existing <- 0
existing = existing + 1
myChunk <- sprintf("myChunk-%s",existing)
```
## Analysis Routine `r existing`
```{r myChunk,echo = FALSE}
#DO SOMETHING, LIKE PLOT
```
master.Rmd
# Analysis Routines
Analysis for this can be seen in figures \ref{myChunk-1}, \ref{myChunk-2} and \ref{myChunk-3}
```{r child = 'child.Rmd'}
```
```{r child = 'child.Rmd'}
```
```{r child = 'child.Rmd'}
```
EDIT POTENTIAL SOLUTION
Here is one potential workaround, inspired by SQL injection of all things...
child.Rmd
```{r }
if(!exists('existing')) existing <- 0
existing = existing + 1
myChunk <- sprintf("myChunk-%s",existing)
```
## Analysis Routine `r existing`
```{r myChunk,echo = FALSE,fig.cap=sprintf("The Caption}\\label{%s",myChunk)}
#DO SOMETHING, LIKE PLOT
```
A suggestion to preknit the Rmd file into another Rmd file before knitting&rendering as follows
master.Rmd:
# Analysis Routines
Analysis for this can be seen in figures `r paste(paste0("\\ref{", CHUNK_NAME, 1:NUM_CHUNKS, "}"), collapse=", ")`
###
rmdTxt <- unlist(lapply(1:NUM_CHUNKS, function(n) {
c(paste0("## Analysis Routine ", n),
paste0("```{r ",CHUNK_NAME, n, ", child = 'child.Rmd'}"),
"```")
}))
writeLines(rmdTxt)
###
child.Rmd:
```{r,echo = FALSE}
plot(rnorm(100))
```
To knit & render the Rmd:
devtools::install_github("chinsoon12/PreKnitPostHTMLRender")
library(PreKnitPostHTMLRender) #requires version >= 0.1.1
NUM_CHUNKS <- 5
CHUNK_NAME <- "myChunk-"
preknit_knit_render_postrender("master.Rmd", "test__test.html")
Hope it helps. Cheers!
If you're getting to this level of complexity, I suggest you look at the brew package.
That provides a templating engine where you can dynamically create the Rmd for knitting.
You get to reference R variables in the outer brew environment, and build you dynamic Rmd from there.
Dynamic chunk names are possible with knitr::knit_expand(). Arguments are referenced in the child document, including in the chunk headers, using {{arg_name}}.
So my parent doc contains:
```{r child_include, results = "asis"}
###
# Generate a section for each dataset
###
species <- c("a", "b")
out <- lapply(species, function(sp) knitr::knit_expand("child.Rmd"))
res = knitr::knit_child(text = unlist(out), quiet = TRUE)
cat(res, sep = "\n")
```
And my child doc, which has no YAML header, contains:
# EDA for species {{sp}}
```{r getname-{{sp}}}
paste("The species is", "{{sp}}")
```
See here in the RMarkdown cookbook.
I am knitting a .Rmd file and want to have two outputs: the html and a purl'ed R script each time I run knit. This can be done with the following Rmd file:
---
title: "Purl MWE"
output: html_document
---
```{r}
## This chunk automatically generates a text .R version of this script when running within knitr.
input = knitr::current_input() # filename of input document
output = paste(tools::file_path_sans_ext(input), 'R', sep = '.')
knitr::purl(input,output,documentation=1,quiet=T)
```
```{r}
x=1
x
```
If you do not name the chunk, it works fine and you get html and .R output each time you run knit() (or click knit in RStudio).
However, if you name the chunk it fails. For example:
title: "Purl MWE"
output: html_document
---
```{r}
## This chunk automatically generates a text .R version of this script when running within knitr.
input = knitr::current_input() # filename of input document
output = paste(tools::file_path_sans_ext(input), 'R', sep = '.')
knitr::purl(input,output,documentation=1,quiet=T)
```
```{r test}
x=1
x
```
It fails with:
Quitting from lines 7-14 (Purl.Rmd)
Error in parse_block(g[-1], g[1], params.src) : duplicate label 'test'
Calls: <Anonymous> ... process_file -> split_file -> lapply -> FUN -> parse_block
Execution halted
If you comment out the purl() call, it will work with the named chunk. So there is something about how the purl() call is also naming chunks which causes knit() to think there are duplicate chunk names even when there are no duplicates.
Is there a way to include a purl() command inside a .Rmd file so both outputs (html and R) are produced? Or is there a better way to do this? My ultimate goal is to use the new rmarkdown::render_site() to build a website that updates the HTML and R output each time the site is compiled.
You can allow duplicate labels by including options(knitr.duplicate.label = 'allow') within the file as follows:
title: "Purl MWE"
output: html_document
---
```{r GlobalOptions}
options(knitr.duplicate.label = 'allow')
```
```{r}
## This chunk automatically generates a text .R version of this script when running within knitr.
input = knitr::current_input() # filename of input document
output = paste(tools::file_path_sans_ext(input), 'R', sep = '.')
knitr::purl(input,output,documentation=1,quiet=T)
```
```{r test}
x=1
x
```
This code isn't documented on the knitr website, but you can keep track with the latest changes direct from Github: https://github.com/yihui/knitr/blob/master/NEWS.md
A related approach to #ruaridhw solution would be to wrap the knitr::purl() in callr::r(). See function below that saves the R chunks from a specified R markdown file to a temporary .R file:
# RMD to local R temp file
# inspiration: https://gist.github.com/noamross/a549ee50e8a4fd68b8b1
rmd_chunks_to_r_temp <- function(file){
temp <- tempfile(fileext=".R")
# needed callr so can use when knitting -- else can bump into "duplicate chunk
# label" errors when running when knitting
callr::r(function(file, temp){
knitr::purl(file, output = temp)
},
args = list(file, temp))
}
This function also exists in funspotr:::rmd_chunks_to_r_temp() at brshallo/funspotr.
You can avoid this error with a bash chunk that calls purl in a separate R session. That way there's no need to allow duplicate labels.
An example use case is an Rmd file where the code is run (and not echo'd) throughout the report and then all the code chunks are shown with chunks names and code comments in an Appendix. If you don't require that additional functionality then you would only need up until the bash chunk.
The idea is that report_end signifies where to stop purl such that the appendix code isn't considered "report code". Then read_chunk reads the entire R file into one code chunk which can then be echo'd with syntax highlighting if required.
---
title: "Purl MWE"
output: html_document
---
These code chunks are used in the background of the report however
their source is not shown until the Appendix.
```{r test1, echo=FALSE}
x <- 1
x
```
```{r test2, echo=FALSE}
x <- x + 1
x
```
```{r test3, echo=FALSE}
x <- x + 1
x
```
# Appendix
```{r, eval=TRUE}
report_end <- "^# Appendix"
temp <- tempfile(fileext = ".R")
Sys.setenv(PURL_IN = shQuote("this_file.Rmd"), # eg. knitr::current_input()
PURL_OUT = shQuote(temp),
PURL_END = shQuote(report_end))
```
```{bash, include=FALSE}
Rscript -e "lines <- readLines($PURL_IN, warn = FALSE)" \
-e "knitr::purl(text = lines[1:grep($PURL_END, lines)], output = $PURL_OUT, documentation = 1L)"
```
```{r, include=FALSE}
knitr::read_chunk(temp, labels = "appendix")
unlink(temp)
```
```{r appendix, eval=FALSE, echo=TRUE}
```
I have about 60 .Rdata files in the same directory. The object name in all those .Rdata are same. I want to write some code to load and print all 60 .Rdata file and each file in the new page. For example, if the file name is file_1.rdata, file_2.rdata and file_3.rdata. The object name in all three .Rdata files is table. The following knitr code showed exactly what I want,
>\```{r,echo=FALSE}
>load("file_1.rdata")
>print(table)
>\```
>\pagebreak
>\```{r,echo=FALSE}
>load("file_2.rdata")
>print(table)
>\```
>\pagebreak
>\```{r,echo=FALSE}
>load("file_3.rdata")
>print(table)
>```
>\pagebreak
But I have more than 60 files, it is really hard to write all the code by hand. I can write for loop in R block, however, how can I make a new page for each .rdata file?
The for loop will be
>\```{r,echo=FALSE}
>names <- c("file_1.rdata","file_2.rdata","file_3.rdata")
>for(i in 1:length(names)){
> current_object <- names[i]
> load(current_object)
> print(table)
>}
>\```
You can try adding in cat("\n\n\\pagebreak\n") inside your for loop, and results='asis' to your chunk call:
```{r,echo=FALSE, results='asis'}
names <- c("file_1.rdata","file_2.rdata","file_3.rdata")
for(i in 1:length(names)){
current_object <- names[i]
load(current_object)
print(table)
cat("\n\n\\pagebreak\n")
}
```
It works for me with mtcars:
---
title: "test"
output: pdf_document
---
```{r, echo=FALSE, results='asis'}
for (i in 1:3) {
print(mtcars)
cat("\n\n\\pagebreak\n")
}
```
NB you might want to look into the function kable to format your tables more nicely. Or using library(xtable):
```{r, echo=FALSE, results='asis'}
for (i in 1:3) {
print(xtable::xtable(mtcars), type = "latex")
cat("\n\n\\pagebreak\n")
}
```