Removing last 10 lines of each RMarkdown in a folder - r

I wish to run a R script where I can remove the last 10 lines of each Rmarkdown file as the last 10 lines are confidential content. I have been given 10 such RMarkdowns in total. Is there a better way to do this than doing it one by one?
I have provieded a dummy tempalte of the Rmarkdown.
---
title: "Untitled"
author: "Unknown"
date: "28/05/2021"
output: pdf_document
---
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
## Including Plots
You can also embed plots, for example:
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
## Including Plots
You can also embed plots, for example:
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
## Including Plots
You can also embed plots, for example:
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
## Including Plots
You can also embed plots, for example:
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.

You can use readLinesand writeLinesfor this:
txt <- readLines("test.md")
N <- length(txt)
writeLines(txt[1:(N-10)], "test_short.md")
Then use dirto enumerate all files and automate with lapply or a for loop.

Related

Is there a way to add line breaks ONLY when exporting to PDF in R Markdown?

I think the question is quite self-explanatory but for avoidance of doubt I'll explain with more detail below:
I have an R Markdown document that works well if converted to HTML or uploaded to GitHub. When converting to PDF (using Latex), the results are not so pretty. I find that the biggest problem in a Latex PDF document are line breaks. I can fix the line breaks issue on the PDF document by adding "\ " characters, but that throws my HTML document out of whack too.
Is there a way to manually add line breaks (or "space before/after paragraphs") for the PDF output only?
Thank you!
You can redefine the relevant spacings in the YAML header. \parskip controls the paragraph spacing. Code blocks are shaded using a snugshade environment from the framed package. We can also redefine the shaded environment for code blocks to have some vertical space at the start. Here's a reproducible example. Note: I also added the keep_tex parameter so you can see exactly what the generated tex file looks like, in case this is useful:
title: "test"
author: "A.N. Other"
header-includes:
- \setlength{\parskip}{\baselineskip}
- \renewenvironment{Shaded}{\vspace{\parskip}\begin{snugshade}}{\end{snugshade}}
output:
pdf_document:
keep_tex: true
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r cars}
summary(cars)
```
## Including Plots
You can also embed plots, for example:
```{r pressure, echo=FALSE}
plot(pressure)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
Once you output to HTML, you can just print the HTML webpage as PDF. that might be an easy way keep the original format

Estimating read time for an R Markdown document and printing it possibly below the title

System: Windows 10, RStudio 1.0.153, R 3.4.1
I write blog posts in R Markdown. I would like to use Medium's approach of including an estimated reading time for the blog post just underneath the title, because research has suggested that including reading time in your blog post increases the likelihood that people will read it all. To estimate reading time I would like to use a simplified version of the algorithm used on Medium:
compute the number N of words in the document, preferably excluding the YAML preface and the R code chunks with echo=FALSE, i.e., all the text that my visitors won't see)
compute the number m of imported images and R-generated plots: if counting plots makes the problem too complicated, it's ok to count only imported images. I import each image separately in its own R chunk with the include_graphics function, thus even just counting the number of occurences of the include_graphics keyword would work.
estimated time in minutes is
seconds <- N/200*60+12*m
minutes <- round(seconds/60)
fin <- ifelse(minutes>1, " minutes", " minute")
estimation <- paste0("Estimated reading time: ", minutes, fin)
The code for the estimation should be executed each time the R Markdown doc is knitted, so ideally inside an R chunk in the R Markdown doc itself, or in an external function which is then sourced in an R chunk. This way, the estimation would automatically update as I edit the R Markdown document. How can I do that? For example, given the sample R document below, which has N=106 words and m=2 images, the text
"Estimated reading time: 1 minute"
should be be included in the HTML output, just below the title, author and date.
Sample R Markdown document:
---
title: "Testy test"
author: "anonymous"
date: "`r Sys.Date()`"
output: html_document
---
```{r setup, include=FALSE}
library(knitr)
opts_chunk$set(echo = TRUE)
```
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r load_data, echo=FALSE}
include_graphics("doge.jpg")
```
## Including Plots
You can also embed plots, for example:
```{r pressure, echo=FALSE}
plot(pressure)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.

R Markdown makes custom plot disappear if I set echo=FALSE

I created a custom function which sets mfrow to nxn and creates n^2 scatter plots, with multiple data sets on each plot, based on an input list of data frames. The signature of my plotting function looks like this:
plot.return.list<-function(df.list,num.plot,title)
Where df.list is my list of data frames, num.plot is the total number of plots to generate (used to set mfrow) and title is the overall plot title (the function generates titles for each individual sub-graph).
This creats plots fine when I run the function from the console. However, I'm trying to get this figure into a markdown document using RStudio, like so:
```{r, fig.height=6,fig.width=6}
plot.return.list(f.1.list,4,bquote(atop("Numerical Approximations vs Exact Soltuions for "
,dot(x)==-1*x*(t))))
```
Since I haven't set the echo option in my {r} statement, this prints both the plotting code as well as the plot itself. However, if my first line instead reads:
{r, fig.height=6,fig.width=6,echo=FALSE}
Then both the code AND the plot disappear from the final document.
How do I make the plot appear WITHOUT the code? According to the example RStudio gives, setting echo=FALSE should make the plot appear without the code, but that isn't the behavior I'm observing.
EDIT: I seem to have tracked my problem down to kable. Whether or not I'm making a custom plot-helper function, any call to kable kills my plot. This can be reproduced in a markdown:
---
title: "repro"
author: "Frank Moore-Clingenpeel"
date: "October 9, 2016"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(knitr)
options(default=TRUE)
repro.df<-data.frame((0.1*1:10)%*%t(1:10))
```
```{r, echo=FALSE}
kable(repro.df)
```
```{r, fig.height=6,fig.width=6,echo=FALSE}
plot(repro.df[,1],repro.df[,2])
```
In this code, the plot won't plot because I have echo set to false; removing the flag makes the plot visible
Also note that in my repro code, kable produces a table with a bunch of garbage in the last line--I don't know why, but this isn't true for my full original code, and I don't think it's related to my problem.
Thanks for the reproducible example. From this I can see that the problem is you don't have a newline between your table chunk and your plot chunk.
If you were to knit this and examine the MD file produced by knit (or set html_document as your output format and have keep_md: true to look at it), you would see that the table code and plot code are not separated by any newline. Pandoc needs this to delimit the end of the table. Without it, it thinks your ![](path/to/image.png) is part of the table and hence puts it as a "junk line" in the table rather than an image on its own.
Just add a newline between the two chunks and you will be fine. (Tables need to be surrounded with blank lines).
(I know you are compiling to LaTeX so it may confuse you why I am talking about markdown. In case it does, when you do Rmd -> PDF, Rmarkdown uses knit to go from RMD to MD, and then pandoc to go from MD to tex. This is why you still need to make sure your markdown looks OK).

Why does the test-RMarkdown in R-Studio give an html that does not show any results after pressing the knit-button?

I opened a new R Markdown file in R studio and got the default small working example.
---
title: "test"
author: "Katharina Zweig"
date: "30. Januar 2016"
output: html_document
---
This is an R Markdown document. Markdown is a simple formatting syntax
for authoring HTML, PDF, and MS Word documents. For more details on using
R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that
includes both content as well as the output of any embedded R code chunks
within the document. You can embed an R code chunk like this:
```{r}
summary(cars)
```
You can also embed plots, for example:
```{r, echo=FALSE}
plot(cars)
```
Note that the `echo = FALSE` parameter was added to the code chunk to
prevent printing of the R code that generated the plot.
It says, you only need to press the knit button to create an HTML
containg the text, the code and the results of the code. I got some long
error logs that were hardly helpful. Neither did changing the output to
PDF and Word - same result: text was there, code was there, no results of
running the code. By producing the output, the original file vanished.
What is wrong?
When the knit button is used on a file not yet saved, it asks you under which name to save it. The file needs to be saved as an Rmd file - just give no extension and R-Studio will do it right. Then, the file does not vanish and the resulting document contains the results of the r commands. I thought it asked where to save the output and gave it the extension of the output file, i.e., either myfile.html / myfile.pdf / myfile.doc.
In the chunk option try this:
{r, results='asis'}
summary(cars)
You can also embed plots, for example:
{r, echo=FALSE, results='asis'}
plot(cars)
The results = 'asis' command should output the tables and graphs if not please let me know.

Proper R Markdown Code Organization

I have been reading about R Markdown (here, here, and here) and using it to create solid reports. I would like to try to use what little code I am running to do some ad hoc analyses and turn them into more scalable data reports.
My question is rather broad: Is there a proper way to organize your code around an R Markdown project? Say, have one script that generates all of the data structures?
For example: Let's say that I have the cars data set and I have brought in commercial data on the manufacturer. What if I wanted to attach the manufacturer to the current cars data set, and then produce a separate summary table for each company using a manipulated data set cars.by.name as well as plot a certain sample using cars.import?
EDIT: Right now I have two files open. One is an R Script file that has all of the data manipulation: subsetting and re-categorizing values. And the other is the R Markdown file where I am building out text to accompany the various tables and plots of interest. When I call an object from the R Script file--like:
```{r}
table(cars.by.name$make)
```
I get an error saying Error in summary(cars.by.name$make) : object 'cars.by.name' not found
EDIT 2: I found this older thread to be helpful. Link
---
title: "Untitled"
author: "Jeb"
date: "August 4, 2015"
output: html_document
---
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r}
table(cars.by.name$make)
```
```{r}
summary(cars)
summary(cars.by.name)
```
```{r}
table(cars.by.name)
```
You can also embed plots, for example:
```{r, echo=FALSE}
plot(cars)
plot(cars.import)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
There is a solution for this sort of problem, explained here.
Basically, if you have an .R file containing your code, there is no need to repeat the code in the .Rmd file, but you can include the code from .R file. For this to work, the chunks of code should be named in the .R file, and then can be included by name in the .Rmd file.
test.R:
## ---- chunk-1 ----
table(cars.by.name$make)
test.Rmd
Just once on top of the .Rmd file:
```{r echo=FALSE, cache= F}
knitr::read_chunk('test.R')
```
For every chunk you're including (replace chunk-1 with the label of that specific chunk in your .R file):
```{r chunk-1}
```
Note that it should be left empty (as is) and in run-time your code from .R will be brought over here and run.
Often times, I have many reports that need to run the same code with slightly different parameters. Calling all my "stats" functions separately, generating the results and then just referencing is what I typically do. The way to do this is as follows:
---
title: "Untitled"
author: "Author"
date: "August 4, 2015"
output: html_document
---
```{r, echo=FALSE, message=FALSE}
directoryPath <- "rawPath" ##Something like /Users/userid/RDataFile
fullPath <- file.path(directoryPath,"myROutputFile.RData")
load(fullPath)
```
Some Text, headers whatever
```{r}
summary(myStructure$value1) #Where myStructure was saved to the .RData file
```
You can save an RData file by using the save.image() command.
Hope that helps!

Resources