Place line break within a single cell in R data frame - r

I am working on a Shiny app that collects some user inputs, performs some data processing and finally sends an email to the user with the required output.
While creating the table that I need to send to the user, there is one cell within the table where I need to include line breaks.
You can refer to the image below for the desired output:
I have tried using the R code below:
a <- data.frame("SrNo" = 1, "Description" = "Name: John \n Age: 45")
I have also tried other variations of line breaks such as \n (with 2 \), \r\n.
However, I only get the following output with a space rather than a line break.
I have also tried implementing the solution provided at the following link - although without much success.
line break within cell for huxtable table
Would be great if you can help me place the line break within the Description column of the data frame.

Related

Looping variables in the parameters of the YAML header of an R Markdown file and automatically outputting a PDF for each variable

I am applying for junior data analyst positions and have come to the realization that I will be sending out a lot of cover letters.
To (somewhat) ease the pain and suffering that this will entail, I want to automate the parts of the cover letter that is suited for automation and will be using R Markdown to (hopefully) achieve this.
For the purposes of this question, let's say that the parts I am looking to automate is the position applied for and the company looking to hire someone for that position, to be used in the header of the cover letter.
These are the steps I envision in my mind's eye:
Gather the positions of interest and corresponding company in an Excel spreadsheet. This gives and Excel sheet with two columns with the variables position and company, respectively.
Read the Excel file into the R Markdown as a data frame/tibble (let's call this jobs).
Define two parameters in the YAML header of the .Rmd file to look something like this:
---
output: pdf_document
params:
position: jobs$position[i]
company: jobs$company[i]
---
The heading of the cover letter would then look something like this:
"Application for the position as r params$position at r params$company"
To summarize: In order to not have to change the values of the parameters manually for each cover letter, I would like to read an Excel file with the position titles and company names, loop these through the parameters in the YAML header, and then have R Markdown output a PDF for each pair of position and company (and ideally have the name of each PDF include the position title and company name for easier identification when sending the letters out). Is that possible? (Note: the title of the position and the company name does not necessarily have to be stored in an Excel file, that's just how I've collected them.)
Hopefully, the above makes clear what I am trying to achieve.
Any nudges in the right direction is greatly appreciated!
EDIT (11 July 2021):
I have partly arrived at an answer to this.
The trick is to define a function that includes the rmarkdown::render function. This function can then be included in a nested for-loop to produce the desired PDF files.
Again, assuming that I want to automate the position and the company, I defined the rendering function as follows (in a script separate from the "main" .Rmd file containing the text [named "loop_test.Rmd" here]):
render_function <- function(position, company){
rmarkdown::render(
# Name of the 'main' .Rmd file
'loop_test.Rmd',
# What should the output PDF files be called?
output_file = paste0(position, '-', company, '.pdf'),
# Define the parameters that are used in the 'main' .Rmd file
params = list(position = position, company = company),
evir = parent.frame()
)
}
Then, use the function in a for-loop:
for (position in positions$position) {
for (company in positions$company) {
render_function(position, company)
}
}
Where the Excel file containing the relevant positions is called positions with two variables called position and company.
I tested this method using 3 "observations" for a position and a company, respectively ("Company 1", "Company 2" and "Company 3" and "Position 1", "Position 2" and "Position 3"). One problem with the above method is that it produces 3^2 = 9 reports. For example, Position 1 is used in letters for Company 1, Company 2 and Company 3. I obviously only want to match outputs for Company 1 and Position 1. Does anyone have any idea on how to achieve this? This is quite unproblematic for two variables with only three observations, but my intent is to use several additional parameters. The number of companies (i.e. "observations") is, unfortunately, also highly likely to be quite numerous before I can end my search... With, say, 5-6 parameters and 20 companies, the number of reports output will obviously become ridiculous.
As said, I am almost there, but any nudges in the right direction for how to restrict the output to only "match" the company with the position would be highly appreciated.
You can iterate over by row like below.
for(i in 1:nrow(positions)) {
render_function(positions$position[i], positions$company[i])
}

Why is R merging all the rows in my CSV file as one whole document?

I am using R for a sentiment analysis. My source file which contains around 50 reviews made by guests has been created in Excel (with each review recorded in a single row and single column). So, all reviews are found in Column A, with no headers. The file has then been saved as a csv file and stored in a folder.
My R codes are as follows:
library (tm)
docs<-Corpus(DirSource('E:/Sentiment Analysis'))
#checking a particular review in the document
writeLines(as.character(docs[[20]]))
Running that last line gives me an out of bound error message.
When I change it to writeLines(as.character(docs[[1]])), R displays all the reviews as if they were one whole paragraph.
How can I correct this issue?
The tm::Corpus() function used with DirSource() treats each file as a separate document, rather than each line within one file as a separate document.
To read each row of a text file as a separate document, one can use the Corpus(VectorSource()) syntax.
As an example, we'll create a text file, read it from a directory to illustrate how Corpus() behaves with DirSource(), versus how we would read it with VectorSource().
# represent contents of the text file that was stored in
# ./data/ExcelFile1.csv
aTextFile <- "This is line one of text.
This is line two of text. This is a second sentence in line two."
library(tm)
# read as the OP read it
corpusDir <- "./data/textMining"
aCorpus <- Corpus(DirSource(corpusDir))
length(aCorpus) # shows only one item in list, entire file
# use pipe as separator because documents include commas.
aDataFrame <- read.table("./data/textMining/ExcelFile1.csv",header=FALSE,
sep="|",stringsAsFactors=FALSE)
# use VectorSource to treat each row as a separate document
aCorpus <- Corpus(VectorSource(aDataFrame$V1))
# print the two documents
aCorpus[1]$content
aCorpus[2]$content
...and the output. First, the length of the corpus as we read it with DirSource():
> length(aCorpus) # shows only one item in list, entire file
[1] 1
Second, we'll print the two rows from the second read, illustrating that they are treated as separate documents.
> aCorpus <- Corpus(VectorSource(aDataFrame$V1))
> aCorpus[1]$content
[1] "This is line one of text."
> aCorpus[2]$content
[1] "This is line two of text. This is a second sentence in line two. "
>

Generating a table in word from rmarkdown using the flextable package error

I have been trying to generate a table in R Markdown with output to word looking like this (a very common table format for chemical sciences):
I started with kable using markdown syntax to get the subscripts etc (eg. [FeBr~2~(dpbz)~2~]) which worked in the word document file. However, i could not modify the table design and most importantly i could not figure out how to get the headings to display properly. So i moved on using the flextable package. Here is my code so far (still work in progress):
```{r DipUVvis,echo=FALSE, anchor='Table S', tab.cap="Summary of catalytic reactions monitored with *in situ* UV-Vis spectroscopy."}
df<-data.frame(Entry=c('AMM 51^*a*^','AMM 52^*a*^','AMM 53^*a*^','AMM 54^*a*^','AMM 57^*b*^','AMM 58^*c*^','AMM 59^*d*^'),
Precat=c('[FeBr~2~(dpbz)~2~] (4.00)','[FeBr~2~(dpbz)~2~] (2.00)','[FeBr~2~(dpbz)~2~] (1.00)','[FeBr~2~(dpbz)~2~] (0.50)','[FeBr~2~(dpbz)~2~] (2.00)','[FeBr(dpbz)~2~] (1.00)','[FeBr~2~(dpbz)~2~] (2.00)'),
Nucl=c('Zn(4-tolyl)~2~/2 MgBr~2~ (100)','Zn(4-tolyl)~2~/2 MgBr~2~ (100)','Zn(4-tolyl)~2~/2 MgBr~2~ (100)','Zn(4-tolyl)~2~/2 MgBr~2~ (100)','Zn(4-tolyl)~2~/2 MgBr~2~ (100)','Zn(4-tolyl)~2~/2 MgBr~2~ (100)','Zn(4-tolyl)~2~/2 MgBr~2~ (100)'),
BnBr=c(0,0,0,0,'42 + 42',42,42))
tbl<-regulartable(df)
tbl<-set_header_labels(tbl,Entry='Entry',Precat='Pre-catalyst (mM)',Nucl='Nucleophile (mM)',BnBr='BnBr (mM)')
tbl <- align( tbl, align = "center", part = "all" )
tbl<-autofit(tbl)
tbl
```
This took care of the headers and with a bit of setting the rest parameters i think i can get the table to look like in the picture above. The resulting table looks fine in the Rstudio console from a formatting perspective:
However, there are two major issues:
1) The subscripts/superscripts are not being translated.
2) When i Knit to word, instead of a table i get 5 pages of code, which from my understanding must be the html code?
After many hours of trying to sort this out, i found that one possible cause is R studio using an old version of pandoc (https://github.com/davidgohel/flextable/issues/34). Indeed that was the case for me so i changed it by moving the new installed files of pandoc in the correct directory (where r studio is looking) and renaming. This must have worked now (see second figure console section). However it didnt change anything. Then i tried adding in my code:
knit_print(tbl)
This keeps giving an error:
Error in knit_print.flextable(tbl) : render_flextable needs to be used as a renderer for a knitr/rmarkdown R code chunk (render by rmarkdown).
Interestingly, when i removed the last line from the r chunk in R studio (tbl) and added the following below the r chunk (not in it):
`r tbl`
The table was generated in word (of course i still didnt get the subscripts and superscripts right). It also had the Figure caption on the top and not the bottom as a desirable side effect of generating the table after the main r chunk.
Any ideas of what is going on and how can i get the correct table output in word? Really confused here, so thank you in advance for your help.
UPDATE: If i remove the anchor = 'Table S' from the chunk header the table comes out ok (still without the subscripts or superscripts though) but then i cant automatically number the tables (i have used this: https://gist.github.com/benmarwick/f3e0cafe668f3d6ff6e5 for autonumbering and cross referencing).

How to set a "formatted value" in googleVis?

I am using googleVis and shiny to (automatically) create a Organizational Chart.
Similar to this question:
Google Visualization: Organizational Chart with fields named the same, I want to use formatted values in googleVis to be able to create fields in an organizational chart, which have the same name. I suspect it has something to do with roles but I cannot figure the correct syntax out.
The help page for gvisOrgChart mentiones formatted values but does not say how to set them:
"You can specify a formatted value to show on the chart instead, but the unformatted value is still used as the ID."
## modified example from help page
library(googleVis)
Regions[7,1] = Regions[8,1] # artificially create duplicated name in another parent node
Org <- gvisOrgChart(Regions)
plot(Org)
In the above example the duplicated name (Mexico) is only shown once in the chart. I want both of them to be drawn (One in the Europe and one in the America parent node).
Thank you for your help
cateraner
After talking to one of the developers of the googleVis package I got the solution to the problem now. The formatted value contains extra speak marks, which have to be removed before the text is usable as HTML.
## modified example from help page
library(googleVis)
# add new entry
levels(Regions$Region) = c(levels(Regions$Region), "{v: 'Germany.2', f: 'Germany'}")
Regions[8,1] = "{v: 'Germany.2', f: 'Germany'}"
Org <- gvisOrgChart(Regions)
# remove extra speak marks
Org$html$chart <- gsub("\"\\{v", "\\{v", Org$html$chart)
Org$html$chart <- gsub("\\}\"", "\\}", Org$html$chart)
plot(Org)
In the resulting graph you have two times "Germany", one under node "America" and one under "Europe". The same way you could add HTML formations to your text (color, font, etc.).
Thanks too Markus Gesmann for helping me on that.

R - How to create array from console input

Hi all and thanks in advance for all your help.
In R, I'm sending a command to an external Windows program using system(command), which in turn outputs lines (with multiple values per line) that I see directly on the R console. They look something like this:
a,b,c,d,e,f,g,h
1,2,3,4,5,6,7,8
3,4,5,7,1,3,4,9
7,5,3,1,8,1,5,7
What I would like to do is create an array that has the top row as column names and each subsequent row from the input should be the values that go into these columns. Any and all help in making this work would be very appreciated.
This is my first foray into this territory so I'm quite stuck as to how to do it. I've meddled with scan(), pipe() and readLines() but haven't been able to succeed. I have no particular attachment to system(command), any function that will run the executable that will give me the output I need is fine by me if it helps achieve what I want.
The comment made by user1935457 did the trick.
read.table(text = system(command, intern=TRUE), sep = ",", header=TRUE)

Resources