Getting Stargazer Column labels to print on two or three lines? - r

I have some models that have long titles to be fully explanatory. It would be helpful to have them print their descriptors or titles on two lines. This reads the line break character, but the resulting latex output doesn't recognize it.
var1<-rnorm(100)
var2<-rnorm(100)
df<-data.frame(var1, var2)
mod<-lm(var1~var2)
library(stargazer)
stargazer(mod, column.labels='my models\nneed long titles')
Thanks.

One option: You can insert latex codes for newline (\\) and table column alignment (&). Note that each \ needs to be "escaped" in R with another \.
stargazer(mod, column.labels='my models\\\\ & need long titles')
Another way is to multirow. This could be easier for more complex tables with different length headings on each column. You will need to add \usepackage{multirow} to your document preamble.
stargazer(mod, column.labels='\\multirow{2}{4 cm}{my models need long titles}')
You will also need to post-edit the latex output from stargazer to insert some extra lines (using \\) below the variable headings so that the rest of the table gets moved down also (as in the exceprt below):
& \multirow{2}{4 cm}{my models need long titles} \\
\\ % note the extra line inserted here before the horizontal rule
\hline \\[-1.8ex]

For multi-line column labels in html, the word you are looking for is:
<br>
For example, I have r-code that reads a CSV, modifies the labels with line breaks, and writes out as html like this:
filename <- "table1.csv"
pubTable <- read.table(file=filename,header = TRUE, sep = ",")
labels = c("Period", "2D and 3D <br>Image <br>Analysis"," Document <br>Processing","
Biometric <br>Identification ","Image <br>Databases","Video <br>Analysis","Biomedical
<br>and <br>Biological")
library(stargazer)
stargazer(pubTable[],type = "html", rownames = FALSE,
summary=FALSE,out="Table1.html",covariate.labels = labels)
The resulting table looks like this 1:
1 Adopted from Conte, Donatello, et al. "Thirty years of graph matching in pattern recognition." International journal of pattern recognition and artificial intelligence 18.03 (2004): 265-298.

Related

How to sort lines of text alphabetically based on a part of each line?

I have a text file that contains abbreviations like so (simplified example):
\item[3D] Three-dimensional
\item[PCA] Principal Component Analysis
\item[RF] Random Forest
\item[ANN] Artificial Neural Networks
I want to manipulate these lines in R so that the abbreviations (e.g. ANN) are sorted in an alphabetical order and an abbreviation that starts with a number (e.g. 3D) comes after the last abbreviation that starts with letter. \item[]s should be ignored and left unmodified as they are going to be used in a LaTeX file.
My desired output is:
\item[ANN] Artificial Neural Networks
\item[PCA] Principal Component Analysis
\item[RF] Random Forest
\item[3D] Three-dimensional
I would be interested in solving this using tidyverse but any other solution will be useful too.
Here’s a ‘tidyverse’ solution:
sorted_lines = readLines(your_file) %>%
tibble(text = .) %>%
extract(text, into = 'abbr', regex = r'(\\item\[([^]]*)\])', remove = FALSE) %>%
arrange(abbr) %>%
pull(text)
Result:
\item[3D] Three-dimensional
\item[ANN] Artificial Neural Networks
\item[PCA] Principal Component Analysis
\item[RF] Random Forest
However, there’s really no need to use tidy data manipulation here. You can equivalently use (mostly1) base R functions:
lines = readLines(your_file)
abbreviations = str_match(lines, r'(\\item\[([^\]]*)\])')[, 2L]
sorted_lines = lines[order(abbreviations)]
Note that both solutions produce a different ordering than in your question, because they will order “3D” before “ANN”, as is conventional. Are you sure you want to put numbers at the end?
In both cases, the code extracts the abbreviation from each line of text via the regular expression r'(\\item\[([^]]*)\])', and then sorts the lines by these abbreviations.
The regular expression uses R 4.0’s new raw string literals: r"(…)". This allows us to use backslashes inside the string without having to escape them. Without raw string literals, the regular expression would look like this: \\\\item\\[([^\\]]*)\\]). — That’s just unnecessarily hard to read.
1 I’m using str_match from ‘stringr’, since the pattern extraction functions in base R are a pain to use.

Greek letters in chunk are not shown properly

I have created an r chunk in an r markdown document, in which I have calculated some parameter values that I am required to find for a homework. Now I would like for my knitted PDF-document to show the sentence "Our estimate of $\beta_1$ is -0.2186". However, the portion of the code for the greek letter beta ($\beta_1$) is being shown in the PDF the same way it's written here, not as the actual greek letter.
I have already tried installing LaTeX-packages in the document header (e.g. \usepackage{mathtools}), which made no difference.
cigs_mean <- mean(smoke$cigs) #find y-bar
educ_mean <- mean(smoke$educ) #find x-bar
beta1_hat <- (cov(smoke$educ,smoke$cigs))/(var(smoke$educ)) #find beta1-hat
beta0_hat <- (cigs_mean-(beta1_hat*educ_mean)) #find beta0-hat
print(paste0("Our estimate of $\beta_1$ is ", round(beta1_hat, digits=4)))
I just want for the document to show a greek letter beta with subscript 1, rather than replicating the code I have written ($\beta_1$)
Backslashes in R character strings have a special meaning as escape characters, and must themselves be escaped. Otherwise, your string '$\beta$' is read by R as '$' ‹backspace› 'e' 't' 'a' '$'.
Furthermore, print is the wrong function to use here: its purpose is to provide output in the interactive R console, never for actual output to a document. Use cat instead.
Finally, if you haven’t already done so, you need to tell knitr to interpret the results of this code chunk as-is instead of rendering them as a result:
```{r results = 'asis'}
…
cat(paste0("Our estimate of $\\beta_1$ is ", round(beta1_hat, digits=4), "\n"))
```

in R, how to load text lines containing newlines, and display them as multi-line output in mtext

Suppose I create two strings, each of which contains newlines, and assign them to variables as shown below.
question_one<- 'What is your answer?\n\nYes\nNo\nMaybe'
question_two<- 'What is your reply?\n\nOne\nTwo\nThree'
Then writeLines(question_one) and mtext(side=1, question_one) output the question on 5 separate lines (the second of which is blank); this is exactly the output I am after.
What I can't do is, start with those two strings as the two lines in a 2-line text document, bring them into my R session by using something like
filename="/path/sample_questions.txt"
my_scanned_questions <- scan(filename, what="", sep="\r", allowEscapes=F)
and then use mtext(side=1, my_scanned_questions[i]) to generate the output text of my ith question on 5 lines.
I have tried various combinations of sep values, different numbers (1,2,4) of backslashes in my .txt file, etc, allowEscapes as T and F, but the closest I can get is mtext(side=1, my_scanned_questions[1]) outputs the string 'q1 What is your answer?\n\nYes\nNo\nMaybe' on one display line.
I've had no better luck using readLines instead of scan.
I created a text file with three lines of text in it here
You need to replace the double backslashed newline with a single...
a <- readLines("https://gist.githubusercontent.com/corynissen/541100602ac4bc7b3dc6/raw/1adc16160a8a2aa705b7bbb3810723cc416079a1/a2.txt")
plot(1:10)
a <- gsub("\\\\n", "\n", a)
mtext(side=1, a[1])

Omit floating and document environments from stargazer regression table output

I just started using the stargazer package to make regression tables in R, but can't figure out how to write table output to a .tex file without either floating or document environments (and preamble in the case of the document environment). That is, I just want the tabular environment. My work flow is to keep the table floating environment - and the associated captions and labels -- in the body of the paper and link to the table's tabular environment with \input{}.
Is this possible?
# bogus data
my.data <- data.frame(y = rnorm(10), x = rnorm(10))
my.lm <- lm(y ~ x, data=my.data)
# if I write to file, then I can omit the floating environment,
# but not the document environment
# (i.e., file contains `\documentclass{article}` etc.)
stargazer(my.lm, float=FALSE, out="option_one.tex")
# if I write to a text connection with `sink`,
# then I can omit both floating and document environments,
# but not commands
# (i.e., log contains `sink` and `stargazer` commands)
con <- file("option_two.tex")
sink(con)
stargazer(my.lm, float=FALSE)
sink()
Save your stargazer results to an object:
res <- stargazer(my.lm, float=FALSE)
If you take a look at the contents of res then you'll see it's just a series of lines of text. Write this to a file using cat() like this
cat(res, file="tab_results.tex", sep="\n")
The sep="\n" is only required because the lines of text in the res object dont contain any line breaks themselves. If we leave use the default sep=" " then your table will be written to the tex file as one long line.
Hope this helps.

How to output to pdf using knitr/xtable when tex has square braces ( [ )?

I am trying to output the freq table generated by cut to a PDF file using knitr and xtable. However, when I include the option include.rownames=FALSE the output is not processed and an error is reported whereas the code works with include.rownames=TRUE. Test code is below:
\documentclass[12pt]{article}
\begin{document}
<<test_table, results = "asis",echo=FALSE>>=
library(xtable)
x <- sample(10,100,replace=TRUE)
breaks <- c(1,2,5,10)
x.freq <- data.frame(table(cut(x, breaks, right=FALSE)))
print(xtable(x.freq),include.rownames=TRUE)
#
\end{document}
When I use include.rownames=TRUE I get the output below.
1 [1,2) 5
2 [2,5) 35
3 [5,10) 49
whereas when I use include.rownames=FALSE I get an error:
Test.tex:71: LaTeX Error: \begin{tabular} on input line 58 ended by \end{document}.
I believe that I am getting the error because of the presence of the square braces ' [ ' in the text source.
Does anyone know how to solve this problem?
The problem is that the end of each row in the table is a \\ which has an optional argument to specify how much space to leave before the next row, for example, \\[1in]. There's allowed to be white space between the \\ and the [, so in this case, it's trying to read the [2,5) as that argument.
A simple workaround would be to change the labels of the factor to include some non-printing blank character first, however, if you do so, by default, print.xtable will "sanitize" it so that it actually does print, so you'd need to turn that off by passing a new sanitize function; identity will do none of this fixing.
levels(x.freq$Var1) <- paste0("{}", levels(x.freq$Var1))
print(xtable(x.freq),include.rownames=FALSE, sanitize.text=identity)

Resources