How can I display UTF-8 characters in knitr chunk outputs? [duplicate] - r

Is there a set of best practices or documentation for working with Unicode in knitr and Rmarkdown? I can't seem to get any glyphs to show up properly when knitting a document.
For example, this works in the console (in Rstudio):
> cat("\U2660 \U2665 \U2666 \U2663")
♠ ♥ ♦ ♣
But when knitting I get this:
HTML
Word

It looks like an encoding issue specific to Windows, and may be related to this issue: https://github.com/hadley/evaluate/issues/59 Unfortunately we have to wait for a fix in base R, but if you don't have to use cat(), and this expression is a top-level expression in your code chunk (e.g. not inside a for-loop or if-statement), I guess this may work:
knitr::asis_output("\U2660 \U2665 \U2666 \U2663")
It passes the character string directly to knitr and bypasses cat(), since knitr cannot reliably catch multibyte characters written out by cat() on Windows -- it depends on whether the characters can be represented by your system's native encoding.

For anyone else who came across this after trying to get emoji support in Rstudio/Rmarkdown documents, another possible issue is that if the file encoding isn't set to UTF-8, the resulting compiled document won't support emojis either.
In order for emoji to work in Rmarkdown, you must change the file encoding of the Rmd document. Go to File -> Reopen with encoding, then select UTF-8.
Once you have ensured the file is open in UTF-8 encoding, you should be able to compile with emoji support.
You should even be able to paste emoji from a browser directly into the document. 😺
It is probably a good idea to change the default encoding for all files to UTF-8 so that you don't have to deal with this issue again.

Unicode: Inline
Phew, that was close `r knitr::asis_output("\U1F605 \U2660 \U2665 \U2666 \U2663")`
Unicode: Block
```{r, echo=FALSE}
knitr::asis_output("Phew, that was close \U1F605 \U2660 \U2665 \U2666 \U2663")
```
The emo package
Unfortunately, this package isn't yet on CRAN, but it can be installed with devtools::install_github("hadley/emo")
emo::ji("face")
There are some more examples here

Related

Cross-references and figures doesnt render when knit a pdf, since Rstudio install babel-stuff (after changing language to "fr-FR")

I really need help this time:
Rstudio auto-install some babel-stuff after I've indicate the language of an academic paper in YAML (lang: "fr-FR"). After that point, when I'm knit to pdf, tables and figures doesn't render anymore in the pdf: computation printed in the text are correct, but the pdf produced is now without figure, without table, and cross-references are not working anymore (e.g., the pdf now contain some "Figure #ref(tab:repartition-transport)" where before the language-change there is a number printed like "Figure 1").
I try to set language back to "en-EN" but cross-references, tables and figures doesn't render in the pdf.
The error message said that, when knitting is over:
Package babel Warning: No hyphenation patterns were preloaded for the language 'French' into the format.
Avis : (babel) Please, configure your TeX system to add them and rebuild the format.
Now I will use the patterns preloaded for \language=nohyphenation instead on input line 87.
I don't understand.
Since I don't know how to remove babel (which is not in the packages list), I try to run tinytex::check_installed("babel"), which answer: TRUE. Any help is very appreciated, since I don't know what is the problem.
Thanks

rmarkdown: new behaviour when knitting pdf documents

I tried to knit an old rmarkdown document to pdf recently. In the document, I had used the tilde symbol to denote a non-breaking space, e.g. 'Figure~2'. This syntax now seems to behave differently, now it prints 'Figure~2' verbatim, with the tilde printed in the document. There are many other differences, for instance % would once be interpreted as a comment, now it is printed.
I'm using Debian stretch with RStudio-1.2.1335. I can't find any documentation of this change in rmarkdown, pandoc or RStudio. Does anyone know what caused this change? Or how to revert to the old behaviour? Thanks.
The pandoc solution is to simply escape a space:
This is a short\ sentence.
Then a tilde will appear in the tex output.
What might work as well is $nbsp;:
This is a short sentence.
And if you really like your TeX then use \protect{~}:
This is a short\protect{~}sentence.

How to pass options to a LaTeX font in R Markdown?

rmarkdown, following {xe|lua}latex, allows to specify fonts for main text, sans-serif text, monspaced text (most notably code chunks !) and math fonts in the YAML header. At least for PDF rendering via xetex, this works.
However, I found no (documented) way to pass options to the underlying setxxxfont \LaTeX command. For example, the YAML fragment :
```
monofont: Inconsolata
```
generates the following \LaTeX fragment :
\setmonofont[Mapping=tex-ansi]{Inconsolata}
I have two questions with this:
why is the Mapping=tex-ansi added ? And how to control it ? (I'm working in UTF8...).
How could I set additional arguments for the font options i.e. \setmonofont[Scale=0.91]{TeX Gyre Cursor}?
The R Markdown book and the Pandoc's User's Guide did not reveal anything pertinent.
Background:
When R Markdown converts the knitted code to the output format (PDF), a pandoc template is used. The template is stored within the package, and any variables which get replaced by the YAML variables are contained in the $$ Notation
1. Encoding
The Mapping=tex-ansi is added to the code as a workaround an issue as reported on GitHub. Therefore I would be cautious of deleting this for potential side effects.
If you do indeed wish to change this code, you will have to make a copy of the LaTeX template file used to convert the document. You can find the default template here. See here for some more information on providing custom templates.
2. Additional Font Options
You can use the monofontoptions YAML argument to add additional arguments to the font options.
Documentation of the variables which can be parsed by the LaTeX output are available in the pandoc documentation

Corrupted Rmarkdown script: How can I get the Cyrillic characters back?

I was working with a script with lots of Cyrillic characters (throughout chunks and out of them) for weeks. One day I have opened a new Rmarkdown script where I wrote English, while the other document is still in my R session. Afterwards, I have returned to the Cyrillic document and everything written turns to something like this 8 иÑлÑ 1995 --> ÐлаÑÑÑ - наÑодÑ
The question is: Where is the source of problem? And, how can the corrupted script turn to its original form (with the Cyrillic characters)?
UPDATE!!
I have tried reopeining the Rstudio scrip with encoding CP1251, CP1252, windows1251 and UTF8, but it does not work. Certaintly the weird symbols change to another weird symbols. The problem is that I have saved the document with the default encoding CP1251 and windows1251) at the very begining.
Solution:
If working with cyrillic and lating characters, be sure you save the Rstudio script with UTF-8 encoding always, when you computer is windows (I do not know mac). If you close the script and open it again, re-open the file with UTF8 encoding.
Assuming you're using RStudio: Open your *.Rmd file and then try to reopen it "with encoding". Therefore simply use the File-Menu as shown below.
Select "Show all encodings" and choose your specific encoding, I suggest windows-1251 for cyrillic encoding:
Note: Apparently the issue can also occur while at the one time opening the *.Rmd file as "standalone" and at the other time from within an R Project.
Hope that would help.

How to use UTF-8 characters in dygraph title in R

Using Rstudio [Windows8], when I use the dygraph function to plot a time series, I have a problem when trying to use UTF-8 characters in the main title.
library(dygraphs)
dygraph(AirPassengers, main = "Título")
This results in a title: "T?tulo"
I have tried to convert "Título" to the utf-8 enconding, but it doesn't work.
You can use enc2utf8.
dygraph(AirPassengers, main = enc2utf8("Título"))
You need to make sure your locale settings support the character that you want to use, and that the file is saved with the right encoding. Saving as UTF-8 worked for me.
I was able to replicate your situation in Windows 7 and tried a bunch of things. Embedded in Rmarkdown, here is a minimal working example.
```{r}
Sys.setlocale("LC_ALL","German")
#note that windows locale names are different from unix & mac, usually
#the name of nationality works here.
#also works with "Faroese", "Hungarian", and others who have this letter.
#the locale has to be set in a preceding block to take effect.
```
```{r}
Encoding("Título")
library(dygraphs)
dygraph(AirPassengers, main = "Título")
```
You can try out the encoding given to the title to have with Encoding(). Languages like Faroese, Hungarian, and German encode "Título" as latin or unknown, both of which seem to cause no problems for dygraph's javascript. UTF-8 wrote it as <U+00ED> which was a problem for the javascript, as well as some, but not all other functions. With a matching locale, converting to utf-8 as #Michele recommended has the same result.
Also, if you don't have the title in many places, it is possible to just manually find and replace the title in the html/javascript file that is made. The problem occurs on conversion, but if the file is already made, the title variable can be successfully changed. The letter still has a question mark in Rstudio "Viewer" output, but I recommend making the entire file for javascript regularly, as I've seen other functions malfunction in the viewer window.

Resources