R plot title encoding in Pdf - r

This question is related to: Rhtml: Warning: conversion failure on '<var>' in 'mbcsToSbcs': dot substituted for <var> and R doesn't open with UTF-8
I use Ubuntu, I can not show a turkish character, ı, on the title of a plot:
myScript.r:
pdf(file='/home/sait/Desktop/abc.pdf')
plot(1:7,1:7,main='geziparkı')
I am having the following warning messages when I run the script using Rscript myScript.r,
Warning messages:
1: In title(...) :
conversion failure on 'geziparkı' in 'mbcsToSbcs': dot substituted for <c4>
2: In title(...) :
conversion failure on 'geziparkı' in 'mbcsToSbcs': dot substituted for <b1>
3: In title(...) :
conversion failure on 'geziparkı' in 'mbcsToSbcs': dot substituted for <c4>
4: In title(...) :
conversion failure on 'geziparkı' in 'mbcsToSbcs': dot substituted for <b1>
I added the line pdf.options(encoding='ISOLatin2.enc') on the top of my script as mentioned in the related previous questions, did not help.
Do I need to change something from my locale settings of Ubuntu. My sessioinInfo() is following,
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=tr_TR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=tr_TR.UTF-8 LC_COLLATE=tr_TR.UTF-8
[5] LC_MONETARY=tr_TR.UTF-8 LC_MESSAGES=C
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=tr_TR.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
PS: I continue investigating this issue, and realized that if I use .png, it works perfectly, only problem is with .pdf.

I finally found the solution,
Substituting pdf(file='/home/sait/Desktop/abc.pdf') with
cairo_pdf('/home/sait/Desktop/abc.pdf', family="DejaVu Sans") did the trick.
I do not know what this actually done, however I have tried a lot of stuff and nothing has worked except this one.

Related

Foreign(hebrew, Chinese) characters: Tidyverse incorrect display in console but correct in View() [duplicate]

For at least some cases, Asian characters are printable if they are contained in a matrix, or a vector, but not in a data.frame. Here is an example
q<-'天'
q # Works
# [1] "天"
matrix(q) # Works
# [,1]
# [1,] "天"
q2<-data.frame(q,stringsAsFactors=FALSE)
q2 # Does not work
# q
# 1 <U+5929>
q2[1,] # Works again.
# [1] "天"
Clearly, my device is capable of displaying the character, but when it is in a data.frame, it does not work.
Doing some digging, I found that the print.data.frame function runs format on each column. It turns out that if you run format.default directly, the same problem occurs:
format(q)
# "<U+5929>"
Digging into format.default, I find that it is calling the internal format, written in C.
Before I dig any further, I want to know if others can reproduce this behaviour. Is there some configuration of R that would allow me to display these characters within data.frames?
My sessionInfo(), if it helps:
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252
[3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C
[5] LC_TIME=English_Canada.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_3.0.1
I hate to answer my own question, but although the comments and answers helped, they weren't quite right. In Windows, it doesn't seem like you can set a generic 'UTF-8' locale. You can, however, set country-specific locales, which will work in this case:
Sys.setlocale("LC_CTYPE", locale="Chinese")
q2 # Works fine
# q
#1 天
But, it does make me wonder why exactly format seems to use the locale; I wonder if there is a way to have it ignore the locale in Windows. I also wonder if there is some generic UTF-8 locale that I don't know about on Windows.
I just blogged about Unicode and R several days ago. I think your R editor is UTF-8 and this gives your illusion that R in your Windows handles UTF-8 characters.
The short answer is when you want to process Unicode (Here, it is Chinese), don't use English Windows, use a Chinese version Windows or Linux which by default is UTF-8.
Session info in my Ubuntu:
> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: i686-pc-linux-gnu (32-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

Just started learning R, unable to render demo("graphics")

I googled around for this issue and couldn't find anything, but hopefully someone on SW can help diagnose this issue!
While running on R version 3.4.1, I tried running
demo("graphics")
and the result was a blank window, unable to render any of the colors/graph (see image).
After exiting out of the window and pressing return again, another blank window (exact same thing) pops up. When I exit out of the second window, I am left with the following command line output:
> demo("graphics")
demo(graphics)
---- ~~~~~~~~
Type <Return> to start :
> # Copyright (C) 1997-2009 The R Core Team
>
> require(datasets)
> require(grDevices); require(graphics)
> ## Here is some code which illustrates some of the differences between
> ## R and S graphics capabilities. Note that colors are generally
specified
> ## by a character string name (taken from the X11 rgb.txt file) and
that line
> ## textures are given similarly. The parameter "bg" sets the background
> ## parameter for the plot and there is also an "fg" parameter which sets
> ## the foreground color.
>
>
> x <- stats::rnorm(50)
> opar <- par(bg = "white")
> plot(x, ann = FALSE, type = "n")
Hit <Return> to see next plot:
Error in plot.new() : attempt to plot on null device
I checked, and the "graphics" package definitely exists and is in libPath. Moreover, it appears that demo("persp") also fails in a similar way.
Anyone know what might be causing this issue?
Edit 1:
Thanks for responding!
I am running this from bash, on Ubuntu 16.04.
Here is the requested terminal output:
> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.4.1
> dev.off()
Error in dev.off() : cannot shut down device 1 (the null device)
Edit 2:
Turns out, when I call
dev.new()
before a demo(), the demo is suddenly able to run. However, demo() will not run without it. This doesn't explain to me why demo() doesn't work off the start though.

R/data.table: read multiline script with fread

I'm using data.table::fread to read input from a shell script. For readability I want to split the script on multiple lines using the line continuation character '\'.
However, fread doesn't seem to like shell scripts on multiple lines.
Examples:
library(data.table)
fread("cat test1.txt test2.txt") ## OK
Now split script on two lines:
fread("cat test1.txt \
test2.txt")
Error in fread("cat test.txt \n test.txt") :
Expected sep (' ') but new line, EOF (or other non printing character) ends field 0 when detecting types ( first): test.txt
## Same problem
fread("cat test.txt \\
test.txt")
Is there any escape sequence or switch I'm missing?
If not, these are possible solutions I guess: 1) Don't split script at all 2) write script to a file and call that file with fread.
These are my settings:
sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS release 6.7 (Final)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 LC_PAPER=en_GB.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.10.4
loaded via a namespace (and not attached):
[1] tools_3.2.3 chron_2.3-46 tcltk_3.2.3
embedding within paste is an alternative:
fread(paste("cat test1.txt",
"test2.txt"))
If you are looking for an easy way to read multiple text files, you could either use
fread("cat t*.txt")
or if the .txt files don't follow the above example pattern of file names, perhaps move them to a sub-directory (say 'data') and read them all as below:
fread("ls data | cat")

R plots some unicode characters but not others

our sysadmin just upgraded our operating system to SLES12SP1. I reinstalled Rv3.2.3 and tried to make plots. I use cairo_pdf and try to make a plot with the x-label being \u0298 i.e. the solar symbol, but it doesn't work: the label just comes out blank. For example:
cairo_pdf('Rplots.pdf')
plot(1, xlab='\u0298') # the x-label comes up blank
dev.off()
This used to work, but for some reason it does not anymore. It works with other characters, e.g.
cairo_pdf('Rplots.pdf')
plot(1, xlab='\u2113') # the x-label comes up with the \ell symbol
dev.off()
When I just paste in the solar symbol, i.e.
plot(1, xlab='ʘ')
then I get the warning
Warning messages:
1: In title(...) :
conversion failure on 'ʘ' in 'mbcsToSbcs': dot substituted for <ca>
The machine is German, but I am using the US English UTF-8 locale:
> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: SUSE Linux Enterprise Server 12 SP1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
Any tips on how I can get the solar symbol to appear?
Note: I suppose with a new system you should first do:
capabilities() #And see what the result for cairo is.
A couple of ideas although one of them requires knowing what fonts you are using so the output of l10n_info()$MBCS and names(X11Fonts()) might be needed.
Option 1) The Hershey fonts have all the astrological signs as special escape characters. Page 4 of the output of :
demo(Hershey) # has \\SO as the escape sequence for the "solar" symbol.
So looking at the code for the draw.vf.cell function we see that it's using the text function to plot those characters and therefore using it to label an axis will require adding xpd=TRUE to the arguments:
plot(1, xlab="") ; text(1, .45, "\\SO" , vfont=c("serif", "plain"), xpd=TRUE )
Option 2) find the solar symbol in the font of your choice. You might try setting the font to something other than "Helvetica". See ?X11 that has a section on Cairo fonts. The points function's help page has a function called TestChars that lets you print character glyphs in various fonts to your output device. In this case your output device might be either cairopdf or x11. On my device (the Mac fork of UNIX) the Arial font has this output:
png(type="cairo-png");plot(1, xlab="\u0298");dev.off()
My observation over the years of similar questions leads me to believe that Cairo graphics are more reliably cross-platform. But since R can be compiled without cairo support, it's not a sure thing.
Maybe your text editor is using latin1, therfore you would send latin1 characters to your console.
Look at the encoding
Encoding('ʘ')
and / or try
plot(1, xlab=iconv('ʘ', from='latin1', to="UTF-8"))
but be carefull the encoding could change while coping.
If you use Notepad++ you can convert in the text editor between the different encodings.

knitr updated from 1.2 to 1.4 error: Quitting from lines

I recently updated knitr to 1.4, and since then my .Rnw files don't compile.
The document is rich (7 chapters, included with child="").
Now, in the recent knitr version I get an error message:
Quitting from lines 131-792 (/DATEN/anna/tex/CoSta/chapter1.Rnw)
Quitting from lines 817-826 (/DATEN/anna/tex/CoSta/chapter1.Rnw)
Fehler in if (eval) { :
Argument kann nicht als logischer Wert interpretiert werden
(the last two lines mean that knitr is looking for a logical and it cannot find it.
At those lines 131 and 817 two figures end. Compiling these sniplets separately will work.
I have no idea how to resolve this problem.
Thank's in advance for any hints that allow to resolve my issue.
Here is the sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] tools stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] knitr_1.4
loaded via a namespace (and not attached):
[1] compiler_2.15.1 digest_0.6.3 evaluate_0.4.7 formatR_0.9
[5] stringr_0.6.2 tcltk_2.15.1
Following the suggestions of Hui, I run each chapter separately with
knit("chapter1.Rnw")
and so on. No error message occurs, and separate tex files are created. To provide more information I display part of the code.
There is a main document in which several options are set
<<options-setting,echo=FALSE>>=
showthis <- FALSE
evalthis <- FALSE
evalchapter <- TRUE
opts_chunk$set(comment=NA, fig.width=6, fig.height=4)
#
The each chapter is used via child chunks, e.g. chapter1 is called from
<<child-chapter1, child='chapter1.Rnw', eval=evalchapter>>=
#
The error message which appears when knitting the main Rnw file was given above.
The related Figure environment is as follows
\begin{figure}[ht]
\centering
<<wuerfel-simulation,echo=showthis,fig.height=5>>=
data.sample6 <- sample(1:6,repl=TRUE,100)
table(data.sample6)
barplot(table(data.sample6)/100,col=5,main="Haeufigkeiten beim Wuerfeln")
#
\caption{Visualisierung beim W"urfeln. 100 Versuche.}
\label{fig:muent-vis}
\end{figure}
This is not very advanced, but the error is still as it was given before.
The quitting from lines concerns a long text, from 131 (end of first chunk) to line 792 (beginning of the followup chunk), which is
<< zeiten, echo=showthis,eval=evalthis>>=
zeiten <- c(17,16,20,24,22,15,21,15,17,22)
max(zeiten)
mean(zeiten)
zeiten[4] <- 18; zeiten
mean(zeiten)
sum(zeiten > 20)
#
Is there a problem with correctly closing a chunk?
I now located the error and I provide a short piece of code with reproducible error message.It concerns conditional evaluation of child processes involving Sexpr:
The main file is the following
\documentclass{article}
\begin{document}
<<options-setting,echo=FALSE>>=
evalchapter <- TRUE
#
<<test,child="test-child.Rnw", eval=evalchapter>>=
#
\end{document}
The related child file 'test-child.Rnw' is
<<no-sexpr>>=
t <- 2:4
#
text \Sexpr{(t <- 2:4)}
knitting this 'as is' gives the error message from above. Removing the Sexpr in the child everything works nicely.
But, everything also works nicely, if I remove the conditioning in the call of the child file, i.e., without 'eval=evalchapter'
Since I use Sexpr quite often I would like to have a solution to this problem. As I mentioned earlier, there were no problems up to knitR Version 1.2.
This is related to a change in knitr 1.3 and mentioned in the NEWS:
added an argument options to knit_child() to set global chunk options for child documents; if a parent chunk calls a child document (via the child option), the chunk options of the parent chunk will be used as global options for the child document, e.g. for <<foo, child='bar.Rnw', fig.path='figure/foo-'>>=, the figure path prefix will be figure/foo- in bar.Rnw; see How to avoid figure filenames in child calls for an application
And this caused a bug for inline R code. In your case, the chunk option eval=evalchapter was not evaluated when it is used for evaluating inline code. I have fixed the bug in the development version v1.4.5 on Github.

Resources