X11 forwarding over ssh for R: why this warning? - r

I need to use R (3.0.2 from pkgsrc) on a remote server (NetBSD) over a ssh connection with X11 forwarding. plot(1) is generating the expected graphic on my local machine, however R is also returning warnings in my terminal session as below.
> plot(1)
Warning messages:
1: In (function (display = "", width, height, pointsize, gamma, bg, :
locale not supported by Xlib: some X ops will operate in C locale
2: In (function (display = "", width, height, pointsize, gamma, bg, :
X cannot set locale modifiers
I don't know whether this bodes problems that I may encounter later, but I'd like to get everything set up and configured correctly. Would someone please clarify the meaning of the warnings and explain how to address them?
Edit for more info:
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64--netbsd (64-bit)
locale:
[1] C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
> names(X11Fonts())
[1] "serif" "sans" "mono" "Times" "Helvetica"
[6] "CyrTimes" "CyrHelvetica" "Arial" "Mincho"

The errors are saying that the X11 graphics driver does not know what font to use (see this discussion). By default, R installs with the C locale set. For linux, you need to set a UTF-8 locale which is prefixed by the language.
For example, for the English in the US, you would set it to 'en_US.UTF-8'.
Try setting the system locale with the Sys.setlocale command like so:
Sys.setlocale("LC_CTYPE", "en_US.UTF-8")
Sys.setlocale("LC_ALL", "en_US.UTF-8")
This can be done through the .bashrc configuration file like so:
export LC_CTYPE=en_US.UTF-8
export LC_ALL=en_US.UTF-8
(source)

Related

R source() encoding bug?

I am found very strange bug about encoding of character constants in R.
main.R:
options(encoding = "UTF-8")
print(Sys.getlocale())
print(getOption("encoding"))
print("first run")
source("internal.R")
print("")
print("second run")
source("internal.R", encoding = "UTF-8")
print("")
internal.R
print(Sys.getlocale())
print(getOption("encoding"))
char_constant="Тут не просто живут баги, тут у них гнездо"
print(Encoding(char_constant))
Now lets see the output, push source button in R
[1] "ru_RU.UTF-8/ru_RU.UTF-8/ru_RU.UTF-8/C/ru_RU.UTF-8/ru_RU.UTF-8"
[1] "UTF-8"
[1] "first run"
[1] "ru_RU.UTF-8/ru_RU.UTF-8/ru_RU.UTF-8/C/ru_RU.UTF-8/ru_RU.UTF-8"
[1] "UTF-8"
[1] "unknown"
[1] ""
[1] "second run"
[1] "ru_RU.UTF-8/ru_RU.UTF-8/ru_RU.UTF-8/C/ru_RU.UTF-8/ru_RU.UTF-8"
[1] "UTF-8"
[1] "UTF-8"
[1] ""
Notice the difference in encoding. "unknown" first time and "UTF-8" second time.
There is obvious small bug source ignores default encoding parameter.
The real bug is what mixing different encodings in data.table causes a lot of problems, and R-studio makes "UTF-8" constant when you execute just one string and makes "unknown" constant when you source whole file.
Do somebody have any idea what is going on and how to make workaround?
R version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin14.5.0 (64-bit)
Running under: OS X 10.12.4 (unknown)
locale:
[1] ru_RU.UTF-8/ru_RU.UTF-8/ru_RU.UTF-8/C/ru_RU.UTF-8/ru_RU.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_3.3.0
On Windows, R's source function does not work with files that include characters that aren't part of the current system encoding. You may have trouble with RStudio's Run All and Source on Save commands, as they rely on source.
Take a look at: https://support.rstudio.com/hc/en-us/articles/200532197-Character-Encoding

R plots some unicode characters but not others

our sysadmin just upgraded our operating system to SLES12SP1. I reinstalled Rv3.2.3 and tried to make plots. I use cairo_pdf and try to make a plot with the x-label being \u0298 i.e. the solar symbol, but it doesn't work: the label just comes out blank. For example:
cairo_pdf('Rplots.pdf')
plot(1, xlab='\u0298') # the x-label comes up blank
dev.off()
This used to work, but for some reason it does not anymore. It works with other characters, e.g.
cairo_pdf('Rplots.pdf')
plot(1, xlab='\u2113') # the x-label comes up with the \ell symbol
dev.off()
When I just paste in the solar symbol, i.e.
plot(1, xlab='ʘ')
then I get the warning
Warning messages:
1: In title(...) :
conversion failure on 'ʘ' in 'mbcsToSbcs': dot substituted for <ca>
The machine is German, but I am using the US English UTF-8 locale:
> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: SUSE Linux Enterprise Server 12 SP1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
Any tips on how I can get the solar symbol to appear?
Note: I suppose with a new system you should first do:
capabilities() #And see what the result for cairo is.
A couple of ideas although one of them requires knowing what fonts you are using so the output of l10n_info()$MBCS and names(X11Fonts()) might be needed.
Option 1) The Hershey fonts have all the astrological signs as special escape characters. Page 4 of the output of :
demo(Hershey) # has \\SO as the escape sequence for the "solar" symbol.
So looking at the code for the draw.vf.cell function we see that it's using the text function to plot those characters and therefore using it to label an axis will require adding xpd=TRUE to the arguments:
plot(1, xlab="") ; text(1, .45, "\\SO" , vfont=c("serif", "plain"), xpd=TRUE )
Option 2) find the solar symbol in the font of your choice. You might try setting the font to something other than "Helvetica". See ?X11 that has a section on Cairo fonts. The points function's help page has a function called TestChars that lets you print character glyphs in various fonts to your output device. In this case your output device might be either cairopdf or x11. On my device (the Mac fork of UNIX) the Arial font has this output:
png(type="cairo-png");plot(1, xlab="\u0298");dev.off()
My observation over the years of similar questions leads me to believe that Cairo graphics are more reliably cross-platform. But since R can be compiled without cairo support, it's not a sure thing.
Maybe your text editor is using latin1, therfore you would send latin1 characters to your console.
Look at the encoding
Encoding('ʘ')
and / or try
plot(1, xlab=iconv('ʘ', from='latin1', to="UTF-8"))
but be carefull the encoding could change while coping.
If you use Notepad++ you can convert in the text editor between the different encodings.

using -knitr- to weave Rnw files in RStudio

This seems to be a recurrent problem for who is willing to write dynamic documents with knitr in RStudio (see also here for instance).
Unfortunately I haven't find a solution on Stack Overflow or by googling more in general.
Here is a toy example I am trying to compile in RStudio. It is the minimal-example-002.Rnw (link):
\documentclass{article}
\usepackage[T1]{fontenc}
\begin{document}
Here is a code chunk.
<<foo, fig.height=4>>=
1+1
letters
chartr('xie', 'XIE', c('xie yihui', 'Yihui Xie'))
par(mar=c(4, 4, .2, .2)); plot(rnorm(100))
#
You can also write inline expressions, e.g. $\pi=\Sexpr{pi}$, and \Sexpr{1.598673e8} is a big number.
\end{document}
My problem is that I am not able to compile the pdf in RStudio by using knitr, while by changing the default weaving option to sweave I get the final pdf.
More specifically, I work in Windows 7, latest RStudio version (0.98.1103), I weave the file using the knitr option and I disabled the "always enable Rnw concordance" box.
Did this happen to you?
Any help would be highly appreciated, thank you very much.
EDIT
Apparently it is not an RStudio problem, as I tried to compile the document from R with:
library('knitr')
knit('minimal_ex.Rnw')
and I get the same error:
processing file: minimal_ex.Rnw
|
| | 0%
|
|...................... | 33%
ordinary text without R code
|
|........................................... | 67%
label: foo (with options)
List of 1
$ fig.height: num 4
Quitting from lines 8-10 (minimal_ex.Rnw)
Errore in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 3, 0
Inoltre: Warning messages:
1: In is.na(res[, 1]) :
is.na() applied to non-(list or vector) of type 'NULL'
2: In is.na(res) : is.na() applied to non-(list or vector) of type 'NULL'
EDIT 2:
This is my session info:
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C
[5] LC_TIME=Italian_Italy.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] knitr_1.10.5
loaded via a namespace (and not attached):
[1] tools_3.1.1
After spending hours to try to figure out the problem, I updated R (v 3.2.0) and everything works fine now.
It is not clear if the problem was due to some packages conflict, for sure it wasn't an RStudio problem (as I had initially thought).
To add a little to this: It seems to be a bug with the echo parameter which defaults to TRUE. Setting it to false with knitr and pdfLaTeX as a renderer worked for me. In case you're in a situation where you can't update because of dependencies and/or rights issues, this input might be a helpful adhoc fix, since the error message is pretty useless.

Force character vector encoding from "unknown" to "UTF-8" in R

I have a problem with inconsistent encoding of character vector in R.
The text file which I read a table from is encoded (via Notepad++) in UTF-8 (I tried with UTF-8 without BOM, too.).
I want to read table from this text file, convert it do data.table, set a key and make use of binary search. When I tried to do so, the following appeared:
Warning message:
In [.data.table(poli.dt, "żżonymi", mult = "first") :
A known encoding (latin1 or UTF-8) was detected in a join column. data.table compares the bytes currently, so doesn't support
mixed encodings well; i.e., using both latin1 and UTF-8, or if any unknown encodings are non-ascii and some of those are marked known and
others not. But if either latin1 or UTF-8 is used exclusively, and all
unknown encodings are ascii, then the result should be ok. In future
we will check for you and avoid this warning if everything is ok. The
tricky part is doing this without impacting performance for ascii-only
cases.
and binary search does not work.
I realised that my data.table-key column consists of both: "unknown" and "UTF-8" Encoding types:
> table(Encoding(poli.dt$word))
unknown UTF-8
2061312 2739122
I tried to convert this column (before creating a data.table object) with the use of:
Encoding(word) <- "UTF-8"
word<- enc2utf8(word)
but with no effect.
I also tried a few different ways of reading a file into R (setting all helpful parameters, e.g. encoding = "UTF-8"):
data.table::fread
utils::read.table
base::scan
colbycol::cbc.read.table
but with no effect.
==================================================
My R.version:
> R.version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 0.3
year 2014
month 03
day 06
svn rev 65126
language R
version.string R version 3.0.3 (2014-03-06)
nickname Warm Puppy
My session info:
> sessionInfo()
R version 3.0.3 (2014-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=Polish_Poland.1250 LC_CTYPE=Polish_Poland.1250 LC_MONETARY=Polish_Poland.1250
[4] LC_NUMERIC=C LC_TIME=Polish_Poland.1250
base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.9.2 colbycol_0.8 filehash_2.2-2 rJava_0.9-6
loaded via a namespace (and not attached):
[1] plyr_1.8.1 Rcpp_0.11.1 reshape2_1.2.2 stringr_0.6.2 tools_3.0.3
The Encoding function returns unknown if a character string has a "native encoding" mark (CP-1250 in your case) or if it's in ASCII.
To discriminate between these two cases, call:
library(stringi)
stri_enc_mark(poli.dt$word)
To check whether each string is a valid UTF-8 byte sequence, call:
all(stri_enc_isutf8(poli.dt$word))
If it's not the case, your file is definitely not in UTF-8.
I suspect that you haven't forced the UTF-8 mode in the data read function (try inspecting the contents of poli.dt$word to verify this statement). If my guess is true, try:
read.csv2(file("filename", encoding="UTF-8"))
or
poli.dt$word <- stri_encode(poli.dt$word, "", "UTF-8") # re-mark encodings
If data.table still complains about the "mixed" encodings, you may want to transliterate the non-ASCII characters, e.g.:
stri_trans_general("Zażółć gęślą jaźń", "Latin-ASCII")
## [1] "Zazolc gesla jazn"
I could not find a solution myself to a similar problem.
I could not translate back unknown encoding characters from txt file into something more manageable in R.
Therefore, I was in a situation that the same character appeared more than once in the same dataset, because it was encoded differently ("X" in Latin setting and "X" in Greek setting).
However, txt saving operation preserved that encoding difference --- of course well-done.
Trying some of the above methods, nothing worked.
The problem is well described “cannot distinguish ASCII from UTF-8 and the bit will not stick even if you set it”.
A good workaround is " export your data.frame to a CSV temporary file and reimport with data.table::fread() , specifying Latin-1 as source encoding.".
Reproducing / copying the example given from the above source:
package(data.table)
df <- your_data_frame_with_mixed_utf8_or_latin1_and_unknown_str_fields
fwrite(df,"temp.csv")
your_clean_data_table <- fread("temp.csv",encoding = "Latin-1")
I hope, it will help someone that.

Could not find any X11 fonts error

I am starting to get into R development and I was following a tutorial that in a certain point opens the "X11" to display graphics but when that window opens I get the following error:
Error in axis(side = side, at = at, labels = labels, ...) : could
not find any X11 fonts Check that the Font Path is correct. In
addition: Warning messages: 1: In function (display = "", width,
height, pointsize, gamma, bg, : locale not supported by Xlib: some
X ops will operate in C locale 2: In function (display = "", width,
height, pointsize, gamma, bg, : X cannot set locale modifiers
I have been Googling around but I can't find how to fix the "font path" of this application, does anybody know?
EDIT
The output of sessionInfo():
> sessionInfo()
R version 2.13.2 (2011-09-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] C/UTF-8/C/C/C/C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] galgo_1.1 R.oo_1.8.2 R.methodsS3_1.2.1
loaded via a namespace (and not attached):
[1] tools_2.13.2
When doing names(X11Fonts()):
> names(X11Fonts())
[1] "serif" "sans" "mono"
>
I "followed" the admin manual, and set the lines in the .bashrc
Setting for the new UTF-8 terminal support in Lion.
export LC_CTYPE=en_US.UTF-8
export LC_ALL=en_US.UTF-8
From http://www.mail-archive.com/r-sig-mac#r-project.org/msg01027.html
What does this return:
capabilities("X11")
If you are on .Platform$OS.type == "windows" then you may need to do some further research. I doubt that X11 is installed there by default. But your edit shows that you are on a mac so try this:
names(X11Fonts())
# results on my device
[1] "serif" "sans" "mono" "Times"
[5] "Helvetica" "CyrTimes" "CyrHelvetica" "Arial"
[9] "Mincho"
When I execute X11() at the R command console in the Mac-GUI I get an X11 window and choosing X11/About X11' I see that I have "XQuartz 2.1.6 (xorg-server 1.4.2-apple33)". I am using Leopard (still), but I thought that recent version of Macs installed X11 support by default and I don't remember needing to point R in the right direction to find it either.

Resources