I'm creating a plot in R, and need to add an en dash to some axis labels, as opposed to your everyday hyphen.
axis(1, at=c(0:2), labels=c("0-10","11-30","31-70"))
I'm running R version 2.8.1 on Linux.
Old question but still a problem...
I'm using R vsn 3.3.2 on OSX 10.12.2, plotting with plot() to a pdf file that I import into Affinity Designer vsn 1.5.4. Axis labels of the form "2-0" show up in Affinity Designer with the dash overlapping the "0". I don't know if the problem lies with Affinity Designer or the pdf file or what. It would be nice to be able to try various Unicode dash characters, but R and pdf files both seem to not yet be fully equipped to deal with Unicode using the default fonts.
Solution: the "cairo" package in R:
library("cairo")
d = 0:11
names(d) = paste(0:11, "-", 11:0, sep="")
names(d) = gsub("-", "\U2012", names(d)) # U+2012 is "figure dash"
d
barplot(d)
cairo_pdf(filename="x.pdf", width=11, height=8)
barplot(d)
dev.off()
The dashes show up in the R console, default R plotting device, and the pdf file viewed with both Preview and Affinity Designer.
In this example, you can use the expression() function to get en dashes rendered properly:
axis(1,
at=c(0:2),
labels=c(expression(0-10),
expression(11-30),
expression(31-70)))
You're using Linux, so depending on how well R understands unicode, you could map one of your spare keyboard keys to the Compose Key and then just type it out. To get a —, press Compose and then the normal - key two or three times (depending on your system's mappings). Note that when using the Compose key, you don't hold it down - just press the keys in sequence.
Exactly how you'd enable that varies, but in Ubuntu, System->Preferences->Keyboard, Layout tab, Layout Options button, and select something appropriate for the "Compose key position" item. I usually use the Menu key.
Edit: My mistake, you wanted an en-dash, not an em-dash. Then en-dash (–) is Compose dash dash period, rather than Compose dash dash dash.
a MDPI journal has requested to change from hyphen to en dash in the axis labels.
Using the base system for graph, I solved the problem by simply changing the "-" with "\u2013" without spaces. The example code for axis in a complete form is
axis(1,1:2,c("20\u201329","40\u201349")
In my case the two labels expressed two age groups. I used it in R 4.1.3 and windows 10.
Related
In order to get some space for a special ylab text, I use mar=c(5,7,4,2). This provides me with 7 lines of space for ylab. On the default device (screen) everything functions as anticipated. However, I cannot get this output to any other device than the screen.
par(mar=c(5,7,4,2))
png(file="a.png", width=500, height=500)
plot(1,1,ylab="A very very long axis title\nthat need special care",xlab="",type="n")
I verified the same behavior with png, tiff, pdf. It seems that the maximum printable size in these devices is 4. Anything that goes beyond this number gets cut off. The same behavior is when plotting xlabs, eg by using mgp=c(5,1,0). mgp=c(4,1,0) (line 4) is the maximum printable line in any other device than the screen.
Even after upgrading to the latest R version does not change this behavior and it is the same on Windows and Ubuntu.
Any advice on the root cause of this behavior is appreciated.
The problem is the order of your statements. The par() call applies to the current device. Since you open the png() device after that, it doesn't have any effect. Just put things in this order and they'll be fine:
png(file="a.png", width=500, height=500)
par(mar=c(5,7,4,2))
plot(1,1,ylab="A very very long axis title\nthat need special care",xlab="",type="n")
dev.off()
This gives this image in the file:
ggplot(data=NULL,aes(x=1,y=1))+
geom_text(size=10,label="ক্ত", family="Kohinoor Bangla")
On my machine, the Bengali conjunct cluster "ক্ত" is rendered as its constituents plus a virana:
I have tried several different fonts to no avail. Is there a trick to making conjuncts render correctly?
EDIT:
Explicitly using the unicode still doesn't not render correctly:
This renders correctly for me:
print(stringi::stri_enc_toutf8("\u0995\u09cd\u09a4"))
This still gives me the exact same result as before
ggplot(data=NULL,aes(x=1,y=1))+
geom_text(size=10,label="\u0995\u09cd\u09a4", family="Kohinoor Bangla")
Why is there a difference between the console output and ggplot output?
I'm not familiar with the Bengali language, but if you would look up the unicode characters for the text that you want to render, you could simply use those in geom_text()
# According to unicode code chart, these are some Bengali characters
# U+099x4
# U+09Ex3
ggplot(data=NULL,aes(x=1,y=1))+
# Substitute 'U+' by '\u', leave the 'x' out
geom_text(size = 10, label = "\u0994\u09E3")
Substitute the unicode characters as you see fit.
Hope that helped!
EDIT: I tried your last piece of code, which gave me a warning about the font not being installed. So I ran it without the family = "Kohinoor Bangla":
ggplot(data=NULL,aes(x=1,y=1))+
geom_text(size=10,label="\u0995\u09cd\u09a4")
Which gave me the following output:
From a visual comparison with the character that you posted, it looks quite similar. Next, I ran the same piece of code on my work computer, which gave me the following output:
The difference between work and home, is that work runs on a linux, while home runs on windows, work has R 3.4.4, home has R 3.5.3. Both are in RStudio, both are ggplot 3.2.0. I can't update R on work because of backwards compatibility issues, to check wether the version of R might be the problem. However, you could check wether your version of R is older than 3.5.3 and see if updating relieves the problem. Otherwise, I would guess it is a platform issue.
On some computers, the following code used in conjunction with the packages siar and SIBER does not render the delta and/or permil symbol correctly in the axes labels. Instead, either a blank axis label, or text such as "\u2030" is rendered in its place.
plot(0,xlab = expression(paste(delta^13,"C (\u2030)")))
One often encountered problem is that your computer's region settings (i.e. your operating system, not the applications R or Rstudio) is set to use a non-UTF8 character set. If you type
Sys.setlocale()
in the R command window, you should see something like
"en_IE.UTF-8/en_IE.UTF-8/en_IE.UTF-8/C/en_IE.UTF-8/en_IE.UTF-8"
which for me means I'm using UTF-8 in english with Irish region settings.
If you don't see UTF-8 then the \u2030 and other character codes won't work
I constructed dendrogram in R with the code:
data(iris)
aver<-sapply(iris[,-5],function(x) by(x,iris$Species,mean))
matrix<-dist(aver)
clust<-hclust((matrix),"ave")
clust$labels<-row.names(aver)
plot(as.dendrogram(clust))
I wanted to save the dendrogram as svg file using the code:
install.packages("Cairo")
library(Cairo)
svg("plot.svg")
plot(as.dendrogram(clust))
dev.off()
Here the problem started:
When I imported the "plot.svg" into Inkscape (ver: 0.48.4) and selected any label (e.g. "setosa") it was not recognized as a text, but rather as some "user defined" object. Specifically, when I selected any "letter" in the label and inspect it with the XML Editor (ctrl+shift+X) in Inkscape I obtained this information:
**id**: use117
**x**: 142.527344
**xlink:href**: #glyph0-8
**y**: 442.589844
On the other hand, when I manually wrote "setosa" using "create and edit text objects" tool, and inspected in XML Editor, it returned:
**id**: text4274
**sodipodi:linespacing**: 125%
**style**: font-size:18px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;line-height:125%;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;font-family:Palatino Linotype;-inkscape-font-specification:Palatino Linotype
**transform**: scale(0.8,0.8)
**x**: 176.02016
**xml:space**: preserve
**y**: 596.96674
It is likely that Inkscape did not recognize the labels as a text according to the attribute "id" from XML Editor. Hence, I am not able to change neither font, size as well as use other functions related to text objects in Inkscape.
Here is the svg file, that I made with the previous code and imported into Inkscape
I checked previous steps using other versions of Inkscape as well as R, but it would be the same.
Here is the question:
Do you have any suggestion how I can gather labels as a text attribute instead of a "user defined" (or whatever it is object...) when importing svg files from R into Inkscape?
UPDATE
#baptiste linked to the SO thread where #Oscar Perpiñán suggested three packages (gridSVG, SVGAnnotation and RSVGTipsDevice) that manipulate SVG. Unfortunately, neither of packages suggested could solve the problem with the text issue.
So far I found SO thread where #Mo Sander suggested RSvgDevice package since it can preserve text object rather than glyphs. Being stuck with the RSvgDevice installation procedure, I found that it RSvgDevice is only available for 32-bit installations and R < 2.15.0. Otherwise, R returned warning message:
Warning message:
package ‘RSvgDevice’ is not available (for R version 3.0.1)
Beside the requirements for older R versions, currently only RSvgDevice can preserve a text object in SVG.
I'm a bit late to the party, but I've been dealing with this myself. I found a trick to make it work. First, I export the plot as PDF instead of SVG because PDF fonts are recognized by inkscape.
This, however brings a new problem as the text often ends up being defined letter by letter meaning that you can change the font, but the spacing is still defined and it becomes immensely annoying. I found that it was due to the x coordinate being defined at each letter.
I wrote a perl script and put it in this gist to remove all the trailing coordinates. After that I'm able to manipulate all the fonts I wished. Note, that this will only work for horizontal text.
Hope that helps this problem you had over a year ago :)
This is a failing in Cairo. Major, from my point of view.
The cairo SVG surface (i.e. the back-end in Cairo used to "draw" on SVG) simply does not support the "text" tag. It does not understand about strings at all. Instead, it places each character (glyph) individually. So any SVG generated with Cairo is not useful if you want to post-process contained text with a vector editor. :(
The only mention I found on the cairo list was this one:
http://lists.cairographics.org/archives/cairo/2011-February/021777.html
The svglite package exports text on Linux as desired.
[EDIT] According to this thread, there is also a way to remove the squeezing of the edited text into the fixed box width. Just remove the textLength field from the object in the XML editor.
Cheers
Can't directly comment on mgrewe answer because of my low reputation but thank you for the solution.
Implemented the textLength edit into R:
svgitem<-readLines('file.svg')
svgitem<-gsub('textLength=','tL=',svgitem)
writeLines(svgitem,'without_textLength.svg')
Text-box seems to be no longer affected after edition in Inkscape using the without_textLength.svg file and keeps a trace of old textLength renamed 'tL'.
Thanks again mgrewe, I've lost so many hours reformating text in Inkscape before seeing your answer.
R is clearly not using the standard SVG text objects for producing its labels. I have no idea why. I am not an R user.
Perhaps by default it uses it's own custom font that it manually inserts glyph-by-glyph into the output. Are you using the same font in both cases? In Inkscape you are using Palatino. Is that what you are using for the labels in R?
Some of you may have seen my blog post on this topic, where I wrote the following code after wanting to help a friend produce half-filled circles as points on a graph:
TestUnicode <- function(start="25a0", end="25ff", ...)
{
nstart <- as.hexmode(start)
nend <- as.hexmode(end)
r <- nstart:nend
s <- ceiling(sqrt(length(r)))
par(pty="s")
plot(c(-1,(s)), c(-1,(s)), type="n", xlab="", ylab="",
xaxs="i", yaxs="i")
grid(s+1, s+1, lty=1)
for(i in seq(r)) {
try(points(i%%s, i%/%s, pch=-1*r[i],...))
}
}
TestUnicode(9500,9900)
This works (i.e. produces a nearly-full grid of cool dingbatty symbols):
on Ubuntu 10.04, in an X11 or PNG device
on Mandriva Linux distribution, same devices, with locally built R, once pango-devel was installed
It fails to varying degrees (i.e. produces a grid partly or entirely filled with dots or empty rectangles), either silently or with warnings:
on the same Ubuntu 10.04 machine in PDF or PostScript (tried setting font="NimbusSan" to use URW fonts, doesn't help)
on MacOS X.6 (quartz, X11, Cairo, PDF)
For example, trying all the available PDF font families:
flist <- c("AvantGarde", "Bookman","Courier", "Helvetica", "Helvetica-Narrow",
"NewCenturySchoolbook", "Palatino", "Times","URWGothic",
"URWBookman", "NimbusMon", "NimbusSan", "NimbusSanCond",
"CenturySch", "URWPalladio","NimbusRom")
for (f in flist) {
fn <- paste("utest_",f,".pdf",sep="")
pdf(fn,family=f)
TestUnicode()
title(main=f)
dev.off()
embedFonts(fn)
}
on Ubuntu, none of these files contains the symbols.
It would be nice to get it to work on as many combinations as possible, but especially in some vector format and double-especially in PDF.
Any suggestions about font/graphics device configurations that would make this work would be welcomed.
I think you are out of luck Ben, as, according to some notes by Paul Murrell, pdf() can only handle single-byte encodings. Multi-byte encodings need to be converted to a the single-byte equivalent, and therein lies the rub; by definition, single-byte encodings cannot contain all the glyphs that can be represented in a multi-byte encoding like UTF-8, say.
Paul's notes can be found here wherein he suggests a couple of solutions using Cairo-based PDF devices, using cairo_pdf() on suitably-endowed Linux and Mac OS systems, or via the Cairo package under MS Windows.
I have found the cairo_pdf device to be completely insufficient: the output is markedly different from both pdf and on-screen rendering, and its plotmath support is sketchy.
However, there’s a rather simple workaround on OS X: Use the “normal” quartz device and set its type to pdf:
quartz(type = 'pdf', file = 'output.pdf')
Unfortunately, on my computer this ignores the font family and always uses Helvetica (although the documentation claims that the default is Arial).
There are at least two other gotchas:
pdf converts hyphens to minuses. This may not even always be what you want but it’s quite useful to properly typeset negative numbers. The linked thread describes workarounds for this.
It’s of course platform specific and only works on OS X.
(I realise that OP briefly mentions the Quartz device but this thread is frequently viewed and I think this solution needs more prominence.)
Another solution might be to use tikzDevice which can now use XeLaTeX with Unicode characters. The resulting tex file can then be compiled to produce a pdf. The problem is still that you must have a font on your system that contains the characters.
library(tikzDevice)
options(tikzXelatexPackages=c(getOption('tikzXelatexPackages'),
'\\setromanfont{Courier New}'))
tikz(engine='xetex',standAlone=T)
TestUnicode(9500,9900)
dev.off()
The first time, this will take a LONG time.
Have you tried embedding a font in the PDF, or including one for Mac users that would work?