Unicode subscript in R - r

I want to write \sigma^2_i using unicode I can get two-thirds of the way there with:
"\u03C3\U00B2"
I can't for the life of me figure out how to add subscript. According to this site where I got the unicode for superscript 2 the correct unicode for subscript i is "\u1D62", but in R this does not print subscript i.
I know that for text in plots you can use expression, but this is not for a plot, so expression does not work.
Any help would be appreciated.
edit: it appears that using cat evaluates unicode in cases when just printing doesn't necessarily evaluate the unicode. I still don't understand why the unicode characters are inconsistent. Per this website the unicode character for subscript k is "\u2096", but cat("\u2096") prints a thick |

Related

How to generate all possible unicode characters?

If we type in letters we get all lowercase letters from english alphabet. However, there are many more possible characters like ä, é and so on. And there are symbols like $ or (, too. I found this table of unicode characters which is exactly what I need. Of course I do not want to copy and paste hundreds of possible unicode characters in one vector.
What I've tried so far: The table gives the decimals for (some of) the unicode characters. For example, see the following small table:
Glyph Decimal Unicode Usage in R
! 33 U+0021 "\U0021"
So if type "\U0021" we get a !. Further, paste0("U", format(as.hexmode(33), width= 4, flag="0")) returns "U0021" which is quite close to what I need but adding \ results in an error:
paste0("\U", format(as.hexmode(33), width= 4, flag="0"))
Error: '\U' used without hex digits in character string starting ""\U"
I am stuck. And I am afraid even if I figure out how to transform numbers to characters usings as.hexmode() there is still the problem that there are not Decimals for all unicode characters (see table, Decimals end with 591).
Any idea how to generate a vector with all the unicode characters listed in the table linked?
(The question started with a real world problem but now I am mostly simply eager to know how to do this.)
There may be easier ways to do this, but here goes. The Unicode package contains everything you need.
First we can get a list of unicode scripts and the block ranges:
library(Unicode)
uranges <- u_scripts()
Check what we've got:
head(uranges, 3)
$Adlam
[1] U+1E900..U+1E943 U+1E944..U+1E94A U+1E94B U+1E950..U+1E959 U+1E95E..U+1E95F
$Ahom
[1] U+11700..U+1171A U+1171D..U+1171F U+11720..U+11721 U+11722..U+11725 U+11726 U+11727..U+1172B U+11730..U+11739 U+1173A..U+1173B U+1173C..U+1173E U+1173F
[11] U+11740..U+11746
$Anatolian_Hieroglyphs
[1] U+14400..U+14646
Next we can convert the ranges into their sequences.
expand_uranges <- lapply(uranges, as.u_char_seq)
To get a single vector of all characters we can unlist it. This won't be easy to work with so really it would be better to keep them as a list:
all_unicode_chars <- unlist(expand_uranges)
# The Wikipedia page linked states there are 144,697 characters
length(all_unicode_chars)
[1] 144762
So seems to be all of them and the page needs updating. They are stored as integers so to print them (assuming the glyph is supported) we can do, for example, printing Japanese katakana:
intToUtf8(expand_uranges$Katakana[[1]])
[1] "ァアィイゥウェエォオカガキギクグケゲコゴサザシジスズセゼソゾタダチヂッツヅテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモャヤュユョヨラリルレロヮワヰヱヲンヴヵヶヷヸヹヺ"

How to print an asterisk ('*') in R Markdown

I am trying to write in my R Markdown "3 times 6:10 and (3 times 6):10" all in complete letters (I am using times instead of *. My purpose is to put * instead of word times without any problem). However, it keeps giving me italic syntax for the part "6:10 and (3". How can I write my *s in R Markdown straight into the document without evoking * syntax which is make letters italic in the middle?
Escape '*' by using '\*' instead
Alternatively, you could use $\times$ to get the '×' symbol
On the web use &ast;
That's one of it's HTML codes.
(Only mention it because google brought me here when I didn't search about the R language and figure others will end up here as well.)
Markdown is quite sensitive to whitespace. If you don't put spaces around the * (line 3 below) you get the problem you're describing (markdown assumes the *s are italic-delimiters).
Some possible solutions
line 5: add spaces, no problem (except you might not want that spacing)
line 7 (#CaptainHat): set times as a LaTeX times symbol
line 9: set in code format
line 11 (#CaptainHat): protect *s with backslashes
Also tried #CaptainHat's suggestions as well as type-setting in code format ...)

How does bquote treat Unicode characters when using annotate R?

I was trying to include some Unicode characters as well as some expressions inside a plot. The Unicode character alone or in the axis works but when adding other elements does not show as expected. How can get the respective Unicode symbol?
ggplot()+
annotate("text",x=1,y=1.3,
label=bquote(C[1]~R^2*text[Psi]~"Currency \u20BF"),colour="blue",size=8)+
annotate("text",x=1,y=1.2,
label=bquote("Currency\u20BF"),colour="red",size=8)+
annotate("text",x=1,y=1.1,
label="Currency\u20BF",colour="red",size=8)+
labs(x=("Currency \u20BF"))

Insert specific unicode symbols inside values of data.frame variable

In my data.frame I would like to add two variables, "A" and "B", whose values contain respectively an n with the i subscript and an n with the s subscript.
As I have understood so far, it's not possible to specify an expression for the values of a variable, and hence to add special characters it's necessary to use unicode symbols. Some of this unicodes work in R, as for example the greek letter "mu", identified with the unicode \U00B5, or numeric subscripts, as you can see in this reprex in your R console:
x <- data.frame("A" = c("\U00B5"),
"B" = c("B\U2082"))
print(x)
These unicodes work also if I decide to put this variable in a ggplot() object, because I will display the correct symbol ("mu" for example) on the axis text or the facets.
The problem is that when I do the same for the subscripts of i (unicode: \U1D62) and s (unicode: \u209B), R doesn't recognise the unicode and prints the whole string inside the variable name.
Do you know how I can resolve this issue and if this unicode works on every operating system?
Thanks
Is there a reason you can't use the expression() function? It seems this would solve your problem (at least concerning greek letters).
Here is the site i used to learn how to input greek letters into my R/ggplot-legends.
https://stats.idre.ucla.edu/r/codefragments/greek_letters/
Altough it is not exactly the answer you look for, i still hope it helps!
If you are on Windows 10 recently updated with as of April 2018 Update:
Use the Windows key + '.' (e.g. hold together Windows Key plus period) in your text editor. This brings up Microsoft Emoji keyboard.
Select the Greek letters variable for your script.
The R Console will not accept the Greek letters as variables directly but only the from the editor script. Some of the Greek letters don't translate to English (like "µ" or "ß".) You can paste and copy them from ls() output to access. You may be able to use some math symbols as well for variable names. I can't however, get this to work with source(). That must be a text encoding problem.

how to put y axis greek letters in Veusz plot?

I want to put Capitalomega with index DE and k label:
and then ı want to show on the y axis label? How to do them?
Generally you can use tex symbols in Veusz. Therefore, you can write \Omega_{DE} and \Omega_{k} for your request. See details here (Sec. 2.4 Text).
Veusz understands a limited set of LaTeX-like formatting for text. There are some differences (for example, "10^23" puts the 2 and 3 into superscript), but it is fairly similar. You should also leave out the dollar signs. Veusz supports superscripts ("^"), subscripts ("_"), brackets for grouping attributes are "{" and "}".
Supported LaTeX symbols include: \AA, \Alpha, \Beta, \Chi, \Delta, \Epsilon, \Eta, \Gamma, \Iota, \Kappa, \Lambda, \Mu, \Nu, \Omega, \Omicron, \Phi, \Pi, \Psi, \Rho, \Sigma, \Tau, \Theta, \Upsilon, \Xi, \Zeta, \alpha, \approx, \ast, \asymp, \beta, \bowtie, \bullet, \cap, \chi, \circ, \cup, \dagger, \dashv, \ddagger, \deg, \delta, \diamond, \divide, \doteq, \downarrow, \epsilon, \equiv, \eta, \gamma, \ge, \gg, \in, \infty, \int, \iota, \kappa, \lambda, \le, \leftarrow, \lhd, \ll, \models, \mp, \mu, \neq, \ni, \nu, \odot, \omega, \omicron, \ominus, \oplus, \oslash, \otimes, \parallel, \perp, \phi, \pi, \pm, \prec, \preceq, \propto, \psi, \rhd, \rho, \rightarrow, \sigma, \sim, \simeq, \sqrt, \sqsubset, \sqsubseteq, \sqsupset, \sqsupseteq, \star, \stigma, \subset, \subseteq, \succ, \succeq, \supset, \supseteq, \tau, \theta, \times, \umid, \unlhd, \unrhd, \uparrow, \uplus, \upsilon, \vdash, \vee, \wedge, \xi, \zeta. Please request additional characters if they are required (and exist in the unicode character set). Special symbols can be included directly from a character map.
Other LaTeX commands are supported. "\" breaks a line. This can be used for simple tables. For example "{a\b} {c\d}" shows "a c" over "b d". The command "\frac{a}{b}" shows a vertical fraction a/b.
Also supported are commands to change font. The command "\font{name}{text}" changes the font text is written in to name. This may be useful if a symbol is missing from the current font, e.g. "\font{symbol}{g}" should produce a gamma. You can increase, decrease, or set the size of the font with "\size{+2}{text}", "\size{-2}{text}", or "\size{20}{text}". Numbers are in points.
Various font attributes can be changed: for example, "\italic{some italic text}" (or use "\textit" or "\emph"), "\bold{some bold text}" (or use "\textbf") and "\underline{some underlined text}".
Example text could include "Area / \pi (10^{-23} cm^{-2})", or "\pi\bold{g}".
Veusz plots these symbols with Qt's unicode support. You can also include special characters directly, by copying and pasting from a character map application. If your current font does not contain these symbols then you may get a box character.
In addition to the answer OmG posted, you can also directly enter the character (via a character map application or copy and paste), as Veusz supports unicode characters.

Resources