If we type in letters we get all lowercase letters from english alphabet. However, there are many more possible characters like ä, é and so on. And there are symbols like $ or (, too. I found this table of unicode characters which is exactly what I need. Of course I do not want to copy and paste hundreds of possible unicode characters in one vector.
What I've tried so far: The table gives the decimals for (some of) the unicode characters. For example, see the following small table:
Glyph Decimal Unicode Usage in R
! 33 U+0021 "\U0021"
So if type "\U0021" we get a !. Further, paste0("U", format(as.hexmode(33), width= 4, flag="0")) returns "U0021" which is quite close to what I need but adding \ results in an error:
paste0("\U", format(as.hexmode(33), width= 4, flag="0"))
Error: '\U' used without hex digits in character string starting ""\U"
I am stuck. And I am afraid even if I figure out how to transform numbers to characters usings as.hexmode() there is still the problem that there are not Decimals for all unicode characters (see table, Decimals end with 591).
Any idea how to generate a vector with all the unicode characters listed in the table linked?
(The question started with a real world problem but now I am mostly simply eager to know how to do this.)
There may be easier ways to do this, but here goes. The Unicode package contains everything you need.
First we can get a list of unicode scripts and the block ranges:
library(Unicode)
uranges <- u_scripts()
Check what we've got:
head(uranges, 3)
$Adlam
[1] U+1E900..U+1E943 U+1E944..U+1E94A U+1E94B U+1E950..U+1E959 U+1E95E..U+1E95F
$Ahom
[1] U+11700..U+1171A U+1171D..U+1171F U+11720..U+11721 U+11722..U+11725 U+11726 U+11727..U+1172B U+11730..U+11739 U+1173A..U+1173B U+1173C..U+1173E U+1173F
[11] U+11740..U+11746
$Anatolian_Hieroglyphs
[1] U+14400..U+14646
Next we can convert the ranges into their sequences.
expand_uranges <- lapply(uranges, as.u_char_seq)
To get a single vector of all characters we can unlist it. This won't be easy to work with so really it would be better to keep them as a list:
all_unicode_chars <- unlist(expand_uranges)
# The Wikipedia page linked states there are 144,697 characters
length(all_unicode_chars)
[1] 144762
So seems to be all of them and the page needs updating. They are stored as integers so to print them (assuming the glyph is supported) we can do, for example, printing Japanese katakana:
intToUtf8(expand_uranges$Katakana[[1]])
[1] "ァアィイゥウェエォオカガキギクグケゲコゴサザシジスズセゼソゾタダチヂッツヅテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモャヤュユョヨラリルレロヮワヰヱヲンヴヵヶヷヸヹヺ"
I am creating my series of glyphs in a custom font but I can only see glyphs which occupy the character spaces from U+0020 up until U+007f when the font is read by other programs. Is there a setting to allow all other characters to be read? In my font all characters after are called "control characters".
Thanks for any tips.
Glyphs which you add to your font appear based on the character that their encoding matches. If you add characters with no encoding, of course they won't be avaialble. You could consider giving them encodings of e.g. U+F000, U+F001, etc., which are in the Unicode Private Use Area. Then you could paste such characters into your program, or if you use Linux, some desktop environments allow you to press CtrlShiftU and type e.g. f000 and get the character U+F000.
I am trying to write in my R Markdown "3 times 6:10 and (3 times 6):10" all in complete letters (I am using times instead of *. My purpose is to put * instead of word times without any problem). However, it keeps giving me italic syntax for the part "6:10 and (3". How can I write my *s in R Markdown straight into the document without evoking * syntax which is make letters italic in the middle?
Escape '*' by using '\*' instead
Alternatively, you could use $\times$ to get the '×' symbol
On the web use *
That's one of it's HTML codes.
(Only mention it because google brought me here when I didn't search about the R language and figure others will end up here as well.)
Markdown is quite sensitive to whitespace. If you don't put spaces around the * (line 3 below) you get the problem you're describing (markdown assumes the *s are italic-delimiters).
Some possible solutions
line 5: add spaces, no problem (except you might not want that spacing)
line 7 (#CaptainHat): set times as a LaTeX times symbol
line 9: set in code format
line 11 (#CaptainHat): protect *s with backslashes
Also tried #CaptainHat's suggestions as well as type-setting in code format ...)
I want to put Capitalomega with index DE and k label:
and then ı want to show on the y axis label? How to do them?
Generally you can use tex symbols in Veusz. Therefore, you can write \Omega_{DE} and \Omega_{k} for your request. See details here (Sec. 2.4 Text).
Veusz understands a limited set of LaTeX-like formatting for text. There are some differences (for example, "10^23" puts the 2 and 3 into superscript), but it is fairly similar. You should also leave out the dollar signs. Veusz supports superscripts ("^"), subscripts ("_"), brackets for grouping attributes are "{" and "}".
Supported LaTeX symbols include: \AA, \Alpha, \Beta, \Chi, \Delta, \Epsilon, \Eta, \Gamma, \Iota, \Kappa, \Lambda, \Mu, \Nu, \Omega, \Omicron, \Phi, \Pi, \Psi, \Rho, \Sigma, \Tau, \Theta, \Upsilon, \Xi, \Zeta, \alpha, \approx, \ast, \asymp, \beta, \bowtie, \bullet, \cap, \chi, \circ, \cup, \dagger, \dashv, \ddagger, \deg, \delta, \diamond, \divide, \doteq, \downarrow, \epsilon, \equiv, \eta, \gamma, \ge, \gg, \in, \infty, \int, \iota, \kappa, \lambda, \le, \leftarrow, \lhd, \ll, \models, \mp, \mu, \neq, \ni, \nu, \odot, \omega, \omicron, \ominus, \oplus, \oslash, \otimes, \parallel, \perp, \phi, \pi, \pm, \prec, \preceq, \propto, \psi, \rhd, \rho, \rightarrow, \sigma, \sim, \simeq, \sqrt, \sqsubset, \sqsubseteq, \sqsupset, \sqsupseteq, \star, \stigma, \subset, \subseteq, \succ, \succeq, \supset, \supseteq, \tau, \theta, \times, \umid, \unlhd, \unrhd, \uparrow, \uplus, \upsilon, \vdash, \vee, \wedge, \xi, \zeta. Please request additional characters if they are required (and exist in the unicode character set). Special symbols can be included directly from a character map.
Other LaTeX commands are supported. "\" breaks a line. This can be used for simple tables. For example "{a\b} {c\d}" shows "a c" over "b d". The command "\frac{a}{b}" shows a vertical fraction a/b.
Also supported are commands to change font. The command "\font{name}{text}" changes the font text is written in to name. This may be useful if a symbol is missing from the current font, e.g. "\font{symbol}{g}" should produce a gamma. You can increase, decrease, or set the size of the font with "\size{+2}{text}", "\size{-2}{text}", or "\size{20}{text}". Numbers are in points.
Various font attributes can be changed: for example, "\italic{some italic text}" (or use "\textit" or "\emph"), "\bold{some bold text}" (or use "\textbf") and "\underline{some underlined text}".
Example text could include "Area / \pi (10^{-23} cm^{-2})", or "\pi\bold{g}".
Veusz plots these symbols with Qt's unicode support. You can also include special characters directly, by copying and pasting from a character map application. If your current font does not contain these symbols then you may get a box character.
In addition to the answer OmG posted, you can also directly enter the character (via a character map application or copy and paste), as Veusz supports unicode characters.
When I am trying to paste the character » (right double angle quotes) in Unix from my Notepad, it's converting to /273. The corresponding Hex value is BB and the Decimal value is 187.
My actual requirement is to have this character as the file delimiter when I export a .dat file from a database table. So, this character was put in as the delimiter after each column name. But, while copy-pasting, it's getting converted to /273.
Any idea about how to fix this? I am on Solaris (SunOS 5.10).
Thanks,
Visakh
ASCII only defines the character codes up to 127 (0x7F) - everything after that is another encoding, such as ISO-8859-1 or UTF-8. Make sure your locale is set to the encoding you are trying to use - the locale command will report your current locale settings, the locale(5) and environ(5) man pages cover how to set them. A much more in-depth introduction to the whole character encoding concept can be found in Joel Spolsky's The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
The character code 0xBB is shown as » in the IS0-8859-1 character chart, so that's probably the character set you want, so the locale would be something like en_US.ISO8859-1 for that character set with US/English messages/date formats/currency settings/etc.