Support Tibetan vowel u by HarfBuzz? - harfbuzz

Does harfbuzz support Tibetan vowel u well? For usp and harfbuzz library,the shaped result is different for the same Tibetan chars.there are results:
usp:
harfbuzz:
The Tibetan chars is "U+0F45 U+0F74 U+0F74 U+0F74", string length is 4.
i don't know why does the result differently,and how to fix it?

HarfBuzz stacks diacritics for different scripts (Arabic also for example) and what you see is its difference with Uniscribe which doesn't have that feature so there isn't anything you should do in your side unless you insist on having that dotted circle for some strange reasons for some specific reason.

Related

Determine user location based latitude-longitude

I am planing to do a system that allow user to enter a 10 values of (digits, Characters) then I can determine his location.
I would to do some mathematics stuff or anythings that allow me to convert the (latitude-longitude) to one string (digits, Characters).
Is it possible to do that if yes please give me hint how I can do it!
thanks
At a code length of 10 characters, an Open Location Code (a.k.a. “Plus Code”) gives about 14m of resolution. Usually you'd have a + between the first 8 and the last 2 characters, but you can infer that. You can type and find these codes easily in Google Maps.
Geohash uses base 32 instead of base 20, so each character provides more information. 8 characters there already give you 19m resolution, the way I read Wikipedia. There is a chance you'd accidentially have obscenities in your code, though, which other codes try harder to avoid.
Geohash-36 uses 36 base characters, and avoids vowels (to prevent obscenities), but relies on character case. Wikipedia gives the accuracy of 10 characters as ⅙m.
All of these are well documented and probably have freely accessible reference implementations, too. You can also read about the design principles behind these.

Math ML MO uses

What do following snippets of code do in Math ML files? I removed those lines and it still worked fine for me.
<mo>⁡</mo>
<mo>⁢</mo>
<mo></mo>
Answering to any of them or just letting me know what they are would be very much appreciated.
The first two are ⁡ function application and ⁢ invisible times. They help indicate semantic information, see this Wikipedia entry
The last one, , could be anything since it lies in the Unicode Private Use Area which is provided so that font developers can store glyphs that do not correspond to regular Unicode positions. (Unless it's a typo and really 6349 in which case it's a a Han character.)

Why is there both setf/setb and setaf/setab in tput?

I'm trying to use tput to set foreground and background colors in my terminal in a device independent way.
If the whole purpose of termcap/terminfo/tput is to become device independent, why are there both versions that explicitly use ANSI controls (setaf/setab) and versions that do not (should not)?
This discussion quotes terminfo(5) which in turn quotes standards that explicitly says that those are to be implemented using ANSI and not ANSI, respectively.
Why isn't there just setf/setb and they always set the foreground and background colors. I don't care how it's done, that's why I use tput!
Why isn't there just setf/setb and they always set the foreground and background colors
are actually two questions!
The first part, why there are ANSI and non-ANSI terminal commands takes too long to exaplin, and it's unnecessary as the history is quite well explained on Wikipedia.
The second part could perhaps be freely rephrased to "what's the difference?" or "what can I do about it?".
Difference:
ANSI type terminals use another mapping between colour number and colours than non-ANSI terminals. For example, the code for yellow on one would be cyan on the other. There are simply two different mapping tables. Those things are described quite well on Wikipedia.
What you can do about it:
Discover which type of terminal you have, and use corresponding command.
Or modify your termcap.
None of these solutions are fully generic though, unfortunately.

How does the 68000 internally represent instructions?

How does the 68000 internally represent instructions.
I've read that there are different types of instructions: single effective operation word format instructions, brief and full extension word format instructions. The single effective operation word instruction seems to represent the instruction and the lower 6 bits of this instruction the addressing mode and register. Does this addressing mode and register tell you if there follows a brief or full extension word format instruction, which on his turn represents the operands for the instruction. Do you know a better manual than the 68000 programming reference manual.
Thanks in advance
The actual internal representation is a combination of "microcode" and "nanocode". The 68000 has 544 17-bit microcode words which dispaches to 366 68-bit nanocode words.
While this may not be what you wanted to know, this link may provide some insights:
http://www.easy68k.com/paulrsm/doc/dpbm68k1.htm
right, on m68000 indexed modes uses the brief extension. In "Address Register Indirect with Index (8-Bit Displacement) Mode" (d8, An, Xn), the BEW is filled with D/A (if Xn is a data or address register), Xn (the register number), W/L (to threat Xn contents as 16 or 32bits), scale to 0 (see note), and the 8-bit displacement.
on other hand, other modes, like the 16bit displacement, "Address with displacement" (d16,An) , the extension is only a word with the displacement.
the note is: brief extension word - m68k doesn't support the 2bits for scale so is set to 0; scale on BEW using the scale bits, and full extensions are only suported m68020,40,-> cpus. http://etd.dtu.dk/thesis/264182/bac10_19.pdf

Should I use utf-8 encoding for an online course?

Hello this is my question:
I am currently working on an introductory course on R programming for people with zero background on programming (this is people studying biology, veterinary, medicine, economics, ...), so they tend to be not very tech savvy and to use Windows. After they download and open the R scripts that I prepared, they are going to find every now and then badly encoded characters (as the course is in spanish and has many accents). This happens because my scripts are made with UTF-8 encoding and is not supported by default in Windows.
The options to avoid this nuisance are:
change all my scripts to the encoding WINDOWS-1252
instruct everyone to change their encoding to UTF-8
The first option is more annoying for me and helps prevents the students to be distracted with a quite minor detail.
The second option has no clear advantages from the pedagogic point of view, so I'd like to ask which virtues do you think it has...
Thanks in advance!
I would highly recommend instructing them to change their encoding to UTF-8. I've had the same issue on numerous occassions with web-app scripting and generally speaking it's alot more hassle to go through the code than to instruct the customer (or in your case, student) to use the UTF-8 encoding.
Afterall the course you're holding is an introductionary course, you might want to consider briefly covering the topic and explain the differences between the two - and more specifically: What happens when it doesn't work?
You have a golden opportunity to save yourself some time later down road, and possibly avoid the "Why is there question marks all over my screen"-question altogether!
Maybe you can avoid non-ASCII characters in your scripts. For example, to represent the greek "mu" character, you could use
> mu <- "\u03BC"
> Encoding(mu) <- "UTF-8"
> mu
[1] "μ"
Now if you print mu on the console, it is displayed correctly. In the script, you did not use any non-ASCII character at all.

Resources