I'm working on a PDF to HTML project. In the original .ai file, some numeric characters are displayed in a box:
Although I know the font used in the file is GothicMB101Pro DeBold-83pv-RKSJ-H, I don't have the font file on my machine (and of course the original designer is long gone). In my illustrator, it appear like this:
The 1) part is one single character - not "1" and ")", so at least I know it's not some form of kerning but some unicode character. But I couldn't find any match in my search. The "enclosed numeric" characters ① aren't the same.
Since I'm not sure which character it is, and being not very knowledgeable in Japanese (it seems like a very common occurrence in Japanese language), I couldn't satisfy my client's requirement.
What are those characters and how do I get them onscreen?
I would guess, that since the output you are seeing without the original font installed, consists of two characters, the original also consisted of two characters, first of which is a regular one (in that case, number 1), and the second one is a combining character. There is one for a combining enclosing square, and this is probably the one that is rendered as closing parenthesis ")" that you see in the output. Using the number 1 and the enclosing square (at least in my browser in the stackoverflow answer editior) gives me the required result, as shown below:
1⃞
If your font does not render the enclosing square, it is probably the fault of your font, that is used as a fallback. But without knowing which font exactly is used as a replacement, it is hard to say if it is possible to work around the issue.
Related
I'm trying to assign the result of a chain matrix multiplication in Maxima to a new variable. I'm not sure as a new user why line %o6 isn't the same as the previous and fully evaluate the chain. Also why when I enter the new variable name "B" I simply have "B" returned back to me and not ([32, 32], [32, 32]). Basic questions I know but I've searched the documentation for a number of hours, and tutorials, and the syntax that I'm supposed to use here to get what I guess I was expecting as output, is still unclear to me.
I can't tell for sure, but it appears that the problem is that B : A.A.A is entered holding the shift key for at least one of the spaces, and Shift+Space is interpreted as non-breaking space instead of ordinary space. This appears to be a known bug or at least a serious misfeature in wxMaxima; see: https://github.com/wxMaxima-developers/wxmaxima/issues/1031
(I say misfeature because Shift+Space --> non-breaking space is documented in the wxMaxima documentation, but it seems like a classic example of "bad affordance"; it is all too easy to do the wrong thing without knowing it. Anyway this is just my opinion.)
I built wxMaxima from current source code and it appears that Shift+Space is now not interpreted as non-breaking space in code, so B : A.A.A should have the expected effect even if shift key is held while typing space. The current version is 19.07.0-DevelopmentSnapshot. I poked through the commit log a bit, but I can't figure out which commit changed the behavior of Shift+Space, so it's possible that the problem is not fixed and it is just fortuitous that I am not encountering it.
There are two workarounds, if one doesn't want to hazard an upgrade. (1) Omit spaces. (2) Be careful to only type space without shift.
Hope this is helpful in some way.
I've been playing around with Google's OCR recently using the default tutorial and was trying to parse numbers. I've seen previous issues dealing with numbers on license plates, but was wondering if there was a solution when special characters affect the results of OCR. Most notably, including the '#' character with a number, such as #1, #2, etc as shown below results in the output ##Z#T# and even occasionally gives me Chinese characters even after I set the language to/from settings to English.
Numbers with pound sign
For a similar comparison, the image below is easily read by the OCR:
Numbers without pound sign
Is there a setting that I'm missing that can improve the results or is this just a constraint by the model?
I've tried to follow a tutorial to add a comment for Beyond Compare but I am still unable to mark the commented lines as unimportant differences. I would like to compare R files. This is how I configured the grammar Rules.
If possible I would like to ignore the commented line only if the content of the line is equal. In other words if by removing the comment the two lines would actually differ I would still like to have them marked as important differences.
Here is the actual result of the comparison. Strangely when there are two comment symbols (#) the line appear as minor difference.
Beyond Compare doesn't support what you're trying to do. The comparison for each character checks both the character itself and the grammar type of the element. For example, comparing an identifier to a string will always show the characters as completely different even if the strings themselves are identical.
In your example, since they're different grammar types, every character is considered a difference. On the left they're comments, so unimportant and normally drawn as blue differences, but you're ignoring unimportant differences so they're shown as matching/black instead. On the right, they're important text, so they're drawn as red differences.
The lines that are comments on both sides are showing as matching because (A) they're all the same character and grammar type, so, aside from the # leading character, they are treated as matches, and (B) you're ignoring unimportant differences. (B) means that you could actually have anything for the content of the comments on each side and it would still show up as matching.
The base R function factor() interprets character elements consisting of blank space as valid factor elements instead of NA. What is the benefit of interpreting blank space character elements like this? Is it a legacy feature that is kept as it is to maintain compatibility?
Example:
factor(c("a","a","","b"))
I realize that this isn't an ordinary problem that can be solved with a reproducible example as a starting point, but I decided to give it a try anyway. The design decision to have factor() interpret blank space character elements like this confounds me. It seems to me that it would simplify things with no clear disadvantages to interpret these elements as NA instead.
What is the benefit of interpreting blank space character elements like this?
Because empty string data usually means “this is an empty string”, and not “this is missing data”.
It depends on the usage of course: an empty “name” field is most likely missing data. But an empty “title” field is just that: no title. How else would you encode lack of a title (assuming “Mr” and “Mrs” have a separate field, which may not be the case).
For factors, having empty labels makes less sense. However, R tends to convert strings to factors quite liberally (especially when reading tabular data from files), and treating all those empty values as NA would cause a lot of mis-annotated data. In general, such implicit conversions should always be lossless, i.e. preserve the whole domain of values being converted.
I'm supposed to add some modifications to a PHP web site which uses a font with Arabic style numbers.
I'm asked to convert the numbers style (language) to the English style (language) using the same font, is that achievable ?
Arabic(red) & English (green) numbering:
In principle, it is possible to create a font that has alternate glyphs for Arabic digits, selectable with OpenType font features and looking like common (European) digits. However, I do not know any such font, and such an approach would be odd on several accounts. The Arabic digits have been encoded as separate characters, and treating the difference between them and common digits as merely a glyph difference would deviate from normal reasonable practices.
Thus, the change, if desired, should be made at the character level. The details depend on the context, but the principle is simple: common digits are U+0030...U+0039 and Arabic digits are U+0660...U+0669, both in numeric order, so at the character code level it is simply a matter of adding or subtracting a constant.