I am using chunk five fonts on my web site as font-face on css.When I use pure fonts on photoshop, there will turkish characters exist.But when I convert it to font-face.I won't display Turkish characters.I shared a screenshot on the following segment of text;
I've tried to convert different type of font faces.I tried to convert it with the subsetting support and I've checked Turkish field on it.Also, I entered ş,Ş,İ,ı,ğ,Ğ,ü,Ü,Ç,ç,Ö,ö Single Characters field on converter.Unfortunately,It's not worked for me.How can I fix that problem?
Thanks and Regards.
You can try convert fonts from another sources and will fix. I know this is old post but maybe helps you.
Some sources:
http://convertfonts.com/
https://www.web-font-generator.com/
http://www.flaticon.com/font-face
https://fontie.flowyapps.com/home
I have experienced the exact same problem as you when dealing with font conversions for Turkish. First, I had run a conversion using FontSquirrel's tool (available here), but it turns out the conversion was stripping these much-needed characters for the Turkish language.
One of the references from #Karmacoma's answer was very interesting and did the trick for me (Fontie) because it delivers advanced options, which gives us more control over the conversion process.
In order to cover the special characters in Turkish, you must use Switch to advanced view and run the conversion with Latin Extended-A.
I went to Wikipedia for a list of characters covered in Latin Extended-A and you can find them here.
Related
I have create a font-subset for my two used fonts.
But if I enter the browser and inspect a given H1-Tag which should only use this font, it shows that 2 Fonts are used, because one character is taken from an Fallback_Font Open Sans:
The exact HTML-Tag:
<strong class="headline1">Carservice Meisterwerkstatt</strong>
The CSS which is used (BTW: PT Sans use the same Font-Subsetting, so the next Fallback for those 5 Glyphs is OpenSans):
To determine the Subset I've used: glyphhanger http://localhost:3000 and added the output of it as whitelist to the following command:
glyphhanger --whitelist=U+A,U+20-23,U+25-29,U+2B-3B,U+3F-57,U+59,U+5A,U+5F,U+61-7D,U+A9,U+C4,U+D6,U+DC,U+E4,U+F6,U+FC,U+F002,U+F017,U+F0F1,U+F2B5,U+F2DC,U+F46D,U+F500,U+F530,U+F5E1,U+F63B,U+F7D9 --subset=Dosis-VariableFont_wght.ttf
What I do search for is a way to figure out, which 5 Glyphs are used from Open Sans. Is there a way to get this in the DEV-Console?
For testing purposes, I've changed the font to other font face to see immediately if there is used another font as fallback. But as you can see, even with Alfredo as Fallback it is not visible which 5 glyph's are using this fallback.
I've tried now to remove each single Character in Content-Part of the Tag inside of the Dev-Console... and checked when does the font-mixing appear. I figured out, that it appear only if I have 2 Characters with a whitespace in between: r M
But if I enter only a character (or word) with a whitespace in front of, or after the character, it doesn't happend. M even not like M .
I found that there are more than one simple space-character. There are many (see https://emptycharacter.com/ down on topic Unicode empty characters)
So it seems the issue at least is, that the Font-Subset doesn't have the needed Unicode included.
If anybody knows how to easily figure out which exact unicode the browser request to the font, you are very welcome to paste it here as comment)
I am working on a project which requires to convert PDF to text. The PDF contains Hindi fonts (Mangal to be specific) along with English.
100% of english is getting converted into text. The conversion of Hindi part is around 95%. Remaining 5% Hindi text is either coming as blank or like " ा". I could figure out that the accented characters are not getting converted to text properly.
I am using following code:
pdftotext -enc UTF-8 pdfname.pdf textname.txt
The PDF uses following Fonts
name, type, emb, sub, uni
ZDPKEY+Mangal, CID TrueType, yes, yes, yes
Mangal TrueType, no, no, no
Helvetica-Bold Type 1, no, no, no
CODUBM+Mangal-Bold, CID TrueType, yes, yes, yes
Mangal-Bold, TrueType, no, no, no
Times-Roman, Type 1 no, no, no
Helvetica, Type 1, no, no, no
Following is the result of conversion. Left side is original PDF. Right side is text opened in notepad:
http://preview.tinyurl.com/qbxud9o
My questions is whether the 5% missing / junk characters be correctly captured in Text with open-source packages? Would appreciate your inputs!
Change your code to.
pdftotext -enc "UTF-8" pdfname.pdf textname.txt
It has worked for me, similarly it should work for you.
I am new to the world of Adobe InDesign and IDML file format. I am trying to understand the IDML file format so that I can create IDML files dynamically through code!
I am going through the IDML File format specification and have found references to "Mojikumi Tables" and "Kinsoku Tables" and "Aki". Though the documentation defines various attributes for these elements, there's no clear explanation what these elements actually are.
Any pointers or links to relevant articles would be really helpful.
Thanks.
These are all additional typography settings used in laying out Japanese text.
Kinsoku: A rule set in the Japanese language that is used to determine characters that are not permitted at the beginning or end of a line. Reference.
Mojikumi: Determines spacing between punctuation, symbols, numbers, and other character classes in Japanese type. Reference.
Aki: Means space in Japanese:
"When the glyphs that correspond to characters of different character
classes come together in a run of text, there is spacing behaviour. In
other words, extra space, measured using a fraction of an em, is
introduced depending on which two character classes are in proximity*.
Typical values are one-fourth and one-half of an em"
(Footnote: * 'In Japanese this space is referred to as aki, which simply means
"space"')
Reference and source for this quote.
Here's a link to a book that should provide more information: CJKV Information Processing, 2nd Edition
I want to support German, French & Spanish characters on a particular field of my website. I need a regex for this. Presently I am using -
^[\w\s-\+\$\*\.\?\:\;\!\,"'\%\&\/\(\)\#\#«»£°¿¡_ÀÂÆÇÈÉÊËÎÏÔŒÙÛÜàâæçèéêëîïôœùûüÄÖäößÁÍÑÓÚáíñóú\u201E\u201C\u201D\u20AC]{1,255}$
This regex basically uses all the char set from the 3 languages I mentioned.
Is there a neat way to avoid this lengthy regex? I tried /p{L}/p{Z} regex. However this didnt work.
My website is in ASP.net
/p{L}/p{Z} is wrong, should be \p{L}\{Z}.
all the letters, like "ÀÂÆÇÈ" shouldn't be needed, they are all included in \w in .net!
You don't need most of the escaping in a character class
You can't write something like " in a character class, only thing what happens is that every single character is added to the class.
This should be quite similar to what you used:
^[-\p{L}\p{N}\p{P}\p{Z}_+$*%&/##«»£°\u201E\u201C\u201D\u20AC]{1,255}$
I haven't checked those Unicode codepoints at the end of the class, I don't now if they are needed or not.
For an explanation of all the \p{...} items see Unicode Regular Expressions on regular-expressions.info
I'm trying to parse a file that looks sort of hex encoded but mostly not. I contacted support for the vendor who created the file and they said that they it can be parsed using "an 0x116 offset"
What is a 0x116 offset?
It took me 2 weeks to get an answer from the vendor on my first question, so I wanted to see if someone here could help me make sense of! Thank you!
"0x116 offset" means nothing. It could be a value that needs to be added to words or subtracted to remove some naive encoding, or anything else for that matter.
Could you post a part of the file? Is it binary or text? Could you define "mostly not"?
What vendor/software package/device does this file come from?