I am using Visual Foxpro 9, I want to print Unicode chars in report (frx).
There are some ways to extend report listener to show unicode. I need the code to extend/show reportListner to show unicode.
I've never had to work with Unicode within VFP either, or spent any time working with Reports, but the Help for the Render method of the ReportListener does mention Unicode:
cContentsToBeRendered
Indicates the text to be rendered for Expression (Field) and Label layout elements.
For Picture layout elements sourced from a file, cContentsToBeRendered contains the filename.
When specifying a filename for an image, ReportListener provides cContentsToBeRendered
as a DBCS string, which is the standard format for strings in Visual FoxPro.
However, when indicating text to be rendered, ReportListener provides
cContentsToBeRendered as a Unicode string, appropriately translated to the correct
locale using any regional script information associated with this layout control in
its report definition file (frx) record.
If your derived class sends the text value through some additional processing, such as
storage in a table, you can use the STRCONV() function, and its optional regional
script parameter, to convert the string to DBCS first. For more information, see
STRCONV( ) Function.
Although I could be incorrect, but I believe VFP does NOT support UniCode and only works with the base ASCII character set. But then again, I've never needed to use Unicode either and have used FoxPro since the beginning of its lifetime.
I would imagine Rick Strahl's article Using Unicode in Visual FoxPro
Web and Desktop Applications would be fairly definitive on the topic.
Related
Is there any tag that tells the browser to simply print what is inside the tag, without caring about the syntax of what is inside the tag? I'm trying to print a few unicode characters, but the browser keeps giving errors, even if I paste the character directly inside of a pre tag, without using ampersands.
I'm trying to print © inside of a div tag, but that putting that character inside of a div tag results in an "improperly formatted" error (the page doesn't even show up in Mozilla Firefox, and the sentence with the copyright symbol isn't printed in Microsoft Edge).
The page is being served as application/xhtml+xml.
Here is the code:
<footer>©</footer>
and here is the error:
XML Parsing Error: not well-formed Location: http://programcode.net/ Line Number 19, Column 13:
<footer></footer>
------------^
If I do this:
<footer><pre>© </pre></footer>
then the same error occurs:
XML Parsing Error: not well-formed Location: http://programcode.net/ Line Number 19, Column 18:
<footer><pre> </pre></footer>
-----------------^
I tried declaring utf-8 and utf-32 (in both the meta tag in the xhtml file, and .htaccess), but the error still occurred.
XHTML is awesome because it uses the XML parser which is extremely strict. When you have an error you know you have an error and that you need to fix it. I've seen a person spend three days trying to figure out why Safari wouldn't work but all the other browsers worked fine (he was missing a quote around an element's attribute).
What you need to do is encode HTML entities. There are a few websites that show you the full Unicode ranges and their characters. I recommend using https://unicode-table.com/en/ because it's less intimidating.
Now once you're there you'll want to want to search for the copyright symbol.
Next you'll click the obvious symbol and you'll end up on the copyright page.
You're looking for the HTML-code (the proper terminology when speaking with other professionals is "numeric HTML entity"). Never use the loose "Entity" (©), you want to always use the numeric HTML entity (©).
So your code should look like the following:
©
XHTML, CSS and JavaScript handle HTML entities a bit differently.
For JavaScript Entities you'll need to replace the uppercase 'U' with a lowercase 'u', remove the '+'. Here is an example that you can run from any browser's web developer console:
alert('Look at my \u00A9 date!');
Note that you must have the double zeroes for the copyright symbol (removing them will break the code).
For CSS Entities it's a little simpler:
h1::after {content: '\00A9'; display: block; float: left;}
Why is this so complex?
There are eight bits to a byte (one megabit a second is really only 125,000 bytes (125 kilobytes) a second. Some characters can not by represented by a single character in code. There are multiple levels of Unicode (universal character set) but most websites are moving to UTF-8. Some languages (such as Chinese, to the best of my understanding) use a symbol for an entire word (they their "alphabet" is much longer). All these characters have to somehow be represented by code (that you do not see). There is a big move to support UTF-8 natively everywhere (especially the web). Pretty much anything above character code 127 should be encoded when using XHTML. It may or it may not work natively and that is a more advanced topic for a different question. Hopefully this will give you enough insight to get a moving and a grooving though. 😊
Part of my source code displays text in English that uses common text formatting characters (E.g.: \n: "This is my\nstring"). When I call lupdate and open the .ts file in Qt Linguist, it correctly displays the formatted text in the source text preview space (so without the \n or else).
The problem is that, when I translate the string and put the formatting characters in the translation and run the application with the translation file, my app reads the special characters as normal ones!
How may I overcome this problem? How can I put the necessary text formatting so the translation procedure don't generate such kind of bugs?
When you translate your app, you should press Enter, where you see \n.
For example,you see this string in source code
This is my\nstring
In linguist you should write:
This is my[here press Enter]
string.
When you run app with this translation, you'll see that all good.
I hope, it helps.
Context: ASP.NET MVC running in IIS, with a a UTF-8 %-encoded URL.
Using the standard project template, and a test-action in HomeController like:
public ActionResult Test(string id)
{
return Content(id, "text/plain");
}
This works fine for most %-encoded UTF-8 routes, such as:
http://mydevserver/Home/Test/%e4%ba%ac%e9%83%bd%e5%bc%81
with the expected result 京都弁
However using the route:
http://mydevserver/Home/Test/%ee%93%bb
the url is not received correctly.
Aside: %ee%93%bb is %-encoded code-point 0xE4FB; basic-multilingual-plane, private-use area; but ultimately - a valid unicode code-point; you can verify this manually, or via:
string value = ((char) 0xE4FB).ToString();
string encoded = HttpUtility.UrlEncode(value); // %ee%93%bb
Now, what happens next depends on the web-server; on the Visual Studio Development Server (aka cassini), the correct id is received - a string of length one, containing code-point 0xE4FB.
If, however, I do this in IIS or IIS Express, I get a different id, specifically "î“»", code-points: 0xEE, 0x201C, 0xBB. You will immediately recognise the first and last as the start and end of our percent-encoded string... so what happened in the middle?
Well:
code-point 0x93 is “ (source)
code-point 0x201c is “ (source)
It looks to me very much like IIS has performed some kind of quote-translation when processing my url. Now maybe this might have uses in a few scenarios (I don't know), but it is certainly a bad thing when it happens in the middle of a %-encoded UTF-8 block.
Note that HttpContext.Current.Request.Raw also shows this translation has occurred, so this does not look like an MVC bug; note also Darin's comment, highlighting that it works differently in the path vs query portion of the url.
So (two-parter):
is my analysis missing some important subtlety of unicode / url processing?
how do I fix it? (i.e. make it so that I receive the expected character)
id = Encoding.UTF8.GetString(Encoding.Default.GetBytes(id));
This will give you your original id.
IIS uses Default (ANSI) encoding for path characters. Your url encoded string is decoded using that and that is why you're getting a weird thing back.
To get the original id you can convert it back to bytes and get the string using utf8 encoding.
See Unicode and ISAPI Filters
ISAPI Filter is an ANSI API - all values you can get/set using the API
must be ANSI. Yes, I know this is shocking; after all, it is 2006 and
everything nowadays are in Unicode... but remember that this API
originated more than a decade ago when barely anything was 32bit, much
less Unicode. Also, remember that the HTTP protocol which ISAPI
directly manipulates is in ANSI and not Unicode.
EDIT: Since you mentioned that it works with most other characters so I'm assuming that IIS has some sort of encoding detection mechanism which is failing in this case. As a workaround though you can prefix your id with this char and then you can easily detect if the problem occurred (if this char is missing). Not a very ideal solution but it will work. You can then write your custom model binder and a wrapper class in ASP.NET MVC to make your consumption code cleaner.
Once Upon A Time, URLs themselves were not in UTF-8. They were in the ANSI code page. This facilitates the fact that they often are used to select, well, pathnames in the server's file system. In ancient times, IE had an option to tell whether you wanted to send UTF-8 URLs or not.
Perhaps buried in the bowels of the IIS config there is a place to specify the URL encoding, and perhaps not.
Ultimately, to get around this, I had to use request.ServerVariables["HTTP_URL"] and some manual parsing, with a bunch of error-handling fallbacks (additionally compensating for some related glitches in Uri). Not great, but only affects a tiny minority of awkward requests.
How do I add an html entity to my CSV?
I have an asp.net, sql server that generates html, excel, and csv files. Some of the data needs to have the ‡ entity in it. How do I get it to output to my CSV correctly? If I have it like this: ‡, then it gets screwed up but if I output it with the entity code, the CSV outputs that text.
Non-printable characters in a field are sometimes escaped using one of several c style character escape sequences, ### and \o### Octal, \x## Hex, \d### Decimal, and \u#### Unicode.
So just escape your non-ascii character C#-style and you'll be fine.
I'm not sure what you mean by "it gets screwed up".
Regardless, it is up to the receiving program or application to properly interpret the characters.
What this means is that if you put ‡ in your csv file then the application that opens the CSV will have to look for those entities and understand what to do with them. For example, the opening application would have to run an html entity decoder in order to properly display it.
If you are looking at the CSV file with notepad (for example) then of course it won't decode the entities because notepad has no clue what html entities are or even what to do when it finds them.
Even Internet Explorer wouldn't convert the entities for display when opening a CSV file. Now if you gave it a .html extension then IE would handle the display of the file with it's html rendering engine.
I'm building an automated RSS feed in ASP.NET and occurrences of apostrophes and hyphens are rendering very strangely:
"Here's a test" is rendering as "Here’s a test"
I have managed to circumvent a similar problem with the pound sign (£) by escaping the ampersand and building the HTML escape for £ manually as shown in in the extract below:
sArticleSummary = sArticleSummary.Replace("£", "£")
But the following attempt is failing to resolve the apostrophe issue, we stil get ’ on the screen.
sArticleSummary = sArticleSummary.Replace("’", "’"")
The string in the database (SQL2005) for all intents and purposes appears to be plain text - can anyone advise why what seem to be plain text strings keep coming out in this manner, and if anyone has any ideas as to how to resolve the apostrophe issue that'd be appreciated.
Thanks for your help.
[EDIT]
Further to Vladimir's help, it now looks as though the problem is that somewhere between the database and it being loaded into the string var the data is converting from an apostrophe to ’ - has anyone seen this happen before or have any pointers?
Thanks
I would guess the the column in your SQL 2005 database is defined as a varchar(N), char(N) or text. If so the conversion is due to the database driver using a different code page setting to that set in the database.
I would recommend changing this column (any any others that may contain non-ASCII data) to nvarchar(N), nchar(N) or nvarchar(max) respectively, which can then contain any Unicode code point, not just those defined by the code page.
All of my databases now use nvarchar/nchar exclusively to avoid these type of encoding issues. The Unicode fields use twice as much storage space but there'll be very little performance difference if you use this technique (the SQL engine uses Unicode internally).
Transpires that the data (whilst showing in SQLServer plain) is actually carrying some MS Word special characters.
Assuming you get Unicode-characters from the database, the easiest way is to let System.Xml.dll take care of the conversion for you by appending the RSS-feed with a XmlDocument object. (I'm not sure about the elements found in a rss-feed.)
XmlDocument rss = new XmlDocument();
rss.LoadXml("<?xml version='1.0'?><rss />");
XmlElement element = rss.DocumentElement.AppendChild(rss.CreateElement("item")) as XmlElement;
element.InnerText = sArticleSummary;
or with Linq.Xml:
XDocument rss = new XDocument(
new XElement("rss",
new XElement("item", sArticleSummary)
)
);
I would just put "Here's a test" into a CDATA tag. Easy and it works.
<![CDATA[Here's a test]]>