I have been pushing to finish off this project since 1 week and have been stuck with one problem that god knows how I missed. We have a sql server database that has 2 tables (of concern here).
Table 1 (id,langCode,englishText,translatedTextInUnicode{nvarchar(MAX)})
Table 2 (id,langCode,englishText,translatedTextInUtf-8{nvarchar(MAX)})
Application is layered app like db -> linq-sql ->wcf -> ASP.net 3.5
I have tried many things like using <meta> tag, using <globalization> tag in web.config but to no avail. Really struggling to show utf-characters in general and both when used with an accordion control on aspx pages. All I get is garbage text.
Need some help!
Besides the usual link to Joel's seminal article about The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), I can only strongly recommend to use the same encoding throughout the whole database.
Related
Here’s the basic question…
I have a long HTML document (a contract with 100+ pages) that ultimately needs to be a PDF document with headers and footers (page numbers). What is the best tool/language for making this happen?
Here’s the back story…
I work at a satellite office for a low-tech construction company that issues contracts to subcontractors, and because I am the only one who is able to unjam the printer, I have become the defacto IT person in the company. In the past, to make a contract, someone has had to go through a MS Word document (the boiler plate contract) and type in the necessary information to produce a contract.
About a year ago, I got so frustrated with that methodology that I created a MS Access Database where a user could add information using Access forms and then a mail merge with MS Word to populate a contract. This has been a HUGE improvement plus we have been able to start tracking money a lot more easily using the other database features. The database is stored on a shared computer in the satellite office. However, this system only works IF the individual users have MS Access and MS Word installed on their individual machines and only if they are physically connected to our local network.
With the success of this system at the satellite office, I am now attempting to create a web-based version of this tool that everyone in the company can use that only relies on standard software on individual machines and can be accessible anywhere.
I have converted a computer into a server for development purposes using XAMPP, created a SQL database, created HTML forms, and am using PHP to run queries. Over the past few months, I have crash coursed my way through myriad languages including CSS, and have finally gotten everything to the point that the system will create an HTML version of the contract with everything populated. Now I just need to format it for printing (ideally to a virtual PDF printer) with headers and footers (page numbers). This should be the easiest part, right?
CSS with the #media: print tags would, on the surface, appear to be the best way to make this happen because CSS3 uses tags like “#top-left” and “content: counter(page)” to do everything that I want; however, after investing a lot of time setting everything up, it appears that only Foxfire kind of supports this and IE and Chrome absolutely do not.
Headers and footers overlap body content, and I can’t get the pagination to work at all. Apparently these are common frustrations.
In my hunting, I ran across a program called Prince that would seem to do what I want (and quite a bit more), but the price tag on that is way more than I am willing to pay.
I can’t believe that what I want to do is a new or unique thing. I suspect I am just not searching for the right keywords. Is there a better tool/technique out there for converting HTML to a printer-friendly format without spending a ton of money?
I feel your pain. But the only solution I've found that really works is to use a PDF library to write the formatted text to a PDF directly from PHP (or Python or another language, but you mentioned PHP and I've done that). I've used R&OS quite a bit:
http://pdf-php.sourceforge.net/
It may take a little while to get up to speed, but you can do pretty much anything with it, including easily create nicely formatted tables, flowing text and embedded images. The catch is that, with the exception of a few tags like <b></b> and <i></i> you don't get to use any HTML or CSS - essentially you write two output routines, one for HTML and one for PDF.
We have run into a Section 508 compliance issue with SSRS reports.
Specifically, the issue is with tabular data tables. SSRS reports do not generate table headers with the tag which is a Section 508 requirement for accessible HTML tables.
After quite some research and debugging we found out a way to make SSRS generate HTML tables in a way that the header cells have Ids and data cells have headers matching the corresponding header Ids. This is very close but still not enough apparently. The must be used to render the headers (row and col headers).
By the way, there are two provisions regarding accessible HTML tables in Section 508: g and h. Details are found at Section 508 Laws and Regulations.
We are using SQL Server 2008 R2 SP2 and SSRS version 10.50.4319 and we are displaying, the reports through the ReportViewer (version 10), in ASP.NET web applications running on .NET 4.0.
Does anyone know if this is even possible? Any help is greatly appreciated!
I have not worked with SSRS, but 1194.22 g and h are either/or, meaning you don't apply both to the same table. Ideally you would use g for basic tables which are one dimensional[1]. You would use h for a multi-leveled table[1][2]. To my knowledge, no HTML/XHTML specification says you cannot use the id/headers method on one-dimensional tables. From a developer's point of view, we should use scopes whenever possible, so we don't have to touch every cell.
In your situation, it seems like you must use the second method.
This is very close but still not enough apparently. The must be used to render the headers (row and col headers).
Who told you this? If you are working with an agency, I'd request to see if they have a place to get help with 508, for a second opinion, or with the 508 Coordinator, who should roughly say this. Maybe you missed adding an id somewhere, which would make the table blow up.
1- This page probably isn't the best source to use, but was the first non-W3C page that had both types of tables.
2- This table is non-compliant.
Update to my question.
Contacted Microsoft's SSRS team. Their reply was: it was their design to generate table headers with and data cells like . No need for further questions.
But still not sure why they chose to render table header with instead of . Mind boggling.
This is more of an advise / best practice question that I'm hoping someone has come across before and can give me a steer.
I need to build a web application (the client would like webforms because that's what their developers know for when i hand it over)
Essentially when the client logs in, they will pick a language then I need to replace the text for menus, input boxes etc. The client wants to add their translations and update them at any time.
Ideas I have looked at are:
Holding the translations in resource files, building an editor in to the web application and then adding attributes on the fly to my viewmodels.
Holding the translations in sql server so i have the name, language and translation as a lookup e.g. Home | French | Maison. Then on pre-render I'll scrape the screen for any controls needing translation in the menu, labels, text areas.
Does anyone know of any good examples or had the experience of doing this themselves.
I've a similar situation, and chose to store data in SQL.
Translation mistakes happen often, and you don't want to recompile or disassemble every time.
It is possible to avoid the need to republish, but I've found it just more intuitive and straightforward to maintain SQL.
Bottom line, it depends on the amount of data you have. If it's more than just a couple of keywords, it sounds like a job for SQL to me.
Edit:
In a similar question, users recommend using resources, claiming it is the standard method.
However, if your users are going to make changes to values on regular basis (not because of mistake correction, but because data actually changes), then SQL seems best fit for the job.
I'm writing an asp.net application that will need to be localized to several regions other than North America. What do I need to do to prepare for this globalization? What are your top 1 to 2 resources for learning how to write a world ready application.
A couple of things that I've learned:
Absolutely and brutally minimize the number of images you have that contain text. Doing so will make your life a billion percent easier since you won't have to get a new set of images for every friggin' language.
Be very wary of css positioning that relies on things always remaining the same size. If those things contain text, they will not remain the same size, and you will then need to go back and fix your designs.
If you use character types in your sql tables, make sure that any of those that might receive international input are unicode (nchar, nvarchar, ntext). For that matter, I would just standardize on using the unicode versions.
If you're building SQL queries dynamically, make sure that you include the N prefix before any quoted text if there's any chance that text might be unicode. If you end up putting garbage in a SQL table, check to see if that's there.
Make sure that all your web pages definitively state that they are in a unicode format. See Joel's article, mentioned above.
You're going to be using resource files a lot for this project. That's good - ASP.NET 2.0 has great support for such. You'll want to look into the App_LocalResources and App_GlobalResources folder as well as GetLocalResourceObject, GetGlobalResourceObject, and the concept of meta:resourceKey. Chapter 30 of Professional ASP.NET 2.0 has some great content regarding that. The 3.5 version of the book may well have good content there as well, but I don't own it.
Think about fonts. Many of the standard fonts you might want to use aren't unicode capable. I've always had luck with Arial Unicode MS, MS Gothic, MS Mincho. I'm not sure about how cross-platform these are, though. Also, note that not all fonts support all of the Unicode character definition. Again, test, test, test.
Start thinking now about how you're going to get translations into this system. Go talk to whoever is your translation vendor about how they want data passed back and forth for translation. Think about the fact that, through your local resource files, you will likely be repeating some commonly used strings through the system. Do you normalize those into global resource files, or do you have some sort of database layer where only one copy of each text used is generated. In our recent project, we used resource files which were generated from a database table that contained all the translations and the original, english version of the resource files.
Test. Generally speaking I will test in German, Polish, and an Asian language (Japanese, Chinese, Korean). German and Polish are wordy and nearly guaranteed to stretch text areas, Asian languages use an entirely different set of characters which tests your unicode support.
Learn about the System.Globalization namespace:
System.Globalization
Also, a good book is NET Internationalization: The Developer's Guide to Building Global Windows and Web Applications
Would be good to refresh a bit on Unicodes if you are targeting other cultures,languages.
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
This is a hard problem. I live in Canada, so multilingualism is a big issue. In all my years of doing software development, I've never seen a solution that I liked. I've seen a lot of solutions that worked, and got the job done, but they've always felt like a big kludge. I would go with #harriyott, and make sure that none of your strings are actually in code. A resource file works well for desktop applications. However in ASP.Net, I'd recommend using the database. #John Christensen also has some good pointers.
Make sure you're compiling with Code Analysis turned on, and pay attention to the Globalization warnings that it gives you. Keep data in an invariant format (CultureInfo.InvariantCulture) until you display it to the user (then use CultureInfo.CurrentCulture).
I would seriously consider reading the following code project article:
Globalization and localization demystified in ASP.NET 2.0
It covers everything from Cultures and Locales, setting the threads current culture, resource files, encodings, you name it!
And of course it's loaded with pretty pictures and examples :-). Good luck!
I would suggest:
Put all strings in either the database or resource files.
Allow extra space for translated text, as some (e.g. German) are wordier.
How should I store (and present) the text on a website intended for worldwide use, with several languages? The content is mostly in the form of 500+ word articles, although I will need to translate tiny snippets of text on each page too (such as "print this article" or "back to menu").
I know there are several CMS packages that handle multiple languages, but I have to integrate with our existing ASP systems too, so I am ignoring such solutions.
One concern I have is that Google should be able to find the pages, even for foreign users. I am less concerned about issues with processing dates and currencies.
I worry that, left to my own devices, I will invent a way of doing this which work, but eventually lead to disaster! I want to know what professional solutions you have actually used on real projects, not untried ideas! Thanks very much.
I looked at RESX files, but felt they were unsuitable for all but the most trivial translation solutions (I will elaborate if anyone wants to know).
Google will help me with translating the text, but not storing/presenting it.
Has anyone worked on a multi-language project that relied on their own code for presentation?
Any thoughts on serving up content in the following ways, and which is best?
http://www.website.com/text/view.asp?id=12345&lang=fr
http://www.website.com/text/12345/bonjour_mes_amis.htm
http://fr.website.com/text/12345
(these are not real URLs, i was just showing examples)
Firstly put all code for all languages under one domain - it will help your google-rank.
We have a fully multi-lingual system, with localisations stored in a database but cached with the web application.
Wherever we want a localisation to appear we use:
<%$ Resources: LanguageProvider, Path/To/Localisation %>
Then in our web.config:
<globalization resourceProviderFactoryType="FactoryClassName, AssemblyName"/>
FactoryClassName then implements ResourceProviderFactory to provide the actual dynamic functionality. Localisations are stored in the DB with a string key "Path/To/Localisation"
It is important to cache the localised values - you don't want to have lots of DB lookups on each page, and we cache thousands of localised strings with no performance issues.
Use the user's current browser localisation to choose what language to serve up.
You might want to check GNU Gettext project out - at least something to start with.
Edited to add info about projects:
I've worked on several multilingual projects using Gettext technology in different technologies, including C++/MFC and J2EE/JSP, and it worked all fine. However, you need to write/find your own code to display the localized data of course.
If you are using .Net, I would recommend going with one or more resource files (.resx). There is plenty of documentation on this on MSDN.
As with most general programming questions, it depends on your needs.
For static text, I would use RESX files. For me, as .Net programmer, they are easy to use and the .Net Framework has good support for them.
For any dynamic text, I tend to store such information in the database, especially if the site maintainer is going to be a non-developer. In the past I've used two approaches, adding a language column and creating different entries for the different languages or creating a separate table to store the language specific text.
The table for the first approach might look something like this:
Article Id | Language Id | Language Specific Article Text | Created By | Created Date
This works for situations where you can create different entries for a given article and you don't need to keep any data associated with these different entries in sync (such as an Updated timestamp).
The other approach is to have two separate tables, one for non-language specific text (id, created date, created user, updated date, etc) and another table containing the language specific text. So the tables might look something like this:
First Table: Article Id | Created By | Created Date | Updated By | Updated Date
Second Table: Article Id | Language Id | Language Specific Article Text
For me, the question comes down to updating the non-language dependent data. If you are updating that data then I would lean towards the second approach, otherwise I would go with the first approach as I view that as simpler (can't forget the KISS principle).
If you're just worried about the article content being translated, and do not need a fully integrated option, I have used google translation in the past and it works great on a smaller scale.
Wonderful question.
I solved this problem for the website I made (link in my profile) with a homemade Python 3 script that translates the general template on the fly and inserts a specific content page from a language requested (or guessed by Apache from Accept-Language).
It was fun since I got to learn Python and write my own mini-library for creating content pages. One downside was that our hosting didn't have Python 3, but I made my script generate static HTML (the original one was examining User-agent) and then upload it to server. That works so far and making a new language version of the site is now a breeze :)
The biggest downside of this method is that it is time-consuming to write things from scratch. So if you want, drop me line and I'll help you use my script :)
As for the URL format, I use site.com/content/example.fr since this allows Apache to perform language negotiation in case somebody asks for /content/example and has a browser tell that it likes French language. When you do this Apache also adds .html or whatever as a bonus.
So when a request is for example and I have files
example.fr
example.en
example.vi
Apache will automatically proceed with example.vi for a person with Vietnamese-configured browser or example.en for a person with German-configured browser. Pretty useful.