How should I store localized versions of user-entered data in my database?

How should I store localized versions of user-entered data in my database? - asp.net

I am working for a client on a web app that requires localization in 3 languages (English and 2 others). I understand how to use resources in an ASP.NET application to display localized versions of static data. However, I am not sure how to approach the issue of localized user-entered data. For example, an administrator may want to add some new metadata the application (e.g. a new product category). This will eventually need to be translated into all 3 languages, but it will initially be entered in whatever language the administrator knows. Since this kind of data is not static, we store it in the database. Should we add a culture code to the primary key to differentiate different localized versions of the same data? Is there a "best practice" or pattern I'm not aware of for this kind of problem?

Have a child table your your entity, with a composite PK of MainItemID and LanguageCode (EN, DE, FR etc). This child table stores your language specific text.
If you always have English, or it is a fallback then you could have the child table for DE, FR etc and the main table for English. A LEFT JOIN and ISNULL will take care of this.
Either way is OK depending on your exact needs which I suspect is the first one. Of course, you'd need to ensure you have at least one child row on data entry of, say, a new product category

I would suggest you make a table to track the Language and then use the languageID as a foreign key in the other table instead of language code.
Language(LanguageID, Name)
And then in the other tables use that LanguageID as a foreign key.
e.g. you are storing localized text in the table
LocalizedTextTable(ID,text,LanguageID)

My solution was to create a string column which holds encoded data for all supported languages. Special application logic is required to insert and extract the data.
Specialized text editor supporting multi-lingual data helped a lot too.

Related

When creating a new Dataverse table, why does it come with automatic columns?

I am new to Dataverse, moving from the SQL Server world, and just created my first Dataverse table (Standard table). Upon creation, the table has lots of what I assume are automatically-added columns? These include "Owner", "Status", "Version Number". I come from the SQL Server background where new tables come "empty", with no columns. I do not think I need these automatically-added columns (this is just going to be a small log table that holds datetime, action, etc. columns).
Would it break anything if these automatically-added columns were deleted? Also, if anyone could provide information about why these columns are included, that would help. I have researched these questions online, but found very little. Thank you in advance.

They are standard, out of the box attributes that you can't remove.
You can change the Ownership within the Table Type to "Organization" when creating the table to remove the Owner however the rest are created as part of every table.
There is some high level detail on the docs
https://learn.microsoft.com/en-us/powerapps/maker/data-platform/entity-overview

Dataverse (earlier called as Common Data service) is Dynamics CRM under the hood. It’s a SaaS model CRM online software comes with some basic fundamental components.
When you create a table (entity) it comes with columns (attributes), relationships, views, forms, dashboards, etc.
The UCI model driven app can be made quickly to include these components with all CRUD operations without any code by doing simple configuration and customization.
To support these barebone functionalities - the necessary attributes like name, currency, statecode, statuscode, createdby, createdon, modifiedby, modifiedon and security implication fields like owner, owning business unit, owning team and change tracking & concurrency fields like row version, etc will be created.
You can keep them aside as they are part of platform and do your customization as you need.

How to setup data model for customizable application

I have an ASP.NET data entry application that is used by multiple clients. The application consists of multiple data entry modules that are common to all clients.
I now have multiple clients that want their own custom module added which will typically consist of a dozen or so data points. Some values will be text, others numeric, some will be dropdown selections, etc.
I'm in need of suggestions for handling the data model for this. I have two thoughts on how to handle. First would be to create a new table for each new module for each client. This is pretty clean but I don't particular like it. My other thought is to have one table with columns for each custom data point for each client. This table would end up with a lot of columns and a lot of NULL values. I don't really like either solution and suspect there's a better way to do this, so any feedback you have will be appreciated.
I'm using SQL Server 2008.

As always with these questions, "it depends".
The dreaded key-value table.
This approach relies on a table which lists the fields and their values as individual records.
CustomFields(clientId int, fieldName sysname, fieldValue varbinary)
Benefits:
Infinitely flexible
Easy to implement
Easy to index
non existing values take no space
Disadvantage:
Showing a list of all records with complete field list is a very dirty query
The Microsoft way
The Microsoft way of this kind of problem is "sparse columns" (introduced in SQL 2008)
Benefits:
Blessed by the people who design SQL Server
records can be queried without having to apply fancy pivots
Fields without data don't take space on disk
Disadvantage:
Many technical restrictions
a new field requires DML
The xml tax
You can add an xml field to the table which will be used to store all the "extra" fields.
Benefits:
unlimited flexibility
can be indexed
storage efficient (when it fits in a page)
With some xpath gymnastics the fields can be included in a flat recordset.
schema can be enforced with schema collections
Disadvantages:
not clearly visible what's in the field
xquery support in SQL Server has gaps which makes getting your data a real nightmare sometimes
There are maybe more solutions, but to me these are the main contenders. Which one to choose:
key-value seems appropriate when the number of extra fields is limited. (say no more than 10-20 or so)
Sparse columns is more suitable for data with many properties which are filled out infrequent. Sounds more appropriate when you can have many extra fields
xml column is very flexible, but a pain to query. Appropriate for solutions that write rarely and query rarely. ie: don't run aggregates etc on the data stored in this field.

I'd suggest you go with the first option you described. I wouldn't over think it. The second option you outlined would be a bad idea in my opinion.
If there are fields common to all the modules you're adding to the system you should consider keeping those in a single table then have other tables with the fields specific to a particular module related back to the primary key in the common table. This is basically table inheritance (http://www.sqlteam.com/article/implementing-table-inheritance-in-sql-server) and will centralize the common module data and make it easier to query across modules.

Entity Attribute Value (EAV) vs. XML Column for New Product Atttributes

I have an existing, mature schema to which we need to add some new Product attributes. For example, we have Products.Flavor, and now need to add new attributes such as Weight, Fragrance, etc. Rather than continue to widen the Products table, I am considering a couple of other options. First is a new Attributes table, which will effectively be a property bag for arbitary attributes, and a ProductsAttributes table to store the mappings (and values) for a particular product's attributes. This is the Entity-Attribute-Value (EAV) pattern, as I've come to understand it. The other option is to add a new column to the Products table called Attributes, which is of type XML. Here, we can arbitrarily add attributes to any product instance without adding new tables.
What are the pros/cons to each approach? I'm using SQL Server 2008 and ASP.NET 4.0.

This is (imho) one of the classic database design issues. Call it "attribute creep", perhaps, as once you start, there will always be another attribute or property to add. They key decision is, do you store the data within the database using the basic tools provided by the database (tables and columns) to structure and format the data, or do you store the data in some other fashion (XML and name/value pairs being the most common alternates). Simply put, if you store the data in a form other than that supported by the DBMS system, then you lose the power of the DBMS system to manage, maintain, and work with that data. This is not much of a problem if you only need to store it as "blob data" (dump it all in, pump it all out), but once you start have to seek, sort, or filter by this data, it can get very ugly very fast.
With that said, I do have strong opinions on name/value pairs and XML, but alas, none are positive. If you do have to store your data this way, and yes it can be an entirely valid business/design decision, then I would recommend looking long and hard on how the data you need to store in the database will be used and accessed in the future. Weight the pros and cons of each methodology in light of how it will be used, and pick the once that's easiest to manage and maintain. (Don't pick the one that's easiest to implement, you'll be supporting it for a lot longer than you'll be writing it.)
(It's long, but the "RLH" essay is a classic example of name/value pairs run amok.)
(Oh, and if you're using it, look into SQL Server 2008's "Sparse Columns" option. Doesn't sound like what you need, but you never know.)

Use ASP.NET Profile or not?

I need to store a few attributes of an authenticated user (I am using Membership API) and I need to make a choice between using Profiles or adding a new table with UserId as the PK. It appears that using Profiles is quick and needs less work upfront. However, I see the following downsides:
The profile values are squished into a single ntext column. At some point in the future, I will have SQL scripts that may update user's attributes. Querying a ntext column and trying to update a value sounds a little buggy to me.
If I choose to add a new user specific property and would like to assign a default for all the existing users, would it be possible?
My first impression has been that using profiles may cause maintainance headaches in the long run. Thoughts?

There was an article on MSDN (now on ASP.NET http://www.asp.net/downloads/sandbox/table-profile-provider-samples) that discusses how to make a Profile Table Provider. The idea is to store the Profile data in a table versus a row, making it easier to query with just SQL.
More onto that point, SQL Server 2005/2008 provides support for getting data via services and CLR code. You could conceivably access the Profile data via the API instead of the underlying tables directly.
As to point #2, you can set defaults to properties, and while this will not update other profiles immediately, the profile would be updated when next it is accessed.

Seems to me you have answered your own question. If your point 1 is likely to happen, then a SQL table is the only sensible option.

Check out this question...
ASP.NET built in user profile vs. old stile user class/tables
The first hint that the built-in profiles are badly designed is their use of delimited data in a relational database. There are a few cases that delimited data in a RDBMS makes sense, but this is definitely not one of them.
Unless you have a specific reason to use ASP.Net Profiles, I'd suggest you go with the separate tables instead.

What is the best way to implement multilingual domain objects using NHibernate?

What is the best way to design the Domain objects which can have multi-lingual fields. An example can be a Product class with Description being multi-lingual.
I have found few links but could not decide which one is the best way.
http://fabiomaulo.blogspot.com/2009/06/localized-property-with-nhibernate.html
(This stores all localised language data in one field. Can be a problem if we query from Sql)
http://ayende.com/Blog/archive/2006/12/26/LocalizingNHibernateContextualParameters.aspx
(This one has a warning at the beginning that it is a hack and no longer supported)
http://www.webdevbros.net/2009/06/24/create-a-multi-languaged-domain-model-with-nhibernate-and-c/
(This does not describe how multilingual data will be structured in the database.)
Anyone having experience with using NHibernate with multi-lingual data. Is there a better way?

The third option looks great. The hibernate mapping is given, but not the database schema - if that's what you are missing, then I'll sketch it out here:
dictionary
----------
ID: int - identity
name: nvarchar(255)
phrase
------
dictionary_id:int (fkey dictionary.ID)
culture_id:int (LCID)
phrase:nvarchar(255) - this is the default size - seems too small
According to this blog entry, 255 is the default string length for String values. To overcome the short string length on the phrase text, you can change the <element> tag to
<element column="phrase" type="String" length="4001"></element>
To use this in your domain model, you add a PhraseDictionary property to your entity where you want translatable text. E.g. the title property or decription property.
I think the article describes a great approach, and is the one that I would go
for.
EDIT: In response to the comments, make the length less than 4001 if you know the absolute maximum size is less than that, as this will typically be faster. Also, NHibernate will lazily fetch the collection, but it may fetch all the items at once. You can profile to determine if this has any performance implications. (If you have only a handful of languages then I doubt you will see a difference.) If you have many languages (Say 50+) then it may be worthwhile creating custom properties to fetch the localized text. These will issue queries to fetch specifically the text required. More importantly, you may be able to fetch all the text for a given entity in one query, rather than each localized text property as a separate query.
Note that this extra effort is only needed if profiling gives you reason to be concerned about the performance. Chances are that the implementation in the article as is will function more than adequately.

I only have experience for Hibernate, but since nHibernate is so similar:
One option is to define a component type MultilingualString with members for each language (this assumes the set of languages is known at coding time). This type is also a convenient location to place an getter for the string by language id.
class MultiLingualString {
String english;
String chinese;
String klingon;
String forLanguage(Language lang) {
switch (lang) {
// you can guess what goes here
}
}
}
This results in the strings for all languages being stored in separate columns in the database while the representation in the object world retains fine granularity.
The advantage is that no join is required to fetch the strings. On the other hand, the only way not to fetch a string with this approach is to use a projection, which is a severe limitation if the strings are large, numerous and rarely needed.
If you do this a lot, writing a UserType might be worth it.

From a strictly database oriented standpoint with SQL Server, you should have one table with all of the base data (record key, dates, numbers, etc) and one table with all of the translatable string data. Let call the two tables Base and Base_Description.
Base ensures that there is a single key for each record, the key might be a string or auto-generated id depending on your particular use case.
The Base_Description table is related to the Base table, but also contains a value to select the language that the data is in. In my projects we use the langid column from sys.languages because we can set the language of the connection with and then grab it with ##LANGID for most operations.
In our testing we found this to be significantly faster than having multiple fields for each language, it also allows you to add other languages more easily. We are also using SQL Server Full-Text indexing and it fully works with this method. You should index in the neutral language and then you can pick the language to search against at run time (also filtering against the LangID column in Base_Description).

Do your requirements include the domain objects actually having multiple-language properties in the same object? And, if so, is it unlimited translations stored in the object (in a collection, say - in which case I would say that it would need to be just like any master/detail or parent/child collection) or fixed translations, in which case the languages (and thus the mapping to results of a stored proc or whatever) have to be determined statically anyway?
In many internationalized applications I worked on, the data was in only one language - customer names, the product names (there was no point in mapping even identical products used in one country to products in another, they all had different distributors and different SKUs, and of course localized pricing). The interface was also only in one language (at a time). So all the domain objects only required one language at a time. Thus the language of the translation would be determined when the object was instantiated.
We had translation user interfaces which allowed users to update the translated texts, but these only required two languages at a time (local and the default). I can see this being closest to what you are talking about. I guess that you would have child collections for each translatable property with all the possible translations in the collection. This would probably be closest to the second solution in the third article you linked. Of course, at this point you would also need to see if you want eager/lazy loading etc.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex