Almost all the applications I worked on involve some look-up values. For example, a lot of times a list of languages ( English, French, etc...) need to be displayed on the WebForm for user to choose.
The common data structure for such look up values include an integer Id and a string as name. Since these look-up values are used so frequently, and they are unlikely to be changed. Most of time, instead of grabbing them from database, I just define a global enum in C# like this
enum Language : int { English = 1, French = 2}
I've been handling look-up values like this for years now. I knew it may not be the best way to handle them. For example, every time a new language is added to the system, somebody needs to remember to update that enum.
I guess this is a very common scenario, just wondering how you guys handle this type of look-up values.
Thanks,
I usually put them int he database anyway, because you never know if and when they'll need to change. There have been times when - by design - I know that the list will not change, and for those, I will use enums instead. All other cases, I will use a database table.
Depending on the data requirements, I typically handle small lookups like that in a configuration file often loaded singularly to be applied as an application variable for all instances of the app. The data is looked up once when the application is started or recycled and then remains useful until it is dead.
Although, I have a lot of intermediate data of similar nature that requires regular maintenance. It is easier to put this into a database because at least then you can build a small maintenance application to handle it. If you put them in an enum, you have to recompile and redeploy.
I ran into a similar issue with resource files. Since resource files are compiled, they require another redeployment.
You may benefit from T4 Templates. I haven't used these yet because we haven't had a requirement to meet the need, but I'd like to use them eventually.
Here's a great article for generating enums from a database table (the code is hard to read on the page, there is a link to the source at the end of the article):
http://www.thecodejunkie.com/2008/11/generate-enums-from-database-using-text.html
Also, Scott Hanselman has a list of great T4 templates/articles on his blog.
T4 Templates on MSDN
Related
I am currently working to rework the data system of our application. Basically, it is designed so that people can add all the custom fields they want, with only a few constant/always-there fields.
Our current design is giving us plenty of maintenance problems. What we do is dynamically(at runtime) add a column to the database for each field. We have to have a meta table and other cruft to maintain all of these dynamic columns.
Now we are looking at EAV, but it doesn't seem much better. Basically, we have many different types of fields, so there would be a StringValues, IntegerValues, etc table... which makes things that much worse.
I am wondering if using JSON or XML blobs in the database may be a better solution, specifically because in most use cases, when we retrieve anything out of these tables, we need the entire row. The problems is that we need to be able to create reports for this data as well.. No solution really makes custom queries look easy. And searching across such a blob database will surely be a performance nightmare when reports are ran.
Each "row" needs to have anywhere from about 15 to 100(possibly more) attributes/columns associated with it.
We are using SQL Server 2008 and our application interfacing with the database is a C# web application(so, ASP.Net).
what do you think? Use EAV or blobs or something else entirely? (Also, yes, I know a schema free database like MongoDB would be awesome here, but I can't convince my boss to use it)
What about the xml datatype? Advanced querying is possible against this type.
We've used the xml type with good success. We do most of our heavy lifting at the code level using linq to parse out values. Our schema is somewhat fixed, so that may not be an option for you.
One interesting feature of SQL server is the sql_variant type. It's fully supported in .NET and quite easy to use. The advantages is you don't need to create StringValue, IntValue, etc... columns, just one Value column that can contain all the simple types.
This very specific type favors the EAV option, IMHO.
It has some drawbacks though (sorting, distinct selects, etc...). So if you want to use it, make sure you read all the documentation and understand its limit.
Create a table with your known columns and "X" sparse columns using a sequential name such as DataColumn0001, DataColumn0002, etc. When there is a definition for a new column just rename a column and start inserting data. The great advantage to the sparse column is it is indexable.
More info at this link.
What you're doing is STUPID with a database that doesn't support your data type. You should work with a medium that meets your needs which include NoSQL databases such as RavenDB, MongoDB, DocumentDB, CouchBase or Postgres in RDMBS to name several.
You are inherently using the tool in a capacity it was neither designed for, and one it specifically attempts to limit you from achieving success. NoSQL database solutions frequently use JSON as an underlying storage because JSON is inherently schemaless. Want to add a property? Sure go ahead, want to add a whole sub collection? Sure go ahead. NoSQL databases were in part, created specifically to remove rigid schema requirements of RDBMS.
2015 Edit: Postgres now natively supports JSON. This is a viable option for RDBMS. My answer is still correct that you need to use the correct tool for the problem. It is a polygot persistence world.
I'm using asp.net and I need to build an application where we can easily create forms without recreating the database, and preferably without changing the create/read/update/delete queries. The goal is to allow customers to create their own forms with dropdowns, textboxes, checkboxes, even many-to-one relationship to another simple form (that's stretching it). The user does not have to be able to create the forms themselves, but I don't want to be adding tables, fields, queries, web page, etc. each time a new form is requested/modified.
2 questions:
1) How do I structure a flexible database to do this (in SQL Server)? I can think of two ways: a) Create a table for each datatype (int, varchar(x), smalldatetime, bit, etc). This would be very difficult to create the adequate queries. b) Create the form table with lots of extra fields and various datatypes in case the user needs 5 integers or 5 date fields. This seems the easiest, but is probably pretty inefficient use of space.
2) How do I build the forms? I thought about creating an xml sheet that had the validations, data type, control to display, etc. as a list. Then I would parse through the xml to build the form on the fly. Probably using css to do the layout (that would have to be manual, which is ok).
Is there a better/best way? Is there something out there that I could look at to get ideas? Any help is much appreciated.
This sounds like a potential candidate for an InfoPath solution. At first blush, it will do most/all of what you are asking.
This article gives a brief overview of creating an InfoPath form that is based on a SQL data source.
http://office.microsoft.com/en-us/infopath-help/design-a-form-template-based-on-a-microsoft-sql-server-database-HP010086639.aspx
I have built a completely custom solution like you are describing, and if I ever did it again I would probably opt for either 1) a third-party product or 2) less functionality. You can spend 90% of your time working on 10% of the feature set.
EDIT: reread your questions and here is additional feedback.
1 - Flexible data structure: A couple things to keep in mind are performance and the ability to write reports against the data. The more generic the data structure, the harder these will be to achieve (again, speaking from experience).
Somewhat contrary to both performance and report-readiness, Microsoft SharePoint uses XML fragments/documents in generic tables for maximum flexibility. I can't argue with the features of SharePoint, so this does get the job done and greatly simplifies the data structure. XML will perform well when done correctly, but most people will find it more difficult to write queries against XML. Even though XML is a "first class citizen" to SQL Server, it may or may not perform as well as an optimized table structure.
2 - Forms: I have implemented custom forms using XML transformed by XSLT. XML is often a good choice for storing form structure; XSLT is a monster unto itself, but it is very powerful. For what it's worth, InfoPath stores its form structure as XML.
I've also implemented dynamic forms using custom controls in .Net. This is a very object-oriented approach, but (depending on the complexity of the UI) can require a significant amount of code.
Lastly (again using SharePoint as an example), Microsoft implemented horrendously complicated XML list/form definitions in SharePoint 2007. The complexity defeats many of the benefits. In other words, if you go the XML route, make your structures clean and simple or you will have a maintenance nightmare on your hands.
EDIT #2: In reference to Scott's question below, here's a high-level data structure that will avoid duplicated data and doesn't rely on XML for the majority of the form definition.
Caveat: I just put this design together in SQL Management Studio...I only spent 10 minutes on it; developing a flexible form system is not a trivial task, so this is an over-simplification. It does not address the storage of user-entered data, just the definition of the form.
The tables:
Form - top-level form table which contains (as you would guess) the collection of fields that comprise the form.
Field - generic fields that could be reused across forms. For example, you don't want 50 different "Last Name" fields for 50 different forms. Note the "DataTypeId" column. You could store any type you wanted in this column, like "number, "free text", even a value that indicates the user should pick from a list.
FormField - allows a form to contain 0-many fields in its definition. You could even extend this table to indicate that the user can ADD as many of this field as they need.
Constraint - basically a lookup table that defines a constraint type (maybe it's max length, max occurrences, required, etc.)
FormFieldConstraint - relates a constraint to a particular instance of a form field. This table combines a specific form with a specific field with a specific constraint. Note the metadata column; this potentially would be a good use for XML to store the specifics of the constraint.
Essentially, I suggest building a normalized database with few or no null values and no duplicated data. A structure as I've described would get you on the path to that goal.
I think if you need truly dynamic forms saved into a database, you'd have to create a sort of "dictionary" data table.
For example...
UserForms
---------
FormID
FieldName
FieldValue
FormID relates back to the parent form (so you can aggregate all of the fields for one form. FieldName is the name of the text field entered from. FieldValue is the value entered/selected for that field.
Obviously this isn't a perfect solution. You could have issues typing your data dynamically, but I leave the implementation of that up to you.
Anyways, hopefully this gives you somewhere to start thinking about how you'd like to accomplish things. Good luck!
P.S. I've found using webforms with .NET to be a total pain when doing dynamic form generation. In the instances I had to do it, I ditched it almost entirely and used native HTML elements. Then rewired my form by using the necessary values from Request. Probably not a perfect solution either, but it worked the best for me.
We created a forms system like the one you're describing with a schema very similar to the one at the end of Tim's post. It has been pretty complicated, and we really had to wrestle with the standard WebForms controls like the DetailsView and GridView to make them be able to perform CRUD operations on groups of answers. They're used to binding straight to properties on an object, and we're forcing them to look up a field ID in a dictionary first. They don't like it. You may want to consider using MVC.
One tricky question is how to store the answers. We ended up using a table that's keyed on FieldId, InstanceId (for example, if 10 people are filling out your form, there are 10 instances), and RowNumber (because we support multi-row forms for things like employment history). Instead of doing it this way, I would recommend making your AnswerRow a first-class concept with its own table tied to an Instance, and having the answers be linked to the AnswerRow and Field.
In order to support different value types, we had to create multiple answer fields on our answer table (AnswerChar, AnswerDate, AnswerInt, AnswerDecimal). Each of these maps to one or more Field Types (Date, Time, Number, etc.). Each Field Type has its own logic to represent the value in the UI, and put the value into the appropriate property on our answer objects.
All in all, it's been a lot of work. It's worth it for us, since this is the crux of the product, but you will definitely want to keep the cost in mind before embarking on a project like this.
I am struggling with the following use-case:
User amends an existing order. The order is complex - lots of related 'entities' (addresses, post options, suppliers, makes, models, various items etc). Across multiple http posts.
User wants to discard the changes.
--
I have an order entity and as the user is editing this I am making various changes to the entity associations e.g changing order.address, order.items.add(item)...
In a single post this is fine, but across posts I don't know how best store state. If I store the entities then I cannot save the changes as they are across different data contexts. I have read that it is bad practice to store the data context in the session state i.e. long-lived context. I can't save changes after each edit/post because I cannot roll-back (?). I really would like to work with the entities during the editing process rather than one big save at the end (taking UI settings and applying these in one chunk).
This must be a pretty common problem - it's driving me mad. Any help really appreciated.
Cheers!
We have a similar problem where we are building a complex business object through a multi-page wizard.
Instead of creating a partially complete business object at each step of the wizard, we create a dedicated wizard object that looks pretty similar to the business object, populate that through the wizard. At each step in the wizard, the wizard object is saved into the database. At the end the user can accept it and it is converted to a real business object and then becomes visible to everyone else, or they can bin it and no-one else ever knows it existed.
If this kind of approach was not suitable, I suspect you're looking at some kind of difference tracking, either at the entity or database levels. Neither are simple to implement, work with or manage in a system. The former would be some kind of calculation and storage of n changes to the entities and developing an algorithm to undo them, the latter depends on your RDBMS, but might include versioned rows or similar.
Yes its pretty much common for us. In most scenarios we use the MVC approach. Even without the actual ASP .NET MVC Projects, we use similar ViewModel with our Views/Pages/Scenarios etc. where there is no direct/single entity mapping to the Business Layer (in other words, Business.Entities). This is pretty much similar to DTOs.
It is always easy to use Disconnected EF. We retrieve data and discard the context, then transform the Entities into ViewModels/DTOs if necessary. When you need to persist changes, all you have to do is to create a new context, find the latest entity instance do the changes.
The Views/Pages/Controllers will be managing these ViewModels/DTOs. Tracking Changed and Deleted content can be done by introducing a HistoryList<T> (you can extend a List<T> to implement this).
Once done, using a Controller/Workflow/Component you can observe the ViewModel/DTO and do the necessary changes to your Entities using a new Context to retrieve and persist.
It involves a bit of a coding and I would say its not a perfect solution since it has its own pros and cons.
/KP
As the question is a bit self explanatory, I want to achieve the goal of a multi-language website. I am using an Entity Data Model with MS SQL 2005. I came up with an idea of making a seperate database for each language with exactly the same model and relations. So I can use the Entities constructor that takes a connectionString and switch the site to the selected language.
I am using an ascx as the language control that fires an event, and the parent aspx gets the selected language as an integer (from event args) and call the method containing the same linq queries but Entity context will be created with the connection string of that db (of language)
I could only came up with this solution, because I think adding a new language will require a replication of the english one, imported to Access and sent to the translator. Then will be exported back, and the model will fit (HOPEFULLY).
My question is if this is really a good approach or am I missing anything that will create greater hassle to me. Thanks in advance
multi-database is not a good solution as soon as entities within the different databases have relations to each other. Generally a good approach is to work with labels in one default language. These labels can either be in a well defined format (e.g. 'LABEL.TEXT_HELLO') or just in the base language (e.g. 'Hello World').
So all you have to do is building a table for translations where the base language is the key and hopefully there is for each key a value containing the translation. As soon as you have the translations, you can write a method ont he frontend which writes the labels in the language used by the user.
In Zend Framework for example, you have to write <h1><?= $this->translate('Hello World'); ?></h1> instead of just <h1>Hello World</h1>
The good thing about that is, that if ya translation is missing, you can still use the fallback (in this case english) to show the user at least something.
That way, you can manage your app in one database and users who speak several languages do not have to switch between applications and content.
cheers
My approach: create a table Language that lists all the available languages. Relate each table that should be localized to Language. Now, you can easily access the localized content e.g.
Content[content_ID].HeadLine.Where(hl => hl.Language.id == "en-US")
I look forward to see what other people as I myself is still learning DB design and EDM.
OK, if you want to be able to easily implement a new language, then reinventing the internationalization features already built in to ASP.NET is not the way to go, because it isn't "easy".
At least, not as easy as using a satellite resource DLL. Your translators will use off-the-shelf tooling to translate your resources, and ASP.NET will automatically select the correct DLL based on the user's current culture.
Read up on ASP.NET internationalization/globalization features; there's no need to invent your own.
How should I store (and present) the text on a website intended for worldwide use, with several languages? The content is mostly in the form of 500+ word articles, although I will need to translate tiny snippets of text on each page too (such as "print this article" or "back to menu").
I know there are several CMS packages that handle multiple languages, but I have to integrate with our existing ASP systems too, so I am ignoring such solutions.
One concern I have is that Google should be able to find the pages, even for foreign users. I am less concerned about issues with processing dates and currencies.
I worry that, left to my own devices, I will invent a way of doing this which work, but eventually lead to disaster! I want to know what professional solutions you have actually used on real projects, not untried ideas! Thanks very much.
I looked at RESX files, but felt they were unsuitable for all but the most trivial translation solutions (I will elaborate if anyone wants to know).
Google will help me with translating the text, but not storing/presenting it.
Has anyone worked on a multi-language project that relied on their own code for presentation?
Any thoughts on serving up content in the following ways, and which is best?
http://www.website.com/text/view.asp?id=12345&lang=fr
http://www.website.com/text/12345/bonjour_mes_amis.htm
http://fr.website.com/text/12345
(these are not real URLs, i was just showing examples)
Firstly put all code for all languages under one domain - it will help your google-rank.
We have a fully multi-lingual system, with localisations stored in a database but cached with the web application.
Wherever we want a localisation to appear we use:
<%$ Resources: LanguageProvider, Path/To/Localisation %>
Then in our web.config:
<globalization resourceProviderFactoryType="FactoryClassName, AssemblyName"/>
FactoryClassName then implements ResourceProviderFactory to provide the actual dynamic functionality. Localisations are stored in the DB with a string key "Path/To/Localisation"
It is important to cache the localised values - you don't want to have lots of DB lookups on each page, and we cache thousands of localised strings with no performance issues.
Use the user's current browser localisation to choose what language to serve up.
You might want to check GNU Gettext project out - at least something to start with.
Edited to add info about projects:
I've worked on several multilingual projects using Gettext technology in different technologies, including C++/MFC and J2EE/JSP, and it worked all fine. However, you need to write/find your own code to display the localized data of course.
If you are using .Net, I would recommend going with one or more resource files (.resx). There is plenty of documentation on this on MSDN.
As with most general programming questions, it depends on your needs.
For static text, I would use RESX files. For me, as .Net programmer, they are easy to use and the .Net Framework has good support for them.
For any dynamic text, I tend to store such information in the database, especially if the site maintainer is going to be a non-developer. In the past I've used two approaches, adding a language column and creating different entries for the different languages or creating a separate table to store the language specific text.
The table for the first approach might look something like this:
Article Id | Language Id | Language Specific Article Text | Created By | Created Date
This works for situations where you can create different entries for a given article and you don't need to keep any data associated with these different entries in sync (such as an Updated timestamp).
The other approach is to have two separate tables, one for non-language specific text (id, created date, created user, updated date, etc) and another table containing the language specific text. So the tables might look something like this:
First Table: Article Id | Created By | Created Date | Updated By | Updated Date
Second Table: Article Id | Language Id | Language Specific Article Text
For me, the question comes down to updating the non-language dependent data. If you are updating that data then I would lean towards the second approach, otherwise I would go with the first approach as I view that as simpler (can't forget the KISS principle).
If you're just worried about the article content being translated, and do not need a fully integrated option, I have used google translation in the past and it works great on a smaller scale.
Wonderful question.
I solved this problem for the website I made (link in my profile) with a homemade Python 3 script that translates the general template on the fly and inserts a specific content page from a language requested (or guessed by Apache from Accept-Language).
It was fun since I got to learn Python and write my own mini-library for creating content pages. One downside was that our hosting didn't have Python 3, but I made my script generate static HTML (the original one was examining User-agent) and then upload it to server. That works so far and making a new language version of the site is now a breeze :)
The biggest downside of this method is that it is time-consuming to write things from scratch. So if you want, drop me line and I'll help you use my script :)
As for the URL format, I use site.com/content/example.fr since this allows Apache to perform language negotiation in case somebody asks for /content/example and has a browser tell that it likes French language. When you do this Apache also adds .html or whatever as a bonus.
So when a request is for example and I have files
example.fr
example.en
example.vi
Apache will automatically proceed with example.vi for a person with Vietnamese-configured browser or example.en for a person with German-configured browser. Pretty useful.