How to deal with complex database using an ORM on Android? - sqlite

I can't find how to deal properly with complex databases using ORM on Android. I tried to find an Open Source project to see how it works but can find one that suits what i'm looking for...
I learned about relational databases some years ago and worked on SQL Server and Oracle databases, huge ones. The first things i learned when designing a database is to avoid having several times the same data. The second things i learned is never do in code what you can do with SQL. So I'm facing several problems with Android and ORM since it looks like you abolutely have to use an ORM in Android to be a good developer...
Let's take an example and say we have 100 buildings with 50 people in each of these building, all buildings has a different address. I want to get all people with their building address. I can't put this in one table else the same strings will exists many times in the database. Since on each adress there are 50 people if I use only one table I will have the same address string 50 times for each building, so I create another table with only buildings and make a relationship between these two tables. This is a trivial case but i saw many times Android app storing the same data many times in one or two tables.... what the point to use relational database if you replicate data ?
Again this is a simplistic example but when you have 20 or 30 tables with complex relationships the query in ORM styles can quickly become unreadable compare to SQL. Therefore not all SQL join types are generally supported by ORM. Then you use SQL raw query but what's the point to use an ORM since you can't use the object mapping since you're not returning a table you can map to a class but the result of a query... or maybe there is something I didn't understand. What the point to use an ORM if you don't use the relationnal object mapping advantage or make the queries difficultly maintainable ?
I saw a lot of code too where the ORM is used to get data in several tables and then the filtering and joining part is made using code... what's the point to use a relational database if you have to do this in code ? actually doing this some years ago what seen as the worse thing to do... but now I saw it so many times on Android...
So another solution is to create a View in the db and map my object to this view. I can use the power of SQL and the power of relationnal object mapping of the ORM. But several ORM doesn't support Views, like GreenDao who is one of the most used ORM today as far as I know...
All the example i can find here and there are not dealing with complex databases or has this kind of bad practices. Or at least it was condidered as bad practices for years... does it changed ?
So what's the best way to deal with "complex" databases on Android ?

Related

Using LMDB to implement a sqlite-alike relational database, relevant resources?

For educational reasons I wish to build a functional, full, relational database. I'm aware LMDB was used to be the storage backend of sqlite, but I don't know C. I'm on .NET and I'm not interested in just duplicate a "traditional" RDBMS (so, for example, I not worry about implement a sql parser but my own custom scripting language that I'm building), but expose the full relational model.
Consider this question similar to "How I implement a programming language on top of LLVM" before worry about why I'm not using sqlite or similar.
From the material I read, LMDB look great, specially because It provide transactions and reliability, plus the low-level plumbing. How that translate to changes that could touch several rows at several tables is another question..
Exist material that explain how is implemented a relational layer on top of something like LMDB? Is using LMDB (or their competitors) optimal enough or exist another better way to get results?
Is possible to use LMDB to store other structures like hashtables, arrays and (the one I'm more interested for a columnar database) bitmap arrays?, ie, similar to redis?
P.D: Exist a forum or another place to talk more about this subject?
I had this idea too. You should realize that this is tons of work and most likely no one will care. I haven't built full-blown relational db as this is crazy to do for one person. You could check it out here
Anyway I've used leveldb (and later rocksdb) and so you have keys-values sorted by key, ability to get value by key, iterate keys, have atomic writes of many values (WriteBatch) and consistent view of data at given time - snapshots. These features are enough to build correct thread-safe reading of table rows (using snapshots), correct writing of data and related indexes - all or nothing (using writebatch) and even transactions.
Each column has it's on disk index - keys sorted by values - so you could efficiently do various operations on it and keys with values themselves so you could efficiently read values with given id.
This setup is efficient for writing and reading using available operations on tables with little data (say less than a million rows). However, if table grows iterating over many keys can become not so fast. To solve this and to add a group-by statement I've decided to add memory indexes, but that's another story. So all-in-all it might be fun idea but in reality a lot of work and often frustrating results - why would you want to do that?

What's a good strategy to move data from a SQL database to Google NDB?

Hello there fellow netizens,
I have a SQL database (about 600MB big) that I want to import into my GAE app. I know that one possibility would be to simpy use Google Cloud SQL, but I'd rather have the data available in NDB to get the benefits thereof. So I'm wondering, how should I think about converting the SQL schema into a NDB schemaless structure? Should I simply set up Kinds to mirror each table? How ought I deal with foreign keys that relate different tables?
Any pointers are greatly appreciated!
- Lee
How should I think about converting the SQL schema into a NDB schemaless structure?
If you are planning to transfer your SQL data to the Datastore, you need to think about how these two systems are very different.
Should I simply set up Kinds to mirror each table?
In thinking about making this transfer, simple analogies like this will only get you so far. Thinking SQL on a schemaless DB can get you in serious trouble due to the difference in implementation, even if at first it helps to think of a Kind as a table, Entity properties as columns, etc... In short, no, you should not simply set up Kinds to mirror each table. You could, but it depends what kind of operations you want to support on these entities, how often these ops will occur, what kind of queries your system relies on, etc...
How ought I deal with foreign keys that relate different tables?
Honestly, if you're looking to use MySQL specific features like foreign keys, or your data model will require a lot of rethinking. A "foreign key" could be as little as maintaining a key reference to the other Kind in an Entity of a certain Kind.
I would suggest that you stick with Cloud SQL if your data storage solution is already built in SQL, unless you are willing to A) rethink your whole data model B) implement the new data model C) transfer the data you currently have D) re-code all code that interacts with data storage (unless using ORM, in which case your life might be easier for this aspect).
Depending how complex your SQL db is, and how much time you feel it will take to migrate to Datastore, and how much time/brainpower you are willing to commit to learning a new system and new ways of thinking, you should either stick with SQL or do the above steps to rebuild your data storage solution.

ADO.NET Entity Framework or ADO.NET

I'm starting a new project based on ASP.NET and Windows server.
The application is planned to be pretty big and serve large amount of clients pulling and updating high freq. changing data.
I have previously created projects with Linq-To-Sql or with Ado.Net.
My plan for this project is to use VS2010 and the new EF4 framework.
It would be great to hear other
programmers options about development
with Entity Framework
Pros and cons from previous
experience?
Do you think EF4 is ready for
production?
Should i take the risk or just stick with plain old good ADO.NET?
Whether EF4 is really ready for production is a bit hard to say since it's not officially been released yet.... but all the preliminary experiences and reports about it seem to indicate it's quite good.
However: you need to take into consideration what EF is trying to solve; it's a two-layer approach, one layer maps to your physical storage schema in your database (and supports multiple backends), and the second layer is your conceptual model you program against. And of course, there's the need for a mapping between those two layers.
So EF4 is great if you have a large number of tables, if you have multiple backends to support, if you need to be able to map a physical schema to a different conceptual schema, and so forth. It's great for complex enterprise level applications.
BUT that comes at a cost - those extra layers do have an impact on performance, complexity, maintainability. If you need those features, you'll be happy to pay that price, no question. But do you need that??
Sure, you could go back to straight ADO.NET - but do you really want to fiddle around with DataTables, DataRows, and untyped Row["RowName"] constructs again?? REALLY???
So my recommendation would be this:
if you need only SQL Server as your backend
if you have a fairly simple and straightforward mapping of one database table to one entity object in your model
then: use Linq-to-SQL ! Why not?? It's still totally supported by Microsoft in .NET 4 - heck, they even did bugfixes and added a few bits and pieces - it's fast, it's efficient, it's lean and mean - so why not??
My advice is use both. At first I thought I would only use linq to sql and never have to touch ado.net ever again ( what made me happy lol).
Now I am using both because some things linq to sql(and any ORM like EF) can't do. I had to do some mass inserts and I did it first with linq to sql and to do 500 records it took over 6mins(2 mins for validation rules rest was inserting into the db).
I changed it to sql bulk copy and now it is down to 2min and 4 seconds(4 seconds to do all inserts)
But like marc_s said I really did not want to fiddle around with DataTables, DataRows, and untyped Row["RowName"].
Say my table was like 10 columns long and called Table A. What I did was I used linq to sql and made a Table A class( new TableA()) object and populated it with data. I then would pass this object to a method that created the datarow.
So linq to sql saved me some time because I probably would have made a class as I would not have wanted to pass in 10 parameters into the method that makes the data row. I also feel it gives a bit of typeness back as you have to pass in the right object to use that method so less chance of passing in the wrong data.
Finally you can still use linq to sql to call Stored procedures and that is like one line of code.
So I would use both when ever you notice that linq to sql (or in your case EF) is slow then just write a SP and call it through EF. If you need to do straight ado.net evaluate what you need to do maybe you can use EF for most of the code(so you can at least work with objects) and only for that small portion ado.net sort of what I did with sql bulk copy.
EF 4 is now more similar to LINQ to SQL, in the good ways; it has the FK keys right in the object, has add methods right in the object sets, and a lot of other nice features. THe designer is much improved, and the major plus is that it works with SQL and Oracle, and maybe some others (as long as the provider supports it) instead of LINQ to SQL with only SQL Server.
EF is the future; the ADO.NET data services is a web service add on, plus it supports POCO and T4 generation, and any new features will support this (LINQ to SQL is maintenance only, and data sets won't be getting any changes any more).
HTH.

Architectural Design DAL Layer

I am working on architecture of mid sized web application & for my DAL layer i am having 3 options
1) Traditional Stored proc Based Architecture (Using NTiers Template of Codesmith)
2) LINQ To SQL (or PLINQO Template of codesmith)
3) LINQ To Entity
From above LINQ to Entity is out of reach as we need to start application very quickly and we don't have the sufficient skillset for the same and as team has never worked on any OR/M tools it will be steep learning curve for them (This is what i read some where)
I prefer to go ahead with LINQ to SQL (But only fear is microsoft is not going to support or enhance LINQ to SQL further), from my point of view if microsoft is not going to enhance it further i am not having any issue as whatever feature i require in my project it is sufficient.
Now my issue is should i use linq to sql or should i stick to traditional architecture ?
OR else any other option is there ...
EDIT : I am going to use SQL Server as database and it does not require to interact with any other database
One of the most important objective in designing DAL Layer is faster development and maintainability for future database table changes, as there are chances that field may increase or decrease in future.
Also if you feel that any ORM tool is really good and does not have steep learning curve then also we can use
Please provide suggestions
As you are working in medium size project, I would suggest you to use LINQ-TO-SQL because of these advantages
Advantages using LINQ to SQL:
•No magic strings, like you have in SQL queries
•Intellisense
•Compile check when database changes
•Faster development
•Unit of work pattern (context)
•Auto-generated domain objects that are usable small projects
•Lazy loading.
•Learning to write linq queries/lambdas is a must learn for .NET developers.
Regarding performance:
•Most likely the performance is not going to be a problem in most solutions. To pre-optimize is an anti-pattern. If you later see that some areas of the application are to slow, you can analyze these parts, and in some cases even swap some linq queries with stored procedures or ADO.NET.
•In many cases the lazy loading feature can speed up performance, or at least simplify the code a lot.
Regarding debuging:
•In my opinion debuging Linq2Sql is much easier than both stored procedures and ADO.NET. I recommend that you take a look at Linq2Sql Debug Visualizer, which enables you to see the query, and even trigger an execute to see the result when debugging.
•You can also configure the context to write all sql queries to the console window, more information here
Regarding another layer:
•Linq2Sql can be seen as another layer, but it is a purely data access layer. Stored procedures is also another layer of code, and I have seen many cases where part of the business logic has been implemented into stored procedures. This is much worse in my opinion because you are then splitting the business layer into two places, and it will be harder for developers to get a clear view of the business domain.
There is no absolutely preffered way of writing DAL. These are all options. Which one to choose depends on your project, your skills and your inclinations.
Normally, with LINQ you can expect to be more productive. On the other hand, the DAL built with stored procedures can be expected to perform faster.
The issue only comes when you need some specific queries that the default LINQ to SQL provider won't be able to generate to be blazingly fast. In that case you will have to tap into your LINQ code to plug in your custom stored procedures where needed.
Regarding LINQ to SQL support and further development, it was grounded a long time ago already. So no official further development. Note: that is true for LINQ to SQL (it will be taken over by EF) relational solution, not for the main LINQ functionality.
Entity Framework in its v.1 only received massive critics. You're advised to wait until v2 comes out.
The most important limitation with LINQ (over Entity Framework or any other popular ORM) is that it doesn't support 1 to n mappings. That is, each your LINQ class can only map to a single table, not represent some sort of view over several others. Maybe it's not important to you, but maybe it is. Depends on your project.
The argument of stored procedures vs ORM's is long-standing and unlikely to be resolved any time soon. My recommendation would be to go with an ORM (Linq-to-Sql in your case).
Yes, stored procedures will always be faster since the queries are precompiled. The real question you have to ask yourself is whether you have such a performance-intensive system that your users will actually notice the difference. Keep in mind that using stored procedures means that you will need to manually write all your own queries where using an ORM does this for you. This usually means that an ORM will speed up your development.
Since you mention that speeding up development time is one of your goals I would recommend Linq-to-Sql - otherwise you will basically write the entire DAL yourself.
All of the options you've provided have significant drawbacks. None of them meet the requirements you've set out.
You need to prioritize what is most important for you.
If learning curve is your biggest issue, stay away from all ORMs if you are already comfortable with ADO.NET, DataTables, etc.
If development speed is your biggest issue, you should learn an ORM and go that route. The easiest ORM to recommend is NHibernate. Every other ORM has significant weaknesses. NHibernate works in the vast majority of projects, whereas other ORMs are much more situationally appropriate (depending on your DB design, model design, agility requirements, legacy schema support, etc.). All ORMs have learning curves, they just come into play at different times and in different ways.
Just to expand on #Developer Art, using the traditional stored proc approach enables you to put business logic in the database. Usually you will want to avoid this, but sometimes it is necessary to do. Not to mention you could also enforce constraints and permissions at the database level using this approach. It all depends on your requirements.
With the limitations mention I would say just stick to adhoc/custom queries and ADO.NET and not go for any jazzy stuff. Also stored procedure based DAL are faster is a notion based lame arguments like stored procedures are precompiled but they are not. All that they have is query plan cache. So lesser the investment in stored procedures the better you are. My advice ADO.Net and custom dynamic queries constructed from entity objects.

ORMs that generate database structures from classes

I have looked at NHibernate and EntitySpaces and they both seem to work differently.
In EntitySpaces, you define the database tables and table relationships and the classes are generated for you.
In NHibernate, you define the classes and the table relationships are generated for you. This is what I am looking for.
Are there any other ASP.NET ORMs that generate tables from classes like NHibernate?
Any recommendations?
DataObjects.Net also uses "Code first" (Model first) approach.
See http://wiki.dataobjects.net/index.php?title=Features
Linq to SQL can create the database table structures and relationships from the classes, with the dataContext.CreateDatabase() method.
Mindscape LightSpeed offers this ability - part of complete scheme round-tripping.
Hope this helps
http://www.mindscape.co.nz/blog/index.php/2008/06/17/schema-round-tripping-in-the-lightspeed-designer/
I prefer an approach that I have full control to generate what I need as well. In the case of ORMs I get classes that match my tables. I believe that using my own domain of objects that derives from my business and not the underlying data store is the right way to go. My class hierarchies that represent my business data should be 100% independent from the data store.
LightSpeed has a really good Visual Studio designer that supports both generating .NET entity classes from the database and updating the database from your .NET entities.
This is something that NHibernate does.
And on the subject (that Draemon) started. My personal view is that unless performance is your absolute 1st priority and all other things must suffer to make that happen (e.g. when writing software for a manufacturing fab), you will be better off working on the domain model first.
My reasoning: you spend a lot more time coding against the domain than you do against the database itself -- especially when using an orm. So spend that time wisely.
I had fairly good success working with Genome ORM. It does many jobs for you. You can first design your domain model and then generate the DB scripts out of that. Beside this Genome generates DTOs for you. It is pretty good at that and saves a lot of time of developers.
http://www.genom-e.com

Resources