Should I make multiple SQLite databases for better concurrency? - sqlite

I'm very new to SQL and relational databases (just started learning last week) and I'm in the process of upgrading my website and currently keep all my data in XML files. It works, but the new site would be better suited from what I hear a relational database can do, and it looks like SQLite is best for me. One of my concerns is concurrency, even though 99% of the data will be read-only (which I understand SQLite is pretty good at) 99% of the time. Other things, like page view counters for certain pages will constantly require small writes. I'm still learning database design and want to do it right. Would it make sense to make separate databases for things that get written to a lot, that way making the main database far less susceptible to concurrency issues? Is it possible to do a "foreign key" type reference (I still haven't used foreign keys yet, but think I understand them) across databases? As each view count would point to some primary key in the main database. Thanks for any help!

SQLite is good to use in embedded systems (like mobile phones and tablets) and small desktop applications (Chrome, Firefox, Thunderbird, etc). However, when you need to have many concurrent readers and writers (typical for websites), you should not use it.
Even if you split your data in many databases, it has a lot of operational overhead. For example, it will be difficult to join data from different databases - you must use ATTACH, and by default you can only ATTACH up to 10 databases. And concurrency issues will still not go away 100%.
Instead, use real database like PostgreSQL or MySQL. Not only it will be faster, these databases provide real concurrent access to your data over the network, which SQLite cannot do.
My personal preference is PostgreSQL, but if your web hosting does not provide PostgreSQL, you can use MySQL, but then please use fully transactional engine like InnoDB.

Related

SQL When to create a new database?

I have three different applications, they all share the ASP.NET membership aspect of the database and almost definitely they won't share anything else.
Should I have a separate database for each of the applications, or would one suffice?
All the application tables are prefixed, so that wouldn't be a problem in integration. Although I was wondering if there would be any performance issues, or if having all three applications share the same database would be some kind of grave mistake.
The applications in question are three web applications, the "main site", a forum and a bug tracker. I'm wondering if this is viable because integration could be easier if I had a single database. For instance, the bug tracker registers asp.net membership tables in it's db connection, and it even creates an "admin" user, where the db that is actually supposed to be holding the membership tables would be the "main site" one.
Update: I added a bounty to this question since the answers seem to have pretty split opinions about whether I should or not use multiple databases for different applications that share only membership providers.
Separate apps = separate databases - unless you have to "squeeze" everything into a single DB (e.g. on a shared web hoster).
Separate databases can be backed up (and restored!) separately.
Separate databases can be distributed onto other servers when needed.
Separate databases can be tweaked individually.
I have always found it would be better to have more databases so that it is easier to:
Migrate to more servers if needed
Manage security / access easier
Easier (and Faster) restores and backups
I would actually go with four databases. A Membership database, and then one for each application (if the membership is truly shared). This will allow you to lock security across applications as well.
Looking at your question closer... You say that the data would "likely not be shared"... will a lot of your queries be joining tables with the membership? If so, might be easier if they are in the same database. However if you are going with a more entity based approach, I would think you would still be better with multiple databases. You might even want to look at something like an LDAP database or some other type of caching for your membership database to speed things up.
You should use the same database unless you have a current need to place them in separate databases - HOWEVER where possible you should architect your system so that you could move the data into a separate database should the need arise.
In practise this means that you should keep SQL procedures working the smallest amount of data possible - i.e. Don't have multi-step stored procs which do lots of separate actions. Have separate usps and call each from code.
Reasons to use separate databases:
1) Unrelated data - Group data that is interrelated - andonce databases get beyond a certain complexity, look to separate out blocks of related data into separate databases in order to simplify.
2) Data that is of either higher importance (e.g. Personal Details) should be separated to allow for greater security measures: e.g. screening this data from developers
3) or lower importance (e.g. Logging Info) - this probably does not need backing up - and if it's particularly volumous, you probably don't want it increasing the time taken to back up the main site database.
4) Used by applications living on different servers at different locations. Quite obviously you want to site data as close as possible to the consuming application.
Without really knowing the size and scale of your system, difficult to give full opinion, if it's just your own site, one db may work for now - if it's commercial then i'd have 4 dbs from the word go: Membership details, Forum, Bug Tracker and MainSite related stuff.
Thus in code you would have a Membership manager which only talks to the Membership db, A BugManager, A ForumManager and anything else will only talk to the MainSite db. I can't think of any reason you'd need any of these databases talking to each other.
Just my inclination: although the three apps might not share much (not yet, anyway: but what happens when a forum post wants to reference a bug report?), they all belong to the same "system," so to speak.
I would definitely put all of the tables in just one database.
In my opinion , it is better to split the database for increased flexibility, security, efficiency, and scalability.
In future if there is any addition of requirement (you never know) which is common to all the three applications , it might be a little difficult to maintain.
For example: User login /audit trace for your 3 applications.
It may sound like I'm wandering a bit, but have you taken into account another possibility, that is separating all the authentication/membership functionality into an application itself?
From your description it seems you may add another application in the future. It would start to look like a network of sites, much like 37signals web apps, Google web apps or MSN web apps.
And thus, you may go for a kind of Single-Sign-On / Connect service. This one single application may offer authentication methods via web-services or any other mechanisms, it will have its own DB for you to tweak, modify, backup and move without affecting the other apps. I myself have found this situation many times and thus I love how easy is to share your Google or Facebook login among applications.
Perhaps I'm seeing it from a little higher perspective than yours, sorry if it's the case. If this is not an option, you may keep 4 databases: 1 for each application and 1 for the membership provider, which has its own connectionstring most of the time.
Of course it depends on the size of your applications' footprint on DB-level. 10 tables per app is OK, 150 tables per app would make the DB a little ugly to us, that being a personal preference.
Good luck with whatever option you choose.
The membership framework allows for partitioning across multiple applications, so you probably should have the following configuration:
Membership Database
Application 1 Database
Application 2 Database
Application 3 Database
Then, in each of the application databases, create synonyms that point to the membership database's tables for when you need to write your own queries that access both application data and membership data. Synonyms are easy to maintain and allow you change where the database is without changing any dependencies on those tables as the synonym names don't change.
Your application configuration in Web.config will determine how the data is partitioned in the membership database as you specify an ApplicationName that should be different for each app.

Rearchitecture ASP.NET app by replacing SQL Server with NoSQL

We have an ASP.NET app with SQL Server & it is a photo & video sharing site.
Details of photos and videos are stored in tables & the files are in the file system.
Database has 75 tables and 225 stored procedures. The app will be ready for production deployment within next 6 months.
Due to longer time growth concerns, we decided to switch to NoSQL (MongoDB) database.
We have few questions regarding the best way to approach this:
Is it better to deploy the app with SQL Server backend and migrate to NoSQL later?
OR re-architecture now and rewrite/recreate database, tables, procedures and data layer
How difficult will it be re-architecture/recode with MongoDB? Any tools or BKMs?
EDIT:
Our app is Youtube+Flickr type site where user will share photos and videos with lots of comments, tags and ratings (photo\video & comments).
Is NoSQL a better database to move to? Reason for moving: cost + read query speed
Please help me with you valuable advise.
Thank you very much.
Change is always exponentially more expensive the later it is introduced to a project. This is a core principle of software engineering. You should do this now.
That said, I question your long-term vision. Relational databases, used properly, have a lot of performance in them.
This question raises more questions than answers.
Have you benchmarked your current implementation in terms of requests/responses?
Why MongoDB out of all possible NoSQL databases? (Don't get me wrong, I love Mongo, but love and hype should not weigh in technology choices)
Are you certain you will get the large userbase you're expecting? Why are you so certain?
Using stored procs seems to tip off that you aren't using an ORM? Why not?
Generally, I'm against these types of re-architectures. Firstly, you need to get your whole team acclimated to how Mongo affects development. Secondly, your ops team needs to get acclimated to how to deploy and maintain a Mongo installation. More likely than not, this will prevent you from launching in a timeline you want to launch.
I'd say that you should probably launch as is, fix the ORM part if you aren't using one, benchmark your app, benchmark a prototype of your app backed by Mongo and if the performance advantages are so big that it warrants the pain of re-architecture do it.
To your latter question, there aren't any tools right now, as far as I can tell, that'll automate or semi-automate the database import/export from SQL Server to Mongo. There are barely tools to do that for MySQL.
I've done such a migration a few month ago, during the early developement stage of a website in ASP.NET. It was a hard decision, but I could concentrate on that migration. The reason why I did this migration was the ORM that I couldn't trust anymore and some very slow queries that I had no idea how to optimize.
During coding phase, what I figured out was : I was spending a lot of time with the data model in SQL Server (using Entity) and all the plumbery code.
Now, no more store procedures (C# and Linq code instead), no more 2 layers to maintain (the code is the model).
My small experience says : The earlier the better but don't get me wrong, before migrating you really have to think in Document rather than in RDBMS. This means you may have to partially change the businness DataModel to correctly utilize MongoDB features, otherwise you could get bad performances and Mongo DB is useless for bad models.
Another point is the admin stuff. You'll have to quickly learn Mongo DB admin to be up to speed. And even if the tools are good, they completely differ from SQL Server tools.
In conclusion, If you're convinced MongoDB is your future data store and search database,
(and it was in my case), read documentation, take time to do some Proof Of Concept. Then you can think Document and load test you new model.
Your core question appears to be whether to make the switch to MongoDB now, or deploy on SQL and go to MongoDB in a future release.
You do not appear to be using an ORM (e.g. NHibernate, Entity Framework.) Setting other concerns aside, if you're convinced that you want to go to NoSQL, then I would do it now rather than later. Unless you integrate a Provider model for your data access, changing the underlying data access strategy after it is already established would be difficult.
I agree. Switching now is better, if only to avoid the data migration headache switching post-deployment will require.

asp.net (mvc) and mysql, what am I getting in to here?

I'm building asp.net mvc app, and I want to know the ramifications of me switching from sqlserver2008 to mysql?
Apart from some syntax tweaks, what other things should I am taking into consideration (technically speaking ofcourse) if I want to move over to use mysql?
convert sprocs to inline queries
transaction and locking maybe handled differently
others?
There are some differences with how the two treat some kinds of locking and concurrency, etc. but for 95% of web applications those kinds of issues simply never come into play. If you're doing standard CRUD, maybe some transactions, executing a few stored procedures? No difference to speak of except the syntax, a good reference to which can be found here.
I really recommend checking out DbLinq, which is based on LINQ to SQL but supports lots of different SQL databases. It gets us much closer to making applications truly db-agnostic - you can swap out the SQL Server provider for MySQL, PostgreSQL, Oracle, Firebird, SQLite, Ingres - and all the LINQ expressions stay exactly the same. No need to tweak any queries.

What is Sqlite used for?

I don't know how authoritative this is but I found this:
http://www.sqlite.org/cvstrac/wiki?p=PerformanceConsiderations
and it doesn't seem good to have a lot of connections to sqlite. This seems to be bad for the web and most applications that have more than a few users. I'm having a hard time thinking of what sqlite would be used for when you don't need that many connections. Every program I can think of needs users, lots of them sometimes, so what would I use a database for that doesn't allow that many connections? I thought about prototypes but why would I use that when I can just connect to a larger database? Embedded apps maybe?
Thank you.
EDIT: Thanks everyone. I look at the page recommended below but an confused about something:
Under appropriate uses for sqlite it has:
Situations Where SQLite Works Well
•Websites
SQLite usually will work great as the database engine for low to medium traffic websites (which is to say, 99.9% of all websites). The amount of web traffic that SQLite can handle depends, of course, on how heavily the website uses its database. Generally speaking, any site that gets fewer than 100K hits/day should work fine with SQLite. The 100K hits/day figure is a conservative estimate, not a hard upper bound. SQLite has been demonstrated to work with 10 times that amount of traffic.
Situations Where Another RDBMS May Work Better
•Client/Server Applications
If you have many client programs accessing a common database over a network, you should consider using a client/server database engine instead of SQLite. SQLite will work over a network filesystem, but because of the latency associated with most network filesystems, performance will not be great. Also, the file locking logic of many network filesystems implementation contains bugs (on both Unix and Windows). If file locking does not work like it should, it might be possible for two or more client programs to modify the same part of the same database at the same time, resulting in database corruption. Because this problem results from bugs in the underlying filesystem implementation, there is nothing SQLite can do to prevent it.
A good rule of thumb is that you should avoid using SQLite in situations where the same database will be accessed simultaneously from many computers over a network filesystem.
The Question:
I'm going to show my ignorance here but what is the difference between these two?
This is answered well by sqlite itself : Appropriate use of sqlite
Another way to look at SQLite is this:
SQLite is not designed to replace Oracle. It is designed to replace fopen().
It's good for situations where you don't have access to a "real" database and still want the power of a relational db. For example, Firefox stores a bunch of information about your settings/history/etc in an SQLite database. You can't expect everyone that runs firefox to have MySQL or postgre installed on their machine.
It's also perfectly capable of running relatively-low traffic, read-heavy websites. The performance of it is overall very good, it's more than the large majority of websites need for their traffic levels.
It's often used for embedded applications.
It can be very handy to use a database like storage when you have no access to a database service. So SQLite is used since it's just a file you store somewhere.
I also find that using SQLite is good for getting a prototype application together pretty quickly without the overhead of having a seperate DB server or bogging a development environment with an instance of MySQL/Oracle/Whatever.
Also easy to pick up and move the database to a different machine if you need to.
The iPhone uses it for call history, SMS messages, contacts, and other type of data. Like Ólafur Waage said, good for embedded applications on mobile device because it's lightweight. I have used it also on stand alone applications. Easy to use and available on most platforms.
Think about simple client or desktop apps that could make use of a db, like as a poor example, an address book. Rather than bundling a huge db engine like mysql or postgre with your deliverable, sqlite is very lightweight and easy to include with your finished app.
This FLOSS Weekly podcast episode talks with the creator of SQLite and covers among other things goes over the type of things you would use it for. Everything from file systems for mobile phones to smallish web sites.
In the simplest terms, SQLite is a public-domain software package that provides a
relational database management system, or RDBMS. Relational database systems are
used to store user-defined records in large tables. In addition to data storage and management,
a database engine can process complex query commands that combine data
from multiple tables to generate reports and data summaries. Other popular RDBMS
products include Oracle Database, IBM’s DB2, and Microsoft’s SQL Server on the
commercial side, with MySQL and PostgreSQL being popular open source products.
The “Lite” in SQLite does not refer to its capabilities. Rather, SQLite is lightweight
when it comes to setup complexity, administrative overhead, and resource usage.
For detail info and solution about SQLite visit the link below:
http://blog.developeronhire.com/what-is-sqlite-sqlite/
Thank you.
What the above two answers say. Expanding slightly on Chad Birch's answer, its teh calls to the SQLite db, and a rather poor implementation of sync() that causes FF3 to be so slow in linux.

How Scalable is SQLite? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I recently read this Question about SQLite vs MySQL and the answer pointed out that SQLite doesn't scale well and the official website sort-of confirms this, however.
How scalable is SQLite and what are its upper most limits?
Yesterday I released a small site* to track your rep that used a shared SQLite database for all visitors. Unfortunately, even with the modest load that it put on my host it ran quite slowly. This is because the entire database was locked every time someone viewed the page because it contained updates/inserts. I soon switched to MySQL and while I haven't had much time to test it out, it seems much more scaleable than SQLite. I just remember slow page loads and occasionally getting a database locked error when trying to execute queries from the shell in sqlite. That said, I am running another site from SQLite just fine. The difference is that the site is static (i.e. I'm the only one that can change the database) and so it works just fine for concurrent reads. Moral of the story: only use SQLite for websites where updates to the database happen rarely (less often than every page loaded).
edit: I just realized that I may not have been fair to SQLite - I didn't index any columns in the SQLite database when I was serving it from a web page. This partially caused the slowdown I was experiencing. However, the observation of database-locking stands - if you have particularly onerous updates, SQLite performance won't match MySQL or Postgres.
another edit: Since I posted this almost 3 months ago I've had the opportunity to closely examine the scalability of SQLite, and with a few tricks it can be quite scalable. As I mentioned in my first edit, database indexes dramatically reduce query time, but this is more of a general observation about databases than it is about SQLite. However, there is another trick you can use to speed up SQLite: transactions. Whenever you have to do multiple database writes, put them inside a transaction. Instead of writing to (and locking) the file each and every time a write query is issued, the write will only happen once when the transaction completes.
The site that I mention I released in the first paragraph has been switched back to SQLite, and it's running quite smoothly once I tuned my code in a few places.
* the site is no longer available
Sqlite is scalable in terms of single-user, I have multi-gigabyte database that performs very well and I haven't had much problems with it.
But it is single-user, so it depends on what kind of scaling you're talking about.
In response to comments. Note that there is nothing that prevents using an Sqlite database in a multi-user environment, but every transaction (in effect, every SQL statement that modifies the database) takes a lock on the file, which will prevent other users from accessing the database at all.
So if you have lots of modifications done to the database, you're essentially going to hit scaling problems very quick. If, on the other hand, you have lots of read access compared to write access, it might not be so bad.
But Sqlite will of course function in a multi-user environment, but it won't perform well.
SQLite drives the sqlite.org web site and others that have lots of traffic. They suggest that if you have less than 100k hits per day, SQLite should work fine. And that was written before they delivered the "Writeahead Logging" feature.
If you want to speed things up with SQLite, do the following:
upgrade to SQLite 3.7.x
Enable write-ahead logging
Run the following pragma: "PRAGMA cache_size = Number-of-pages;" The default size (Number-of-pages) is 2000 pages, but if you raise that number, then you will raise the amount of data that is running straight out of memory.
You may want to take a look at my video on YouTube called "Improve SQLite Performance With Writeahead Logging" which shows how to use write-ahead logging and demonstrates a 5x speed improvement for writes.
Sqlite is a desktop or in-process database. SQL Server, MySQL, Oracle, and their brethren are servers.
Desktop databases are by their nature not a good choices for any application that needs to support concurrent write access to the data store. This includes at some level most web sites ever created. If you even have to log in for anything, you probably need write access to the DB.
Have you read this SQLite docs - http://www.sqlite.org/whentouse.html ?
SQLite usually will work great as the
database engine for low to medium
traffic websites (which is to say,
99.9% of all websites). The amount of web traffic that SQLite can handle
depends, of course, on how heavily the
website uses its database. Generally
speaking, any site that gets fewer
than 100K hits/day should work fine
with SQLite. The 100K hits/day figure
is a conservative estimate, not a hard
upper bound. SQLite has been
demonstrated to work with 10 times
that amount of traffic.
SQLite scalability will highly depend on the data used, and their format. I've had some tough experience with extra long tables (GPS records, one record per second). Experience showed that SQLite would slow down in stages, partly due to constant rebalancing of the growing binary trees holding the indexes (and with time-stamped indexes, you just know that tree is going to get rebalanced a lot, yet it is vital to your searches). So in the end at about 1GB (very ballpark, I know), queries become sluggish in my case. Your mileage will vary.
One thing to remember, despite all the bragging, SQLite is NOT made for data warehousing. There are various uses not recommended for SQLite. The fine people behind SQLite say it themselves:
Another way to look at SQLite is this: SQLite is not designed to replace Oracle. It is designed to replace fopen().
And this leads to the main argument (not quantitative, sorry, but qualitative), SQLite is not for all uses, whereas MySQL can cover many varied uses, even if not ideally. For example, you could have MySQL store Firefox cookies (instead of SQLite), but you'd need that service running all the time. On the other hand, you could have a transactional website running on SQLite (as many people do) instead of MySQL, but expect a lot of downtime.
i think that a (in numbers 1) webserver serving hunderts of clients appears on the backend with a single connection to the database, isn't it?
So there is no concurrent access in the database an therefore we can say that the database is working in 'single user mode'. It makes no sense to diskuss multi-user access in such a circumstance and so SQLite works as well as any other serverbased database.
Think of it this way. SQL Lite will be locked every time someone uses it (SQLite doesn't lock on reading). So if your serving up a web page or a application that has multiple concurrent users only one could use your app at a time with SQLLite. So right there is a scaling issue. If its a one person application say a Music Library where you hold hundreds of titles, ratings, information, usage, playing, play time then SQL Lite will scale beautifully holding thousands if not millions of records(Hard drive willing)
MySQL on the other hand works well for servers apps where people all over will be using it concurrently. It doesn't lock and it is quite large in size. So for your music library MySql would be over kill as only one person would see it, UNLESS this is a shared music library where thousands add or update it. Then MYSQL would be the one to use.
So in theory MySQL scales better then Sqllite cause it can handle mutiple users, but is overkill for a single user app.
SQLite's website (the part that you referenced) indicates that it can be used for a variety of multi-user situations.
I would say that it can handle quite a bit. In my experience it has always been very fast. Of course, you need to index your tables and when coding against it, you need to make sure you use parameritized queries and the like. Basically the same stuff you would do with any database to improve performance.

Resources