How Scalable is SQLite? [closed] - sqlite

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I recently read this Question about SQLite vs MySQL and the answer pointed out that SQLite doesn't scale well and the official website sort-of confirms this, however.
How scalable is SQLite and what are its upper most limits?

Yesterday I released a small site* to track your rep that used a shared SQLite database for all visitors. Unfortunately, even with the modest load that it put on my host it ran quite slowly. This is because the entire database was locked every time someone viewed the page because it contained updates/inserts. I soon switched to MySQL and while I haven't had much time to test it out, it seems much more scaleable than SQLite. I just remember slow page loads and occasionally getting a database locked error when trying to execute queries from the shell in sqlite. That said, I am running another site from SQLite just fine. The difference is that the site is static (i.e. I'm the only one that can change the database) and so it works just fine for concurrent reads. Moral of the story: only use SQLite for websites where updates to the database happen rarely (less often than every page loaded).
edit: I just realized that I may not have been fair to SQLite - I didn't index any columns in the SQLite database when I was serving it from a web page. This partially caused the slowdown I was experiencing. However, the observation of database-locking stands - if you have particularly onerous updates, SQLite performance won't match MySQL or Postgres.
another edit: Since I posted this almost 3 months ago I've had the opportunity to closely examine the scalability of SQLite, and with a few tricks it can be quite scalable. As I mentioned in my first edit, database indexes dramatically reduce query time, but this is more of a general observation about databases than it is about SQLite. However, there is another trick you can use to speed up SQLite: transactions. Whenever you have to do multiple database writes, put them inside a transaction. Instead of writing to (and locking) the file each and every time a write query is issued, the write will only happen once when the transaction completes.
The site that I mention I released in the first paragraph has been switched back to SQLite, and it's running quite smoothly once I tuned my code in a few places.
* the site is no longer available

Sqlite is scalable in terms of single-user, I have multi-gigabyte database that performs very well and I haven't had much problems with it.
But it is single-user, so it depends on what kind of scaling you're talking about.
In response to comments. Note that there is nothing that prevents using an Sqlite database in a multi-user environment, but every transaction (in effect, every SQL statement that modifies the database) takes a lock on the file, which will prevent other users from accessing the database at all.
So if you have lots of modifications done to the database, you're essentially going to hit scaling problems very quick. If, on the other hand, you have lots of read access compared to write access, it might not be so bad.
But Sqlite will of course function in a multi-user environment, but it won't perform well.

SQLite drives the sqlite.org web site and others that have lots of traffic. They suggest that if you have less than 100k hits per day, SQLite should work fine. And that was written before they delivered the "Writeahead Logging" feature.
If you want to speed things up with SQLite, do the following:
upgrade to SQLite 3.7.x
Enable write-ahead logging
Run the following pragma: "PRAGMA cache_size = Number-of-pages;" The default size (Number-of-pages) is 2000 pages, but if you raise that number, then you will raise the amount of data that is running straight out of memory.
You may want to take a look at my video on YouTube called "Improve SQLite Performance With Writeahead Logging" which shows how to use write-ahead logging and demonstrates a 5x speed improvement for writes.

Sqlite is a desktop or in-process database. SQL Server, MySQL, Oracle, and their brethren are servers.
Desktop databases are by their nature not a good choices for any application that needs to support concurrent write access to the data store. This includes at some level most web sites ever created. If you even have to log in for anything, you probably need write access to the DB.

Have you read this SQLite docs - http://www.sqlite.org/whentouse.html ?
SQLite usually will work great as the
database engine for low to medium
traffic websites (which is to say,
99.9% of all websites). The amount of web traffic that SQLite can handle
depends, of course, on how heavily the
website uses its database. Generally
speaking, any site that gets fewer
than 100K hits/day should work fine
with SQLite. The 100K hits/day figure
is a conservative estimate, not a hard
upper bound. SQLite has been
demonstrated to work with 10 times
that amount of traffic.

SQLite scalability will highly depend on the data used, and their format. I've had some tough experience with extra long tables (GPS records, one record per second). Experience showed that SQLite would slow down in stages, partly due to constant rebalancing of the growing binary trees holding the indexes (and with time-stamped indexes, you just know that tree is going to get rebalanced a lot, yet it is vital to your searches). So in the end at about 1GB (very ballpark, I know), queries become sluggish in my case. Your mileage will vary.
One thing to remember, despite all the bragging, SQLite is NOT made for data warehousing. There are various uses not recommended for SQLite. The fine people behind SQLite say it themselves:
Another way to look at SQLite is this: SQLite is not designed to replace Oracle. It is designed to replace fopen().
And this leads to the main argument (not quantitative, sorry, but qualitative), SQLite is not for all uses, whereas MySQL can cover many varied uses, even if not ideally. For example, you could have MySQL store Firefox cookies (instead of SQLite), but you'd need that service running all the time. On the other hand, you could have a transactional website running on SQLite (as many people do) instead of MySQL, but expect a lot of downtime.

i think that a (in numbers 1) webserver serving hunderts of clients appears on the backend with a single connection to the database, isn't it?
So there is no concurrent access in the database an therefore we can say that the database is working in 'single user mode'. It makes no sense to diskuss multi-user access in such a circumstance and so SQLite works as well as any other serverbased database.

Think of it this way. SQL Lite will be locked every time someone uses it (SQLite doesn't lock on reading). So if your serving up a web page or a application that has multiple concurrent users only one could use your app at a time with SQLLite. So right there is a scaling issue. If its a one person application say a Music Library where you hold hundreds of titles, ratings, information, usage, playing, play time then SQL Lite will scale beautifully holding thousands if not millions of records(Hard drive willing)
MySQL on the other hand works well for servers apps where people all over will be using it concurrently. It doesn't lock and it is quite large in size. So for your music library MySql would be over kill as only one person would see it, UNLESS this is a shared music library where thousands add or update it. Then MYSQL would be the one to use.
So in theory MySQL scales better then Sqllite cause it can handle mutiple users, but is overkill for a single user app.

SQLite's website (the part that you referenced) indicates that it can be used for a variety of multi-user situations.
I would say that it can handle quite a bit. In my experience it has always been very fast. Of course, you need to index your tables and when coding against it, you need to make sure you use parameritized queries and the like. Basically the same stuff you would do with any database to improve performance.

Related

Should I make multiple SQLite databases for better concurrency?

I'm very new to SQL and relational databases (just started learning last week) and I'm in the process of upgrading my website and currently keep all my data in XML files. It works, but the new site would be better suited from what I hear a relational database can do, and it looks like SQLite is best for me. One of my concerns is concurrency, even though 99% of the data will be read-only (which I understand SQLite is pretty good at) 99% of the time. Other things, like page view counters for certain pages will constantly require small writes. I'm still learning database design and want to do it right. Would it make sense to make separate databases for things that get written to a lot, that way making the main database far less susceptible to concurrency issues? Is it possible to do a "foreign key" type reference (I still haven't used foreign keys yet, but think I understand them) across databases? As each view count would point to some primary key in the main database. Thanks for any help!
SQLite is good to use in embedded systems (like mobile phones and tablets) and small desktop applications (Chrome, Firefox, Thunderbird, etc). However, when you need to have many concurrent readers and writers (typical for websites), you should not use it.
Even if you split your data in many databases, it has a lot of operational overhead. For example, it will be difficult to join data from different databases - you must use ATTACH, and by default you can only ATTACH up to 10 databases. And concurrency issues will still not go away 100%.
Instead, use real database like PostgreSQL or MySQL. Not only it will be faster, these databases provide real concurrent access to your data over the network, which SQLite cannot do.
My personal preference is PostgreSQL, but if your web hosting does not provide PostgreSQL, you can use MySQL, but then please use fully transactional engine like InnoDB.

Rearchitecture ASP.NET app by replacing SQL Server with NoSQL

We have an ASP.NET app with SQL Server & it is a photo & video sharing site.
Details of photos and videos are stored in tables & the files are in the file system.
Database has 75 tables and 225 stored procedures. The app will be ready for production deployment within next 6 months.
Due to longer time growth concerns, we decided to switch to NoSQL (MongoDB) database.
We have few questions regarding the best way to approach this:
Is it better to deploy the app with SQL Server backend and migrate to NoSQL later?
OR re-architecture now and rewrite/recreate database, tables, procedures and data layer
How difficult will it be re-architecture/recode with MongoDB? Any tools or BKMs?
EDIT:
Our app is Youtube+Flickr type site where user will share photos and videos with lots of comments, tags and ratings (photo\video & comments).
Is NoSQL a better database to move to? Reason for moving: cost + read query speed
Please help me with you valuable advise.
Thank you very much.
Change is always exponentially more expensive the later it is introduced to a project. This is a core principle of software engineering. You should do this now.
That said, I question your long-term vision. Relational databases, used properly, have a lot of performance in them.
This question raises more questions than answers.
Have you benchmarked your current implementation in terms of requests/responses?
Why MongoDB out of all possible NoSQL databases? (Don't get me wrong, I love Mongo, but love and hype should not weigh in technology choices)
Are you certain you will get the large userbase you're expecting? Why are you so certain?
Using stored procs seems to tip off that you aren't using an ORM? Why not?
Generally, I'm against these types of re-architectures. Firstly, you need to get your whole team acclimated to how Mongo affects development. Secondly, your ops team needs to get acclimated to how to deploy and maintain a Mongo installation. More likely than not, this will prevent you from launching in a timeline you want to launch.
I'd say that you should probably launch as is, fix the ORM part if you aren't using one, benchmark your app, benchmark a prototype of your app backed by Mongo and if the performance advantages are so big that it warrants the pain of re-architecture do it.
To your latter question, there aren't any tools right now, as far as I can tell, that'll automate or semi-automate the database import/export from SQL Server to Mongo. There are barely tools to do that for MySQL.
I've done such a migration a few month ago, during the early developement stage of a website in ASP.NET. It was a hard decision, but I could concentrate on that migration. The reason why I did this migration was the ORM that I couldn't trust anymore and some very slow queries that I had no idea how to optimize.
During coding phase, what I figured out was : I was spending a lot of time with the data model in SQL Server (using Entity) and all the plumbery code.
Now, no more store procedures (C# and Linq code instead), no more 2 layers to maintain (the code is the model).
My small experience says : The earlier the better but don't get me wrong, before migrating you really have to think in Document rather than in RDBMS. This means you may have to partially change the businness DataModel to correctly utilize MongoDB features, otherwise you could get bad performances and Mongo DB is useless for bad models.
Another point is the admin stuff. You'll have to quickly learn Mongo DB admin to be up to speed. And even if the tools are good, they completely differ from SQL Server tools.
In conclusion, If you're convinced MongoDB is your future data store and search database,
(and it was in my case), read documentation, take time to do some Proof Of Concept. Then you can think Document and load test you new model.
Your core question appears to be whether to make the switch to MongoDB now, or deploy on SQL and go to MongoDB in a future release.
You do not appear to be using an ORM (e.g. NHibernate, Entity Framework.) Setting other concerns aside, if you're convinced that you want to go to NoSQL, then I would do it now rather than later. Unless you integrate a Provider model for your data access, changing the underlying data access strategy after it is already established would be difficult.
I agree. Switching now is better, if only to avoid the data migration headache switching post-deployment will require.

Can/should I run my web site against a SQLite database?

I'm about to build a new personal blog/portfolio site (which will be written in ASP.NET), and I'm going to run it against a SQLite database. There are a few reasons for this:
The site will not be getting a lot
of traffic, and from what I've read,
SQLite is able to support quite a
lot of concurrent users for reading
anyway
I can back up all the content
easily, just by downloading the db
over FTP
I don't have to pay my hosting
company every month for a huge
SQL2008 database that I'm hardly
using
So, should I go for it, or is this a crazy idea?
I'm not so sure about #2 (what happens if SQLite makes changes to the file while the FTP program is reading it?) but other than that, there is no reason to prefer one DB over the other (unless one of those DBs just can't do what you need).
[EDIT] Use an online backup to create the file for FTP download. That will make sure the file content is intact.
Even better, add a page (with password) to your site which creates the file at the press of a button, so your browser can download it.
It's just fine for a low traffic site as long as it's mostly read traffic. If it were me, I'd use SQL Compact Edition instead (same benefits as Sqlite- single file, no server), just because I'm a LINQ-head and the LINQ providers are "in the box" for it, but Sqlite has a decent LINQ library and managed support as well. Make sure your hosting company allows unmanaged code, or that you use the managed port of Sqlite (don't know its current stability though).
SQLite can handle this easily - go for it.
You should check, but I think that the Express version of SQL 2008 is free of charge.
Anyway, I've been working with SQLite from .NET environment, and it works quite fine (but I haven't done any load test).
And if you're not decided yet, you still can use a LINQ provider which will allow you later to switch from one database to another without rewriting your SQL code (I think to DbLinq, for example).
If you plan to backup you database, you must ensure first that it is not used at the moment.
SQLite answer this for you:
http://sqlite.org/whentouse.html
low-medium volume = okay,
high volume = don't use it
in your case its a-ok to use sqlite
Generally, yes.
But you should be aware of the fact that SQLite does not support everything that you might be used to from a 'real' DBMS. E.g. there are no constraints like foreign keys, unique indexes and the like, and AFAIK some (more advanced) datatypes are not available.
You should check for the various limitations here and here. If you can get along with that there's no reason to not use SQLite.
A rule of thumb is that if the site can run on one server
then SQLite is sufficient. That is what the creator of
SQLite, D. Richard Hipp, said at approximately 13 min
30 secs into episode 26 of the FLOSS Weekly
podcast.
Direct audio link (MP3 file, 24 MB, 51 min 15 sec).
I'd say no. First off, I don't know who you are using for a provider, but with my provider (goDaddy), it's pretty cheap at $2.99 a month or so. I get 1 sql server db and 10 mysql dbs.
I don't know how much cheaper this can get.
Secondly, why risk it? Most provider plans include at least MySQL database. You can hook up with that.
In general, SQLite isn't meant for a high-traffic website, but it can do quite well on websites getting 100,000 hits/day or less. The SQLite org website gets more than 500,000 hits/day, and generates 2 million or more DB interactions/day ... all handled by SQLite.
Here are some things that will dramatically speed up SQLite's performance:
Index your tables
Use transactions for multiple commands instead of executing one at a time.
Learn about write-ahead logging
Do a Google search on each of the above with SQLite ... your DB performance will improve dramatically.
An SQLite DB can actually be faster than a MySQL, PostGRE, MS SQL Server DB, or other hosted server-based DBs for 2 reasons:
1). SQLite is usually stored on the same machine as the website, rather than a separate server machine, eliminating round trip network latency response times.
2.) For smaller read/write requests, the SQLite engine is executing far less code, which can also be faster.
For a smaller website, a smaller DB engine like SQLite could actually be faster and more efficient.
Are you using any SQL functionality? SUM, AVG, SORT BY, etc, if yes go use SQLite. If not, just use plain txt files to store your data. Also make sure that the database is outside the httpdocs folder or it is not web accessible.

batch manipulations for online web app

A customer has a web based inventory management system. The system is proprietary and complicated. it has around 100 tables in the DB and complex relationships between them. it has ~1500000 items.
The customer is doing some reorganisations in his processes and now has the need to make massive updates and manipulation to the data (only data changes, no structural changes). The online screens do not permit such work, since they where designed at the begining without this requirement in mind.
The database is MS Sql 2005, and the application is an asp.net running on IIS.
one solution is to build for him new screens where he could visialize the data in grids and do the required job on a large amount of records. This will permit us to use the already existing functions that deal with single items (we just need to implement a loop). At this moment the customer is aware of 2 kinds of such massive manipulations he wants to do, but says there will be others.This will require design, coding, and testing everytime we have a request.
However the customer needs are urgent because of some regulatory requirements, so I am wondering if it will be more efficient to use some kind of mapping between MSSQL and Excel or Access to expose the needed informations. make the changes in Excel or Access then save in the DB. may be using SSIS to do this.
I am not familiar with SSIS or other technologies that do such things, that's why I am not able to judge if the second solution is indeed efficient and better than the first. of course the second solution will require some work and testing, but will it be quicker and less expensive?
the other question is are there any other ways to do this?
any ideas will be greatly appreciated.
Either way, you are going to need testing.
Say you export 40000 products to Excel, he re-organizes them and then you bring them back into a staging table(s) and apply the changes to your SQL table(s). Since Excel is basically a freeform system, what happens if he introduces invalid situations? Your update will need to detect it, fail and rollback or handle it in some specified way.
Anyway, both your suggestions can be made workable.
Personally, for large changes like this, I prefer to have an experienced database developer develop the changes in straight SQL (either hardcoding or table-driven), test it on production data in a test environment (doing a table compare between before and after) and deploy the same script to production. This also allows the use of the existing stored procedures (you are using SPs, to enforce a consistent interface to this complex database, right?) so that basic database logic already in place is simply re-used.
I doubt Excel will be able to deal with 1.5mil elements/rows.
When you say to visualise data in grids - how will your customer make changes? Manually or is there some automation behind it? I would strongly encourage automation (since you know about only 2 types of changes at the moment). Maybe even a simple standalone "converter" application - don't make part of the main program - it will be too tempting for them in the future to manually edit data straight in the DB tables.
Here is a strategy that I think will get you from A to B in the shortest amount of time.
one solution is to build for him new
screens where he could visialize the
data in grids and do the required job
on a large amount of records.
It's rarely a good idea to build an interface into the main system that you will only use once or twice. It takes extra time and you'll probably spend more time maintaining it than using it.
This will permit us to use the already
existing functions that deal with
single items (we just need to
implement a loop)
Hack together your own crappy little interface in a .NET Application, whose sole purpose is to fulfill this one task. Keep it around in your "stuff I might use later" folder.
Since you're dealing with such massive amounts of data, make sure you're not running your app from a remote location.
Obtain a copy of SQL 2005 and install it on a virtualization layer. Copy your production database over to this virtualized SQL server. Take a snap shot of your virtualized copy before you begin testing. Write and test your app against this virtualized copy. Roll back to your original snap shot each time you test. Keep changing your code, testing, and rolling back until your app can flawlessly perform the desired changes.
Then, when the time comes for you to change the production database, you can sit back and relax while your app does all of the changes. Since the process will likely take a while, so add some logging so you can check the status as it runs.
Oh yeah, make sure you have a fresh backup before you run your big update.

What is Sqlite used for?

I don't know how authoritative this is but I found this:
http://www.sqlite.org/cvstrac/wiki?p=PerformanceConsiderations
and it doesn't seem good to have a lot of connections to sqlite. This seems to be bad for the web and most applications that have more than a few users. I'm having a hard time thinking of what sqlite would be used for when you don't need that many connections. Every program I can think of needs users, lots of them sometimes, so what would I use a database for that doesn't allow that many connections? I thought about prototypes but why would I use that when I can just connect to a larger database? Embedded apps maybe?
Thank you.
EDIT: Thanks everyone. I look at the page recommended below but an confused about something:
Under appropriate uses for sqlite it has:
Situations Where SQLite Works Well
•Websites
SQLite usually will work great as the database engine for low to medium traffic websites (which is to say, 99.9% of all websites). The amount of web traffic that SQLite can handle depends, of course, on how heavily the website uses its database. Generally speaking, any site that gets fewer than 100K hits/day should work fine with SQLite. The 100K hits/day figure is a conservative estimate, not a hard upper bound. SQLite has been demonstrated to work with 10 times that amount of traffic.
Situations Where Another RDBMS May Work Better
•Client/Server Applications
If you have many client programs accessing a common database over a network, you should consider using a client/server database engine instead of SQLite. SQLite will work over a network filesystem, but because of the latency associated with most network filesystems, performance will not be great. Also, the file locking logic of many network filesystems implementation contains bugs (on both Unix and Windows). If file locking does not work like it should, it might be possible for two or more client programs to modify the same part of the same database at the same time, resulting in database corruption. Because this problem results from bugs in the underlying filesystem implementation, there is nothing SQLite can do to prevent it.
A good rule of thumb is that you should avoid using SQLite in situations where the same database will be accessed simultaneously from many computers over a network filesystem.
The Question:
I'm going to show my ignorance here but what is the difference between these two?
This is answered well by sqlite itself : Appropriate use of sqlite
Another way to look at SQLite is this:
SQLite is not designed to replace Oracle. It is designed to replace fopen().
It's good for situations where you don't have access to a "real" database and still want the power of a relational db. For example, Firefox stores a bunch of information about your settings/history/etc in an SQLite database. You can't expect everyone that runs firefox to have MySQL or postgre installed on their machine.
It's also perfectly capable of running relatively-low traffic, read-heavy websites. The performance of it is overall very good, it's more than the large majority of websites need for their traffic levels.
It's often used for embedded applications.
It can be very handy to use a database like storage when you have no access to a database service. So SQLite is used since it's just a file you store somewhere.
I also find that using SQLite is good for getting a prototype application together pretty quickly without the overhead of having a seperate DB server or bogging a development environment with an instance of MySQL/Oracle/Whatever.
Also easy to pick up and move the database to a different machine if you need to.
The iPhone uses it for call history, SMS messages, contacts, and other type of data. Like Ólafur Waage said, good for embedded applications on mobile device because it's lightweight. I have used it also on stand alone applications. Easy to use and available on most platforms.
Think about simple client or desktop apps that could make use of a db, like as a poor example, an address book. Rather than bundling a huge db engine like mysql or postgre with your deliverable, sqlite is very lightweight and easy to include with your finished app.
This FLOSS Weekly podcast episode talks with the creator of SQLite and covers among other things goes over the type of things you would use it for. Everything from file systems for mobile phones to smallish web sites.
In the simplest terms, SQLite is a public-domain software package that provides a
relational database management system, or RDBMS. Relational database systems are
used to store user-defined records in large tables. In addition to data storage and management,
a database engine can process complex query commands that combine data
from multiple tables to generate reports and data summaries. Other popular RDBMS
products include Oracle Database, IBM’s DB2, and Microsoft’s SQL Server on the
commercial side, with MySQL and PostgreSQL being popular open source products.
The “Lite” in SQLite does not refer to its capabilities. Rather, SQLite is lightweight
when it comes to setup complexity, administrative overhead, and resource usage.
For detail info and solution about SQLite visit the link below:
http://blog.developeronhire.com/what-is-sqlite-sqlite/
Thank you.
What the above two answers say. Expanding slightly on Chad Birch's answer, its teh calls to the SQLite db, and a rather poor implementation of sync() that causes FF3 to be so slow in linux.

Resources