Data sharing - SQLite vs Shared Memory IPC - sqlite

I would like to get your opinion regarding a design implementation for data sharing.
I am working on Linux embedded device (mips 200 Mhz) and I want to have some sort of data sharing between multiple processes which can either read or write multiple parameters at once.
This data holds ~200 string parameters which are updated every second.
Process may access to data around ~10 times in 1 second.
I would very much like to try and make the design efficient (CPU / Mem).
This data is not required to be persistent and will be recreated every reboot.
Currently, I am considering two options:
Using shard memory IPC (SHM) + semaphore (locking on all SHM).
To use SQLite memory based DB.
For either option, I will supply a C interface library which will perform all the logic of DB operation.
For SHM, this mean locking/unlocking the semaphore and access the parameters which can be referred as an indexed array.
For SQLite, my library will be a wrapper for the SQLite interface library, so the process will not have to know SQL syntax, (some parsing should be done for queries and reply).
I believe that shared memory is more efficient:
No need to use and parse SQL, and it is accessed as an array.
Saying that, there are some pros as well for using SQLite:
Already working and debugged (DB level).
Add flexibility.
Used widely in many embedded systems.
Getting to the point,
Performance wise, I have no experience with SQLite, I would appreciate if you can share your opinions and experience.
Thanks

SQLite's in-memory databases cannot be shared between processes, but you could put the DB file into tmpfs.
However, SQLite does not do any synchronization between processes. It does lock the DB file to prevent update conflicts, but if one process finds the file already locked, it just waits for a random amount of time.
For efficient communication between processes, you need to use a mechanism like SHM/semaphores or pipes.

Related

Caching of data in a text file — Better options

I am working on an application at the moment that is using as a caching strategy the reading and writing of data to text files in a read/write directory within the application.
My gut reaction is that this is sooooo wrong.
I am of the opinion that these values should be stored in the ASP.NET Cache or another dedicated in-memory cache such as Redis or something similar.
Can you provide any data to back up my belief that writing to and reading from text files as a form of cache on the webserver is the wrong thing to do? Or provide any data to prove me wrong and show that this is the correct thing to do?
What other options would you provide to implement this caching?
EDIT:
In one example, a complex search is performed based on a keyword. The result from this search is a list of Guids. This is then turned into a concatenated, comma-delimited string, usually less than 100,000 characters. This is then written to a file using that keyword as its name so that other requests using this keyword will not need to perform the complex search. There is an expiry - I think three days or something, but I don't think it needs to (or should) be that long
I would normally use the ASP.NET Server Cache to store this data.
I can think of four reasons:
Web servers are likely to have many concurrent requests. While you can write logic that manages file locking (mutexes, volatile objects), implementing that is a pain and requires abstraction (an interface) if you plan to be able to refactor it in the future--which you will want to do, because eventually the demand on the filesystem resource will be heavier than what can be addressed in a multithreaded context.
Speaking of which, unless you implement paging, you will be reading and writing the entire file every time you access it. That's slow. Even paging is slow compared to an in-memory operation. Compare what you think you can get out of the disks you're using with the Redis benchmarks from New Relic. Feel free to perform your own calculation based on the estimated size of the file and the number of threads waiting to write to it. You will never match an in-memory cache.
Moreover, as previously mentioned, asynchronous filesystem operations have to be managed while waiting for synchronous I/O operations to complete. Meanwhile, you will not have data consistent with the operations the web application executes unless you make the application wait. The only way I know of to fix that problem is to write to and read from a managed system that's fast enough to keep up with the requests coming in, so that the state of your cache will almost always reflect the latest changes.
Finally, since you are talking about a text file, and not a database, you will either be determining your own object notation for key-value pairs, or using some prefabricated format such as JSON or XML. Either way, it only takes one failed operation or one improperly formatted addition to render the entire text file unreadable. Then you either have the option of restoring from backup (assuming you implement version control...) and losing a ton of data, or throwing away the data and starting over. If the data isn't important to you anyway, then there's no reason to use the disk. If the point of keeping things on disk is to keep them around for posterity, you should be using a database. If having a relational database is less important than speed, you can use a NoSQL context such as MongoDB.
In short, by using the filesystem and text, you have to reinvent the wheel more times than anyone who isn't a complete masochist would enjoy.

In-memory SQLite database shared by multiple processes

I have a system where multiple processes successfully share a single SQLite disk based database. The size and nature of the database is such that faster access is always desirable and database is temporary anyway, so keeping it fully in memory sounds like a good idea. I know SQLite supports in memory databases but it appears as if there is no way to share an in-memory database with another process (or at least this is how I understand it). Considering SQLite seems to use file mappings I see no reason why a process-shared in-memory database could not exist (at least in theory).
I am keen to know if anybody knows a way to do this or has some other suggestion.
It is true, that SQlite does not support sharing a memory database with other processes. There is little reason to implement such a feature, because uses cases are mostly artificial. You cite performance as a use case, but you can just create a file based database on a tmpfs if you are on Linux. Otherwise you can still use a number of pragmas, such as PRAGMA synchronous=OFF; to speed up your database by giving up durability. Going further, you can use PRAGMA journal_mode=MEMORY; to prepare commits in memory or even use PRAGMA journal_mode=OFF; if you do not need transaction support at all.
One of the main reasons for the lack of support is the need for locking. SQlite needs some means to lock the database and currently these locking operations tied to the file operations in the SQlite VFS implementation. You might still be able to implement your own VFS module that works in memory, but you risk implementing a filesystem.

sqlite multiple databases insert/update/delete

I was wondering how SQLite behaves when it's given multiple databases to insert/update/delete in at the same time? Does it spawn multiple processes which can in theory have better concurrency than using a single database/single process or it utilizes the same process for each?
Searching through the documentation didn't provide e with a definitive answer. I am aware that SQLite isn't the most ideal environment for multiple writes, as the database resides in as single file. But does that mean that multiple files = different write processes?
databaseOne = connectToSqlite('databaseOne');
databaseTwo = connectToSqlite('databaseTwo');
function write()
queryDatabaseOne("INSERT SOMETHING INTO SOME_TABLE VALUES SOME_VALUES");
queryDatabaseTwo("INSERT SOMETHING INTO SOME_TABLE VALUES SOME_VALUES");
So, two different sqlite databases, and two inserts executed in parallel, towards tables in the two databases.
Thanks
Normally, database queries are blocking - they do not return until they are complete. This helps secure the integrity of the database. The SQLITE API is blocking.
Of course, if you have a multiple databases, then you can write a multi-threaded application with non-blocking routines that call the the SQLITE API and then code overlapping, parallel inserts to the multiple databases. You will have to be careful about all the usual things in a multithreaded application - the SQLITE API will neither help not hinder - with added complication of insuring that there in no possibility of overlapping accesses to the SAME database.

Why is SQLite fit for template cache?

The benefits of using SQLite storage
for the template cache are faster read
and write operations when the number
of cache elements is important.
I've never used it yet,but how can using SQLite by faster than plain file system?
IMO the overhead(initiating a connection) will make it slower.
BTW,can someone provide a demo how to use SQLite?
There is no real notion of "initiating a connection" : an SQLite database is stored as a single file, in the local filesystem ; so there is nothing like a network connection.
I suppose using an SQLite database can be seen as fast as there is only one file (the database), and not one file per template -- and each access to a file costs some resources ; the operating system might be able to cache accesses to one big file more efficiently to several accesses to several distinct small files.
About a "demo how to use SQLite", it kind of depends on the language you'll be using, but you can start by taking a look at the SQLite documentation, and the API that's available in your programming language ; accessing an SQLite DB is not that hard : basically, you have to :
"Connect" to the DB -- i.e. open the file
Issue some SQL queries
Close the connection
It's not much different than with any other DB engine : the biggest difference is there is no need to setup any DB server.
The benefits of SQLite over a standard file system would lie in it's caching mechanism. SQLite stores data in pages and caches pages to memory. Repeated calls for data that are on pages already in memory will skip a call out to the file system.
There is some overhead in using SQLite though. When you connect to a SQLite database the engine reads and parses the schema. On our system, that takes 30ms (although it's usually less than 1ms for smaller schemas--we have just under a hundred tables and hundreds of triggers and indexes).

In memory database with socket capability

Python --> SQLite --> ASP.NET C#
I am looking for an in memory database application that does not have to write the data it receives to disc. Basically, I'll be having a Python server which receives gaming UDP data and translates the data and stores it in the memory database engine.
I want to stay away from writing to disc as it takes too long. The data is not important, if something goes wrong, it simply flushes and fills up with the next wave of data sent by players.
Next, another ASP.NET server must be able to connect to this in memory database via TCP/IP at regular intervals, say once every second, or 10 seconds. It has to pull this data, and this will in turn update on a website that displays "live" game data.
I'm looking at SQlite, and wondering, is this the right tool for the job, anyone have any suggestions?
Thanks!!!
This sounds like a premature optimization (apologizes if you've already done the profiling). What I would suggest is go ahead and write the system in the simplest, cleanest way, but put a bit of abstraction around the database bits so they can easily by swapped out. Then profile it and find your bottleneck.
If it turns out it is the database, optimize the database in the usual way (indexes, query optimizations, etc...). If its still too slow, most databases support an in-memory table format. Or you can mount a RAM disk and mount individual tables or the whole database on it.
Totally not my field, but I think Redis is along these lines.
The application of SQlite depends on your data complexity.
If you need to perform complex queries on relational data, then it might be a viable option. If your data is flat (i.e. not relational) and processed as a whole, then some python-internal data structures might be applicable.
Perhaps AppFabric would work for you?
http://msdn.microsoft.com/en-us/windowsserver/ee695849.aspx
SQLite doesn't allow remote "connections" as far as I know, it only supports being invoked as an in-process library. However, you could try to use MySQL which, while heavier, supports remote connections and does have in-memory tables.
See http://dev.mysql.com/doc/refman/5.5/en/memory-storage-engine.html

Resources