flask manage db connection :memory: - sqlite

I have a flask application that needs to store some information from requests. The information is quite short-lived and if the server is restarted I do not need it any more - so I do not really need persistence.
I have read here that an Sqlite database, which is held in memory can be used for that. What is the best way to manage the database connection? In the flask documentation connections to the database are created on demand, but my database will be deleted if I close the connection.

The problem with using an in memory sqlite db is that your Sqlite in-memory databases cannot be accessed from multiple threads.
http://www.sqlite.org/inmemorydb.html
To further the problem, you are likely going to have more than one process running your app, which makes using an in-memory global variable out of the question as well.
So unless you can be certain that your app will only ever require a single thread or a single process (which is unlikely) You're going to need to either:
Use the disk to store state, such as an on-disk sqlite db, or even just some file you parse.
Use a daemonized process that runs separately from your application to manage the state.
I'd personally go with option 2.
You can use memcached for this, running on a central server or even on your app server if you've only got one. This will allow you to store state (including python objects!) temporarily, in memory and you can even set timeout values for when the data should expire, which from the sound of things might be useful for your app.
Since you're using Flask, you've got some really good built-in support for using a memcached cache, check it out here: http://flask.pocoo.org/docs/patterns/caching/
As for getting memcached running on your server, it's really just an apt-get or yum install away. Let me know if you have questions or challenges and I'll be happy to update.

Related

Wrong dependency to IIS restart for getting changed data in SQL Server

I am working on an ASP.NET webforms application with Entity Framework. Also for some reports it uses a dll and in that we have explicit query to get the records from SQL Server (such as ADO).
The problem is that when I change a column such as ParentID in SQL Server, I must to reset the website in IIS to see it and this solves the problem. This dependency is not logical and I want to know why this happens? Is there any relation to caching because of calling method in the dll?
How can I solve this problem?
When you run a query against SQL server (or any database, really), the result that you see is not the data "in the database", so to speak. The query returns a copy of that data that belongs only to you. The copy of the data gets sent over the network, to the client - in your case, an ASP.NET web application - and the application does whatever it needs to do, such as show it to a user.
Once the query which retrieved the data is complete, there is no longer any link between the data in the client, and the data in the database. There is no continuous, "live" connection between the two, even if your actual database connection is still open. The database connection is merely a way to send queries to the server, and for it to send copies of the data back.
It's like taking a copy of a file from a different machine. If you copy a file from my machine, and then I update my copy, your copy doesn't instantly get updated.
If you want data in some user interface to stay perfectly up to date with the data that actually exists in the database, you have a difficult problem to solve. There is no "easy" way to do this. Or perhaps more accurately, there is no simple or efficient way to do this.
This might seem odd to you. You're thinking "well, why not? Why doesn't it just show me the values as they actually exist?". The reason is that these systems need to be able to support many users - often thousands at once - who are all both reading the database and writing to it. Imagine someone was in the middle of updating data in the database, but then they rollback their transaction. Should you see the data as it was being modified, but not committed? What if two users are trying to update "the same" data at once? All sorts of concurrency questions come into play, which basically boils down to questions about locking.
What you are encountering here is a basic principle of multi-threaded environments, which translates to systems with multiple clients: Data can't be accessed directly by multiple people at the same time. Instead, you give each person their own immutable copy.
In a web application things are even more disconnected. When the browser requests the web page, the server side of the web application gets a copy of the data from the database, and then transmits that to the browser. Once the page is loaded there is no longer any link between the web server and the database server, or any link between the web server and the web browser at the client, and certainly no link between the web browser and the database.
Ultimately, this is one of the "hard problems" in computer science. You want to know how to tell the client to invalidate their "cache", and refresh their local data. There are a few mechanisms provided by .NET to do this with SQL Server, but they are quite technical. One of them is query notifications

how to start Apache Jena Fuseki as read only service (but also initially populate it with data)

I have been running an Apache Jean Fuseki with a closed port for a while. At present my other apps can access this via localhost.
Following their instructions I start this service as follows:
./fuseki-server --update --mem /ds
This create and updatable in memory database.
The only way I currently know how to add data to this database is using the built-in http request tools:
./s-post http://localhost:3030/ds/data
This works great except now I want to expose this port so that other people can query the dataset. However, I don't want to allow people to update or change the database, I just want them to be able to use and query the information I originally loaded into the data base.
According to the documentation (http://jena.apache.org/documentation/serving_data/), I can make the database read-only by starting it without the update option.
Data can be updated without access control if the server is started
with the --update argument. If started without that argument, data is
read-only.
But when I start the database this way, I am no longer able to populate with the initial dataset.
So, MY QUESTION: How I start an in-memory Fuseki database which I can populate with my original dataset but then disallow further http updates.
(My guess is that I need another method to populate the Fueseki database that is not using the http protocol. But I'm not sure)
Some options:
Here are some options:
1/ Use TDB tools to build a database offline and then start the server read only on that TDB database.
2/ Like (1) but use --update to build a persistent database, then stop the server, and restart without --update. The database is now read only. --update affects the services available and does not affect the data in any other way.
Having a persistent database has the huge advantage that you can start and stop the server without needing to reload data.
3/ Use a web server to pass through query requests to the fuseki server and limit the Fuseki server to talk to only localhost. You can update from the local machine, external people can't.
4/ Use Fuseki2 and adjust the security settings to allow update only from localhost but query from anywhere.
What you can't do is update a TDB database currently being served by Fuseki.

SQLite shared cache

I have a huge (>10GB) sqlite database that is shared among many (up to CPU core count) processes (same executable). This is a specialized application so RAM is not an issue and I want to cache as much of the database in memory. I have found about PRAGMA cache_size; and I am successfully using it but this blows the RAM usage out of proportion as each of many processes has its own private cache.
Now, I found SQLite Shared-Cache Mode but I can't see if this applies to different processes or just threads in one process. I have run some tests which confirm the latter but I am not sure if I am doing something wrong or whether something else needs to be done to make this work.
That page explains that "the same cache can be shared across an entire process".
In theory, you could try to configure your OS so that the entire database is held in the file cache.
If the amount of data in individual queries is small, it might be worthwhile to use a client/server database so that the caching needs to be done only in the server process.

Automatically switch state management if SQL Server is unavailable

This may be a dumb question, and based on the fact that googling has failed me I'm betting the answer is "no", but I thought I'd ask in case someone else has figured it out.
We're finally putting our website on a server farm, which means we can't use InProc session management. We're using SQLServer mode instead, but we had a situation where our SQL Cluster crashed. During this time, none of our newer web apps were able to load because of an inability to connect to the session database.
So here's the question: Is it possible to automatically fall back to a different session management (StateServer for example) or dynamically change the connectionstring so that we can use a backup Sql Server?
For now, our plan is to use DNS and if the main SQL Cluster fails, simply switch the DNS to a backup, but that's a manual task, and takes some time. We were hoping to have some sort of automatic failover.
I am afraid that there is no way. Also switching the session state mode would also make your application crash because users won't be able to find the data that was stored in their sessions. So an advice I can give you is the following: use a dedicated SQL server for the sessions, don't use the same server as the one serving your application data. And if you can, progressively start to update your application so that it uses less and less sessions, store very small amounts of data until you completely get rid of it. Make it stateless. Then your application will become very scalable.

using SQLite with mod_perl

I have been successfully using SQLite as a data store for my web applications, but now I am implementing a web site with mod_perl, and am running into database locking issues.
As expected, my entire web application is loaded by the Plack Apache handler (Plack::Handler::Apache2) when the web server is started. Well, the first db query creates a lock on the entire database, and any subsequent query that has to modify the db fails.
What is my way out? Can I use SQLite in a persistent web environment or not? Should I be looking for some other db store?
I am not a fan of MySQL, and don't want to use it. I could potentially use PostGres, but I'd rather use something lightweight, and preferably sql-based as using key/value databases such as Tokyo Cabinet would require learning a whole new way. I'd rather really use SQLite.
If you have an open handle to the database, it can cause this issue. I have had problems when iterating over a result set during a log process causes the lock to stick around.
Try and fetch all the rows for the query and call $sth->finish() to clear up the lock. You will use a little more memory, but you will avoid the locking.
Knowing you are going to do this, you can make use of $sth->fetchall_arrayref() or $sth->fetchall_hashref()
Use Tokyo Cabinet's table database.

Resources