SQLAlchemy Connections, pooling, and SQLite - sqlite

so, my design calls for a separate SQLite file for each "project".. I am reading through the SQLAlchemy Pooling docs more carefully.. my guess right now is that I dont want to fool with pooling at all, but this is really a separate connection engine for each project.. Agree??
In that case, when I create the engine, either I connect to a file named by convention, or create a new SQLite file and supply a schema template... ??

Ehm, what? Connection Pools contain many connections to the same (database) server. It takes time to establish a new connection, so when there are many short-lived processes using the same database, it's handy to have a pool of already established connections. The processes can check out a connection, do their thing and return it, without having wait while opening a new connection.
In any case, all connections go to the same database, given by the URI passed to create_engine

First, some vocabulary. SQLAlchemy defines schemas with MetaData objects, containing objects representing tables and other database entities. Metadata objects can be optionally "bound" to engines that are what you think of as "pools."
To create a standard schema and use it across multiple databases, you'll want to create one metadata object and use it with several engines, each engine being a database you connect to. Here's an example, from the interactive iPython prompt. Note that each of these SQLite engines connect to different in-memory databases; connection1 and connection2 do not connect to the same database:
In [1]: from sqlalchemy import *
In [2]: metadata = MetaData()
In [3]: users_table = Table('users', metadata,
...: Column('id', Integer, primary_key=True),
...: Column('name', String))
In [4]: connection1 = create_engine('sqlite:///:memory:')
In [5]: connection2 = create_engine('sqlite:///:memory:')
In [6]: ## Create necessary tables
In [7]: metadata.create_all(bind=connection1)
In [8]: metadata.create_all(bind=connection2)
In [9]: ## Insert data
In [10]: connection1.execute(
users_table.insert(values={'name': 'Mike'}, bind=connection1))
In [11]: connection2.execute(
users_table.insert(values={'name': 'Jim'}, bind=connection2))
In [12]: print connection1.execute(users_table.select(bind=connection1)).fetchall()
[(1, u'Mike')]
In [13]: print connection2.execute(users_table.select(bind=connection2)).fetchall()
[(1, u'Jim')]
As you can see, I connect to two sqlite databases and executed statements on each using a common schema stored in my metedata object. If I were you, I'd start by just using the create_engine method and not worry about pooling. When it comes time to optimize, you can tweak how databases are connected to using arguments to create_engine.

Related

Create a dynamic database connection in Airflow DAG

I am using Apache-Airflow 2.2.3 and I know we can create connections via admin/connections. But I trying for a way to create a connection using dynamic DB server details.
My DB host, user, password details are coming through the DAGRun input config and I need to read and write the data to DB.
You can read connection details from the DAGRun config:
# Say we gave input {"username": "foo", "password": "bar"}
from airflow.models.connection import Connection
def mytask(**context):
username = context["dag_run"].conf["username"]
password = context["dag_run"].conf["password"]
connection = Connection(login=username, password=password)
However, all operators (that require a connection) in Airflow take an argument conn_id that takes a string identifying the connection in the metastore/env var/secrets backend. At the moment it is not possible to provide a Connection object.
Therefore, if you implement your own Python functions (and use the PythonOperator or #task decorator) or implement your own operators, you should be able to create a Connection object and perform whatever logic using that. But using any other existing operators in Airflow will not be possible.

Is there any good way to pass in bytecode directly to sqlite3?

I'm working on a tool that allows Python developers to write pythonic code to interact with a sqlite3 database, similar to sqlalchemy but without the "translation" phase. If I can generate a sqlite3 prepared statement, how can I directly pass it to the evaluation system?
As a rough example, here's how I roughly view a user being able to interact with my tool:
myTable = Table("field1", "field2", "field3")
mytable.insert("foo", "bar", "baz")
select = mytable.select("field1")
---------------
print(select)
>>> ["foo"]
There is no (public) API in SQLite3 that allows you to execute pre-built SQLite bytecode. The bytecode for an SQL statement can be viewed with the EXPLAIN SQL command, but this is meant for debugging and learning purposes, not for what you're trying to do.
And for most purposes, you shouldn't need this. If you feel that the time spent compiling a prepared statement will be a burden, sqlite3_stmt objects can be stored for the lifetime of the sqlite3 database connection it was created with. Prepared statements that have been executed can be reset, allowing them to be executed again. So as long as the database connection exists, you can compile the statement once and use it as many times as you need to.
But that's about it. There is no mechanism to persist a prepared statement beyond the lifespan of the sqlite3 connection. You can't extract the bytecode by any public API, and you can't use some bytecode you've obtained to reconstitute a prepared statement.
If you want persistence beyond the connection, then you need to store the SQL statement text in whatever place you want to be persistent, and then simply recompile the prepared statement when you reconnect to the database. That one recompilation (or many depending on how many you store) shouldn't be a particular burden, depending on the life span of your application.

specify default schema for a database in db2 client

Do we have any way to specify default schema in cataloged DBs in db2 client in AIX.
The problem is , when it's connecting to DB, it's taking user ID as default schema and that's where it's failing.
We have too many scripts that are doing transactions to DB without specifying schema in their db2 sql statements. So it's not feasible to change scripts at all.
Also we can't create users to match schema.
You can try to type SET SCHEMA=<your schema> ; before executing your queries.
NOTE: Not sure if this work (I am without a DB2 database at the moment, but it seems that work) and depending on your DB2 version.
You can create a stored procedure that just changes the current schema and then set the SP as connect proc. You can test some conditions before make that schema change, for example if the stored procedure is executed from the AIX server directly with a given user.
You configure the database to use this SP each time a connection is established by modifying connect_proc
http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.admin.config.doc/doc/r0057371.html
http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.admin.dbobj.doc/doc/c0057372.html
You can create alias in the new user schema that points to the tables with the other schema. Refer these links :
http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.sql.ref.doc/doc/r0000910.html
http://bytes.com/topic/db2/answers/181247-do-you-have-always-specify-schema-when-using-db2-clp

How does one use the "create database" statement for Oracle express 11g?

According to one of my posts (below) it seems that there is no such thing as a database in Oracle. What we call database in MySQL and MS-SQL is called schema in Oracle.
If that is the case, then why do the oracle docs mention the create database statement ?
For the record, I am using Oracle 11g and oracle SQL Developer GUI tool.
Post-
How to create a small and simple database using Oracle 11 g and SQL Developer?
The create database statement from oracle docs is given below. If there is no database concept, then how did this command come into the picture ?
CREATE DATABASE
CREATE DATABASE [ database ]
{ USER SYS IDENTIFIED BY password
| USER SYSTEM IDENTIFIED BY password
| CONTROLFILE REUSE
| MAXDATAFILES integer
| MAXINSTANCES integer
| CHARACTER SET charset
| NATIONAL CHARACTER SET charset
| SET DEFAULT
{ BIGFILE | SMALLFILE } TABLESPACE
| database_logging_clauses
| tablespace_clauses
| set_time_zone_clause
}... ;
There is concept of a "database" in Oracle. What the term "database" means in Oracle terms is different than what the term means in MySQL or SQL Server.
Since you are using the express edition, Oracle automatically runs the CREATE DATABASE statement as part of the installation process. You can only have 1 express edition database on a single machine. If you are installing a different edition, you can choose whether to have the installer create a database as part of the installation process or whether to do that manually via the CREATE DATABASE statement later. If you are just learning Oracle, you're much better off letting Oracle create the database for you at installation time-- you can only create the database via command-line tools (not SQL Developer) and it is rare that someone just starting out would need to tweak the database settings in a way that the installer didn't prmopt you for.
In Oracle, a "database" is a set of data files that includes the data files for the SYS and SYSTEM schemas which contain all the Oracle data dictionary tables, the data files for the TEMP tablespace where sorts and other temporary operations occur, and the data files for whatever schemas you want to create. In SQL Server and other RDBMSs, these would be separate "databases". In SQL Server, you have a master database, a tempdb database, additional database for different products (i.e. msdb for the SQL Server Agent), and then additional user-defined databases. In Oracle, these would all be separate schemas in a larger container that Oracle refers to as a "database".
Occasionally, a DBA will want to run multiple Oracle databases on the same server-- most commonly when there are different packaged applications that have different requirements about database versions or parameters. If you want to run application A that requires an 11.2 database and application B that doesn't support 11.2 yet, you would need to have two different databases on the server. The DBA could create a separate database and a separate instance but that doubles the memory requirements, doubles the number of background processes required to run the database, and generally makes things less scalable. It's necessary if you really want to run different versions of the database simultaneously but it's not ideal.
The person who answered your original question is correct. The DDL (Data Definition Language) above prepares a space for schemas, which is analogous to MySQL's 'database'. The above statement defines characteristics of the schemas, such as timezone, MBs of space for tables, encoding characterset, root account, etc. You would then issue DDL statements such as those in your other post to create schemas, which define what each user can see.

Database for replication or simple transferring data

I will try to describe my problem of choosing good technology.
I have many machines which stores data locally in database. And there is one client machine with its own database. What I need is to pull data from all machines and put in client's database.
For now I have started implementing some RPC, but I don't know if its good idea. Because I need to manually take care of each table. Database is SQLite.
What is better. Making some RPC calls or find some light database with replication? Maybe NoSQL db like MonoDB?
I have a similar setup where I have a couple of servers that collect various statistics and store in a sqlite3 database. Combining them is really easy. I have a python script that connect to each server, downloads each database file into a temporary folder. I then open the first one, and use ATTACH for each file, and then insert * for each table to merge in all the other databases into a combined database:
conn = connect('/tmp/database1.sl3');
curs = conn.cursor();
mergeDatabases(curs, 8);
def mergeDatabases(curs, j):
for i in range(2, j):
print "merge in database%d" %i
print "ATTACH '/tmp/database%d.sl3' AS foo%d;" %(i,i)
curs.execute("ATTACH '/tmp/database%d.sl3' AS foo%d;" %(i,i))
curs.execute("insert into db select * from foo%d.db;" %i)
curs.execute("insert into vars select * from foo%d.vars;" %i)
curs.execute("detach foo%d;" %i)

Resources