Postgresql 9.1 - Find difference between two databases - postgresql-9.1

I have a specific architecture to set up with postgresql.
I have a system that is based on two databases N and N+1.
The database N is available for clients in read only mode, and the database N+1 is available for modification for clients.
The client can also send two commands to the system:
An "apply" command: all the modifications made on the N+1 db are kept and the new state of the system is a readonly db with N+1 data and a N+2 db with same data available for writes.
A "reset" command: the N+1 db is dropped and a new copy of the N database is made for writes access for the users.
My first idea was to keep two databases in an instance of postgresql and perform pg_dump and pg_restore command on apply or reset command, and rename the database for the apply(N+1 -> N). The db can possibly reach the size of 8 Go, so I am currently performing test of such dump&restore on a Centos 6.7 vm.
Then I looked to the pg_basebackup command, to set up a hot standby database that will be the readonly one. The problem is that such an architecture is based on the idea of data replication from a master to a slave, and that is something I don't want since the client can ask a reset command that will drop the N+1 db.
The thing is I don't know if a system based on daily dump/restore is viable or not, or if there is with postgresql a simple way to handle two databases with the same schema and "detect and apply" the differences between the two: with that ability, I will be able , on apply command, to copy from N+1 to N only the difference, and the contrary with a reset command.
Any idea ?

Related

Scan entire dynamo db and update records based on a condition

We have a business requirement to deprecate certain field values("**State**"). So we need to scan the entire db and find these deprecated field values and take the last record of that partition key(as there can be multiple records for the same partition key, sort key is LastUpdatedTimeepoch), then update the record. Right now the table contains around 600k records. What's the best way to do this without bringing down the db service in production?
I see this thread could help me
https://stackoverflow.com/questions/36780856/complete-scan-of-dynamodb-with-boto3
But my main concern is -
This is a one time activity. As this will take time, we cannot run this in AWS lambda since it will exceed 15 minutes. So where can I keep the code running for this?
Create EC2 instance and assign role to access dynamo db and run function in EC2 instance.

DynamoDB limitations when deploying MoonMail

I'm trying to deploy MoonMail on AWS. However, I receive this exception from CloudFormation:
Subscriber limit exceeded: Only 10 tables can be created, updated, or deleted simultaneously
Is there another way to deploy without opening support case and asking them to remove my limit?
This is an AWS limit for APIs: (link)
API-Specific Limits
CreateTable/UpdateTable/DeleteTable
In general, you can have up to 10
CreateTable, UpdateTable, and DeleteTable requests running
simultaneously (in any combination). In other words, the total number
of tables in the CREATING, UPDATING or DELETING state cannot exceed
10.
The only exception is when you are creating a table with one or more
secondary indexes. You can have up to 5 such requests running at a
time; however, if the table or index specifications are complex,
DynamoDB might temporarily reduce the number of concurrent requests
below 5.
You could try to open a support request to AWS to raise this limit for your account, but I don't feel this is necessary. It seems that you could create the DynamoDB tables a priori, using the AWS CLI or AWS SDK, and use MoonMail with read-only access to those tables. Using the SDK (example), you could create those tables sequentially, without reaching this simultaneously creation limit.
Another option, is to edit the s-resources-cf.json file to include only 10 tables and deploy. After that, add the missing tables and deploy again.
Whatever solution you apply, consider creating an issue ticket in MoonMail's repo, because as it stands now, it does not work in a first try (there are 12 tables in the resources file).

Database for replication or simple transferring data

I will try to describe my problem of choosing good technology.
I have many machines which stores data locally in database. And there is one client machine with its own database. What I need is to pull data from all machines and put in client's database.
For now I have started implementing some RPC, but I don't know if its good idea. Because I need to manually take care of each table. Database is SQLite.
What is better. Making some RPC calls or find some light database with replication? Maybe NoSQL db like MonoDB?
I have a similar setup where I have a couple of servers that collect various statistics and store in a sqlite3 database. Combining them is really easy. I have a python script that connect to each server, downloads each database file into a temporary folder. I then open the first one, and use ATTACH for each file, and then insert * for each table to merge in all the other databases into a combined database:
conn = connect('/tmp/database1.sl3');
curs = conn.cursor();
mergeDatabases(curs, 8);
def mergeDatabases(curs, j):
for i in range(2, j):
print "merge in database%d" %i
print "ATTACH '/tmp/database%d.sl3' AS foo%d;" %(i,i)
curs.execute("ATTACH '/tmp/database%d.sl3' AS foo%d;" %(i,i))
curs.execute("insert into db select * from foo%d.db;" %i)
curs.execute("insert into vars select * from foo%d.vars;" %i)
curs.execute("detach foo%d;" %i)

Can I read and write to a SQLite database concurrently from multiple connections?

I have a SQLite database that is used by two processes. I am wondering, with the most recent version of SQLite, while one process (connection) starts a transaction to write to the database will the other process be able to read from the database simultaneously?
I collected information from various sources, mostly from sqlite.org, and put them together:
First, by default, multiple processes can have the same SQLite database open at the same time, and several read accesses can be satisfied in parallel.
In case of writing, a single write to the database locks the database for a short time, nothing, even reading, can access the database file at all.
Beginning with version 3.7.0, a new “Write Ahead Logging” (WAL) option is available, in which reading and writing can proceed concurrently.
By default, WAL is not enabled. To turn WAL on, refer to the SQLite documentation.
SQLite3 explicitly allows multiple connections:
(5) Can multiple applications or multiple instances of the same
application access a single database file at the same time?
Multiple processes can have the same database open at the same time.
Multiple processes can be doing a SELECT at the same time. But only
one process can be making changes to the database at any moment in
time, however.
For sharing connections, use SQLite3 shared cache:
Starting with version 3.3.0, SQLite includes a special "shared-cache"
mode (disabled by default)
In version 3.5.0, shared-cache mode was modified so that the same
cache can be shared across an entire process rather than just within a
single thread.
5.0 Enabling Shared-Cache Mode
Shared-cache mode is enabled on a per-process basis. Using the C
interface, the following API can be used to globally enable or disable
shared-cache mode:
int sqlite3_enable_shared_cache(int);
Each call sqlite3_enable_shared_cache() effects subsequent database
connections created using sqlite3_open(), sqlite3_open16(), or
sqlite3_open_v2(). Database connections that already exist are
unaffected. Each call to sqlite3_enable_shared_cache() overrides all
previous calls within the same process.
I had a similar code architecture as you. I used a single SQLite database which process A read from, while process B wrote to it concurrently based on events. (In python 3.10.2 using the most up to date sqlite3 version). Process B was continually updating the database, while process A was reading from it to check data. My issue was that it was working in debug mode, but not in "release" mode.
In order to solve my particular problem I used Write Ahead Logging, which is referenced in previous answers. After creating my database in Process B (write mode) I added the line:
cur.execute('PRAGMA journal_mode=wal') where cur is the cursor object created from establishing connection.
This set the journal to wal mode which allows for concurrent access for multiple reads (but only one write). In Process A, where I was reading the data, before connecting to the same database I included:
time.sleep(0.5)
Setting a sleep timer before a connection was made to the same database fixed my issue with it not working in "release" mode.
In my case: I did not have to manually set any checkpoints, locks, or transactions. Your use case might be different than mine however, so research is most likely required. Nevertheless, I hope this post helps and saves everyone some time!

some basic oracle concepts

Hi:
In our new application we have to use the oracle as the db,and we use mysql/sqlserver before,when I come to oracle I am confused by its concepts,for exmaple,the table space,the object,the schema table,index, procedure, database link,...:(
And the schema is closed to the user,I can not make it.
Since when we use the mysql,I just know that one database contain as many tables,and contain as many users,user have different authentication for different table.
But in oracle,everything is different.
Anyone can tell me some basic concepts of oracle,and some quick start docs?
Oracle has specific meanings for commonly-used terms, and you're right, it is confusing. I'll build a hierarchy of terms from the bottom up:
Database - In Oracle, the database is the collection of files that make up your overall collection of data. To get a handle on what Oracle means, picture the database management system (dbms) in a non-running state. All those files are your "database."
Instance - When you start the Oracle software, all those files become active, things get loaded into memory, and there's an entity to which you can connect. Many people would use the term "database" to describe a running dbms, but, once everything is up-and-running, Oracle calls it an, "instance."
Tablespace - A abstraction that allows you to think about a chunk of storage without worrying about the physical details. When you create a user, you ask Oracle to put that user's data in a specific tablespace. Oracle manages storage via the tablespace metaphor.
Data file - The physical files that actually store the data. Data files are grouped into tablespaces. If you use all the storage you have allocated to a user, or group of users, you add data files (or make the existing files bigger) to the tablespace they're configured to use.
User - An abstraction that encapsulates the privileges, authentication information, and default storage areas for an account that can log on to an Oracle instance.
Schema - The tables, indices, constraints, triggers, etc. that are owned by a particular user. There is a one-to-one correspondence between users and schemas. The schema has the same name as the user. The difference between the two is that the user concept is all about account information, while the schema concept deals with logical database objects.
This is a very simplified list of terms. There are different states of "running" for an Oracle instance, for example, and it's easy to get into very nuanced discussions of what things mean. Here's a practical exercise that will let you put your hands on these things, and will make the distinctions clearer:
Start an already-created Oracle instance. This step will transform a group of files, or as Oracle would say, a database, into a running Oracle instance.
Create a tablespace with the CREATE TABLESPACE command. You'll have to specify some data files to put into the tablespace, as well as some storage parameters.
Create a user with the CREATE USER command. You'll see that the items you have to specify have to do with passwords, privileges, quotas, and the like. Specify that the user's data be stored in the tablespace you created in step 2.
Connect to the Oracle using the credentials you created with the new user from step 3. Type, "SELECT * FROM CAT". Nothing should come back yet. Your user has a schema, but it's empty.
Run a CREATE TABLE command. INSERT some data into the table. The schema now contains some objects.
table spaces: these are basically
storage definitions. when defining a
table or index, etc., you can specify
storage options simply by putting
your table in a specific table_space
table, index, procedure: these are pretty much the same
user, schema: explained well before
database link: you can join table A in instance A and table B in instance B using a - database link between the two instances (while logged in on of them)
object: has properties (like a columns in a table) and methods that operate on those poperties (pretty much like in OO design); these are not widely used
A few links:
Start page for 11g rel 2 docs http://www.oracle.com/pls/db112/homepage
Database concepts, Table of contents http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/toc.htm

Resources