Meteor: Delay on direct db inserts (with external MongoDB) - meteor

I have a C application that inserts data directly to the database of my Meteor application. The app works fine (withoud delays) when I run it in development mode (with "meteor"). However, if I run the app as a node app (bundled) and with external MongoDB, there's an annoying delay in screen updates (5-10s).
I have noticed some previous discussions about this:
Meteor: server-side db insert delays
Using node ddp-client to insert into a meteor collection from Node
Questions:
Is there any other way than building a server-side API for doing the db inserts through Meteor?
Why the delay is only when using external MongoDB?
Is there a way in Meteor to shorten this database polling interval?

You need to enable oplog tailing. Without oplog tailing, when your C program makes a database write, the Meteor server doesn't realise anything has changed until it polls MongoDB again. With oplog tailing, it can pick up the changes much more quickly and efficiently. In development mode, oplog tailing is enabled automatically, but for production it needs some additional setup.
Your MongoDB must be set up as a replica set (a replica set of one node does work).
You have to pass in a mongo URL for the replica set's local database with the environment variable MONGO_OPLOG_URL.
For more information, see this article.

Related

DB is still writable after changing to read only

I'm trying to make DB read-only for MariaDB, but it is not working correctly until we restart the service (Spring Boot application). The service keeps writing the data until we restart.
How can we make DB read-only so that it immediately stops writing regardless of service restart?
I tried commands:
1.
FLUSH TABLES WITH READ LOCK; 
SET GLOBAL read_only = 1; 
FLUSH TABLES WITH READ LOCK; 
None of the above fulfills the requirement. Please suggest if there's a different approach to do that.

How to set up a database on startup in AWS amplify?

I have an amplify and I get how to add an api function, use a lambda layer etc. What I don't see is how to create the database on startup -- it appears from the documentation that this is done from a CloudFormation stack, but I still can't see how to ensure that the database is set up on startup of the app (or build the tables if not) using something like SQLAlchemy.
What's the intended flow here?

Airflow Metadata DB = airflow_db?

I have a project requirement to back-up Airflow Metadata DB to some data warehouse (but not using an Airflow DAG). At the same time, the requirement mentions some connection called airflow_db.
I am quite new to Airflow, so I googled a bit on the topic. I am a bit confused about this part. Our Airflow Metadata DB is PostgreSQL (this is built from docker-compose, so I am tinkering on a local install), but when I look at Connections in Airflow Web UI, it says airflow_db is MySQL.
I initially assumed that they are the same, but by the looks of it, they aren't? Can someone explain the difference and what they are for?
Airflow creates airflow_db Conn Id with MySQL by default (see source code)
Default connections are not really useful in production system. It's just a long list of stuff that you are probably not going to use.
Airflow 1.1.10 introduced the ability not to create the default list by setting:
load_default_connections = False in airflow.cfg (See PR)
To give more background the connection list is where hooks find the information needed in order to connect to a service. It's not related to the backend database. Though the backend is db like any db and if you wish to allow hooks to interact with it you can define it in the list like any other connection (which is probably why you have this as option in the default).

Firebase + Datastore = need_index

I'm working through the appengine+go tutorial, which connects in with Firebase: https://cloud.google.com/appengine/docs/standard/go/building-app/. The code is available at https://github.com/GoogleCloudPlatform/golang-samples/tree/master/appengine/gophers/gophers-6, which aside from my Firebase keys is identical.
I have it working locally just fine under dev_appserver.py, and it queries the Vision API and adds labels. However, after I deploy to appengine I get an index error on datastore. If I go to the Firebase console, I see the collection (Post) and the field (Posted) which is a timestamp.
If I change this line: https://github.com/GoogleCloudPlatform/golang-samples/blob/master/appengine/gophers/gophers-6/main.go#L193 to remove the Order("-Posted") then everything works (it's important to note that any Order call causes it to error, except the test records I've posted come in random order.
The error message when running in appengine is: "Getting posts: API error 4 (datastore_v3: NEED_INDEX): no matching index found."
I've attempted to create a composite index, or test locally with --require_indexes=true and it hasn't helped me debug the issue.
Edit: I've moved this over to use Firebase's Datastore libraries directly, instead of the GCP updates. I never solved this particular issue, but was able to move forward with my app actually working :)
By default the local development server automatically creates the composite indexes needed for the actual queries invoked in your app. From Creating indexes using the development server:
The development web server (dev_appserver.py) automatically adds
items to this file when the application tries to execute a query that
needs an index that does not have an appropriate entry in the
configuration file.
In the development server, if you exercise every query that your app
will make, the development server will generate a complete list of
entries in the index.yaml file.
When the development web server adds a generated index definition to
index.yaml, it does so below the following line, inserting it if
necessary:
# AUTOGENERATED
The development web server considers all index definitions below this
line to be automatic, and it might update existing definitions below
this line as the application makes queries.
But you also need to deploy the generated index configurations to the Datastore and let the Datastore update indexing information (i.e. the indexes to get into the Serving state) for the respective queries to not hit the NEED_INDEX error. From Updating indexes:
You upload your index.yaml configuration file to Cloud Datastore
with the gcloud command. If the index.yaml file defines any
indexes that don't exist in Cloud Datastore, those new indexes are
built.
It can take a while for Cloud Datastore to create all the indexes and
therefore, those indexes won't be immediately available to App Engine.
If your app is already configured to receive traffic, then exceptions
can occur for queries that require an index that is still in the
process of being built.
To avoid exceptions, you must allow time for all the indexes to build.
For more information and examples about creating indexes, see
Deploying a Go App.
To upload your index configuration to Cloud Datastore, run the
following command from the directory where your index.yaml is located:
gcloud datastore create-indexes index.yaml
For information, see the gcloud datastore reference.
You can use the GCP Console, to check the status of your indexes.

Can I read and write to a SQLite database concurrently from multiple connections?

I have a SQLite database that is used by two processes. I am wondering, with the most recent version of SQLite, while one process (connection) starts a transaction to write to the database will the other process be able to read from the database simultaneously?
I collected information from various sources, mostly from sqlite.org, and put them together:
First, by default, multiple processes can have the same SQLite database open at the same time, and several read accesses can be satisfied in parallel.
In case of writing, a single write to the database locks the database for a short time, nothing, even reading, can access the database file at all.
Beginning with version 3.7.0, a new “Write Ahead Logging” (WAL) option is available, in which reading and writing can proceed concurrently.
By default, WAL is not enabled. To turn WAL on, refer to the SQLite documentation.
SQLite3 explicitly allows multiple connections:
(5) Can multiple applications or multiple instances of the same
application access a single database file at the same time?
Multiple processes can have the same database open at the same time.
Multiple processes can be doing a SELECT at the same time. But only
one process can be making changes to the database at any moment in
time, however.
For sharing connections, use SQLite3 shared cache:
Starting with version 3.3.0, SQLite includes a special "shared-cache"
mode (disabled by default)
In version 3.5.0, shared-cache mode was modified so that the same
cache can be shared across an entire process rather than just within a
single thread.
5.0 Enabling Shared-Cache Mode
Shared-cache mode is enabled on a per-process basis. Using the C
interface, the following API can be used to globally enable or disable
shared-cache mode:
int sqlite3_enable_shared_cache(int);
Each call sqlite3_enable_shared_cache() effects subsequent database
connections created using sqlite3_open(), sqlite3_open16(), or
sqlite3_open_v2(). Database connections that already exist are
unaffected. Each call to sqlite3_enable_shared_cache() overrides all
previous calls within the same process.
I had a similar code architecture as you. I used a single SQLite database which process A read from, while process B wrote to it concurrently based on events. (In python 3.10.2 using the most up to date sqlite3 version). Process B was continually updating the database, while process A was reading from it to check data. My issue was that it was working in debug mode, but not in "release" mode.
In order to solve my particular problem I used Write Ahead Logging, which is referenced in previous answers. After creating my database in Process B (write mode) I added the line:
cur.execute('PRAGMA journal_mode=wal') where cur is the cursor object created from establishing connection.
This set the journal to wal mode which allows for concurrent access for multiple reads (but only one write). In Process A, where I was reading the data, before connecting to the same database I included:
time.sleep(0.5)
Setting a sleep timer before a connection was made to the same database fixed my issue with it not working in "release" mode.
In my case: I did not have to manually set any checkpoints, locks, or transactions. Your use case might be different than mine however, so research is most likely required. Nevertheless, I hope this post helps and saves everyone some time!

Resources