Airflow: Storing a Connection in Environment Variables , for databricks connection - airflow

I want to store my databricks connection information as an env variable.
as mentioned in
https://airflow.apache.org/docs/apache-airflow/stable/howto/connection.html#:~:text=create%20the%20connection.-,Editing%20a%20Connection%20with%20the%20UI,button%20to%20save%20your%20changes.
I am also looking at the following:
https://docs.databricks.com/dev-tools/data-pipelines.html
it says to set the login as : {“token”: “abc”, “host”:"123"}
I not sure what to export… does anyone have a clue?? I have the token etc… but what is the export statement?

If you have already created the connection from the Airflow UI, open a terminal an enter this command: airflow connections get your_connection_id.
Example:
$ airflow connections get sqlite_default
Id: 40
Conn Id: sqlite_default
Conn Type: sqlite
Host: /tmp/sqlite_default.db
Schema: null
Login: null
Password: null
Port: null
Is Encrypted: false
Is Extra Encrypted: false
Extra: {}
URI: sqlite://%2Ftmp%2Fsqlite_default.db
The URI key has the value you can use to create env variable from. Following this example, would be:
export AIRFLOW_CONN_MY_PROD_DATABASE='sqlite://%2Ftmp%2Fsqlite_default.db'
Hope that works for you! source

Related

How to Fix NO_SECRET warning thrown by Next-Auth

I have a Next js application that uses Next Auth. While in development I continuously keep getting that warning stipulating that I need to set a secret but I don't know where I should set it.
Following this reference I see I just need to run openssl rand -base64 32 to get a secret but I have no Idea where to put it
In the [...nextauth].js outside provider and callback you can set the secret and it's value. As it is recommended to store such values in environment variable you can do the following
export default NextAuth({
providers: [
],
callbacks: {
},
secret: process.env.JWT_SECRET,
});
You should insert the command openssl rand -base64 32in your Linux terminal, then it will generate a Token to use it on an .env file with the variable name NEXTAUTH_SECRET=token_generated. So the error [next-auth][warn][NO_SECRET] will not be showed again on console.
In my case i had to upgrade next and next-auth module to the latest version.
next#12.1.6 and next-auth#4.5.0 worked for me.
NEXTAUTH_SECRET=secret string here
You can generate secret key using this command line openssl rand -base64 32 in command prompt or windows power shell.
And in the next-auth configuration [...nextauth].ts file
export default NextAuth({
providers: [
],
callbacks: {
},
secret: process.env.NEXTAUTH_SECRET,
});
It is said if secret is not defined in [...nextauth].ts, it loads NEXTAUTH_SECRET from env, but I added it and works like charm. :)

Sequelize-cli returns "Unknown Database" when doing migrations

I have been using sequelize migration all this while with no issue,
for example in our development server:
"development": {
"username": "root",
"password": "password",
"database": "db",
"host": "127.0.0.1",
"dialect": "mysql"
}
using sequelize-cli will works fine:
npx sequelize db:migrate
results:
Sequelize CLI [Node: 12.16.1, CLI: 6.2.0, ORM: 6.3.5]
Loaded configuration file "config\config.json".
Using environment "development".
No migrations were executed, database schema was already up to date.
Same goes for our production server, which db is on different server than app:
"production": {
"username": "root",
"password": "password",
"database": "db",
"host": "172.xx.xx.11",
"dialect": "mysql"
}
So recently we have upgraded our production server to have 3 db servers using mariadb, managed by a load balancer (maxscale), a galera cluster or something, using the same setup as previous, so now its something like:
server a: 172.xx.xx.11,
server b: 172.xx.xx.12,
server c: 172.xx.xx.13,
load balancer: 172.xx.xx.10
our new config is like:
"production": {
"username": "root",
"password": "password",
"database": "db",
"host": "172.xx.xx.10",
"dialect": "mysql"
}
there is no firewall open between app server and db server directly, only app server to the load balancer.
testing connection between app server and the load balancer with sequelize seems to have no issue,
can pass through if username and password is correct,
if wrong username, or wrong password will give
ERROR: Access denied for user 'root'#'172.xx.xx.10' (using password: YES)
no issue there. just saying that there is a connection.
then there is no issue also using:
npx sequelize db:drop
or
npx sequelize db:create
resulting in
Sequelize CLI [Node: 12.16.1, CLI: 6.2.0, ORM: 6.3.5]
Loaded configuration file "config\config.json".
Using environment "production".
Database db created.
Verifying in all our db servers that the database did dropped and created.
But when i tried doing migrations, this happens:
Sequelize CLI [Node: 12.16.1, CLI: 6.2.0, ORM: 6.3.5]
Loaded configuration file "config\config.json".
Using environment "production".
ERROR: Unknown database 'db'
I have verified that all our db servers did have that 'db' database, its even created by sequelize based on the config, but somehow sequelize cant seems to recognize or identified that 'db' database.
Please help if you have any experience like this before, and do let me know if you need more info.
Thanks.
You can enable the verbose log level in MaxScale by adding log_info=true under the [maxscale] section. This should help explain what is going on and why it is failing.
It is possible that Sequelize does something that assumes it's working with the same database server. For example, doing an INSERT and immediately reading the inserted value will always work on a single server but with a distributed setup, it's possible the values haven't replicated to all nodes.
If you can't find an explanation as to why it behaves like this or you think MaxScale is doing something wrong, please open a bug report on the MariaDB Jira under the MaxScale project.
Turns out the maxscale user don't have enough privileges. granting SHOW DATABASES privileges to maxscale user fixed my issue.
more info:
https://mariadb.com/kb/en/mariadb-maxscale-14/maxscale-configuration-usage-scenarios/#service
Related issue on MariaDB Jira

Passing SQL credentials to shinyproxy app

I have a working shinyproxy app with LDAP authentication. However, for retrieving data from the SQL-database I now use (not recommended) a hardcoded connection string in my R code with the credentials mentioned herein (I use a service user because my end users don't have permissions to query the database):
con <- DBI::dbConnect(odbc::odbc(),
encoding = "latin1",
.connection_string = 'Driver={Driver};Server=Server;Database=dbb;UID=UID;PWD=PWD')
I tried to replace the connection string with an environmental variable, that I pass from my Linux host to the container. This works when running the container outside ShinyProxy, and thus by passing the environmental variables at runtime with the following Docker command:
docker run -it --env-file env.list app123
However, when using ShinyProxy, it is not clear to me how to configure this in the yaml config file. How do I pass the statement --env-file env.list at this level so that it is picked up in the linked containers?
Any help kindly appreciated!
From this closed issue: https://github.com/openanalytics/shinyproxy/issues/99
Your application.yaml could look something like this:
proxy:
title: Open Analytics Shiny Proxy
logo-url: http://www.openanalytics.eu/sites/www.openanalytics.eu/themes/oa/logo.png
landing-page: /
heartbeat-rate: 10000
heartbeat-timeout: 60000
port: 8080
authentication: simple
admin-groups: admin
# Example: 'simple' authentication configuration
users:
- name: admin
password: password
groups: admin
# Docker configuration
docker:
internal-networking: true
specs:
- id: 01_hello
display-name: Hello Application
description: Application which demonstrates the basics of a Shiny app
container-cmd: ["R", "-e", "shinyproxy::run_01_hello()"]
container-image: openanalytics/shinyproxy-demo
container-env-file: /app/shinyproxy/test.env
container-env:
bar: baz
access-groups: admin
container-network: shinyproxy_reprex_default
logging:
file:
shinyproxy.log
Specifically it seems you could set environment variables with a file using container-env-file.

Invalid directory for database creation on dynamodb start

I am getting Invalid directory for database creation. when I start serverless-dynamodb-local.
Below is my local configuration
dynamodb:
stages:
- test
start:
port: 8000
sharedDb: true
dbPath: dynamodb-local
Is there anything I am missing?

How to write airflow logs to Elasticsearch?

I am using Airflow 1.10.5. Can't seem to find complete documentation or sample on how to setup remote logging using Elasticsearch. I saw airflow documentation about logging, but it wasn't helpful. I am trying to write the airflow (not task) logs to ES.
As far as I understand the docs, the ES log handler can only read from ES. You would have to setup your logging to print into a file, then use something like filebeat to post the file content to ES and Airflow can then read them back...
https://airflow.readthedocs.io/en/stable/howto/write-logs.html#writing-logs-to-elasticsearch
Writing Logs to Elasticsearch
Airflow can be configured to read task
logs from Elasticsearch and optionally write logs to stdout in
standard or json format. These logs can later be collected and
forwarded to the Elasticsearch cluster using tools like fluentd,
logstash or others.
I was able to achieve using [filebeat][1] shipper.
Input config section in filebeat.yml
</snip>
# ============================== Filebeat inputs ===============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /path/to/logs/*.log
</snip>
Output config section in filebeat.yml
<snip>
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]
# Protocol - either `http` (default) or `https`.
#protocol: "https"
# Authentication credentials - either API key or username/password.
#api_key: "id:api_key"
username: "elastic"
password: "changeme"
</snip>
Good doc to read especially about airflow --> ES.

Resources