How to write airflow logs to Elasticsearch? - airflow

I am using Airflow 1.10.5. Can't seem to find complete documentation or sample on how to setup remote logging using Elasticsearch. I saw airflow documentation about logging, but it wasn't helpful. I am trying to write the airflow (not task) logs to ES.

As far as I understand the docs, the ES log handler can only read from ES. You would have to setup your logging to print into a file, then use something like filebeat to post the file content to ES and Airflow can then read them back...
https://airflow.readthedocs.io/en/stable/howto/write-logs.html#writing-logs-to-elasticsearch
Writing Logs to Elasticsearch
Airflow can be configured to read task
logs from Elasticsearch and optionally write logs to stdout in
standard or json format. These logs can later be collected and
forwarded to the Elasticsearch cluster using tools like fluentd,
logstash or others.

I was able to achieve using [filebeat][1] shipper.
Input config section in filebeat.yml
</snip>
# ============================== Filebeat inputs ===============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /path/to/logs/*.log
</snip>
Output config section in filebeat.yml
<snip>
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]
# Protocol - either `http` (default) or `https`.
#protocol: "https"
# Authentication credentials - either API key or username/password.
#api_key: "id:api_key"
username: "elastic"
password: "changeme"
</snip>
Good doc to read especially about airflow --> ES.

Related

Enabling account deletion on nats server

I was trying to prune some users from my nats server by doing:
nsc push --system-account SYS -u nats://localhost:4222 -P
but I got the following error:
server nats-comm-2 responded with error: delete accounts request by SOME_KEY_VALUE failed - delete must be enabled in server config
The meaning of the error is pretty obvious, when I examine the help documentation for nsc push -P:
Only works with nats-resolver enabled nats-server. Mutually exclusive of account-removal/diff
But I'm not sure how to enable this in my nats server config. How do I allow for account pruning?
I found documentation in the resolver section, here, showing that I could add allow_delete: true to the config, but as the YAML format is in camel-case, I had to modify it to be allowDelete: true instead.
nats:
auth:
enabled: true
resolver:
type: full
allowDelete: true

How to log request channel into a log file?

I am currently running a Symfony 5 project in dev environment.
I would like to output requests logs (like 10:01:39 request.INFO Matched route "login_route") into a file.
I have the following config/packages/dev/monolog.yaml file:
monolog:
handlers:
main:
type: stream
path: "%kernel.logs_dir%/%kernel.environment%.log"
level: debug
channels: [event]
With the YAML above, it logs correctly into the file /tmp/dev-logs/dev.log when I execute bin/console cache:clear.
But, it does not log anything when I perform requests on the application, no matter if I set channels: [request] or channels: ~ or even no channel param at all.
How can I edit the settings of that monolog.yaml file in order to log request channel logs ?
I have found the answer! This is very specific to my configuration.
In fact, I have two Docker containers (that both mount the project directory as a volume) for development:
one for code edition (with a linter, syntax checker, specific vim configuration...etc...)
one to access the application through HTTP using PHP-FPM (the one that is used when I make HTTP requests on the app)
So, when I perform a bin/console cache:clear from the first container I use for development, it logs into the /tmp/dev-logs/dev.log file of that first container; but when I execute HTTP requests, it logs into the /tmp/dev-logs/dev.log file of that second container;
I was checking the first container file only while it was logging into the second container file instead. So, I was simply not checking the right file.
Everything works. :)

Missing encryption key to decrypt file with.Ask your team for your master ... it in the ENV['RAILS_MASTER_KEY']. Platform.sh Deployment aborting,

ERROR MESSAGE:
W: Missing encryption key to decrypt file with. Ask your team for your master key and write it to /app/config/master.key or put it in the ENV['RAILS_MASTER_KEY'].
when deploying my project on Platform.sh, the operation failed because of the lack of the decryption key. from my google search, I found that the decryption key.
My Ubuntu .bashrc
export RAILS_MASTER_KEY='ad5e30979672cdcc2dd4f4381704292a'
rails project configuration for PLATFORM.SH
.platform.app.yaml
# The name of this app. Must be unique within a project.
name: app
type: 'ruby:2.7'
# The size of the persistent disk of the application (in MB).
disk: 5120
mounts:
'web/uploads':
source: local
source_path: uploads
relationships:
postgresdatabase: 'dbpostgres:postgresql'
hooks:
build: |
gem install bundler:2.2.5
bundle install
RAILS_ENV=production bundle exec rake assets:precompile
deploy: |
RACK_ENV=production bundle exec rake db:migrate
web:
upstream:
socket_family: "unix"
commands:
start: "\"unicorn -l $SOCKET -E production config.ru\""
locations:
'/':
root: "\"public\""
passthru: true
expires: "24h"
allow: true
routes.yaml
# Each route describes how an incoming URL is going to be processed by Platform.sh.
"https://www.{default}/":
type: upstream
upstream: "app:http"
"https://{default}/":
type: redirect
to: "https://www.{default}/"
services.yaml
# The name given to the PostgreSQL service (lowercase alphanumeric only).
dbpostgres:
type: postgresql:13
# The disk attribute is the size of the persistent disk (in MB) allocated to the service.
disk: 5120
db:
type: postgresql:13
disk: 5120
configuration:
extensions:
- pgcrypto
- plpgsql
- uuid-ossp
environments/production.rb
config.require_master_key = true
I suspect that the master.key is not accessible during deployment, and I don't understand how to solve the problem.
From what I understand, your export is in your .bashrc on your local machine, so it won't be accessible when deploying on Platform.sh. (The logs you see in your terminal when building and deploying are streamed, this doesn't happen on your machine.)
You need to make the RAILS_MASTER_KEY accessible on Platform.sh. To do so, this variable needs to be declared in your project.
Given the nature of the variable, I would suggest to use the Platform CLI to create this variable.
If this variable should be accessible on all your environments, you can make it a project level variable.
$ platform variable:create --level project --sensitive true env:RAILS_MASTER_KEY <your_key>
If it should only be accessible for a specific environment, then you need an environment level variable:
$ platform variable:create --level environment --environment '<your_envrionment>' --inheritable false --sensitive true env:RAILS_MASTER_KEY '<your_key>'
The env: prefix in the variable names tells Platform.sh to expose the variable with the rest of the environment variables. More information about this in the variables prefix section of the environment variables documentation page.
You could do the same via the management console if you prefer to avoid the command line.
Environment variables can also be configured directly in your .platform.app.yaml file, as described here. Keep in mind that this file being versioned, you should not use this method for sensitive information, such as encryption keys, API keys, and other kind of secrets.
The RAILS_MASTER_KEY environment variable should now be accessible during your Platform.sh deployment.

Task fails due to not being able to read log file

Composer is failing a task due to it not being able to read a log file, it's complaining about incorrect encoding.
Here's the log that appears in the UI:
*** Unable to read remote log from gs://bucket/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** 'ascii' codec can't decode byte 0xc2 in position 6986: ordinal not in range(128)
*** Log file does not exist: /home/airflow/gcs/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Fetching from: http://airflow-worker-68dc66c9db-x945n:8793/log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-68dc66c9db-x945n', port=8793): Max retries exceeded with url: /log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1c9ff19d10>: Failed to establish a new connection: [Errno -2] Name or service not known',))
I try viewing the file in the google cloud console and it also throws an error:
Failed to load
Tracking Number: 8075820889980640204
But I am able to download the file via gsutil.
When I view the file, it seems to have text overriding other text.
I can't show the entire file but it looks like this:
--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------
#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,313] {models.py:1569} INFO - Executing <Task(BigQueryOperator): merge_campaign_exceptions> on 2019-08-03T10:00:00+00:00#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,314] {base_task_runner.py:124} INFO - Running: ['bash', '-c', u'airflow run __campaign_exceptions_0_0_1 merge_campaign_exceptions 2019-08-03T10:00:00+00:00 --job_id 22767 --pool _bq_pool --raw -sd DAGS_FOLDER//-campaign-exceptions.py --cfg_path /tmp/tmpyBIVgT']#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:24,658] {base_task_runner.py:107} INFO - Job 22767: Subtask merge_campaign_exceptions [2019-08-04 10:01:24,658] {settings.py:176} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
Where the #-#{} pieces seems to be "on top of" the typical log.
I faced the same problem. In my case the problem was that I removed the google_gcloud_default connection that was being used to retrieve the logs.
Check the configuration and look for the connection name.
[core]
remote_log_conn_id = google_cloud_default
Then check the credentials used for that connection name has the right permissions to access the GCS bucket.
I'm having a similar problem with viewing logs in GCP Cloud Composer. It doesn't appear to be preventing the failing DAG task from running though. What it looks like is a permissions error between the GKE and Storage Bucket where the log files are kept.
You can still view the logs by going into your cluster's storage bucket in the same directory as your /dags folder where you should also see a logs/ folder.
Your helm chart should setup global env:
- name: AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT
value: "google-cloud-platform://"
Then, you should deploy a Dockerfile with root account only (not airflow account), additionaly, you set up your helm uid, gid as:
uid: 50000 #airflow user
gid: 50000 #airflow group
Then upgrade helm chart with new config
*** Unable to read remote log from gs://bucket
1)Found the solution after assigning the roles to the service account
2)The SA key(json or txt) to be added and configured to the connection in the
remote_log_conn_id = google_cloud_default
3)restart the scheduler and webserver of the airflow
4)restart the dags on the airflow
you can find the logs on the GCS bucket where its configured

How to disable interception of errors by Ingress in a Tectonic kubernetes setup

I have a couple of NodeJS backends running as pods in a Kubernetes setup, with Ingress-managed nginx over it.
These backends are API servers, and can return 400, 404, or 500 responses during normal operations. These responses would provide meaningful data to the client; besides the status code, the response has a JSON-serialized structure in the body informing about the error cause or suggesting a solution.
However, Ingress will intercept these error responses, and return an error page. Thus the client does not receive the information that the service has tried to provide.
There's a closed ticket in the kubernetes-contrib repository suggesting that it is now possible to turn off error interception: https://github.com/kubernetes/contrib/issues/897. Being new to kubernetes/ingress, I cannot figure out how to apply this configuration in my situation.
For reference, this is the output of kubectl get ingress <ingress-name>: (redacted names and IPs)
Name: ingress-name-redacted
Namespace: default
Address: 127.0.0.1
Default backend: default-http-backend:80 (<none>)
Rules:
Host Path Backends
---- ---- --------
public.service.example.com
/ service-name:80 (<none>)
Annotations:
rewrite-target: /
service-upstream: true
use-port-in-redirects: true
Events: <none>
I have solved this on Tectonic 1.7.9-tectonic.4.
In the Tectonic web UI, go to Workloads -> Config Maps and filter by namespace tectonic-system.
In the config maps shown, you should see one named "tectonic-custom-error".
Open it and go to the YAML editor.
In the data field you should have an entry like this:
custom-http-errors: '404, 500, 502, 503'
which configures which HTTP responses will be captured and be shown with the custom Tectonic error page.
If you don't want some of those, just remove them, or clear them all.
It should take effect as soon as you save the updated config map.
Of course, you could to the same from the command line with kubectl edit:
$> kubectl edit cm tectonic-custom-error --namespace=tectonic-system
Hope this helps :)

Resources