We use the most recent Airflow v1.10.7 and have set catchup_by_default=False in the airflow.cfg and set catchup=False in each one of our dags and set catchup=False in the default args.
But still the airflow backfills and runs the jobs. We really want to use Airflow but this is a show stopper. Also why does the dag run automatically when it's first created even though catchup = False?
Thanks for your help.
Sample dag:
from airflow import DAG
from utils.general import Schedule, default_args, AbcComputeOperator
from airflow.operators.http_operator import SimpleHttpOperator
from datetime import datetime, timedelta
default_args = {
'owner': 'abc',
'depends_on_past': False,
'start_date': datetime(2020, 1, 1),
'email': ['abc#abc.com'],
'email_on_failure': True,
'email_on_retry': False,
'retries': 0
}
DAG_py:
with DAG(
'abc_dag,
catchup=False,
default_args=default_args,
schedule_interval=Schedule.daily_0800.value) as dag:
t = SimpleHttpOperator(
http_conn_id = 'core',
task_id='send_abc_alerts',
dag=dag,
method='GET',
data={'Url': 'abc'},
endpoint='abc/callapi',
log_response=True,
response_check=lambda r: True
)
Docker-compose.yml:
version: '3.7'
services:
postgres:
image: postgres:9.6
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=abc
- POSTGRES_DB=airflow
logging:
options:
max-size: 10m
max-file: "3"
webserver:
image: puckel/docker-airflow:1.10.7
restart: always
depends_on:
- postgres
environment:
- LOAD_EX=n
- EXECUTOR=Local
- AIRFLOW__CORE__DEFAULT_TIMEZONE=America/New_York
- AIRFLOW__SCHEDULER__CATCHUP_BY_DEFAULT=False
- AIRFLOW__WEBSERVER__BASE_URL=http://1.1.1.1:8080
- AIRFLOW__SMTP__SMTP_STARTTLS=False
- AIRFLOW__SMTP__SMTP_PORT=587
- AIRFLOW__SMTP__SMTP_HOST=0.0.0.0
logging:
options:
max-size: 10m
max-file: "3"
volumes:
- /home/ec2-user/airflow/dags:/usr/local/airflow/dags
# - ./plugins:/usr/local/airflow/plugins
ports:
- "8080:8080"
- "587:587"
command: webserver
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 3
Related
I want to have Linux patches schedule and restart the minions if updates are success. I have created state for OS update process and sending event to event bus. Then I have created reactors to listen event tag and reboot server if tag success but reactor not react to anything and no action.
# /srv/salt/os/updates/linuxupdate.sls
uptodate:
pkg.uptodate:
- refresh: True
event-completed:
event.send:
- name: 'salt/minion-update/success'
- require:
- pkg: uptodate
event-failed:
event.send:
- name: 'salt/minion-update/failed'
- onfail:
- pkg: uptodate
# /etc/salt/master.d/reactor.conf
reactor:
- 'salt/minion-update/success':
- /srv/reactor/reboot.sls
# /srv/reactor/reboot.sls
reboot_server:
local.state.sls:
- tgt: {{ data['id'] }}
- arg:
- os.updates.reboot-server
I've got a docker compose service with a bunch of containers and I am attempting to collect both the docker container metrics from these containers but also the container host metrics from the Ubuntu server the containers are running on. I'm getting the docker container stats but I am not getting the Ubuntu container host metrics. The stats from the non-docker based input plugins (inputs.diskio,inputs.mem, etc) are from the telegraf container.
I found this and opened up the volumes but still nothing: https://community.influxdata.com/t/how-can-we-collect-host-machine-metrics-while-telegraf-is-running-in-docker-container/12005
Here is my compose file:
version: "3"
services:
telegraf:
image: telegraf:1.20.3
volumes:
- ./telegraf.conf:/etc/telegraf/telegraf.conf:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- /sys:/rootfs/sys:ro
- /proc:/rootfs/proc:ro
- /etc:/rootfs/etc:ro
environment:
HOST_PROC: /rootfs/proc
HOST_SYS: /rootfs/sys
HOST_ETC: /rootfs/etc
vote:
build: ./vote
# use python rather than gunicorn for local dev
command: python app.py
depends_on:
- redis
volumes:
- ./vote:/app
ports:
- "5000:80"
networks:
- front-tier
- back-tier
result:
build: ./result
# use nodemon rather than node for local dev
command: nodemon server.js
depends_on:
- db
volumes:
- ./result:/app
ports:
- "5001:80"
- "5858:5858"
networks:
- front-tier
- back-tier
worker:
build:
context: ./worker
depends_on:
- redis
- db
networks:
- back-tier
redis:
image: redis:5.0-alpine3.10
volumes:
- "./healthchecks:/healthchecks"
healthcheck:
test: /healthchecks/redis.sh
interval: "5s"
ports: ["6379"]
networks:
- back-tier
db:
image: postgres:9.4
environment:
POSTGRES_USER: "postgres"
POSTGRES_PASSWORD: "postgres"
volumes:
- "db-data:/var/lib/postgresql/data"
- "./healthchecks:/healthchecks"
healthcheck:
test: /healthchecks/postgres.sh
interval: "5s"
networks:
- back-tier
volumes:
db-data:
networks:
front-tier:
back-tier:
Here is the agent config:
[agent]
interval = "10s"
[[inputs.mem]]
[[inputs.disk]]
## Ignore mount points by filesystem type.
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
[[inputs.diskio]]
[[inputs.kernel]]
[[inputs.ethtool]]
[[inputs.procstat]]
pattern = ".*"
[[inputs.docker]]
endpoint = "unix:///var/run/docker.sock"
gather_services = false
container_names = []
source_tag = true
container_name_include = []
container_name_exclude = []
timeout = "5s"
perdevice = true
docker_label_include = []
docker_label_exclude = []
[[inputs.cpu]]
percpu = true
totalcpu = true
collect_cpu_time = true
report_active = true
How do I get the container host metrics??
I'm trying to connect dapr with nats with jetstream functionality enabled.
I want to start everything with docker-compose. Nats service is started and when I run nats-cli with command nats -s "nats://localhost:4222" server check jetstream, I get OK JetStream | memory=0B memory_pct=0%;75;90 storage=0B storage_pct=0%;75;90 streams=0 streams_pct=0% consumers=0 consumers_pct=0% indicating nats with jetstream is working ok.
Unfortunately, dapr returns first warning then error
warning: error creating pub sub %!s(*string=0xc0000ca020) (pubsub.jetstream/v1): couldn't find message bus pubsub.jetstream/v1" app_id=conversation-api1 instance=50b51af8e9a8 scope=dapr.runtime type=log ver=1.3.0
error: process component conversation-pubsub error: couldn't find message bus pubsub.jetstream/v1" app_id=conversation-api1 instance=50b51af8e9a8 scope=dapr.runtime type=log ver=1.3.0
I followed instructions on official site.
docker-compose.yaml
version: '3.4'
services:
conversation-api1:
image: ${DOCKER_REGISTRY-}conversationapi1
build:
context: .
dockerfile: Conversation.Api1/Dockerfile
ports:
- "5010:80"
conversation-api1-dapr:
container_name: conversation-api1-dapr
image: "daprio/daprd:latest"
command: [ "./daprd", "--log-level", "debug", "-app-id", "conversation-api1", "-app-port", "80", "--components-path", "/components", "-config", "/configuration/conversation-config.yaml" ]
volumes:
- "./dapr/components/:/components"
- "./dapr/configuration/:/configuration"
depends_on:
- conversation-api1
- redis
- nats
network_mode: "service:conversation-api1"
nats:
container_name: "Nats"
image: nats
command: [ "-js", "-m", "8222" ]
ports:
- "4222:4222"
- "8222:8222"
- "6222:6222"
# OTHER SERVICES...
conversation-pubsub.yaml
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
name: conversation-pubsub
namespace: default
spec:
type: pubsub.jetstream
version: v1
metadata:
- name: natsURL
value: "nats://host.docker.internal:4222" # already tried with nats for host
- name: name
value: "conversation"
- name: durableName
value: "conversation-durable"
- name: queueGroupName
value: "conversation-group"
- name: startSequence
value: 1
- name: startTime # in Unix format
value: 1630349391
- name: deliverAll
value: false
- name: flowControl
value: false
conversation-config.yaml
apiVersion: dapr.io/v1alpha1
kind: Configuration
metadata:
name: config
namespace: default
spec:
tracing:
samplingRate: "1"
zipkin:
endpointAddress: "http://zipkin:9411/api/v2/spans"
The problem was in old Dapr version. I used version 1.3.0, Jetstream support is introduced in 1.4.0+. Pulling latest version of daprio/daprd fixed my problem. Also no need for nats://host.docker.internal:4222, nats://nats:4222 works as expected.
I have:
a simple Python app ("iss-web") writing JSON log output to stdout
the Python app ("iss-web") is within a Docker Container
the Python app ("iss-web") Container logging driver is set to "fluentd"
a separate Container running "fluent/fluent-bit:1.7" to collect the Python app JSON log output
Loki 2.2.1 deployed via a Container to receive the Python app log output from fluent-bit
Grafana connected to Loki to visualize the log data
The issue is that the "log" field is not filtered/parsed by fluent-bit, therefore in Loki/Grafana the content of the "log" field is not parsed and used as "Detected fields".
"iss-web" docker-compose.yml
version: '3'
services:
iss-web:
build: ./iss-web
image: iss-web
container_name: iss-web
env_file:
- ./iss-web/app.env
ports:
- 46664:46664
logging:
driver: fluentd
options:
tag: iss.web
redis:
image: redis
container_name: redis
ports:
- 6379:6379
logging:
driver: "json-file"
options:
max-file: ${LOG_EXPIRE}
max-size: ${LOG_SEGMENT}
"fluent-bit" docker-compose.yml
version: '3'
services:
fluent-bit:
image: fluent/fluent-bit:1.7
container_name: fluent-bit
environment:
- LOKI_URL=http://135.86.186.75:3100/loki/api/v1/push
user: root
volumes:
- ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
- ./parsers.conf:/fluent-bit/etc/parsers.conf
ports:
- "24224:24224"
- "24224:24224/udp"
fluent-bit.conf
[SERVICE]
Flush 1
Daemon Off
log_level debug
Parsers_File /fluent-bit/etc/parsers.conf
[INPUT]
Name forward
Listen 0.0.0.0
port 24224
[FILTER]
Name parser
Match iss.web
Key_Name log
Parser docker
Reserve_Data On
Preserve_Key On
[OUTPUT]
Name loki
Match *
host 135.86.186.75
port 3100
labels job=fluentbit
[OUTPUT]
Name stdout
Match *
parsers.conf
I've tried with/without Time_Key, Time_Format, Time_Keep
[PARSER]
Name docker
Format json
#Time_Key time
#Time_Format %Y-%m-%dT%H:%M:%S.%L
#Time_Keep On
# Command | Decoder | Field | Optional Action
# =============|==================|=================
#Decode_Field_As escaped_utf8 log do_next
Decode_Field_As json log
fluent-bit log extract
[0] iss.web: [1620640820.000000000, {"log"=>"{'timestamp': '2021:05:10 11:00:20.439513', 'epoch': 1620640820.4395688, 'pid': 1, 'level': 'DEBUG', 'message': '/ping', 'data': {'message': 'PONG', 'timestamp': '1620640820.4394963', 'version': '0.1'}}", "container_id"=>"bffd720e9ac1e8c3992c1120eed37e00c536cd44ec99e9c13cf690d840363f80", "container_name"=>"/iss-web", "source"=>"stdout"}]
Grafana/Loki screen
I would expect the "Detected fields" to contain pid=1, message=/ping etc
I needed a "json.dumps" in my "logger":
def log(message, level="INFO", **extra):
out = {"timestamp": get_now(), "epoch": get_epoch(), "pid": get_pid(), "level": level, "message": message}
if extra: out |= extra
print(json.dumps(out), flush=True)
return True
my issue is that I cannot set the basic authentication for my frontend app throught traefik
This is how I have configured my traefik
traefik.yml
global:
checkNewVersion: true
sendAnonymousUsage: false
entryPoints:
https:
address: :443
http:
address: :80
traefik:
address: :8080
tls:
options:
foo:
minVersion: VersionTLS12
cipherSuites:
- "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256"
- "TLS_RSA_WITH_AES_256_GCM_SHA384"
providers:
providersThrottleDuration: 2s
docker:
watch: true
endpoint: unix:///var/run/docker.sock
exposedByDefault: false
network: web
api:
insecure: true
dashboard: true
log:
level: INFO
certificatesResolvers:
default:
acme:
storage: /acme.json
httpChallenge:
entryPoint: http
docker-compose.yml
version: '3'
services:
traefik:
image: traefik:v2.0
restart: always
ports:
- "80:80"
- "443:443"
- "8080:8080"
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
- "/srv/traefik/traefik.yml:/etc/traefik/traefik.yml"
- "/srv/traefik/acme.json:/acme.json"
networks:
- web
networks:
web:
external: true
And here is where I have my frontend app running as a traefik provider and where I have my basic auth label
version: '3.7'
services:
frontend:
image: git.xxxx.com:7000/dockerregistry/registry/xxxx
restart: "always"
networks:
- web
volumes:
- "/srv/config/api.js:/var/www/htdocs/api.js"
- "/srv/efs/workspace:/var/www/htdocs/stock"
labels:
- traefik.enable=true
- traefik.http.routers.frontend-http.rule=Host(`test.xxxx.com`)
- traefik.http.routers.frontend-http.service=frontend
- traefik.http.routers.frontend-http.entrypoints=http
- traefik.http.routers.frontend.tls=true
- traefik.http.routers.frontend.tls.certresolver=default
- traefik.http.routers.frontend.entrypoints=http
- traefik.http.routers.frontend.rule=Host(`test.xxxx.com`)
- traefik.http.routers.frontend.service=frontend
- traefik.http.middlewares.frontend.basicAuth.users=test:$$2y$$05$$c45HvbP0Sq9EzcfaXiGNsuuWMfPhyoFZVYgiTylpMMLtJY2nP1P6m
- traefik.http.services.frontend.loadbalancer.server.port=8080
networks:
web:
external: true
I cannot get the login prompt, so Im wondering if I missing some container label for this.
Thanks in advance! Joaquin
firstly , the labels should be in quotation marks like this ""
secondly, I think you are missing a label in the frontend app .
when using basic auth it takes two steps and should look like this :
- "traefik.http.routers.frontend.middlewares=frontend-auth"
- "traefik.http.middlewares.frontend-auth.basicauth.users=test:$$2y$$05$$c45HvbP0Sq9EzcfaXiGNsuuWMfPhyoFZVYgiTylpMMLtJY2nP1P6m"
In your Docker Compose file don't add the "middlewares" label for traefik, instead do it using a traefik.yml file passing the providers.file option, where you should define the routers, services, middlewares, etc. In that "providers file" you should set middlewares under http.routes.traefik – This may sound super confuse at the beginning but is not that hard, trust me.
Let's do a YAML case (you can convert it to "TOML" here).
This example assumes you have a Docker Compose file specifically for Traefik – I haven't tried using the same Docker Compose file with any other services in it (like Wordpress, databases or whatever) since I already have a different path for those files.
docker-compose.yml
version: '3.1'
services:
reverse-proxy:
image: traefik:v2.4
[ ... ]
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
# Map the dynamic conf into the container
- ./traefik/config.yml:/etc/traefik/config.yml:ro
# Map the static conf into the container
- ./traefik/traefik.yml:/etc/traefik/traefik.yml:ro
# Note you don't use "traefik.http.routers.<service>.middlewares etc." here
[ ... ]
In this case I set/get the config files for Traefik in ./traefik (relative to the docker-compose.yml file).
./traefik/config.yml
http:
routers:
traefik:
middlewares: "basicauth"
[ ... ]
middlewares:
basicauth:
basicAuth:
removeHeader: true
users:
- <user>:<password>
# password should be generated using `htpasswd` (md5, sha1 or bcrypt)
[ ... ]
Here you can set the basicauth name as you wish (since that's the middleware name you'll see in the Dashboard), so you could do:
http:
routers:
traefik:
middlewares: "super-dashboard-auth"
[ ... ]
middlewares:
super-dashboard-auth:
basicAuth:
removeHeader: true
users:
- <user>:<password>
# password should be generated using `htpasswd` (md5, sha1 or bcrypt)
[ ... ]
Note that basicAuth must remain as is. Also, here you don't need to use the "double dollar method" to scape it (as in the label approach), so after creating the user password you should enter it exactly like htpasswd created it.
# BAD
user:$$2y$$10$$nRLqyZT.64JI/CD/ym65UGDn8HaY0D6CBTxhe6JXf9u4wi5bEMdh.
# GOOD
user:$2y$10$nRLqyZT.64JI/CD/ym65UGDn8HaY0D6CBTxhe6JXf9u4wi5bEMdh.
Of course you may want to get this data from an .env file and not hardcode those strings, in that case you need to pass the environmental variable from the docker-compose.yml using environment like this:
services:
reverse-proxy:
image: traefik:v2.4
container_name: traefik
[ ... ]
environment:
TRAEFIK_DASHBOARD_USER: "${TRAEFIK_DASHBOARD_USER}"
TRAEFIK_DASHBOARD_PWD: "${TRAEFIK_DASHBOARD_PWD}"
# And any other env. var. you may need
[ ... ]
and use it like this in you traefik/config.yml file:
[ ... ]
middlewares:
super-dashboard-auth:
basicAuth:
removeHeader: true
users:
- "{{env "TRAEFIK_DASHBOARD_USER"}}:{{env "TRAEFIK_DASHBOARD_PWD"}}"
[ ... ]
After that include the previous file in the providers.file.filename
./traefik/traefik.yml
[ ... ]
api:
dashboard: true
insecure: false
providers:
docker:
endpoint: "unix:///var/run/docker.sock"
[ ... ]
file:
filename: /etc/traefik/config.yml
watch: true
[ ... ]
And then simply docker-compose up -d
I configure it this way:
generate password by apache2-utils e.g.
htpasswd -nb admin secure_password
setup traefik.toml
[entryPoints]
[entryPoints.web]
address = ":80"
[entryPoints.web.http.redirections.entryPoint]
to = "websecure"
scheme = "https"
[entryPoints.websecure]
address = ":443"
[api]
dashboard = true
[certificatesResolvers.lets-encrypt.acme]
email = "your_email#your_domain"
storage = "acme.json"
[certificatesResolvers.lets-encrypt.acme.tlsChallenge]
[providers.docker]
watch = true
network = "web"
[providers.file]
filename = "traefik_dynamic.toml"
setup traefik_dynamic.toml
[http.middlewares.simpleAuth.basicAuth]
users = [
"admin:$apr1$ruca84Hq$mbjdMZBAG.KWn7vfN/SNK/"
]
[http.routers.api]
rule = "Host(`monitor.your_domain`)"
entrypoints = ["websecure"]
middlewares = ["simpleAuth"]
service = "api#internal"
[http.routers.api.tls]
certResolver = "lets-encrypt"
setup traefik service
services:
reverse-proxy:
image: traefik:v2.3
restart: always
command:
- --api.insecure=true
- --providers.docker
ports:
- "80:80"
- "443:443"
networks:
- web
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./traefik.toml:/traefik.toml
- ./traefik_dynamic.toml:/traefik_dynamic.toml
- ./acme.json:/acme.json
Regarding this part of the documentation.
If you are using Docker scripts for settings.
Configure as the following.
For example:
labels:
- "traefik.http.middlewares.foo-add-prefix.addprefix.prefix=/foo"
- "traefik.http.routers.router1.middlewares=foo-add-prefix#docker"
I had same issue and I was missing namespace name #docker in the middleware name.