sqlite3.OperationalError: no such table: dag when running in tox environment - airflow

I have a github action which is running a tox file. The tox file is running some very basic tests on apache airflow. For example:
workflow.yml
- uses: actions/checkout#v1
- some more steps
- name: Run tox
run: tox
tox.ini
[tox]
envlist = py36
[testenv]
deps =
pytest
commands =
pytest
test_file.py
def test_dags():
dagbag = DagBag(dag_folder=str(os.path.dirname(os.path.abspath('task_manager_airflow'))) / 'dags', include_examples=False)
dag = dagbag.get_dag(dag_id='some_dag')
How come I get a no such table: dag error when the github action runs? When I run it locally on my machine it works perfectly.

Related

Airflow KubernexExecutor doens't spaw KubernetesPodOperator tasks and complete successfully

I'm testing kubernetes pod operator running
Airflow 2.2.1 with kubernetes cnf 2.1.0 on my minikube.
I'm having issues trying to spaw a mock task:
init_environments = [
k8s.V1EnvVar(name='AIRFLOW__KUBERNETES__POD_TEMPLATE_FILE', value='""'),
k8s.V1EnvVar(name='KUBERNETES__POD_TEMPLATE_FILE', value='""'),
k8s.V1EnvVar(name='POD_TEMPLATE_FILE', value='""')]
other_task = KubernetesPodOperator(
dag=dag,
task_id="ingestion_kube",
env_vars=init_environments,
cmds=["bash", "-cx"],
arguments=["echo 10 \n\n\n\n\n\n\n"],
name="base",
image="meltano-flieber",
image_pull_policy="IfNotExists",
in_cluster=True,
namespace="localkubeflow",
is_delete_operator_pod=False,
pod_template_file=None,
get_logs=True
)
The task pod from KubernetesExecutor that executes the airflow completes successfully but there is no sign of the pod operator task.
Relevant logs of airflow task:
[2021-11-23 21:30:28,213] {dagbag.py:500} INFO - Filling up the DagBag from /opt/***/dags/meltano/meltano_ingest_pendo.py
Running <TaskInstance: meltano_tasks.ingestion_kube manual__2021-11-23T21:30:10.788963+00:00 [queued]> on host meltanotasksingestionkube.996b19ee10464c2f8683cdfc8ce7303
And in airflow tha task looks like if failed but without any relevant logs, anybody have something related to that or has any suggestions?

Github actions Runner listener exited with error code null

In my server, if I run
sudo ./svc.sh status
I got this
It says status is active. But Runner listener exited with error code null
In my, Github accounts actions page runner is offline.
The runner should be Idle as far as I know.
This is my workflow
name: Node.js CI
on:
push:
branches: [ dev ]
jobs:
build:
runs-on: self-hosted
strategy:
matrix:
node-version: [ 12.x]
# See supported Node.js release schedule at https://nodejs.org/en/about/releases/
steps:
- uses: actions/checkout#v2
- name: Use Node.js ${{ matrix.node-version }}
uses: actions/setup-node#v1
with:
node-version: ${{ matrix.node-version }}
Here I don't have any build scripts because I directly push the build folder to Github repo.
How do I fix this error?
I ran into this issue when I first installed the runner application under $USER1 but configured it as a service when I was $USER2.
In the end, I ran cd /home/$USER1/actions-runner && sudo ./svc.sh uninstall to remove the service.
Changed the owner of the all files in the /home/$USER1/actions-runner/ by
sudo chown -R $USER2 /home/$USER1/actions-runner.
And ran ./config.sh remove --token $HASHNUBER (you can get this by following this page) to removed all the runner application.
I also removed it from github settings runner page.
In the end, I installed eveything again as the same user, and it worked out.
I ran into this same issue and it turned out to be a disk space problem. Make sure your server has enough allocated storage.

Codefresh allure report: Test reporter requires CF_BRANCH_TAG_NORMALIZED variable for upload files

Set up:
Upon merge to master codefresh build job builds image and pushes it to docker registry
Codefresh test run job picks up new image and runs the test
By the end of test run CF job, allure report building step runs
Results:
3rd step fails with message in a title only if job ran all the way through pipeline
It passes fine if I rerun the job manually(no step 1, 2 are executed in this case)
Notes:
Manually adding that tag does not help
Test execution pipeline:
stages:
- "clone"
- "create"
- "run"
- "get_results"
- "clean_up"
steps:
clone:
title: "Cloning repository"
type: "git-clone"
repo: "repo/repo"
# CF_BRANCH value is auto set when pipeline is triggered
revision: "${{CF_BRANCH}}"
git: "github"
stage: "clone"
create:
title: "Spin up ec2 server on aws"
image: mesosphere/aws-cli
working_directory: "${{clone}}" # Running command where code cloned
commands:
- export AWS_ACCESS_KEY_ID="${{AWS_ACCESS_KEY_ID}}"
- export AWS_SECRET_ACCESS_KEY="${{AWS_SECRET_ACCESS_KEY}}"
- export AWS_DEFAULT_REGION="${{AWS_REGION}}"
- aws cloudformation create-stack --stack-name yourStackName --template-body file://cloudformation.yaml --parameters ParameterKey=keyName,ParameterValue=qaKeys
stage: "create"
run:
title: "Wait for results"
image: mesosphere/aws-cli
working_directory: "${{clone}}" # Running command where code cloned
commands:
# wait for results in s3
- apk update
- apk upgrade
- apk add bash
- export AWS_ACCESS_KEY_ID="${{AWS_ACCESS_KEY_ID}}"
- export AWS_SECRET_ACCESS_KEY="${{AWS_SECRET_ACCESS_KEY}}"
- export AWS_DEFAULT_REGION="${{AWS_REGION}}"
- chmod +x ./wait-for-aws.sh
- ./wait-for-aws.sh
# copy results ojbects from s3
- aws s3 cp s3://${S3_BUCKETNAME}/ ./ --recursive
- cp -r -f ./_result_/allure-raw $CF_VOLUME_PATH/allure-results
- cat test-result.txt
stage: "run"
get_results:
title: Generate test reporting
image: codefresh/cf-docker-test-reporting
tag: "${{CF_BRANCH_TAG_NORMALIZED}}"
working_directory: '${{CF_VOLUME_PATH}}/'
environment:
- BUCKET_NAME=yourName
- CF_STORAGE_INTEGRATION=integrationName
stage: "get_results"
clean_up:
title: "Remove cf stack and files from s3"
image: mesosphere/aws-cli
working_directory: "${{clone}}" # Running command where code cloned
commands:
# wait for results in s3
- apk update
- apk upgrade
- apk add bash
- export AWS_ACCESS_KEY_ID="${{AWS_ACCESS_KEY_ID}}"
- export AWS_SECRET_ACCESS_KEY="${{AWS_SECRET_ACCESS_KEY}}"
- export AWS_DEFAULT_REGION="${{AWS_REGION}}"
# delete stack
- aws cloudformation delete-stack --stack-name stackName
# remove all files from s3
# - aws s3 rm s3://bucketName --recursive
stage: "clean_up"```
Adding CF_BRANCH_TAG_NORMALIZED as a tag won't help in that case.
CF_BRANCH_TAG_NORMALIZED needs to be set as an environment variable for this step.
Taking a look at the source code of codefresh/cf-docker-test-reporting,
https://github.com/codefresh-io/cf-docker-test-reporting/blob/master/config/index.js
env: {
// bucketName - only bucket name, with out subdir path
bucketName: ConfigUtils.getBucketName(),
// bucketSubPath - parsed path to sub folder inside bucket
bucketSubPath: ConfigUtils.getBucketSubPath(),
// originBucketName - origin value that can contain subdir need to use it in some cases
originBucketName: process.env.BUCKET_NAME,
apiKey: process.env.CF_API_KEY,
buildId: process.env.CF_BUILD_ID,
volumePath: process.env.CF_VOLUME_PATH,
branchNormalized: process.env.CF_BRANCH_TAG_NORMALIZED,
storageIntegration: process.env.CF_STORAGE_INTEGRATION,
logLevel: logLevelsMap[process.env.REPORT_LOGGING_LEVEL] || INFO,
sourceReportFolderName: (allureDir || 'allure-results').trim(),
reportDir: ((reportDir || '').trim()) || undefined,
reportIndexFile: ((reportIndexFile || '').trim()) || undefined,
reportWrapDir: _.isNumber(reportWrapDir) ? String(reportWrapDir) : '',
reportType: _.isString(reportType) ? reportType.replace(/[<>]/g, 'hackDetected') : 'default',
allureDir,
clearTestReport
},
you can see that CF_BRANCH_TAG_NORMALIZED is taken directly from the environment.
My assumption is that whatever triggers your build normally will not set this environment variable. It is usually set automatically when you have a git trigger e.g. from Github.
When you start your pipeline manually you probably set the variable and that's why it's running then.
You should check how your pipelines are usually triggered and if the variable is set (automatically or manually).
Here's some more documentation about these variables:
https://codefresh.io/docs/docs/codefresh-yaml/variables/#system-provided-variables

Airflow is using SequentialExecutor despite setting executor to LocalExecutor in airflow.cfg

I'm having trouble getting the LocalExecutor to work.
I created a postgres database called airflow and granted all privileges to the airflow user. Finally I updated my airflow.cfg file:
# The executor class that airflow should use. Choices include
# SequentialExecutor, LocalExecutor, CeleryExecutor, DaskExecutor, KubernetesExecutor
executor = LocalExecutor
# The SqlAlchemy connection string to the metadata database.
# SqlAlchemy supports many different database engine, more information
# their website
sql_alchemy_conn = postgresql+psycopg2://airflow:[MY_PASSWORD]#localhost:5432/airflow
Next I ran:
airflow initdb
airflow scheduler
airflow webserver
I thought it was working, but I noticed my dags were taking a long time to finish. Upon further inspection of my log files, I noticed that they say Airflow is using the SequentialExecutor.
INFO - Job 319: Subtask create_task_send_email [2020-01-07 12:00:16,997] {__init__.py:51} INFO - Using executor SequentialExecutor
Does anyone know what could be causing this?

dag_id could not be found: dag_id. Either the dag did not exist or it failed to parse after upgrading airflow from 1.7.3 to 1.10.1

I upgraded the airflow version from 1.7.3 to 1.10.1. After up-gradation of the scheduler, webserver and workers, the dags have stopped working showing below error on scheduler-
Either the dag did not exist or it failed to parse.
I have not made any changes to the config. While investigating the issue the scheduler logs shows the issue. Earlier the scheduler run the task as -
Adding to queue: airflow run <dag_id> <task_id> <execution_date> --local -sd DAGS_FOLDER/<dag_filename.py>
While now it is running with absolute path -
Adding to queue: airflow run <dag_id> <task_id> <execution_date> --local -sd /<PATH_TO_DAGS_FOLDER>/<dag_filename.py>
PATH_TO_DAGS_FOLDER is like /home/<user>/Airflow/dags...
which is same as what it is pushing it to workers by since worker is running on some other user it is not able to find the dag location specified.
I am not sure how to tell the worker to look in it's own airflow home dir and not the scheduler one?
I am using mysql as backend and rabbitmq for message passing.

Resources