how to check the output of airflow test? - airflow

I tried the airflow tutorial DAG, and it works with scheduler, I can see the logs generated by scheduled run. But if I use command line test, I didn't see the output:
airflow test my_tutorial_2 templated 2018-09-08
[2018-09-10 15:41:43,121] {__init__.py:51} INFO - Using executor SequentialExecutor
[2018-09-10 15:41:43,281] {models.py:258} INFO - Filling up the DagBag from /Users/xiang/Documents/BigData/airflow/dags
[2018-09-10 15:41:43,338] {example_kubernetes_operator.py:54} WARNING - Could not import KubernetesPodOperator: No module named 'kubernetes'
[2018-09-10 15:41:43,339] {example_kubernetes_operator.py:55} WARNING - Install kubernetes dependencies with: pip install airflow['kubernetes']
That is all the output, and my output is not there.
The airflow version is:
▶ pip list
Package Version
---------------- ---------
alembic 0.8.10
apache-airflow 1.10.0

If you use Ariflow v1.10, you can set the propagate attribute of taskinstance logger to True, then the log record will be propagated to root logger, which use console handler, and printed to sys.stdout.
Add ti.log.propagate = True
after line 589 to site-packages/airflow/bin/cli.py could do this trick.

I've since found that whilst setting 'console' as a handler for the airflow.task logger allows you to see the output of 'airflow test' commands, it also seems to cause 'airflow run' commands to enter an infinite loop and run out of memory. I would therefore only do this in an environment where you only want to run 'airflow test' commands
Why it does this I don't know yet, and whether there's a way to accomplish this question without breaking 'airflow run' is unclear to me
The default logging config for Airflow 1.10.0 has the following loggers available:
'loggers': {
'airflow.processor': {
'handlers': ['processor'],
'level': LOG_LEVEL,
'propagate': False,
},
'airflow.task': {
'handlers': ['task'],
'level': LOG_LEVEL,
'propagate': False,
},
'flask_appbuilder': {
'handler': ['console'],
'level': FAB_LOG_LEVEL,
'propagate': True,
}
},
and the airflow.task logger (which is the logger used when running your task) uses the 'task' handler:
'handlers': {
'console': {
'class': 'airflow.utils.log.logging_mixin.RedirectStdHandler',
'formatter': 'airflow',
'stream': 'sys.stdout'
},
'task': {
'class': 'airflow.utils.log.file_task_handler.FileTaskHandler',
'formatter': 'airflow',
'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
'filename_template': FILENAME_TEMPLATE,
},
'processor': {
'class': 'airflow.utils.log.file_processor_handler.FileProcessorHandler',
'formatter': 'airflow',
'base_log_folder': os.path.expanduser(PROCESSOR_LOG_FOLDER),
'filename_template': PROCESSOR_FILENAME_TEMPLATE,
},
},
which (unless changed) will only write the output of the task to a log file. If you want to see the output in stdout as well, then you need add the console handler to the list of handlers used by the airflow.task logger:
'airflow.task': {
'handlers': ['task', 'console'],
'level': LOG_LEVEL,
'propagate': False,
},
This can be done by either setting up a custom logging configuration class, which overrides the default configuration, or by editing the default settings file
wherever_you_installed_airflow/site-packages/airflow/config_templates/airflow_local_settings.py

I ran into this problem as well with AirFlow 1.10.0. As Louis Genasi mentioned, airflow run would go into a death spiral with default settings and console handler. I suspect there may be a bug with the default logging class in 1.10.0.
I got around the issue by changing the logging handler to Python's logging.StreamHandler (which appears to be the default in Airflow < 1.10.0):
'handlers': {
'console': {
'class': 'logging.StreamHandler',
'formatter': 'airflow',
'stream': 'ext://sys.stdout'
},
'loggers': {
'airflow.processor': {
'handlers': ['console'],
'level': LOG_LEVEL,
'propagate': False,
},
'airflow.task': {
'handlers': ['console'],
'level': LOG_LEVEL,
'propagate': False,
},
'flask_appbuilder': {
'handler': ['console'],
'level': FAB_LOG_LEVEL,
'propagate': True,
}
},
'root': {
'handlers': ['console'],
'level': LOG_LEVEL,
}

Related

Doesn't the airflow ecosystem share a single root logger?

I am trying to ship all airflow logs to kafka by attaching a new handler to the root logger, but not all logs are being published. Do I need to configure something else here?
This is what I'm doing:
custom_log_config.py
LOGGING_CONFIG = deepcopy(DEFAULT_LOGGING_CONFIG)
# Configure a new handler for publishing logs to kafka
environment = get_app_env()
LOGGING_CONFIG["handlers"]["kafka_handler"] = {
"class": "com.test.log_handler.KafkaHandler",
"formatter": "airflow",
"version": environment.version,
"log_file": log_file,
"filters": ["mask_secrets"],
}
# Attach handler to root logger of airflow
LOGGING_CONFIG["root"]["handlers"].append("kafka_handler")
And finally I'm setting airflow configs to use the new logger class described above:
airflow.logging__logging_config_class=com.test.log_handler.custom_log_config.LOGGING_CONFIG
While some logs do flow to kafka, I'm missing task run logs (eg. following loggers: taskinstance.py, standard_task_runner.py, cli_action_loggers.py)
Look at the DEFAULT_LOGGING_CONFIG:
'loggers': {
'airflow.processor': {
'handlers': ['processor'],
'level': LOG_LEVEL,
'propagate': False,
},
'airflow.task': {
'handlers': ['task'],
'level': LOG_LEVEL,
'propagate': False,
'filters': ['mask_secrets'],
},
'flask_appbuilder': {
'handlers': ['console'],
'level': FAB_LOG_LEVEL,
'propagate': True,
},
},
You will find that tasks have separate logger "airflow.task"

.eslintrc error Parsing error: Maximum call stack size exceeded

I am using vite, eslint, and vue3 typescript. When I run command:
npm run lint
I get numerous errors:
0:0 error Parsing error: Maximum call stack size exceeded
My .eslintrc.js setup is:
module.exports = {
parserOptions: {
parser: 'vue-eslint-parser',
sourceType: 'module'
},
env: {
node: true,
},
extends: [
'eslint:recommended',
'plugin:vue/vue3-recommended',
'prettier'
],
rules: {
// override/add rules settings here, such as:
// 'vue/no-unused-vars': 'error'
}
}
This fixed the problem, .eslintrc.js:
module.exports = {
parser: "vue-eslint-parser",
parserOptions: {
parser: "#typescript-eslint/parser",
sourceType: "module"
},
env: {
node: true,
},
extends: [
'eslint:recommended',
'plugin:vue/vue3-recommended',
'prettier'
],
rules: {
// override/add rules settings here, such as:
// 'vue/no-unused-vars': 'error'
}
}

How to pass arguments in mup config for meteor run command?

Mup version (1.4.3):
module.exports = {
servers: {
},
meteor: {
// TODO: change app name and path
name: 'e-commerce',
path: '../../store', //diff
servers: {
// one: {},
two: {},
three: {},
four: {},
five: {},
six: {},
seven: {}
},
buildOptions: {
serverOnly: true,
cleanAfterBuild: false,
},
env: {
// TODO: Change to your app's url
// If you are using ssl, it needs to start with https://
ROOT_URL: 'https://someurl.com',
MONGO_URL: 'xxx',
},
docker: {
// change to 'kadirahq/meteord' if your app is not using Meteor 1.4
//image: 'abernix/meteord:base',
image: 'abernix/meteord:node-8.4.0-base'
// imagePort: 80, // (default: 80, some images EXPOSE different ports)
},
deployCheckWaitTime: 80,
enableUploadProgressBar: true
}
};
Output of commmand after 2gb / 8 g is filled
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
I have detailed my error in meteor forum here
Turns out not enough memory is located and i need to increase memory as included in here
I need to pass environment Variable when mup runs my app, how do i put that example main.js --node-args="--max-old-space-size=6144"
How do i pass arugments with mup config ?
Any suggestion or help is much appreciated.
I just forked the repo and modified and created a new image. Is running already. Thanks for the hint

Using sw-precache configured with runtimeCaching is not loading sw-toolbox

According to the sw-precache documentation https://github.com/GoogleChrome/sw-precache#runtime-caching including configuration for runtime caching for sw-precache should itself take care of including sw-toolbox for runtime caching of dynamic content. I have tried using this with sw-precache's CLI as well as grunt-sw-precache. My configuration for Grunt is as follow:
grunt.initConfig({
'sw-precache': {
build: {
baseDir: './public',
workerFileName: 'service-worker.js',
appendTimestamp: true,
cacheId: 'cnbc-polymer-cache-20',
clientsClaim: true,
directoryIndex: 'index.html',
navigateFallback: 'index.html',
skipWaiting: true,
maximumFileSizeToCacheInBytes: (1024000 * 20),
staticFileGlobs: [
'/src/**/*',
'/index.html',
'/manifest.json',
'/bower_components/**/*',
'/images/**/*.*',
'/favicon.ico'
],
verbose: true,
runtimeCaching: [{
urlPattern: /franchise/,
handler: 'cacheFirst',
options: {
debug: true,
cache: {
maxEntries: 10,
name: 'franchise-cache',
maxAgeSeconds: 180
}
}
}, {
urlPattern: /story/,
handler: 'cacheFirst',
options: {
debug: true,
cache: {
maxEntries: 10,
name: 'story-cache',
maxAgeSeconds: 180
}
}
}]
}
}
});
And when trying with the CLI I used the following sw-precache-config.js:
module.exports = {
baseDir: './public',
workerFileName: 'service-worker.js',
appendTimestamp: true,
cacheId: 'cnbc-polymer-cache-20',
clientsClaim: true,
directoryIndex: 'index.html',
navigateFallback: 'index.html',
skipWaiting: true,
maximumFileSizeToCacheInBytes: (1024000 * 20),
staticFileGlobs: [
'/src/**/*',
'/index.html',
'/manifest.json',
'/bower_components/**/*',
'/images/**/*.*',
'/favicon.ico'
],
verbose: true,
runtimeCaching: [{
urlPattern: /franchise/,
handler: 'cacheFirst',
options: {
debug: true,
cache: {
maxEntries: 10,
name: 'franchise-cache',
maxAgeSeconds: 180
}
}
}, {
urlPattern: /story/,
handler: 'cacheFirst',
options: {
debug: true,
cache: {
maxEntries: 10,
name: 'story-cache',
maxAgeSeconds: 180
}
}
}]
};
All configuration options other than the runtimeCaching options are being applied to the generated service-worker.js file.
My package.json is configured to use "^4.2.3", of sw-precache, and "^3.4.0" of sw-toolbox.
I have not seen anyone else commenting of having this problem. Can anyone comment on what might be the issue preventing sw-precache from respecting my runtimeCaching options?
sadly grunt-sw-precache does not depend on the newest sw-precache which causes the runtimeCaching option and other improvements how sw-precache handles things like requestsRedirects correctly to be missing.
I made a clone of the repo and the necessary changes here. I have no intention on publishing this to npm, but as a temporary solution (so refer to my github repo in your package.json!)
Please check and make sure that you have done Grunt Installation.
grunt-sw-precache can be installed using the following command:
$ npm install grunt-sw-precache --save-dev
enabled grunt-sw-precache by adding the following to your Gruntfile:
grunt.loadNpmTasks('grunt-sw-precache');
Then, you might want to try using handler: 'networkFirst' instead of handler: 'cacheFirst'.
As mentioned in this tutorial,
Try to handle the request by fetching from the network. If it succeeds, store the response in the cache. Otherwise, try to fulfill the request from the cache. This is the strategy to use for basic read-through caching.
You may visit this GitHub post for more information on how and why you'd use sw-precache and sw-toolbox libraries together and also The offline cookbook for more information on caching strategies.

How do you use connect-livereload in grunt?

I was struggling to get my server to reload so I followed some links and ended up at connect-livereload.
However, when I try to use the grunt code in my grunt file
connect: {
options: {
port: 3000,
hostname: 'localhost'
},
dev: {
options: {
middleware: function (connect) {
return [
require('connect-livereload')(), // <--- here
checkForDownload,
mountFolder(connect, '.tmp'),
mountFolder(connect, 'app')
];
}
}
}
}
I get this error in my grunt console:
Warning: checkForDownload is not defined Use --force to continue.
Aborted due to warnings.
The documentation for connect-livereload is so horrendous, there is no mention of what checkForDownload, or mountFolder is supposed to me.
Does anyone know?

Resources