I have a question for Druid Operator. I see that this test is successful, but I take a this error.
File "/home/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line 792, in get_adapter
raise InvalidSchema(f"No connection adapters were found for {url!r}")
I take a dag like this
DRUID_CONN_ID = "druid_ingest_conn_id"
ingestion = DruidOperator(
task_id='ingestion',
druid_ingest_conn_id=DRUID_CONN_ID,
json_index_file='ingestion.json'
)
Also I change the dag to look overload but I take same error.
Another step I change the type to like this but I have a different error
ingestion_2 = SimpleHttpOperator(
task_id='test_task',
method='POST',
http_conn_id=DRUID_CONN_ID,
endpoint='/druid/indexer/v1/task',
data=json.dumps(read_file),
dag=dag,
do_xcom_push=True,
headers={
'Content-Type': 'application/json'
},
response_check=lambda response: response.json()['Status'] == 200,
)
{"error":"Missing type id when trying to resolve subtype of [simple type, class org.apache.druid.indexing.common.task.Task]: missing type id property 'type'\n at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: 1, column: 1]"}
Finally I try giving Http connection in Druid Operator but I have a error like this
raise AirflowException(f'Did not get 200 when submitting the Druid job to {url}')
So that I am confused. I need a help. Thanks for answers.
P.S: We use 2.3.3 Airflow version
Related
How to get the reason for the failure of an operator, without going into logs. As I want to post the reason as a notification through slack?
Thanks,
Xi
I can think of one way of doing this as below.
set error notifications -> https://www.astronomer.io/guides/error-notifications-in-airflow/
Also create a slack email alias for DM https://slack.com/help/articles/206819278-Send-emails-to-Slack
Other way is using the Slack API from airflow : https://medium.com/datareply/integrating-slack-alerts-in-airflow-c9dcd155105
Check the above for SlackAPIPostOperator
exception=context.get('exception')is the function which will give exact reason for failure
Example of on_failure_callback using slack:
step_checker = EmrStepSensor(task_id='watch_step',
job_flow_id="{{ task_instance.xcom_pull('create_job_flow',
key='return_value') }}",
step_id="{{task_instance.xcom_pull(task_ids='add_steps',key='return_value')[0] }}",
aws_conn_id='aws_default',
on_failure_callback=task_fail_slack_alert,)
def task_fail_slack_alert(context):
SLACK_CONN_ID = 'slack'
slack_webhook_token = BaseHook.get_connection(SLACK_CONN_ID).password
slack_msg = """
:red_circle: Task Failed.
*Task*: {task}
*Dag*: {dag}
*Execution Time*: {exec_date}
*Log Url*: {log_url}
*Error*:{exception}
""".format(
task=context.get('task_instance').task_id,
dag=context.get('task_instance').dag_id,
exec_date=context.get('execution_date'),
log_url=context.get('task_instance').log_url,
exception=context.get('exception')
)
failed_alert = SlackWebhookOperator(
task_id='slack_test',
http_conn_id='slack',
webhook_token=slack_webhook_token,
message=slack_msg,
username='airflow',
dag=dag)
return failed_alert.execute(context=context)
I have a problem in the usage of the on_failure_callback. I have defined my error callback function to perform 2 "http post" requests and I have added a logging.error( ) message between the two. I notice that only one is getting executed. Is there any delay or some thing that I am missing here?
please help.
def custom_failure_function(context):
logging.error("These task instances ahhh")
to_json= json.loads(t_teams)
var1= json.dumps(to_json)
print(var1)
r = requests.post('https://myteamschannel/teams', data=var1,verify=False)
logging.error("hello")
runID='OPERATION_CONTEXT .OCV8.TEST2 alarm_object 193'
headers = {'Content-Type':'text/xml'}
alarmRequest='<soapenv:Envelope xmlns:soapenv=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:oper=\"http://172.19.146.147:7180/TeMIP_WS/services/OPERATION_CONTEXT-alarm_object\"><soapenv:Header xmlns:wsa=\"http://www.w3.org/2005/08/addressing\"><wsu:Timestamp xmlns:wsu=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd\"><wsu:Created>2014-05-22T11:57:38.267Z</wsu:Created><wsu:Expires>2014-05-22T12:02:38.000Z</wsu:Expires></wsu:Timestamp><wsse:Security xmlns:wsse=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd\" soapenv:mustUnderstand=\"1\"><wsse:UsernameToken xmlns:wsu=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd\"><wsse:Username>girws</wsse:Username><wsse:Password Type=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-username-token-profile-1.0#PasswordText\">Temip</wsse:Password></wsse:UsernameToken></wsse:Security></soapenv:Header> <soapenv:Body> <oper:Set_Request xmlns:oper=\"http://172.19.146.147:7180/TeMIP_WS/services/OPERATION_CONTEXT-alarm_object\"><EntitySpec><Natural> '+ runID + '</Natural></EntitySpec><Arguments> <Attribute_Values><Filtering_Type>' + 'AUTOFAIL' + '</Filtering_Type></Attribute_Values></Arguments></oper:Set_Request> </soapenv:Body> </soapenv:Envelope>'
r = requests.post('http://myerrorappli:7180/TeMIP_WS/services/OPERATION_CONTEXT-alarm_object', header=headers, data=alarmRequest,verify=True)
logging.error ("FAILED TASK")
logging.error("============================================")
The logs of my airflow are below. Its stopping at the "hello" message and not printing "FAILED TASK".
*** Reading local file: /data/airflow//logs/MOE_TEST_DAG/TeamsTest/2021-10-02T08:24:14.821970+00:00/3.log
[2021-10-02 10:24:36,535] {MOE_TEST.py:132} ERROR - These task instances ahhh
[2021-10-02 10:24:36,987] {MOE_TEST.py:138} ERROR - hello
From your description it's more likely that there is an issue with requests.post() try to add timeout to the request:
def custom_failure_function(context):
...
try:
r = requests.post('http://myerrorappli:7180/TeMIP_WS/services/OPERATION_CONTEXT-alarm_object', header=headers,
data=alarmRequest, verify=True, timeout=5)
except requests.Timeout:
logging.error("request timeout")
except requests.ConnectionError:
logging.error("request connection error")
logging.error("FAILED TASK")
logging.error("============================================")
I'm writing a function to call Rasa API for intent prediction. Here is my code:
def run_test():
url = "http://localhost:5005/model/parse"
obj = {"text": "What is your name?"}
response = requests.post(url, data=obj)
print(response.json())
I also start Rasa server with this command: rasa run -m models --enable-api --cors "*" --debug
And here is what I got from Rasa server terminal:
In the terminal that I excuted run_test(), I got this result:
{'version': '2.7.1', 'status': 'failure', 'message': 'An unexpected error occurred. Error: Failed when parsing body as json', 'reason': 'ParsingError', 'details': {}, 'help': None, 'code'
: 500}
Anybody help me to solve this problem? Thank you very much!
I found the solution:
Only need to use json.dumps() the object, because object in Python is different than object in json.
def run_test():
url = "http://localhost:5005/model/parse"
obj = {"text": "What is your name?"}
response = requests.post(url, data=json.dumps(obj))
print(response.json())
I am fairly new to airflow and I am currently trying to pass information between my SimpleHttpOperators.
This is where the data is retrieved:
request_city_information = SimpleHttpOperator(
http_conn_id='overpass',
task_id='basic_city_information',
headers={"Content-Type": "application/x-www-form-urlencoded"},
method='POST',
data=f'[out:json]; node[name={name_city}][capital]; out center;',
response_filter=lambda response: response.json()['elements'][0],
dag=dag,)
And then I want to use the response from this in the following operator:
request_city_attractions = SimpleHttpOperator(
http_conn_id='overpass',
task_id='city_attractions',
headers={"Content-Type": "application/x-www-form-urlencoded"},
method='POST',
data=f"[out:json];(nwr[tourism='attraction'][wikidata](around:{search_radius},"
f"{request_city_information.xcom_pull(context='ti')['lat']}"
f",10););out body;>;out skel qt;",
dag=dag)
As you can see I tried to access the response via request_city_information.xcom_pull(context='ti'). However, my context seems to be wrong here.
As my data is already written into the XComs I take it that I don't need XCOM_push='True', as suggested here.
There seem to be changes to XCom since airflow 2.x as many of the suggested solutions I found do not work for me.
I believe there is a major gap in my thought process, I just don't know where.
I would appreciate any references to examples or help!
Thanks in advance
I have now solved it with a completely different approach, if you guys know how the first one works I would be happy for an explanation on that.
Here is my solution:
with DAG(
'city_info',
default_args=dafault_args,
description='xcom test',
schedule_interval=None,
) as dag:
#TODO: Tasks with conn_id
def get_city_information(**kwargs):
payload = f'[out:json]; node[name={name_city}][capital]; out center;'
#TODO: Request als Connection
r = requests.post('https://overpass-api.de/api/interpreter', data=payload)
ti = kwargs['ti']
ti.xcom_push('basic_city_information', r.json())
get_city_information_task = PythonOperator(
task_id='get_city_information_task',
python_callable=get_city_information
)
def get_city_attractions(**kwargs):
ti = kwargs['ti']
city_information = ti.xcom_pull(task_ids='get_city_information_task', key='basic_city_information')
payload = f"[out:json];(nwr[tourism='attraction'][wikidata](around:{search_radius}" \
f",{city_information['elements'][0]['lat']},{city_information['elements'][0]['lon']}" \
f"););out body;>;out skel qt;"
r = requests.post('https://overpass-api.de/api/interpreter', data=payload)
#TODO: Json as Object
ti.xcom_push('city_attractions', r.json())
get_city_attractions_task = PythonOperator(
task_id='get_city_attractions_task',
python_callable=get_city_attractions
)
get_city_information_task >> get_city_attractions_task
I've been trying to create an OpenStack image informing the Kernel Id and Ramdisk Id, using the OpenStack Unified SDK (https://github.com/openstack/python-openstacksdk), but without success. I know this is possible, because the OpenStack CLI have this parameters, as shown on this page (http://docs.openstack.org/cli-reference/glance.html#glance-image-create), where the CLI have the "--kernel-id" and "--ramdisk-id" parameters. I've used this parameter in the terminal and confirmed they work, but I need to use them in python.
I'm trying to use the upload_method, as described here http://developer.openstack.org/sdks/python/openstacksdk/users/proxies/image.html#image-api-v2 but I can't get the attrs parameter right. Documentation only say it is suposed to be a dictionary. Here is the code I'm using
...
atrib = {
'properties': {
'kernel_id': 'd84e1f2b-8d8c-4a4a-8858-77a8d5a93cb1',
'ramdisk_id': 'cfef18e0-006e-477a-a098-593d43435a1e'
}
}
with open(file) as fimage:
image = image_service.upload_image(
name=name,
data=fimage,
disk_format='qcow2',
container_format='bare',
**atrib)
....
And here is the error I'm getting:
File "builder.py", line 121, in main
**atrib
File "/usr/lib/python2.7/site-packages/openstack/image/v2/_proxy.py", line 51, in upload_image
**attrs)
File "/usr/lib/python2.7/site-packages/openstack/proxy2.py", line 193, in _create
return res.create(self.session)
File "/usr/lib/python2.7/site-packages/openstack/resource2.py", line 570, in create
json=request.body, headers=request.headers)
File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 675, in post
return self.request(url, 'POST', **kwargs)
File "/usr/lib/python2.7/site-packages/openstack/session.py", line 52, in map_exceptions_wrapper
http_status=e.http_status, cause=e)
openstack.exceptions.HttpException: HttpException: Bad Request, 400 Bad Request
Provided object does not match schema 'image': {u'kernel_id': u'd84e1f2b-8d8c-4a4a-8858-77a8d5a93cb1', u'ramdisk_id': u'cfef18e0-006e-477a-a098-593d43435a1e'} is not of type 'string' Failed validating 'type' in schema['additionalProperties']: {'type': 'string'} On instance[u'properties']: {u'kernel_id': u'd84e1f2b-8d8c-4a4a-8858-77a8d5a93cb1', u'ramdisk_id': u'cfef18e0-006e-477a-a098-593d43435a1e'}
Already tried to use the update_image method, but without success, passing kernel id and ramdisk id as a strings creates the instance, but it does not boot.
Does anyone knows how to solve this?
what the version of the glance api you use?
I have read the code in openstackclient/image/v1/images.py, openstackclient/v1/shell.py
## in shell.py
def do_image_create(gc, args):
...
fields = dict(filter(lambda x: x[1] is not None, vars(args).items()))
raw_properties = fields.pop('property')
fields['properties'] = {}
for datum in raw_properties:
key, value = datum.split('=', 1)
fields['properties'][key] = value
...
image = gc.images.create(**fields)
## in images.py
def create(self, **kwargs):
...
for field in kwargs:
if field in CREATE_PARAMS:
fields[field] = kwargs[field]
elif field == 'return_req_id':
continue
else:
msg = 'create() got an unexpected keyword argument \'%s\''
raise TypeError(msg % field)
hdrs = self._image_meta_to_headers(fields)
...
resp, body = self.client.post('/v1/images',
headers=hdrs,
data=image_data)
...
and openstackclient/v2/shell.py,openstackclient/image/v2.images.py(and i have debuged this too)
## in shell.py
def do_image_create(gc, args):
...
raw_properties = fields.pop('property', [])
for datum in raw_properties:
key, value = datum.split('=', 1)
fields[key] = value
...
image = gc.images.create(**fields)
##in images.py
def create(self, **kwargs):
"""Create an image.""
url = '/v2/images'
image = self.model()
for (key, value) in kwargs.items():
try:
setattr(image, key, value)
except warlock.InvalidOperation as e:
raise TypeError(utils.exception_to_str(e))
resp, body = self.http_client.post(url, data=image)
...
it seems that, you can create a image use your way in version 1.0, but in version 2.0, you should use the kernel_id and ramdisk_id as below
atrib = {
'kernel_id': 'd84e1f2b-8d8c-4a4a-8858-77a8d5a93cb1',
'ramdisk_id': 'cfef18e0-006e-477a-a098-593d43435a1e'
}
but the OpenStack SDK seems it can't trans those two argments to the url (because there is no Body define in openstack/image/v2/image.py. so you should modify the OpenStack SDK to support this.
BTW, the code of OpenStack is a little different from it's version, but many things are same.