Airflow - Druid Operator is not getting host - airflow

I have a question for Druid Operator. I see that this test is successful, but I take a this error.
File "/home/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line 792, in get_adapter
raise InvalidSchema(f"No connection adapters were found for {url!r}")
I take a dag like this
DRUID_CONN_ID = "druid_ingest_conn_id"
ingestion = DruidOperator(
task_id='ingestion',
druid_ingest_conn_id=DRUID_CONN_ID,
json_index_file='ingestion.json'
)
Also I change the dag to look overload but I take same error.
Another step I change the type to like this but I have a different error
ingestion_2 = SimpleHttpOperator(
task_id='test_task',
method='POST',
http_conn_id=DRUID_CONN_ID,
endpoint='/druid/indexer/v1/task',
data=json.dumps(read_file),
dag=dag,
do_xcom_push=True,
headers={
'Content-Type': 'application/json'
},
response_check=lambda response: response.json()['Status'] == 200,
)
{"error":"Missing type id when trying to resolve subtype of [simple type, class org.apache.druid.indexing.common.task.Task]: missing type id property 'type'\n at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: 1, column: 1]"}
Finally I try giving Http connection in Druid Operator but I have a error like this
raise AirflowException(f'Did not get 200 when submitting the Druid job to {url}')
So that I am confused. I need a help. Thanks for answers.
P.S: We use 2.3.3 Airflow version

Related

How to get reason for failure using slack in airflow2.0

How to get the reason for the failure of an operator, without going into logs. As I want to post the reason as a notification through slack?
Thanks,
Xi
I can think of one way of doing this as below.
set error notifications -> https://www.astronomer.io/guides/error-notifications-in-airflow/
Also create a slack email alias for DM https://slack.com/help/articles/206819278-Send-emails-to-Slack
Other way is using the Slack API from airflow : https://medium.com/datareply/integrating-slack-alerts-in-airflow-c9dcd155105
Check the above for SlackAPIPostOperator
exception=context.get('exception')is the function which will give exact reason for failure
Example of on_failure_callback using slack:
step_checker = EmrStepSensor(task_id='watch_step',
job_flow_id="{{ task_instance.xcom_pull('create_job_flow',
key='return_value') }}",
step_id="{{task_instance.xcom_pull(task_ids='add_steps',key='return_value')[0] }}",
aws_conn_id='aws_default',
on_failure_callback=task_fail_slack_alert,)
def task_fail_slack_alert(context):
SLACK_CONN_ID = 'slack'
slack_webhook_token = BaseHook.get_connection(SLACK_CONN_ID).password
slack_msg = """
:red_circle: Task Failed.
*Task*: {task}
*Dag*: {dag}
*Execution Time*: {exec_date}
*Log Url*: {log_url}
*Error*:{exception}
""".format(
task=context.get('task_instance').task_id,
dag=context.get('task_instance').dag_id,
exec_date=context.get('execution_date'),
log_url=context.get('task_instance').log_url,
exception=context.get('exception')
)
failed_alert = SlackWebhookOperator(
task_id='slack_test',
http_conn_id='slack',
webhook_token=slack_webhook_token,
message=slack_msg,
username='airflow',
dag=dag)
return failed_alert.execute(context=context)

Airflow Error callback "on_failure_callback" is not executing all the lines in the function

I have a problem in the usage of the on_failure_callback. I have defined my error callback function to perform 2 "http post" requests and I have added a logging.error( ) message between the two. I notice that only one is getting executed. Is there any delay or some thing that I am missing here?
please help.
def custom_failure_function(context):
logging.error("These task instances ahhh")
to_json= json.loads(t_teams)
var1= json.dumps(to_json)
print(var1)
r = requests.post('https://myteamschannel/teams', data=var1,verify=False)
logging.error("hello")
runID='OPERATION_CONTEXT .OCV8.TEST2 alarm_object 193'
headers = {'Content-Type':'text/xml'}
alarmRequest='<soapenv:Envelope xmlns:soapenv=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:oper=\"http://172.19.146.147:7180/TeMIP_WS/services/OPERATION_CONTEXT-alarm_object\"><soapenv:Header xmlns:wsa=\"http://www.w3.org/2005/08/addressing\"><wsu:Timestamp xmlns:wsu=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd\"><wsu:Created>2014-05-22T11:57:38.267Z</wsu:Created><wsu:Expires>2014-05-22T12:02:38.000Z</wsu:Expires></wsu:Timestamp><wsse:Security xmlns:wsse=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd\" soapenv:mustUnderstand=\"1\"><wsse:UsernameToken xmlns:wsu=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd\"><wsse:Username>girws</wsse:Username><wsse:Password Type=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-username-token-profile-1.0#PasswordText\">Temip</wsse:Password></wsse:UsernameToken></wsse:Security></soapenv:Header> <soapenv:Body> <oper:Set_Request xmlns:oper=\"http://172.19.146.147:7180/TeMIP_WS/services/OPERATION_CONTEXT-alarm_object\"><EntitySpec><Natural> '+ runID + '</Natural></EntitySpec><Arguments> <Attribute_Values><Filtering_Type>' + 'AUTOFAIL' + '</Filtering_Type></Attribute_Values></Arguments></oper:Set_Request> </soapenv:Body> </soapenv:Envelope>'
r = requests.post('http://myerrorappli:7180/TeMIP_WS/services/OPERATION_CONTEXT-alarm_object', header=headers, data=alarmRequest,verify=True)
logging.error ("FAILED TASK")
logging.error("============================================")
The logs of my airflow are below. Its stopping at the "hello" message and not printing "FAILED TASK".
*** Reading local file: /data/airflow//logs/MOE_TEST_DAG/TeamsTest/2021-10-02T08:24:14.821970+00:00/3.log
[2021-10-02 10:24:36,535] {MOE_TEST.py:132} ERROR - These task instances ahhh
[2021-10-02 10:24:36,987] {MOE_TEST.py:138} ERROR - hello
From your description it's more likely that there is an issue with requests.post() try to add timeout to the request:
def custom_failure_function(context):
...
try:
r = requests.post('http://myerrorappli:7180/TeMIP_WS/services/OPERATION_CONTEXT-alarm_object', header=headers,
data=alarmRequest, verify=True, timeout=5)
except requests.Timeout:
logging.error("request timeout")
except requests.ConnectionError:
logging.error("request connection error")
logging.error("FAILED TASK")
logging.error("============================================")

Failed when parsing body as json - call Rasa API with request.post()

I'm writing a function to call Rasa API for intent prediction. Here is my code:
def run_test():
url = "http://localhost:5005/model/parse"
obj = {"text": "What is your name?"}
response = requests.post(url, data=obj)
print(response.json())
I also start Rasa server with this command: rasa run -m models --enable-api --cors "*" --debug
And here is what I got from Rasa server terminal:
In the terminal that I excuted run_test(), I got this result:
{'version': '2.7.1', 'status': 'failure', 'message': 'An unexpected error occurred. Error: Failed when parsing body as json', 'reason': 'ParsingError', 'details': {}, 'help': None, 'code'
: 500}
Anybody help me to solve this problem? Thank you very much!
I found the solution:
Only need to use json.dumps() the object, because object in Python is different than object in json.
def run_test():
url = "http://localhost:5005/model/parse"
obj = {"text": "What is your name?"}
response = requests.post(url, data=json.dumps(obj))
print(response.json())

Pass Information between two SimpleHttpOperators with xom_pull()

I am fairly new to airflow and I am currently trying to pass information between my SimpleHttpOperators.
This is where the data is retrieved:
request_city_information = SimpleHttpOperator(
http_conn_id='overpass',
task_id='basic_city_information',
headers={"Content-Type": "application/x-www-form-urlencoded"},
method='POST',
data=f'[out:json]; node[name={name_city}][capital]; out center;',
response_filter=lambda response: response.json()['elements'][0],
dag=dag,)
And then I want to use the response from this in the following operator:
request_city_attractions = SimpleHttpOperator(
http_conn_id='overpass',
task_id='city_attractions',
headers={"Content-Type": "application/x-www-form-urlencoded"},
method='POST',
data=f"[out:json];(nwr[tourism='attraction'][wikidata](around:{search_radius},"
f"{request_city_information.xcom_pull(context='ti')['lat']}"
f",10););out body;>;out skel qt;",
dag=dag)
As you can see I tried to access the response via request_city_information.xcom_pull(context='ti'). However, my context seems to be wrong here.
As my data is already written into the XComs I take it that I don't need XCOM_push='True', as suggested here.
There seem to be changes to XCom since airflow 2.x as many of the suggested solutions I found do not work for me.
I believe there is a major gap in my thought process, I just don't know where.
I would appreciate any references to examples or help!
Thanks in advance
I have now solved it with a completely different approach, if you guys know how the first one works I would be happy for an explanation on that.
Here is my solution:
with DAG(
'city_info',
default_args=dafault_args,
description='xcom test',
schedule_interval=None,
) as dag:
#TODO: Tasks with conn_id
def get_city_information(**kwargs):
payload = f'[out:json]; node[name={name_city}][capital]; out center;'
#TODO: Request als Connection
r = requests.post('https://overpass-api.de/api/interpreter', data=payload)
ti = kwargs['ti']
ti.xcom_push('basic_city_information', r.json())
get_city_information_task = PythonOperator(
task_id='get_city_information_task',
python_callable=get_city_information
)
def get_city_attractions(**kwargs):
ti = kwargs['ti']
city_information = ti.xcom_pull(task_ids='get_city_information_task', key='basic_city_information')
payload = f"[out:json];(nwr[tourism='attraction'][wikidata](around:{search_radius}" \
f",{city_information['elements'][0]['lat']},{city_information['elements'][0]['lon']}" \
f"););out body;>;out skel qt;"
r = requests.post('https://overpass-api.de/api/interpreter', data=payload)
#TODO: Json as Object
ti.xcom_push('city_attractions', r.json())
get_city_attractions_task = PythonOperator(
task_id='get_city_attractions_task',
python_callable=get_city_attractions
)
get_city_information_task >> get_city_attractions_task

OpenStack SDK - How to create image with Kernel id and Ramdisk parameters?

I've been trying to create an OpenStack image informing the Kernel Id and Ramdisk Id, using the OpenStack Unified SDK (https://github.com/openstack/python-openstacksdk), but without success. I know this is possible, because the OpenStack CLI have this parameters, as shown on this page (http://docs.openstack.org/cli-reference/glance.html#glance-image-create), where the CLI have the "--kernel-id" and "--ramdisk-id" parameters. I've used this parameter in the terminal and confirmed they work, but I need to use them in python.
I'm trying to use the upload_method, as described here http://developer.openstack.org/sdks/python/openstacksdk/users/proxies/image.html#image-api-v2 but I can't get the attrs parameter right. Documentation only say it is suposed to be a dictionary. Here is the code I'm using
...
atrib = {
'properties': {
'kernel_id': 'd84e1f2b-8d8c-4a4a-8858-77a8d5a93cb1',
'ramdisk_id': 'cfef18e0-006e-477a-a098-593d43435a1e'
}
}
with open(file) as fimage:
image = image_service.upload_image(
name=name,
data=fimage,
disk_format='qcow2',
container_format='bare',
**atrib)
....
And here is the error I'm getting:
File "builder.py", line 121, in main
**atrib
File "/usr/lib/python2.7/site-packages/openstack/image/v2/_proxy.py", line 51, in upload_image
**attrs)
File "/usr/lib/python2.7/site-packages/openstack/proxy2.py", line 193, in _create
return res.create(self.session)
File "/usr/lib/python2.7/site-packages/openstack/resource2.py", line 570, in create
json=request.body, headers=request.headers)
File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 675, in post
return self.request(url, 'POST', **kwargs)
File "/usr/lib/python2.7/site-packages/openstack/session.py", line 52, in map_exceptions_wrapper
http_status=e.http_status, cause=e)
openstack.exceptions.HttpException: HttpException: Bad Request, 400 Bad Request
Provided object does not match schema 'image': {u'kernel_id': u'd84e1f2b-8d8c-4a4a-8858-77a8d5a93cb1', u'ramdisk_id': u'cfef18e0-006e-477a-a098-593d43435a1e'} is not of type 'string' Failed validating 'type' in schema['additionalProperties']: {'type': 'string'} On instance[u'properties']: {u'kernel_id': u'd84e1f2b-8d8c-4a4a-8858-77a8d5a93cb1', u'ramdisk_id': u'cfef18e0-006e-477a-a098-593d43435a1e'}
Already tried to use the update_image method, but without success, passing kernel id and ramdisk id as a strings creates the instance, but it does not boot.
Does anyone knows how to solve this?
what the version of the glance api you use?
I have read the code in openstackclient/image/v1/images.py, openstackclient/v1/shell.py
## in shell.py
def do_image_create(gc, args):
...
fields = dict(filter(lambda x: x[1] is not None, vars(args).items()))
raw_properties = fields.pop('property')
fields['properties'] = {}
for datum in raw_properties:
key, value = datum.split('=', 1)
fields['properties'][key] = value
...
image = gc.images.create(**fields)
## in images.py
def create(self, **kwargs):
...
for field in kwargs:
if field in CREATE_PARAMS:
fields[field] = kwargs[field]
elif field == 'return_req_id':
continue
else:
msg = 'create() got an unexpected keyword argument \'%s\''
raise TypeError(msg % field)
hdrs = self._image_meta_to_headers(fields)
...
resp, body = self.client.post('/v1/images',
headers=hdrs,
data=image_data)
...
and openstackclient/v2/shell.py,openstackclient/image/v2.images.py(and i have debuged this too)
## in shell.py
def do_image_create(gc, args):
...
raw_properties = fields.pop('property', [])
for datum in raw_properties:
key, value = datum.split('=', 1)
fields[key] = value
...
image = gc.images.create(**fields)
##in images.py
def create(self, **kwargs):
"""Create an image.""
url = '/v2/images'
image = self.model()
for (key, value) in kwargs.items():
try:
setattr(image, key, value)
except warlock.InvalidOperation as e:
raise TypeError(utils.exception_to_str(e))
resp, body = self.http_client.post(url, data=image)
...
it seems that, you can create a image use your way in version 1.0, but in version 2.0, you should use the kernel_id and ramdisk_id as below
atrib = {
'kernel_id': 'd84e1f2b-8d8c-4a4a-8858-77a8d5a93cb1',
'ramdisk_id': 'cfef18e0-006e-477a-a098-593d43435a1e'
}
but the OpenStack SDK seems it can't trans those two argments to the url (because there is no Body define in openstack/image/v2/image.py. so you should modify the OpenStack SDK to support this.
BTW, the code of OpenStack is a little different from it's version, but many things are same.

Resources