using localstack AWS how can i listen to SNS event in chalice - amazon-sns

i have a localstack setup and i can succesfuly publish SNS from my chalice application by poiting the endpoint url like such
_sns_topic = session.resource('sns', endpoint_url='http://localhost:4566')
however one of our endpoints needs to listen to this SNS event in local, but it is currently set-up using #app.on_sns_message()
#app.on_sns_message(topic='topicNameHere', ''))
def onSnsEvent(event) -> None:
#do something
I checked the documentation but i cant seem to find how can I point this so that it will listen to local/localstack events instead.
any tips and/or idea?

I was able to get this running on my side by following the official Chalice docs and using LocalStack's recommended Chalice wrapper. Here are the steps:
Start LocalStack: localstack start -d
Create a SNS topic: awslocal sns create-topic --name my-demo-topic --region us-east-1 --output table | cat (I am using my-demo-topic).
Install chalice-local: pip3 install chalice-local.
Initiate a new app: chalice-local new-project chalice-demo-sns
Keep the following code inside chalice-demo-sns:
from chalice import Chalice
app = Chalice(app_name='chalice-sns-demo')
app.debug = True
#app.on_sns_message(topic='my-demo-topic')
def handle_sns_message(event):
app.log.debug("Received message with subject: %s, message: %s",
event.subject, event.message)
Deploy it: chalice-local deploy
Use boto3 to publish messages on SNS:
$ python
>>> import boto3
>>> endpoint_url = "http://localhost.localstack.cloud:4566"
>>> sns = boto3.client('sns', endpoint_url=endpoint_url)
>>> topic_arn = [t['TopicArn'] for t in sns.list_topics()['Topics']
... if t['TopicArn'].endswith(':my-demo-topic')][0]
>>> sns.publish(Message='TestMessage1', Subject='TestSubject1',
... TopicArn=topic_arn)
{'MessageId': '12345', 'ResponseMetadata': {}}
>>> sns.publish(Message='TestMessage2', Subject='TestSubject2',
... TopicArn=topic_arn)
{'MessageId': '54321', 'ResponseMetadata': {}}
Check the logs: chalice-local logs -n handle_sns_message

Related

Can't access the fastapi page using the public ipv4 address of the deployed aws ec2 instance with uvicorn running service

I was testing a simple fastapi backend by deploying it on aws ec2 instance. The service runs fine in the default port 8000 in the local machine. But as I ran the script on the ec2 instance with
uvicorn main:app --reload it ran just fine with following return
INFO: Will watch for changes in these directories: ['file/path']
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: Started reloader process [4698] using StatReload
INFO: Started server process [4700]
INFO: Waiting for application startup.
INFO: Application startup complete.
Then in the ec2 security group configuration, the TCP for 8000 port was allowed as shown in the below image.
ec2 security group port detail
Then to test and access the service I opened the public ipv4 address with port address as https://ec2-public-ipv4-ip:8000/ in chrome.
But there is no response whatsoever.
The webpage is as below
result webpage
The error in the console is as below
VM697:6747 crbug/1173575, non-JS module files deprecated.
(anonymous) # VM697:6747
The fastapi main file contains :->
from fastapi import FastAPI, Form, Depends
from fastapi.middleware.cors import CORSMiddleware
from fastapi.encoders import jsonable_encoder
import joblib
import numpy as np
import os
from own.preprocess import Preprocess
import sklearn
col_data = joblib.load("col_bool_mod.z")
app = FastAPI()
#app.get("/predict")
async def test():
return jsonable_encoder(col_data)
#app.post("/predict")
async def provide(data: list):
print(data)
output = main(data)
return output
def predict_main(df):
num_folds = len(os.listdir("./models/"))
result_li = []
for fold in range(num_folds):
print(f"predicting for fold {fold} / {num_folds}")
model = joblib.load(f"./models/tabnet/{fold}_tabnet_reg_adam/{fold}_model.z")
result = model.predict(df)
print(result)
result_li.append(result)
return np.mean(result_li)
def main(data):
df = Preprocess(data)
res = predict_main(df)
print(res)
return {"value": f"{np.float64(res).item():.3f}" if res >=0 else f"{np.float64(0).item()}"}
The service runs fine with same steps in the react js frontend using port 3000 but the fastapi on 8000 is somehow not working.
Thank You for Your Patience
I wanted to retrieve the basic api reponses from the fastapi-uvicorn server deployed in an aws ec2 instance. But there is no response with 8000 port open and ec2 access on local ipv4 ip address.
One way the problem is fixed is by assigning public ipv4 address followed by port 3000 in the CORS origin. But the issue is to hide the get request data on the browser that is accessed by 8000 port.

SFTP with Google Cloud Composer

I need to upload a file via SFTP into an external server through Cloud Composer. The code for the task is as follows:
from airflow import DAG
from airflow.operators.python_operator import PythonVirtualenvOperator
from airflow.operators.dummy_operator import DummyOperator
from datetime import datetime, timedelta
def make_sftp():
import paramiko
import pysftp
import os
from airflow.contrib.hooks.ssh_hook import SSHHook
import subprocess
ssh_hook = SSHHook(ssh_conn_id="conn_id")
sftp_client = ssh_hook.get_conn().open_sftp()
return 0
etl_dag = DAG("dag_test",
start_date=datetime.now(tz=local_tz),
schedule_interval=None,
default_args={
"owner": "airflow",
"depends_on_past": False,
"email_on_failure": False,
"email_on_retry": False,
"retries": 5,
"retry_delay": timedelta(minutes=5)})
sftp = PythonVirtualenvOperator(task_id="sftp",
python_callable=make_sftp,
requirements=["sshtunnel", "paramiko"],
dag=etl_dag)
start_pipeline = DummyOperator(task_id="start_pipeline", dag=etl_dag)
start_pipeline >> sftp
In "conn_id" I have used the following options: {"no_host_key_check": "true"}, the DAG runs for a couple of seconds and the fail with the following message:
WARNING - Remote Identification Change is not verified. This wont protect against Man-In-The-Middle attacks\n[2022-02-10 10:01:59,358] {ssh_hook.py:171} WARNING - No Host Key Verification. This wont protect against Man-In-The-Middle attacks\nTraceback (most recent call last):\n File "/tmp/venvur4zvddz/script.py", line 23, in <module>\n res = make_sftp(*args, **kwargs)\n File "/tmp/venvur4zvddz/script.py", line 19, in make_sftp\n sftp_client = ssh_hook.get_conn().open_sftp()\n File "/usr/local/lib/airflow/airflow/contrib/hooks/ssh_hook.py", line 194, in get_conn\n client.connect(**connect_kwargs)\n File "/opt/python3.6/lib/python3.6/site-packages/paramiko/client.py", line 412, in connect\n server_key = t.get_remote_server_key()\n File "/opt/python3.6/lib/python3.6/site-packages/paramiko/transport.py", line 834, in get_remote_server_key\n raise SSHException("No existing session")\nparamiko.ssh_exception.SSHException: No existing session\n'
do I have to set other options? Thank you!
Configuring the SSH connection with key pair authentication
To SSH into the host as a user with username “user_a”, an SSH key pair should be generated for that user and the public key should be added to the host machine. The following are the steps that would create an SSH connection to the “jupyter” user which has the write permissions.
Run the following commands on the local machine to generate the required SSH key:
ssh-keygen -t rsa -f ~/.ssh/sftp-ssh-key -C user_a
“sftp-ssh-key” → Name of the pair of public and private keys (Public key: sftp-ssh-key.pub, Private key: sftp-ssh-key)
“user_a” → User in the VM that we are trying to connect to
chmod 400 ~/.ssh/sftp-ssh-key
Now, copy the contents of the public key sftp-ssh-key.pub into ~/.ssh/authorized_keys of your host system. Check for necessary permissions for authorized_keys and grant them accordingly using chmod.
I tested the setup with a Compute Engine VM . In the Compute Engine console, edit the VM settings to add the contents of the generated SSH public key into the instance metadata. Detailed instructions can be found here. If you are connecting to a Compute Engine VM, make sure that the instance has the appropriate firewall rule to allow the SSH connection.
Upload the private key to the client machine. In this scenario, the client is the Airflow DAG so the key file should be accessible from the Composer/Airflow environment. To make the key file accessible, it has to be uploaded to the GCS bucket associated with the Composer environment. For example, if the private key is uploaded to the data folder in the bucket, the key file path would be /home/airflow/gcs/data/sftp-ssh-key.
Configuring the SSH connection with password authentication
If password authentication is not configured on the host machine, follow the below steps to enable password authentication.
Set the user password using the below command and enter the new password twice.
sudo passwd user_a
To enable SSH password authentication, you must SSH into the host machine as root to edit the sshd_config file.
/etc/ssh/sshd_config
Then, change the line PasswordAuthentication no to PasswordAuthentication yes. After making that change, restart the SSH service by running the following command as root.
sudo service ssh restart
Password authentication has been configured now.
Creating connections and uploading the DAG
1.1 Airflow connection with key authentication
Create a connection in Airflow with the below configuration or use the existing connection.
Extra field
The Extra JSON dictionary would look like this. Here, we have uploaded the private key file to the data folder in the Composer environment's GCS bucket.
{
"key_file": "/home/airflow/gcs/data/sftp-ssh-key",
"conn_timeout": "30",
"look_for_keys": "false"
}
1.2 Airflow connection with password authentication
If the host machine is configured to allow password authentication, these are the changes to be made in the Airflow connection.
The Extra parameter can be empty.
The Password parameter is the user_a's user password on the host machine.
The task logs show that the password authentication was successful.
INFO - Authentication (password) successful!
Upload the DAG to the Composer environment and trigger the DAG. I was facing key validation issue with the latest version of the paramiko=2.9.2 library. I tried downgrading paramiko but the older versions do not seem to support OPENSSH keys. Found an alternative paramiko-ng in which the validation issue has been fixed. Changed the Python dependency from paramiko to paramiko-ng in the PythonVirtualenvOperator.
from airflow import DAG
from airflow.operators.python_operator import PythonVirtualenvOperator
from airflow.operators.dummy_operator import DummyOperator
from datetime import datetime, timedelta
def make_sftp():
import paramiko
from airflow.contrib.hooks.ssh_hook import SSHHook
ssh_hook = SSHHook(ssh_conn_id="sftp_connection")
sftp_client = ssh_hook.get_conn().open_sftp()
print("=================SFTP Connection Successful=================")
remote_host = "/home/sftp-folder/sample_sftp_file" # file path in the host system
local_host = "/home/airflow/gcs/data/sample_sftp_file" # file path in the client system
sftp_client.get(remote_host,local_host) # GET operation to copy file from host to client
sftp_client.close()
return 0
etl_dag = DAG("sftp_dag",
start_date=datetime.now(),
schedule_interval=None,
default_args={
"owner": "airflow",
"depends_on_past": False,
"email_on_failure": False,
"email_on_retry": False,
"retries": 5,
"retry_delay": timedelta(minutes=5)})
sftp = PythonVirtualenvOperator(task_id="sftp",
python_callable=make_sftp,
requirements=["sshtunnel", "paramiko-ng", "pysftp"],
dag=etl_dag)
start_pipeline = DummyOperator(task_id="start_pipeline", dag=etl_dag)
start_pipeline >> sftp
Results
The sample_sftp_file has been copied from the host system to the specified Composer bucket.

Boto3 unable to connect to local DynamoDB running in Docker container

I'm at a complete loss. I have a Docker container running DynamoDB locally. From the terminal window, I run:
docker run -p 8010:8000 amazon/dynamodb-local
to start the container. It starts fine. I then run:
aws dynamodb list-tables --endpoint-url http://localhost:8010
to verify that the container and the local instance is working fine. I get:
{
"TableNames": []
}
That's exactly what I expect. It tells me that the aws client can connect to the local DB instance properly.
Now the problem. I get to a python shell, and type:
import boto3
db = boto3.client('dynamodb', region_name='us-east-1', endpoint_url='http://localhost:8010', use_ssl=False, aws_access_key_id='my_secret_key', aws_secret_access_key='my_secret_access_key', verify=False)
print(db.list_tables())
I get a ConnectionRefusedError. I have tried the connection with and without the secret keys, with and without use_ssl and verify, and nothing works. At this point I'm thinking it must be a bug with boto3. What am I missing?

Airflow metrics with prometheus and grafana

any one knows how to send metrics from airflow to prometheus, I'm not finding much documents about it, I tried the airflow operator metrics on Grafana but it doesnt show any metrics and all it says no data points.
By default, Airflow doesn't have any support for Prometheus metrics. There are two ways I can think of to get metrics in Prometheus.
Enable StatsD metrics and then export it to Prometheus using statsd exporter.
Install third-party/open-source Prometheus exporter agents (ex. airflow-exporter).
If you are going with 2nd approach then the Airflow Helm Chart also provides support for that.
Edit
If you're using statsd exporter here is a good resource for Grafana Dashboard and exporter config.
This is how it worked for me -
Running airflow in docker using this doc
Added this configuration inside the docker-compose file downloaded in the previous step AIRFLOW__SCHEDULER__STATSD_ON: 'true'
AIRFLOW__SCHEDULER__STATSD_HOST: statsd-exporter
AIRFLOW__SCHEDULER__STATSD_PORT: 9125
AIRFLOW__SCHEDULER__STATSD_PREFIX: airflow
Under environment section
Now run the statsd_export
docker run -d -p 9102:9102 -p 9125:9125 -p 9125:9125/udp \ -v $PWD/statsd_mapping.yml:/tmp/statsd_mapping.yml \ prom/statsd-exporter --statsd.mapping-config=/tmp/statsd_mapping.yml
Get the statsd_mapping.yml contents from Here
Now do docker-compose up to run the airflow and try to run some worflow and you should see logs at http://localhost:9102/metrics
If you installed your Airflow with statsd support:
pip install 'apache-airflow[statsd]'
you can expose Airflow statsd metrics in the scheduler section of your airflow.cfg file, something like this:
[scheduler]
statsd_on = True
statsd_host = localhost
statsd_port = 8125
statsd_prefix = airflow
Then, you can install a tool called statsd_exporter, that captures statsd-format metrics and converts them to Prometheus-format, making them available at the /metrics endpoint for Prometheus to scrape.
There is a docker image available on DockerHub called astronomerinc/ap-statsd-exporter that already maps Airflow statsd metrics to Prometheus metrics.
References:
https://airflow.apache.org/docs/stable/metrics.html
https://github.com/prometheus/statsd_exporter
https://hub.docker.com/r/astronomerinc/ap-statsd-exporter/tags

Apache airflow REST API call fails with 403 forbidden when API authentication is enabled

Apache Airflow REST API fails with 403 forbidden for the call:
"/api/experimental/test"
Configuration in airflow.cfg
[webserver]
authenticate = True
auth_backend = airflow.contrib.auth.backends.password_auth
[api]
rbac = True
auth_backend = airflow.contrib.auth.backends.password_auth
After setting all this, docker image is built and run as a docker container.
Created the airflow user as follows:
airflow create_user -r Admin -u admin -e admin#hpe.com -f Administrator -l 1 -p admin
Login with credentials for Web UI works fine.
Where as login to REST API is not working.
HTTP Header for authentication:
Authorization BASIC YWRtaW46YWRtaW4=
Airflow version: 1.10.9
By creating user in the following manner we can access the Airflow experimental API using credentials.
import airflow
from airflow import models, settings
from airflow.contrib.auth.backends.password_auth import PasswordUser
user = PasswordUser(models.User())
user.username = 'new_user_name'
user.email = 'new_user_email#example.com'
user.password = 'set_the_password'
session = settings.Session()
session.add(user)
session.commit()
session.close()
exit()
By creating user with "airflow create_user" command, we cannot access the Airflow Experimental APIs.

Resources