Task not setting variables properly? - volt

Having some issues getting a variable passed to the client. The task that gets run is this:
def getmaps
$mapnames_current = []
url = 'http://s3-ap-northeast-1.amazonaws.com/splatoon-data.nintendo.net/stages_info.json'
resp = Net::HTTP.get_response(URI.parse(url))
buffer = resp.body
result = JSON.parse(buffer)
result.each do |gamemode|
gamemode['stages'].each do |stage|
$mapnames_current << $mapnames.key(stage['id'])
end
end
end
and $mapnames_current gets called here:
<div class="text-center">
<h2>Current Maps!</h2>
<h3>Turf War:</h3>
<h2>{{ $mapnames_current[0] }}</h2>
<h2>{{ $mapnames_current[1] }}</h2>
<h3>Ranked:</h3>
<h2>{{ $mapnames_current[2] }}</h2>
<h2>{{ $mapnames_current[3] }}</h2>
</div>
I'm not sure what's going wrong here. Shouldn't the $mapnames_current variable be accessible everywhere?

Global variables aren't synced between client and server. The only thing passed between the client and server from a task is the return values of a task. (Which should be something that can be serialized to json (or date types in json)).
There's a good screencast on tasks here: http://datamelon.io/blog/2015/creating-volt-task-objects.html

Related

Accessing task_instance or ti via simpleHttpOperator to do an xcom push

TLDR
In the python callable for a simpleHttpOperator response function, I am trying to push an xcom that has combined information from two sources to a specificied key (a hash of the filename/path and an object lookup from a DB)
Longer Tale
I have a filesensor written which grabs all new files and passes them to MultiDagRun to parallel process the information (scientific) in the files as xcom. Works great. The simpleHttpOperator POSTs filepath info to a submission api and receives back a task_id which it must then read as a response from another (slow running) api to get the result. This I all have working fine. Files get scanned, it launches multiple dags to process, and returns objects.
But... I cannot puzzle out how to push the result to an xcom inside the python response function for the simpleHttpOperator.
My google- and SO and Reddit-fu has failed me here (and it seems overkill to use the pythonOperator tho that's my next stop.). I notice a lot of people asking similar questions though.
How do you use context or ti or task_instance or context['task_instance'] with the response function? (I cannot use "Returned Value" xcom as I need to distinguish the xcom keys as parallel processing afaik). As the default I have context set to true in the default_args.
Sure I am missing something simple here, but stumped as to what it is (note, I did try the **kwargs and ti = kwargs['ti'] below as well before hitting SO...
def _handler_object_result(response, file):
# Note: related to api I am calling not Airflow internal task ids
header_result = response.json()
task_id = header_result["task"]["id"]
api = "https://redacted.com/api/task/result/{task_id}".format(task_id=task_id)
resp = requests.get(api, verify=False).json()
data = json.loads(resp["data"])
file_object = json.dumps(data["OBJECT"])
file_hash = hash(file)
# This is the part that is not working as I am unsure how
# to access the task instance to do the xcom_push
ti.xcom_push(key=file_hash, value=file_object)
if ti.xcom_pull(key=file_hash):
return True
else:
return False
and the Operator:
object_result = SimpleHttpOperator(
task_id="object_result",
method='POST',
data=json.dumps({"file": "{{ dag_run.conf['file'] }}", "keyword": "object"}),
http_conn_id="coma_api",
endpoint="/api/v1/file/describe",
headers={"Content-Type": "application/json"},
extra_options={"verify":False},
response_check=lambda response: _handler_object_result(response, "{{ dag_run.conf['file'] }}"),
do_xcom_push=False,
dag=dag,
)
I was really expecting the task_instance object to be available in some fashion, either be default or configuration but each variation that has worked elsewhere (filesensor, pythonOperator, etc) hasn't worked, and been unable to google a solution for the magic words to make it accessible.
You can try using the get_current_context() function in your response_check function:
from airflow.operators.python import get_current_context
def _handler_object_result(response, file):
# Note: related to api I am calling not Airflow internal task ids
header_result = response.json()
task_id = header_result["task"]["id"]
api = "https://redacted.com/api/task/result/{task_id}".format(task_id=task_id)
resp = requests.get(api, verify=False).json()
data = json.loads(resp["data"])
file_object = json.dumps(data["OBJECT"])
file_hash = hash(file)
ti = get_current_context()["ti"] # <- Try this
ti.xcom_push(key=file_hash, value=file_object)
if ti.xcom_pull(key=file_hash):
return True
else:
return False
That function is a nice way of still accessing the task's execution context when context isn't explicitly handy or you don't want to pass context attrs around to access it deep in your logic stack.

Fastapi how to add variable to all router with TemplateResponse

I use JinjaTemplate and return data to html this way
#router.get('/', response_class=HTMLResponse)
async def main_page(request: Request,
members: List = [e.value for e in MembersQuantity],
activity_service: ActivityService = Depends(),
):
activity = await activity_service.get()
return templates.TemplateResponse('base.html',
context={
'request': request,
'members': members,
'activities': activity,
}
)
There are dozens of such routs in the project. And each of them needs to have the same context variable. I need to pass variables in all the routers - for example user.
I can write this way of passing the user object (it store in Redis)
#router.get('/', response_class=HTMLResponse)
async def main_page(request: Request,
members: List = [e.value for e in MembersQuantity],
activity_service: ActivityService = Depends(),
):
activity = await activity_service.get()
user = None
if await is_authenticated(request):
user = await get_current_user(request)
return templates.TemplateResponse('base.html',
context={
'request': request,
'members': members,
'activities': activity,
'user': user,
}
)
template
<html>
<head>
<title>Item Details</title>
<link href="{{ url_for('static', path='/styles.css') }}" rel="stylesheet">
</head>
<body>
<a class="" href="{{ url_for('user_notification', pk=user.user_id) }}">
</body>
</html>
But doing it in every router is completely wrong.
I know of different ways to pass global variables using middleware in request.state or response.set_cookie. But this methods are not suitable for security reasons.
How can I globally transfer the value I want to the context for all the routs?

Web scraping returns empty list

I am trying to scrape below info from https://www.dsmart.com.tr/yayin-akisi. However the below code returns empty list. Any idea?
<div class="col"><div class="title fS24 paBo30">NELER OLUYOR HAYATTA</div><div class="channel orangeText paBo30 fS14"><b>24 | 34. KANAL | 16 Nisan Perşembe | 6:0 - 7:0</b></div><div class="content paBo30 fS14">Billur Aktürk’ün sunduğu, yaşam değerlerini sorgulayan program Neler Oluyor Hayatta, toplumsal gerçekliğin bilgisine ulaşma noktasında sınırları zorluyor. </div><div class="subTitle paBo30 fS12">Billur Aktürk’ün sunduğu, yaşam değerlerini sorgulayan program Neler Oluyor Hayatta, toplumsal gerçekliğin bilgisine ulaşma noktasında sınırları zorluyor. </div></div>
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url="https://www.dsmart.com.tr/yayin-akisi"
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "lxml")
for link in page_soup.find_all("div", {"class":"col"}):
print(link)
This page is rendered in browser. HTML you're downloading has only links to js files, which later render content of page.
You can use real browser to render page (selenium, splash or similar technologies) or understand how this page receives data you needed.
Long story short, data rendered on this page requested from this link https://www.dsmart.com.tr/api/v1/public/epg/schedules?page=1&limit=10&day=2020-04-16
It is well formatted JSON, so it's very easy to parse it. My recommendation to download page with requests module - it can return json response as dict.
This website is populated by get calls to their API. You can see the get calls on your Browser (Chrome/Firefox) devtools network. If you check, you will see that they are calling API.
import requests
URL = 'https://www.dsmart.com.tr/api/v1/public/epg/schedules'
# parameters that you can tweak or add in a loop
# e.g for page in range(1,10): to get multiple pages
params = dict(page=1, limit=10, day='2020-04-16')
r = requests.get(URL,params=params)
assert r.ok, 'issues getting data'
data = r.json()
# data is dictonary that you can grab data out using keys
print(data)
In cases like this, using BeautifulSoup is unwarranted.

csrf_tokens do not match in flask & nginx

Now I am trying to develop server using nginx + unicorn + flask.
If I execute python only, csrf_tokens work without any problems.
However, if I execute python using nginx + unicorn + flask, the error occurs.
400 BAD Request - The CSRF session token is missing. or The CSRF tokens do not match.
Is there additional settings that I should have done for nginx for session?
or Did I miss something??
app/init.py
from flask_wtf.csrf import CsrfProtect
csrf = CsrfProtect()
def create_app(config_name):
app = Flask(__name__, instance_path='/instance')
app.config.from_object(config[config_name])
config[config_name].init_app(app)
bootstrap.init_app(app)
moment.init_app(app)
csrf.init_app(app)
app.config.update(CSRF_ENABLED = app.config['CSRF_ENABLED'])
//CSRF_ENABLED = True
return app
login.html
<form action="{{url_for('.login')}}" class="form-signin text-center" method="POST">
{{ form.csrf_token }}
</form>
I often use in my projects the combination of flask + gunicorn + nginx. For my forms I use a different approach:
form.py:
from wtforms import StringField, SubmitField, IntegerField
from wtforms.validators import DataRequired, Optional
class IdentityForm(FlaskForm):
age = IntegerField("Type your age", validators=[Optional()])
name = StringField("Type your name*", validators=[DataRequired()])
submit = SubmitField("Submit")
page.html
<form action="" method="post" novalidate>
{{ form.hidden_tag() }}
{{ form.age.label }}<br>
{{ form.age() }}
{{ form.name.label }}<br>
{{ form.name() }}
{{ form.submit() }}
</form>
In this small example the part that concerns us here is the argument form.hidden_tag() on the HTML's side. This argument generates a hidden field that includes a token that is used to protect the form against CSRF attacks.
For this to work it is necessary to define the variable SECRET_KEY in the flask configurations:
SECRET_KEY = os.environ.get('SECRET_KEY') or 'do-not-get-tired-youll-never-find'
The SECRET_KEY is a cryptographic key that makes it possible to generate signatures or tokens. FLASK_WTF use it to protect forms against CSRF attacks.
And that's all. FLASK_WTF takes care of the rest for you
To learn more take a look at this

Getting a src from an img tag using beautifulsoup

This is my last cry for help I'm trying to do some cool embedding with my discord bot and the only problem is I can't seem to get the img from the website to work can anyone help? For the most part, this is what other people have told me to use and the codes found here are not working.
async def events(self, ctx):
"""Top GTAO bounses going on right now!"""
if ctx.message.server.me.bot:
try:
await self.bot.delete_message(ctx.message)
except:
await self.bot.send_message(ctx.message.author, 'Could not delete your message on ' + ctx.message.server.name)
url = "https://socialclub.rockstargames.com/"
async with aiohttp.get(url) as response:
soupObject = BeautifulSoup(await response.text(), "html.parser")
try:
rm = "[Read More](https://socialclub.rockstargames.com/events)"
img = "https://i.imgur.com/0Gu4sSK.png"
avi = "https://i.imgur.com/s5O1yD2.png"
bonus1 = soupObject.find(class_='bonuses').find('ul').get_text()
evpic = soupObject.find(class_='eventThumb').find('img').get('src')
# EMBED
data = discord.Embed(title='GTA Online Bonuses', description='The Current GTA Online Bonuses', colour=0xE4BA22)
data.set_author(name='Rockstar Games', icon_url=avi)
data.add_field(name="This week: \n", value=bonus1)
data.add_field(name="--------", value=rm)
data.set_image(url=evpic)
data.set_thumbnail(url=img)
a`enter code here`wait self.bot.say(embed=data)
except discord.HTTPException:
await self.bot.say("I need the `Embed links` permission to send this OR error")
Checking the website, Rockstar doesn't use the src tag in their images because it is handled by some internal JS
>>> soup.find(attrs={'class':'eventThumb'})
<div class="eventThumb">
<img class="lazyload" data-src="https://prod.cloud.rockstargames.com/global/Events/20449/829a53e7-d14e-4de8-a17b-ccb06becfed6.jpg"/>
</div>
>>> _.img
<img class="lazyload" data-src="https://prod.cloud.rockstargames.com/global/Events/20449/829a53e7-d14e-4de8-a17b-ccb06becfed6.jpg"/>
>>> _.get('data-src')
'https://prod.cloud.rockstargames.com/global/Events/20449/829a53e7-d14e-4de8-a17b-ccb06becfed6.jpg'
So to fix, you would need to change your .get('src') to a .get('data-src')

Resources