Creating Dynamic Text of date time that changes every second in Streamlit - streamlit

I would like to display the output of datatime.now being refreshed every 1 second on the streamlit webui.
from datetime import datetime
datetime.now()
# print this output every one second
datetime.datetime(2020, 5, 19, 4, 22, 40, 921985)
What I have already tried
#!/usr/bin/env python3
import streamlit as st
from datetime import datetime
timenow = str(datetime.now())
st.write(timenow)

I suppose it depends on whether you need exactly one second resolution or not, but the solution is approximately:
import time
from datetime import datetime
import streamlit as st
t = st.empty()
while True:
t.markdown("%s" % str(datetime.now()))
time.sleep(1)
The while loop keeps the process going forever. By having the st.empty() call outside of the loop, we keep modifying the t variable. On each loop repetition, the value for the markdown string gets overwritten by the datetime.now() argument.

Display Time With Infinite While Loop AT END OF Streamlit Code
The placeholder code in the while loop is likely not needed. Using t.write or t.markdown as #Randy Zwitch does is fine. I just wanted to show that IF you use this while look approach, put it at the end of your streamlit script. Even then, watch how the time is interrupted when the button state is updated in the code above the while loop.
import streamlit as st
from datetime import datetime
import time
# Section 1
button = st.button('Button')
button_placeholder = st.empty()
button_placeholder.write(f'button = {button}')
time.sleep(2)
button = False
button_placeholder.write(f'button = {button}')
# Section 2
time_placeholder = st.empty()
while True:
timenow = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
time_placeholder.write(timenow)
time.sleep(1)

Related

How do I update values in streamlit on a schedule?

I have a simple streamlit app that is meant to show the latest data. I want it to re-fresh the data every 5 seconds, but right now the only way I found to do that is via a st.experimental_refresh; this is the core code:
import streamlit as st
import time
current_time = int(time.time())
if 'last_run' not in st.session_state:
st.session_state['last_run'] = current_time
#st.experimental_singleton
def load_data():
...
return data
data = load_data()
if current_time > st.session_state['last_run']+5: # check every 5 seconds
load_data.clear() # clear cache
st.session_state['last_run'] = current_time
st.experimental_rerun()
However, the st.experimental_rerun() makes the user experience terrible; are there any other ideas?
You can try using schedule and st.empty().
Example:
import time
from schedule import every, repeat, run_pending
def load_data():
...
return data
def main():
data = load_data()
# Do something with data
with st.empty():
#repeat(every(5).seconds)
def refresh_data():
main()
while True:
run_pending()
time.sleep(1)

How to get data from app PowerBI as external user

I am interested in downloading data from a national public dataset referring to Vaccine in Italy.
https://app.powerbi.com/view?r=eyJrIjoiMzg4YmI5NDQtZDM5ZC00ZTIyLTgxN2MtOTBkMWM4MTUyYTg0IiwidCI6ImFmZDBhNzVjLTg2NzEtNGNjZS05MDYxLTJjYTBkOTJlNDIyZiIsImMiOjh9
In particular I am interested in downloading the last table.
I tried to use a scraping model via HTML, but it seems that data is not stored directly into the HTML source page.
Then I thought to use the code below in Python 3.9:
import pytest
import time
import json
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
import time
import pyautogui
import win32api
import win32gui
from win32con import *
driver = webdriver.Chrome()
LINK = ".tableEx .bodyCells div:nth-child(1) > .pivotTableCellWrap:nth-child("
TO_ADD ="1)"
data = []
action = ActionChains(driver)
driver.get("https://app.powerbi.com/view?r=eyJrIjoiMzg4YmI5NDQtZDM5ZC00ZTIyLTgxN2MtOTBkMWM4MTUyYTg0IiwidCI6ImFmZDBhNzVjLTg2NzEtNGNjZS05MDYxLTJjYTBkOTJlNDIyZiIsImMiOjh9")
driver.set_window_size(784, 835)
driver.execute_script("window.scrollTo(0,0)")
driver.execute_script("window.scrollTo(0,0)")
for i in range(1,293):
print(i)
if i%10==0:
win32api.SetCursorPos((1625,724))
win32api.mouse_event(MOUSEEVENTF_WHEEL, x, y, -3, 0)
time.sleep(6)
else:
pass
action = ActionChains(driver)
TO_ADD = str(i)+")"
action.move_to_element(driver.find_element(By.CSS_SELECTOR, LINK+TO_ADD)).perform()
action.context_click().send_keys(Keys.ARROW_DOWN).send_keys(Keys.ARROW_DOWN).perform()
time.sleep(0.5)
x,y = pyautogui.locateCenterOnScreen(r'C:\Users\migli\Documents\Learning\copy.png')
pyautogui.moveTo(x,y,0.1)
pyautogui.click()
time.sleep(0.5)
x,y = pyautogui.locateCenterOnScreen(r'C:\Users\migli\Documents\Learning\copiaselez.png')
pyautogui.moveTo(x,y,0.1)
pyautogui.click()
data.append(pyperclip.paste())
Images for clicking:
copiaselez
copy
This seems to achieve what I am trying to do. But then it blocks around the 14th cycle. I don't know why. Maybe I should scroll down the page in some manner, but I tried to do it manually during the code inserting an sleep time around 10th cycle, but it gets error as well.
I also thought using an API but it seems not to exist one.
Any idea is accepted.

How to do store sql output to pandas dataframe using Airflow?

I want to store data from SQL to Pandas dataframe and do some data transformations and then load to another table suing airflow
Issue that I am facing is that connection string to tables are accessbale only through Airflow. So I need to use airflow as medium to read and write data.
How can this be done ?
MY code
Task1 = PostgresOperator(
task_id='Task1',
postgres_conn_id='REDSHIFT_CONN',
sql="SELECT * FROM Western.trip limit 5 ",
params={'limit': '50'},
dag=dag
The output of task needs to be stored to dataframe (df) and after tranfromations and load back into another table.
How can this be done?
I doubt there's an in-built operator for this. You can easily write a custom operator
Extend PostgresOperator or just BaseOperator / any other operator of your choice. All custom code goes into the overridden execute() method
Then use PostgresHook to obtain a Pandas DataFrame by invoking get_pandas_df() function
Perform whatever transformations you have to do in your pandas df
Finally use insert_rows() function to insert data into table
UPDATE-1
As requested, I'm hereby adding the code for operator
from typing import Dict, Any, List, Tuple
from airflow.hooks.postgres_hook import PostgresHook
from airflow.operators.postgres_operator import PostgresOperator
from airflow.utils.decorators import apply_defaults
from pandas import DataFrame
class MyCustomOperator(PostgresOperator):
#apply_defaults
def __init__(self, destination_table: str, *args, **kwargs):
super().__init__(*args, **kwargs)
self.destination_table: str = destination_table
def execute(self, context: Dict[str, Any]):
# create PostgresHook
self.hook: PostgresHook = PostgresHook(postgres_conn_id=self.postgres_conn_id,
schema=self.database)
# read data from Postgres-SQL query into pandas DataFrame
df: DataFrame = self.hook.get_pandas_df(sql=self.sql, parameters=self.parameters)
# perform transformations on df here
df['column_to_be_doubled'] = df['column_to_be_doubled'].multiply(2)
..
# convert pandas DataFrame into list of tuples
rows: List[Tuple[Any, ...]] = list(df.itertuples(index=False, name=None))
# insert list of tuples in destination Postgres table
self.hook.insert_rows(table=self.destination_table, rows=rows)
Note: The snippet is for reference only; it has NOT been tested
References
Pandas convert DataFrame into Array of tuples
Further modifications / improvements
The destination_table param can be read from Variable
If the destination table doesn't necessarily reside in same Postgres schema, then we can take another param like destination_postgres_conn_id in __init__ and use that to create a destination_hook on which we can invoke insert_rows method
Here is a very simple and basic example to read data from a database into a dataframe.
# Get the hook
mysqlserver = MySqlHook("Employees")
# Execute the query
df = mysqlserver.get_pandas_df(sql="select * from employees LIMIT 10")
Kudos to y2k-shubham for the get_pandas_df() tip.
I also save the dataframe to file to pass it to the next task (this is not recommended when using clusters since the next task could be executed on a different server)
This full code should work as it is.
from airflow import DAG
from airflow.operators.python import PythonOperator,
from airflow.utils.dates import days_ago
from airflow.hooks.mysql_hook import MySqlHook
dag_id = "db_test"
args = {
"owner": "airflow",
}
base_file_path = "dags/files/"
def export_func(task_instance):
import time
# Get the hook
mysqlserver = MySqlHook("Employees")
# Execute the query
df = mysqlserver.get_pandas_df(sql="select * from employees LIMIT 10")
# Generate somewhat unique filename
path = "{}{}_{}.ftr".format(base_file_path, dag_id, int(time.time()))
# Save as a binary feather file
df.to_feather(path)
print("Export done")
# Push the path to xcom
task_instance.xcom_push(key="path", value=path)
def import_func(task_instance):
import pandas as pd
# Get the path from xcom
path = task_instance.xcom_pull(key="path")
# Read the binary file
df = pd.read_feather(path)
print("Import done")
# Do what you want with the dataframe
print(df)
with DAG(
dag_id,
default_args=args,
schedule_interval=None,
start_date=days_ago(2),
tags=["test"],
) as dag:
export_task = PythonOperator(
task_id="export_df",
python_callable=export_func,
)
import_task = PythonOperator(
task_id="import_df",
python_callable=import_func,
)
export_task >> import_task

How to use apscheduler to trigger job every minute between specific times?

I'm using python apscheduler module. Is it possible to trigger a job every minute between 7:30 AM and 11:30 PM every day?
I've tried following solution, but I don't know how to add constraint with minutes.
from apscheduler.schedulers.background import BackgroundScheduler
def job_function():
print("Hello World")
sched = BackgroundScheduler()
sched.add_job(job_function, 'cron', hour='7-23', minute='*')
sched.start()
You can use the new OrTrigger to combine several CronTriggers to cover the whole time span:
from apscheduler.triggers.combining import OrTrigger
from apscheduler.triggers.cron import CronTrigger
trigger = OrTrigger([
CronTrigger(hour='7', minute='30-59'),
CronTrigger(hour='8-22', minute='*'),
CronTrigger(hour='23', minute='0-30')
])
sched.add_job(job_function, trigger)

How to sort a QTableWidget with my own code?

I am using Qt4.5.2 on Linux. I have a simple QTableWidget, in which a column displays dates in a human-friendly format. Unfortunately "human-friendly dates" are not easy to sort correctly. So, in the QTableWidget I keep a hidden column with the UNIX timestamp corresponding to that date.
I am trying to make sure that, whenever a request to sort on the DATE column is issued, in reality the sort be done on the (invisible) TIMESTAMP column. I tried reimplementing sortByColumn (this is in Python) by subclassing from QTableWidget and defining:
def sortByColumn(self, col, order):
print 'got request to sort col %d in order %s' % (col, str(order) )
Yet, whenever I click on one of the headers of my table the normal sort method continues to be called.
How can I override it?
You could derive your own class of QTableWidgetItem and then write your own __lt__ operator. This would alleviate the need for an extra column as well. Something along the lines of:
from PyQt4 import QtCore, QtGui
import sys
import datetime
class MyTableWidgetItem(QtGui.QTableWidgetItem):
def __init__(self, text, sortKey):
#call custom constructor with UserType item type
QtGui.QTableWidgetItem.__init__(self, text, QtGui.QTableWidgetItem.UserType)
self.sortKey = sortKey
#Qt uses a simple < check for sorting items, override this to use the sortKey
def __lt__(self, other):
return self.sortKey < other.sortKey
app = QtGui.QApplication(sys.argv)
window = QtGui.QMainWindow()
window.setGeometry(0, 0, 400, 400)
table = QtGui.QTableWidget(window)
table.setGeometry(0, 0, 400, 400)
table.setRowCount(3)
table.setColumnCount(1)
date1 = datetime.date.today()
date2 = datetime.date.today() + datetime.timedelta(days=1)
date3 = datetime.date.today() + datetime.timedelta(days=2)
item1 = MyTableWidgetItem(str(date1.strftime("%A %d. %B %Y")), str(date1))
item2 = MyTableWidgetItem(str(date2.strftime("%A %d. %B %Y")), str(date2))
item3 = MyTableWidgetItem(str(date3.strftime("%A %d. %B %Y")), str(date3))
table.setItem(0, 0, item1)
table.setItem(2, 0, item2)
table.setItem(1, 0, item3)
table.setSortingEnabled(True)
window.show()
sys.exit(app.exec_())
This produced correct results for me, you can run it yourself to verify. The cell text displays text like "Saturday 20. February 2010" but when you sort on the column it will correctly sort by the sortKey field which is "2010-02-20" (iso formatted).
Oh, it should also be noted that this will NOT work with PySide because it appears that the __lt__ operator is not bound, where-as it is with PyQt4. I spent a while trying to debug why it wasn't working and then I switched from PySide to PyQt4 and it worked fine. You may notice that the __lt__ is not listed here:
http://www.pyside.org/docs/pyside/PySide/QtGui/QTableWidgetItem.html
but it is here:
http://doc.qt.digia.com/4.5/qtablewidgetitem.html#operator-lt

Resources