Bokeh data is visualized only after script has ended - bokeh

My script is running through some sequential data generation steps that will finally produce a series of graphs. I want after each step to visualize the graphs that are available so far, but I only get a visualization after the script has completely finished.
This is my code:
#!/usr/bin/python3
from bokeh.plotting import curdoc, figure
import numpy as np
def plot_init(title):
global plot_circle
p = figure(title=title)
plot_circle = p.circle([],[])
curdoc().add_root(p)
def plot(data):
global plot_circle
plot_circle.data_source.stream(data)
input("Continue...")
plot_init("graph 1")
temp_x = np.random.rand(10)
temp_y = np.random.rand(10)
plot({'x': temp_x, 'y': temp_y})
plot_init("graph 2")
temp_x = np.random.rand(10)
temp_y = np.random.rand(10)
plot({'x': temp_x, 'y': temp_y})
I start the bokeh server like this: bokeh serve bokeh-test.py
This is what happens:
I launch the server, console is waiting
I open the url in the browser, the script starts and "Continue..." is written on the console. Nothing is displayed yet, the browser is still loading.
I press a key, the scripts continues and "Continue..." is written on the console. Nothing is displayed yet, the browser is still loading.
I press a key, the script finishes. Now the browser is displaying the two graphs that were generated.
I'd expected after step 2. that the first graph would be displayed and after step 3 that the second graph would be added.
Most examples for 'streaming' data involve add_periodic_callback(), but I don't see how this would fit in my application, because I don't handle data that is streaming in from some external data source, I'm just generating data in the flow of my script.
How can I visualize my data with incremental results?
Best regards,
Vic

One solution would be that you replace the input command - press any key - with a button press or something similar on your app. Basically after button click you execute rendering graph 1 or 2. With this approach bokeh will complete the rendering of the individual parts.

My problem was that curdoc() does not seem to communicate to the client right away when it's called:
if curdoc() is called in the main flow of the script, it is only effective when the script reaches the end
if curdoc() is called in a callback, it is only effective when the callback returns
So I constructed my script such that the data generation flow is in a separate thread and a series of data handling callbacks is triggered using buttons, to do the drawing. The callbacks wait for the data to be generated using a semaphore and likewise the data generation waits for the user pressing the button using another semaphore.
Perhaps I made it overly complicated, but it works exactly as I want. I have a single flow of data generation and I can after each step inspect the graph and push a button to continue with the next data generation step.
Data is transferred from the data generator thread to the data handler callbacks via global variables.
#!/usr/bin/python3
from bokeh.plotting import curdoc, figure
from bokeh.models import Button
import threading
import numpy as np
import time
handler_lock = threading.Semaphore(0) # handler waiting for data
generator_lock = threading.Semaphore(0) # generator waiting for user pushing the button
# THIS IS A BOKEH CALLBACK:
def data_handler():
global data
global title
global final
# wait until data is ready
print("Handler: waiting for data from generator")
handler_lock.acquire()
print("Handler: handling data from generator")
# add new graph
p = figure(title=title)
plot_circle = p.circle([],[])
curdoc().add_root(p)
plot_circle.data_source.stream(data)
# add the button for the user to trigger the callback that will draw the next graph
if not final:
button = Button(label="Continue data generation", button_type="success")
button.on_click(data_handler)
curdoc().add_root(button)
# unlock the generator to continue generating data
print("Handler: unlocking generator")
generator_lock.release()
# actual drawing is only done after the callback finishes!
# THIS RUNS IN A SEPARATE THREAD:
def data_generation():
global data
global title
global final
final = False
# Data generation phase 1 (starts immediately!)
print("Generator: starting data generation")
temp_x = np.random.rand(10)
temp_y = np.random.rand(10)
#time.sleep(10) # pretending data generation takes long
print("Generator: Data generated, going to plot")
data = {'x': temp_x, 'y': temp_y}
title = "graph 1"
# unlock the bokeh callback that is waiting for data
print("Generator: Unlocking handler")
handler_lock.release()
# lock the generator until the callback is done
print("Generator: waiting for handler to finish")
generator_lock.acquire()
print("Generator: continuing generating data")
# Data generation phase 2
print("Generator: starting data generation")
temp_x = np.random.rand(10)
temp_y = np.random.rand(10)
#time.sleep(10) # pretending data generation takes long
print("Data generated, going to plot")
data = {'x': temp_x, 'y': temp_y}
title = "graph 2"
final = True # no button needed to continue
# unlock the bokeh callback that is waiting for data
print("Generator: unlocking handler")
handler_lock.release()
# start data generation in a separate thread
t = threading.Thread(target=data_generation)
t.start()
# draw the start button
button = Button(label="Start handling data", button_type="success")
button.on_click(data_handler)
curdoc().add_root(button)
# main script finishes here, but the generator thread is still running and
# iterative bokeh drawing is done in callbacks

Related

Perform multiple actions upon pressing an actionButton

I've been struggling to get my Shiny app to perform multiple operations upon pressing a single actionButton. I've looked around a lot and haven't been able to solve it. I've tried eventReactive, observeEvent, putting everything in one block, separating them out - you can see the eventReactive and observeEvent attempts in the server.R code.
I have a button which is supposed to run some models, once pressed. Upon pressing the button, I'd like to do quite a few things:
Print a message letting the user know that something is happening
Concatenate the previously-defined user options into a .sh file and execute model prediction
Periodically check for the model output file
Once output file has been created, print "Done" and save as a global variable for further analysis
This is what I have so far (it doesn't work, the model runs [yay!] and that's it). It's particularly confusing to me that the sys command runs fine, but the "Running" message doesn't get printed. I've made sure that I haven't misspelt any var names.
Dummy python script to produce an output: Let's call it script_shiny.py
import time
import pandas as pd
# construct dummy df
d = {'col1': [1, 2], 'col2': [3, 4], 'col3': [5,6]}
df = pd.DataFrame(data=d)
time.sleep(60) # let's add a delay to simulate the 'real' situation
df.to_csv("test_out.txt",sep="\t") # create the output file to be detected
The relevant part of ui.R
*CODE TO CAPTURE PARAMS, WORKS FINE*
shiny::actionButton("button", "Run PIDGIN"), # RUN MODELS
tags$br(),
textOutput("pidginrunning"), # OUTPUT RUNNING MESSAGE
textOutput("pidgindone"), # OUTPUT DONE MESSAGE
The relevant part of server.R
# One approach - observeEvent and defining renderText and .sh file creation and running all together
observeEvent(input$button, { # OBSERVE BUTTON BEING PRESSED
output$pidginrunning <- renderText({ # LET USER KNOW ITS RUNNING - NOT WORKING
paste0("Running PIDGIN...")
})
# CREATE .SH FILE AND RUN DUMMY SCRIPT - WORKING (params previously defined)
bin_bash <- "#!/bin/bash"
output_name <<- "test_out.txt"
runline <- paste0("python script_shiny.py")
bash_file <- data.frame(c(bin_bash,runline))
write.table(bash_file,"./run_pidgin.sh",quote=F,row.names=F,col.names=F)
system("bash -i run_pidgin.sh")
})
# CHECK OUTPUT - NOT WORKING
# Defining output_name in the previous code block works, its defined as a global variable
# Other approach, use eventReactive to define a 'function' which is called separately
checkOutput <- eventReactive(input$button, {
reactivePoll(1000, session, checkFunc = function() {
if (file.exists(output_name)) # check for the output file
assign(x = "preds", # Read it in if so, and define global variable
value = read.csv(output_name,header=T,sep="\t"),
envir = .GlobalEnv)
paste0("Done. Please move onto the Results tab.")
})
})
output$pidgindone <- renderText({ # PRINT DONE MESSAGE SO USER KNOWS TO MOVE ON
checkOutput()
})
(Hope that counts as minimum example. As a proxy I suppose you can generate a dummy .sh file which produces a .txt file table, instead of running the models).
Cheers for your time and help.
EDIT: added example script which creates an output which should be detected

Schedule a task (update data) each monday in Shiny

I have a dashboard living in a Shiny Server pro that shows different analysis. The data is coming from a long query that takes around 20 minutes to be completed.
In my current set up, I have a button that updates the data:
queries new data
transforms the data
saves the data in a file .RData
saves the data in a global object (using data <<-)
Just in case, outside the server and ui functions I have a statement that checks if data object exists. In case that does not exists, it reads the data from the .RData file instead of doing the query again.
Now I would like to update the data each Monday at 5:00pm (I do not want to open the app and push the button each Monday). I think that the best way to do it is using a cron job using cronR. The code will be located in the app.R outside the server and ui functions. Now I have the following questions:
If I am using Shiny server pro how many times, the app, will create the cron job if it is located in the app.R outside the server and ui functions?
How can I replace the object data in the shiny app? In such a way that if a user open the app on Monday after 5:00 pm the data will be in place, without the need of reading the .RData file and of course not doing the query again.
What is the best practice?
Just create your cron process with cronR completely outside the shiny application and make sure it saves your data to the correct place.
Create the R code which gets your data:
library(...)
# ...
# x <- mydata
save(x, file = "NewData.Rda")
Create the cron job:
cmd <- cron_rscript("path/to/getdata.R")
cron_add(cmd, frequency = 'daily', id = 'job5', at = '05:00')
I cant't see your point 1. The app will not create the cron job if it is not named "global.R" or "ui.R" or "server.R", I think. Also, you don't have to put your code under the /srv/shiny-server/ directory.
For your point 2., check the reactiveFileReader function from the shiny library. This function checks a file's last modified time and the file is re-read if changed
data <- reactiveFileReader(5*60*1000, filePath="NewData.Rda", readFunc = load)

cells with Bokeh show() command run forever - how can I make them stop after plotting?

I have a Juypter Notebook cell with some code (not mine) that generates a Bokeh plot. Last command in the cell is a Bokeh show() command. What I want is for the plot to appear in the output cell below, and then to be able to continue running subsequent cells in the notebook.
What happens instead is that when I run the cell the plot appears below, but the cell with the code continues to run (asterisk in the left-hand brackets) and never stops. So I can't continue to run any other cells unless I click the stop button (which leaves the plot intact in the output cell, but follows it with many lines of error trace-back messages).
My understanding is that executing the output_notebook() function should result in show() giving the behavior I want, as described here:
https://docs.bokeh.org/en/latest/docs/user_guide/notebook.html
In their screenshot, the cell with the show() command has clearly finished running (no asterisk).
I'm not sure if it makes any difference, but in my case I am running the notebook on a remote server and using port-forwarding to view it on my laptop computer browser.
Does anyone know how I can get the show() command to finish running?
For reference, this is the code:
# Use the PCA projection
dset = proj_pca
# Select the dimensions in the projection
dim1 = 0
dim2 = 1
p = figure(plot_width=945, plot_height=560)
p.title.text = 'PCA projection: PC' + str(dim1+1) + ' vs PC' + str(dim2+1)
for cont in continents:
for pop in pop_by_continent[cont]:
projections_within_population = dset[indices_of_population_members[pop]]
p.circle(projections_within_population[:,dim1], projections_within_population[:,dim2],
legend=name_by_code[pop], color = color_dict[pop])
#p.legend.visible = False
p.legend.location = "top_center"
p.legend.click_policy="hide"
output_file("interactive_pca.html", title="PCA projection")
output_notebook()
#save(p)
show(p)

How can i update the ColumnDataSource by an "selected.on_change()" event?

First, I created a scatter plot out of geogr. coordinates. If i click on one of these circles a second line-plot next to that scatter plot shows further informations depending on what circle i've clicked. That means i have to update the current ColumnDataSource shown in the line-plot by a new one. But if i click on one of those circles the current Source will not be updated. The line-plot still shows the dataset of the old Source.
I'll try to give you a short example of what i've done so far:
def callback(attr, old, new):
# Depending on what circle i've clicked i start a SQL request
# to gain my dataset i want to plot and the new title of the diagram.
# To change the title actually works:
line_plot.title.text = 'new_title'
# "source_new_values" is a ColumnDataSource created out of a
# SQL-request of my database.
# To change the current source doesn't work. The line-plot is still
# showing the old dataset.
source_current_values = source_new_values
scatter_plot = figure(x_axis_label='lat', y_axis_label='lon')
scatter_plot.circle(x='long', y='lat', source=source_coordinates)
# I use the indices to identify what circle was clicked.
source_coordinates.selected.on_change('indices', callback)
line_plot = figure(x_axis_label='time', x_axis_type='datetime',
y_axis_label='values', title='title')
line_plot.line(x='date', y='value', source=source_current_values)
The solution for tat Problem is I'm not able to update the source by a ColumnDataSource, but by a Dictionary using:
source_current_values.data = Dict("some content")

How to change the extent and position of an existing image in bokeh?

I am displaying 2d data as images of varying shapes in a bokeh server, and therefore need to dynamically update not only the image's data source, but also its dw, dh, x, and y properties. In the dummy example below, these changes are made in a callback function which is connected to a Button widget.
I've figured out that I need to access the glyph attribute of the image's GlyphRenderer object, and I can do so through its update() method (see code). But the changes don't take effect until I click the toolbar's Reset button. I've noticed that the changes also mysteriously take effect the second time I activate the callback() function. What is the proper way to make these changes?
import bokeh.plotting
import bokeh.models
import bokeh.layouts
import numpy as np
# set up the interface
fig1 = bokeh.plotting.figure(x_range=(0, 10), y_range=(0, 10))
im1 = fig1.image([], dw=5, dh=5)
button = bokeh.models.Button(label='scramble')
# add everything to the document
bokeh.plotting.curdoc().add_root(bokeh.layouts.column(button, fig1))
# define a callback and connect it
def callback():
# this always works:
im1.data_source.data = {'image': [np.random.random((100,100))]}
# these changes only take effect after pressing the "Reset"
# button, or after triggering this callback function twice:
im1.glyph.update(x=1, y=1, dw=9, dh=9)
button.on_click(callback)
I don't immediately see why you code isn't work. I can suggest explicitly using a ColumnDataSource and linking all of the Image glyph properties to columns in that source. Then you should be able to update the source.data in a single line and have all of the updates apply.
Here's some incomplete sample code to suggest how to do that:
from bokeh.models import Image, ColumnDataSource
from bokeh.plotting import figure
# the plotting code
plot = figure()
source = ColumnDataSource(data=dict(image=[], x=[], y=[], dw=[], dh=[]))
image = Image(data='image', x='x', y='y', dw='dw', dh=dh)
plot.add_glyph(source, glyph=image)
# the callback
def callback():
source.data = {'image': [np.random.random((100,100))], 'x':[1], 'y':[1], 'dw':[9], 'dh':[9]}
button.on_click(callback)

Resources