Format NaN value with CustomJShover bokeh plot - bokeh

I would like not to show NaN values (made with numpy.nan) of the matrix in Bokeh backend. I tried using CustomJSHover but I wasn't able to do anything because I got errors also for a simple example.
code:
import numpy as np
import xarray as xr
import holoviews as hv
import geoviews as gv
hv.extension('bokeh','matplotlib')
import cartopy.crs as crs
from bokeh.models import HoverTool, CustomJSHover
x,y = np.mgrid[-50:51, -50:51] * 0.1
r = 0.5*np.sin(np.pi +3*x**2+y**2)+0.5
r[r<0.5]=np.nan
coords=np.arange(0,101)
custom=CustomJSHover(code="""
return value + " tot"
""")
tooltips=[
("value","#image{1.1}"), # GIVES RIGHT VALUE BUT WITH NaNs (0.1..0.6.. etc)
("value","#image{custom}") # GIVES some strange 0th, 1st, ... or NaN
]
hover = HoverTool(tooltips=tooltips, formatters={'image' : custom})
ds = xr.Dataset({'R': (['x', 'y'],r)},coords={'x': (['x'], coords),'y': (['y'], coords)})
ensemble = gv.Image(ds, kdims=['x', 'y'],vdims=[ 'R']).opts(tools=[hover])
ensemble
I would like the NaN values are not being shown in {2.1} format and NaN values are white, so that they are not shown at all in the hover.

Related

Multiple ValueErrors in using an LSTM model for predicting in Jupyter Notebook

I am currently trying out different models in predicting the price of BitCoin (daily, from January 1, 2021 to December 31, 2022) for a College project and came across a model with an included source code in a site. However, despite how I follow the code, it seems to have an issue at the part of the mean squared error and mean absolute percentage error. The site's source code may have a function left out or there's an error that I fail to see in past cells in the code.
The full code is as follows and done in Jupyter Notebook
Cell 1:
import pandas as pd
stock_data = pd.read_csv('./BTC-USD-Daily.csv',index_col='Date')
stock_data.head()
Cell 2
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import datetime as dt
plt.figure(figsize=(15,10))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=60))
x_dates = [dt.datetime.strptime(d, '%Y-%m-%d').date() for d in stock_data.index.values]
plt.plot(x_dates, stock_data['High'], label='High')
plt.plot(x_dates, stock_data['Low'], label='Low')
plt.xlabel('Time Scale')
plt.ylabel('Scaled US Dollars')
plt.legend()
plt.gcf().autofmt_xdate()
plt.show()
Cell 3
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import *
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.preprocessing import MinMaxScaler, StandardScaler
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_percentage_error
from sklearn.model_selection import train_test_split
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_squared_error
Cell 4
target_y = stock_data['Close']
X_feat = stock_data.iloc[:,0:3]
Cell 5
sc = StandardScaler()
X_ft = sc.fit_transform(X_feat.values)
X_ft = pd.DataFrame(columns=X_feat.columns,
data=X_ft,
index=X_feat.index)
Cell 6
def lstm_split(data, n_steps):
X, y = [], []
for i in range(len(data)-n_steps+1):
X.append(data[i:i + n_steps, :-1])
y.append(data[i + n_steps-1, -1])
return np.array(X), np.array(y)
Cell 7
X1, y1 = lstm_split(X_ft.values, n_steps=2)
train_split=0.8
split_idx = int(float(np.ceil(len(X1)*train_split)))
date_index = X_ft.index
X_train, X_test = X1[:split_idx], X1[split_idx:]
y_train, y_test = y1[:split_idx], y1[split_idx:]
X_train_date, X_test_date = date_index[:split_idx], date_index[split_idx:]
print(X1.shape, X_train.shape, X_test.shape, y_test.shape)
Cell 8
lstm = Sequential()
lstm.add(LSTM(32, input_shape=(X_train.shape[1], X_train.shape[2]),
activation='relu', return_sequences=True))
lstm.add(Dense(1))
lstm.compile(loss='mean_squared_error', optimizer='adam')
lstm.summary()
Cell 9
history=lstm.fit(X_train, y_train,
epochs=100, batch_size=4,
verbose=2, shuffle=False)
Cell 10
y_pred = lstm.predict(X_test)
Cell 11
mse = mean_squared_error(y_test, y_pred, squared=False)
mape = mean_absolute_percentage_error(y_test, y_pred)
print("MSE: ",mse)
print("MAPE: ", mape)
Cell 11 is supposed to return the MSE and MAPE values of the y_test and y_pred. Instead, it returns ValueErrors such as in "mse = mean_squared_error(y_test, y_pred, squared=False)", or a ValueError "Found array with dim 3. Estimator expected <= 2." while the site source code shows the supposed output.
The code is copied from: https://www.projectpro.io/article/stock-price-prediction-using-machine-learning-project/571#:~:text=The%20idea%20is%20to%20weigh,to%20predict%20future%20stock%20prices.

How to show or hide a graph line in Bokeh

How do we toggle a line on and off (hide or show) in Bokeh again ? The example figure below does not update.
from bokeh.io import output_file, show
from bokeh.layouts import row
from bokeh.plotting import figure
from bokeh.models import CheckboxGroup, CustomJS
output_file("toggle_lines.html")
### Main plot
plot = figure()
# Dummy data for testing
x = list(range(25))
y0 = [ 3**a for a in x]
l0 = plot.line(x, y0, color='blue')
l0.visible = False
checkbox = CheckboxGroup(labels=["l0"], active=[1])
checkbox.js_on_click(CustomJS(args=dict(l0=l0), code="""l0.visible = 0 in checkbox.active;"""))
layout = row(checkbox, plot)
show(layout)
Thank you,
Two main things:
your active argument should either be [0] indicating that the 0th checkbutton in the Group should be active, or should not be supplied, indicating that the default state of all checkboxes should be inactive. By indicating [1] you're telling bokeh there's actually 2 checkboxes and 1 of them is active which leads to errors.
You'll need to pass your checkbox object into the javascript code via the args in your callback (you've already done this with the line, just need to include the checkbox group as well.
This code worked for me:
from bokeh.io import output_file, show
from bokeh.layouts import row
from bokeh.plotting import figure
from bokeh.models import CheckboxGroup, CustomJS
output_file("toggle_lines.html")
### Main plot
plot = figure(x_range=(0, 25))
# Dummy data for testing
x = list(range(25))
y0 = [ 3**a for a in x]
l0 = plot.line(x, y0, color='blue')
l0.visible = False
checkbox = CheckboxGroup(labels=["l0"])
checkbox.js_on_click(CustomJS(args=dict(l0=l0, checkbox=checkbox), code="""l0.visible = 0 in checkbox.active;"""))
layout = row(checkbox, plot)
show(layout)

How to use the format parameter of sliders?

Sliders have format property, see
https://docs.bokeh.org/en/latest/docs/reference/models/widgets.sliders.html
A) Where is the documentation for this property?
B) Is there an example of using the format attribute?
EDIT: is there a way to pass a function that takes the slider value and returns a string?
Formatting documentation can be found on this page with multiple examples. The sliders value can be used by calling slider.value.
I also edited an example where I added a formatter for the amplitude slider. The slider values in this example are used to change the sine wave.
You can run this example by using this command: bokeh serve script.py --show
import numpy as np
from bokeh.io import curdoc
from bokeh.layouts import row, column
from bokeh.models import ColumnDataSource
from bokeh.models.widgets import Slider, TextInput
from bokeh.plotting import figure
# Set up data
N = 200
x = np.linspace(0, 4*np.pi, N)
y = np.sin(x)
source = ColumnDataSource(data=dict(x=x, y=y))
# Set up plot
plot = figure(plot_height=400, plot_width=400, title="my sine wave",
tools="crosshair,pan,reset,save,wheel_zoom",
x_range=[0, 4*np.pi], y_range=[-2.5, 2.5])
plot.line('x', 'y', source=source, line_width=3, line_alpha=0.6)
# Set up widgets
text = TextInput(title="title", value='my sine wave')
offset = Slider(title="offset", value=0.0, start=-5.0, end=5.0, step=0.1)
amplitude = Slider(title="amplitude", value=1.0, start=-5.0, end=5.0, step=0.0000001, format='0.000f') #Slider with different formatting
phase = Slider(title="phase", value=0.0, start=0.0, end=2*np.pi)
freq = Slider(title="frequency", value=1.0, start=0.1, end=5.1, step=0.1)
# Set up callbacks
def update_title(attrname, old, new):
plot.title.text = text.value
text.on_change('value', update_title)
def update_data(attrname, old, new):
# Get the current slider values
a = amplitude.value
b = offset.value
w = phase.value
k = freq.value
# Generate the new curve
x = np.linspace(0, 4*np.pi, N)
y = a*np.sin(k*x + w) + b
source.data = dict(x=x, y=y)
for w in [offset, amplitude, phase, freq]:
w.on_change('value', update_data)
# Set up layouts and add to document
inputs = column(text, offset, amplitude, phase, freq)
curdoc().add_root(row(inputs, plot, width=800))
curdoc().title = "Sliders"

Bokeh: Automatically refreshing bokeh plots

I'm trying out an example Bokeh Application (in 'single module format') for generating a chart from a dataset. In the given example, the user on the web page can click on a button and the chart will update with the latest data. I am trying to figure out how I can achieve this same behavior without requiring the user to click on the button. That is, I would like the chart to automatically update/refresh/reload at a specified interval without the need for user interaction. Ideally, I would only have to change something in myapp.py to accomplish this.
bokeh version is 0.12.0.
Demo code copied here for convenience:
# myapp.py
import numpy as np
from bokeh.layouts import column
from bokeh.models import Button
from bokeh.palettes import RdYlBu3
from bokeh.plotting import figure, curdoc
# create a plot and style its properties
p = figure(x_range=(0, 100), y_range=(0, 100), toolbar_location=None)
p.border_fill_color = 'black'
p.background_fill_color = 'black'
p.outline_line_color = None
p.grid.grid_line_color = None
# add a text renderer to out plot (no data yet)
r = p.text(x=[], y=[], text=[], text_color=[], text_font_size="20pt",
text_baseline="middle", text_align="center")
i = 0
ds = r.data_source
# create a callback that will add a number in a random location
def callback():
global i
ds.data['x'].append(np.random.random()*70 + 15)
ds.data['y'].append(np.random.random()*70 + 15)
ds.data['text_color'].append(RdYlBu3[i%3])
ds.data['text'].append(str(i))
ds.trigger('data', ds.data, ds.data)
i = i + 1
# add a button widget and configure with the call back
button = Button(label="Press Me")
button.on_click(callback)
# put the button and plot in a layout and add to the document
curdoc().add_root(column(button, p))
Turns out there's a method in the Document object:
add_periodic_callback(callback, period_milliseconds)
Not sure why this isn't mentioned outside of the API...
Yeah ,add_periodic_callback()
import numpy as np
from bokeh.layouts import column
from bokeh.models import Button
from bokeh.palettes import RdYlBu3
from bokeh.plotting import figure, curdoc
p = figure(x_range=(0, 100), y_range=(0, 100), toolbar_location=None)
p.border_fill_color = 'black'
p.background_fill_color = 'black'
p.outline_line_color = None
p.grid.grid_line_color = None
r = p.text(x=[], y=[], text=[], text_color=[], text_font_size="20pt",
text_baseline="middle", text_align="center")
i = 0
ds = r.data_source
def callback():
global i
ds.data['x'].append(np.random.random()*70 + 15)
ds.data['y'].append(np.random.random()*70 + 15)
ds.data['text_color'].append(RdYlBu3[i%3])
ds.data['text'].append(str(i))
ds.trigger('data', ds.data, ds.data)
i = i + 1
curdoc().add_root(column(p))
curdoc().add_periodic_callback(callback, 1000)

Plotting with date times and matplotlib

So, I'm using a function from this website to (try) to make stick plots of some netCDF4 data. There is an excerpt of my code below. I got my data from here.
The stick_plot(time,u,v) function is EXACTLY as it appears in the website I linked which is why I did not show a copy of that function below.
When I run my code I get the following error. Any idea on how to get around this?
AttributeError: 'numpy.float64' object has no attribute 'toordinal'
The description of time from the netCDF4 file:
<type 'netCDF4._netCDF4.Variable'>
float64 time(time)
long_name: time
standard_name: time
units: days since 1900-01-01 00:00:00Z
axis: T
ancillary_variables: time_quality_flag
data_min: 2447443.375
data_max: 2448005.16667
unlimited dimensions:
current shape = (13484,)
filling off
Here is an excerpt of my code:
imports:
import matplotlib.pyplot as plot
import numpy as np
from netCDF4 import Dataset
import os
from matplotlib.dates import date2num
from datetime import datetime
trying to generate the plots:
path = '/Users/Kyle/Documents/Summer_Research/east_coast_currents/'
currents = [x for x in os.listdir('%s' %(path)) if '.DS' not in x]
for datum in currents:
working_data = Dataset('%s' %(path+datum), 'r', format = 'NETCDF4')
u = working_data.variables['u'][:][:100]
v = working_data.variables['v'][:][:100]
time = working_data.variables['time'][:][:100]
q = stick_plot(time,u,v)
ref = 1
qk = plot.quiverkey(q, 0.1, 0.85, ref,
"%s N m$^{-2}$" % ref,
labelpos='N', coordinates='axes')
_ = plot.xticks(rotation=70)
Joe Kington answered my question. The netCDF4 file read the times in as a datetime object. All I had to do was replace date2num(time) with time which fixed everything.

Resources