Bokeh: How to fix x-axis in time - datetime

I have produced a time series scatter plot in bokeh, which updates when a user interactively selects a new time series. However, I want to fix the x-axis between 0000 to 2359 hours for comparison (Bokeh tries to guess the appropriate x-range).
Below is a random snippet of data. In this code, how do I fix the x_range without it changing the scale to microseconds?
import pandas as pd
from bokeh.io import push_notebook, show, output_notebook
from bokeh.plotting import figure
from bokeh.models import Range1d
output_notebook()
data = {'2015-08-20 13:39:46': [-0.02813796, 0],
'2015-08-28 12:6:5': [ 1.32426938, 1],
'2015-08-28 13:42:59': [-0.16289655, 1],
'2015-12-14 16:19:44': [ 2.30476287, 1],
'2016-02-01 17:8:32': [ 0.41165004, 0],
'2016-02-09 11:26:33': [-0.65023149, 0],
'2016-04-08 17:57:47': [ 0.09335096, 1],
'2016-04-27 19:2:15': [ 1.43917208, 0]}
test = pd.DataFrame(data=data).T
test.columns = ["activity","objectID"]
test.index = pd.to_datetime(test.index)
p = figure(plot_width=500, plot_height=250, x_axis_label='X', y_axis_label='Y', x_axis_type="datetime")# x_range = Range1d(# dont know what to put here))
r = p.circle(x=test.index.time, y=test["activity"])
show(p, notebook_handle=True);

I've found a (scrappy) solution for this but it doesn't fix the axes sizes entirely since the size of the axes seems to be dependent on other axes properties such as the length of the y-tick labels.
# a method for setting constant x-axis in hours for bokeh:
day_x_axis = pd.DataFrame(data=[0,0], index=['2015-07-28 23:59:00', '2015-08- 28 00:01:00'], columns=["activity"])
day_x_axis.index = pd.to_datetime(day_x_axis.index)
new_time_series = pd.concat((old_time_series, day_x_axis), axis=0) # this will set all other columns you had to NaN.
I fixed my axes entirely by also setting the y_range property when instantiating the figure object.

Related

Simple way to place a label at the top corner of bokeh streaming plots as a python oscilloscope

I want to place a label at the top left corner of each streaming plot, be it one plot, or two plots, etc. The plots are stretched in both directions. For now, I have to manually specify a y postion depending on how many plots are shown. (y=200 for two plots, and y=440 for one plot) One may resolve it by recording the total range of y values shown in the plot, but it feels too hacky. I'm wondering if there is a simple way to do this. Thanks for any help.
from bokeh.server.server import Server
from bokeh.models import ColumnDataSource, Label
from bokeh.plotting import figure
from bokeh.layouts import column
import numpy as np
import datetime as dt
from functools import partial
import time
def f_random():
data = np.random.rand()
data = (dt.datetime.now(), data)
return data
def f_sinewave():
data = np.sin(time.time()/1.)
data = (dt.datetime.now(), data)
return data
def make_document(doc, functions, labels):
def update():
for index, func in enumerate(functions):
data = func()
sources[index].stream(new_data=dict(time=[data[0]], data=[data[1]]), rollover=1000)
annotations[index].text = f'{data[1]: .3f}'
sources = [ColumnDataSource(dict(time=[], data=[])) for _ in range(len(functions))]
figs = []
annotations = []
for i in range(len(functions)):
figs.append(figure(x_axis_type='datetime', plot_width=800, plot_height=400, y_axis_label=labels[i]))
figs[i].line(x='time', y='data', source=sources[i])
annotations.append(Label(x=10, y=200, text='', text_font_size='20px', text_color='black',
x_units='screen', y_units='screen', background_fill_color='white'))
figs[i].add_layout(annotations[i])
doc.add_root(column([fig for fig in figs], sizing_mode='stretch_both'))
doc.add_periodic_callback(callback=update, period_milliseconds=100)
if __name__ == '__main__':
# list of functions and labels to feed into the scope
functions = [f_random, f_sinewave]
labels = ['random', 'sinewave']
server = Server({'/': partial(make_document, functions=functions, labels=labels)})
server.start()
server.io_loop.add_callback(server.show, "/")
try:
server.io_loop.start()
except KeyboardInterrupt:
print('keyboard interruption')
For now you could do:
Label(x=10, y=figs[i].plot_height-30, ...)
It seems like allowing negative values to implicitly position against the "opposite" side would be a nice feature (and a good first task for new contributors), so I would encourage you to file a GitHub issue about it.

matplotlib bar plot add legend from categories dataframe column

I try to add the legend which should, according to my example, output:
a red square with the word fruit and
a green square with the word
veggie.
I tried several things (the example below is just 1 of the many trials), but I can't get it work.
Can someone tell me how to solve this problem?
import pandas as pd
from matplotlib import pyplot as plt
data = [['apple', 'fruit', 10], ['nanaba', 'fruit', 15], ['salat','veggie', 144]]
data = pd.DataFrame(data, columns = ['Object', 'Type', 'Value'])
colors = {'fruit':'red', 'veggie':'green'}
c = data['Type'].apply(lambda x: colors[x])
bars = plt.bar(data['Object'], data['Value'], color=c, label=colors)
plt.legend()
The usual way to create a legend for objects which are not in the axes would be to create proxy artists as shown in the legend guide
Here,
colors = {'fruit':'red', 'veggie':'green'}
labels = list(colors.keys())
handles = [plt.Rectangle((0,0),1,1, color=colors[label]) for label in labels]
plt.legend(handles, labels)
So this is a hacky solution and I'm sure there are probably better ways to do this. What you can do is plot individual bar plots that are invisible using width=0 with the original plot colors and specify the labels. You will have to do this in a subplot though.
import pandas as pd
from matplotlib import pyplot as plt
data = [['apple', 'fruit', 10], ['nanaba', 'fruit', 15], ['salat','veggie', 144]]
data = pd.DataFrame(data, columns = ['Object', 'Type', 'Value'])
colors = {'fruit':'red', 'veggie':'green'}
c = data['Type'].apply(lambda x: colors[x])
ax = plt.subplot(111) #specify a subplot
bars = ax.bar(data['Object'], data['Value'], color=c) #Plot data on subplot axis
for i, j in colors.items(): #Loop over color dictionary
ax.bar(data['Object'], data['Value'],width=0,color=j,label=i) #Plot invisible bar graph but have the legends specified
ax.legend()
plt.show()

Custom legend labels - geopandas.plot()

A colleague and I have been trying to set custom legend labels, but so far have failed. Code and details below - any ideas much appreciated!
Notebook: toy example uploaded here
Goal: change default rate values used in the legend to corresponding percentage values
Problem: cannot figure out how to access the legend object or pass legend_kwds to geopandas.GeoDataFrame.plot()
Data: KCMO metro area counties
Excerpts from toy example
Step 1: read data
# imports
import geopandas as gpd
import matplotlib.pyplot as plt
%matplotlib inline
# read data
gdf = gpd.read_file('kcmo_counties.geojson')
Option 1 - get legend from ax as suggested here:
ax = gdf.plot('val', legend=True)
leg = ax.get_legend()
print('legend object type: ' + str(type(leg))) # <class NoneType>
plt.show()
Option 2: pass legend_kwds dictionary - I assume I'm doing something wrong here (and clearly don't fully understand the underlying details), but the _doc_ from Geopandas's plotting.py - for which GeoDataFrame.plot() is simply a wrapper - does not appear to come through...
# create number of tick marks in legend and set location to display them
import numpy as np
numpoints = 5
leg_ticks = np.linspace(-1,1,numpoints)
# create labels based on number of tickmarks
leg_min = gdf['val'].min()
leg_max = gdf['val'].max()
leg_tick_labels = [str(round(x*100,1))+'%' for x in np.linspace(leg_min,leg_max,numpoints)]
leg_kwds_dict = {'numpoints': numpoints, 'labels': leg_tick_labels}
# error "Unknown property legend_kwds" when attempting it:
f, ax = plt.subplots(1, figsize=(6,6))
gdf.plot('val', legend=True, ax=ax, legend_kwds=leg_kwds_dict)
UPDATE
Just came across this conversation on adding in legend_kwds - and this other bug? which clearly states legend_kwds was not in most recent release of GeoPandas (v0.3.0). Presumably, that means we'll need to compile from the GitHub master source rather than installing with pip/conda...
I've just come across this issue myself. After following your link to the Geopandas source code, it appears that the colourbar is added as a second axis to the figure. so you have to do something like this to access the colourbar labels (assuming you have plotted a chloropleth with legend=True):
# Get colourbar from second axis
colourbar = ax.get_figure().get_axes()[1]
Having done this, you can manipulate the labels like this:
# Get numerical values of yticks, assuming a linear range between vmin and vmax:
yticks = np.interp(colourbar.get_yticks(), [0,1], [vmin, vmax])
# Apply some function f to each tick, where f can be your percentage conversion
colourbar.set_yticklabels(['{0:.2f}%'.format(ytick*100) for ytick in yticks])
This can be done by passing key-value pairs to dictionary argument legend_kwds:
gdf.plot(column='col1', cmap='Blues', alpha=0.5, legend=True, legend_kwds={'label': 'FOO', 'shrink': 0.5}, ax=ax)

Smoothen heatmap in plotly

I wanted to create a heatmap of a probability density matrix using plotly.
import numpy as np
from plotly.offline import download_plotlyjs, init_notebook_mode, plot
import plotly.graph_objs as go
probability_matrix = np.loadtxt("/path/to/file")
trace = go.Heatmap(z = probability_matrix)
data=[trace]
plot(data, filename='basic-heatmap')
This gives me an image like this:
I want to smoothen the color of the squares so that the transition between adjacent squares in the image are somewhat "smoother". I was wondering if there is a way of doing that, without manually resizing the matrix using interpolation.
You can use the zsmooth argument which can take three values ('fast', 'best', or False). For example:
data = [go.Heatmap(z=[[1, 20, 30],
[20, 1, 60],
[30, 60, 1]],
zsmooth = 'best')]
iplot(data)
Will give you the following smooth heatmap:

How to add permanent name labels (not interactive ones) on nodes for a networkx graph in bokeh?

I am trying to add a permanent label on nodes for a networkx graph using spring_layout and bokeh library. I would like for this labels to be re-positioned as the graph scales or refreshed like what string layout does, re-positioning the nodes as the graph scales or refreshed.
I tried to create the graph, and layout, then got pos from the string_layout. However, as I call pos=nx.spring_layout(G), it will generated a set of positions for the nodes in graph G, which I can get coordinates of to put into the LabelSet. However, I have to call graph = from_networkx(G, spring_layout, scale=2, center=(0,0)) to draw the network graph. This will create a new set of position for the node. Therefore, the positions of the nodes and the labels will not be the same.
How to fix this issues?
Thanks for asking this question. Working through it, I've realized that it is currently more work than it should be. I'd very strongly encourage you to open a GitHub issue so that we can discuss what improvements can best make this kind of thing easier for users.
Here is a complete example:
import networkx as nx
from bokeh.io import output_file, show
from bokeh.models import CustomJSTransform, LabelSet
from bokeh.models.graphs import from_networkx
from bokeh.plotting import figure
G=nx.karate_club_graph()
p = figure(x_range=(-3,3), y_range=(-3,3))
p.grid.grid_line_color = None
r = from_networkx(G, nx.spring_layout, scale=3, center=(0,0))
r.node_renderer.glyph.size=15
r.edge_renderer.glyph.line_alpha=0.2
p.renderers.append(r)
So far this is all fairly normal Bokeh graph layout code. Here is the additional part you need to add permanent labels for each node:
from bokeh.transform import transform
# add the labels to the node renderer data source
source = r.node_renderer.data_source
source.data['names'] = [str(x*10) for x in source.data['index']]
# create a transform that can extract the actual x,y positions
code = """
var result = new Float64Array(xs.length)
for (var i = 0; i < xs.length; i++) {
result[i] = provider.graph_layout[xs[i]][%s]
}
return result
"""
xcoord = CustomJSTransform(v_func=code % "0", args=dict(provider=r.layout_provider))
ycoord = CustomJSTransform(v_func=code % "1", args=dict(provider=r.layout_provider))
# Use the transforms to supply coords to a LabelSet
labels = LabelSet(x=transform('index', xcoord),
y=transform('index', ycoord),
text='names', text_font_size="12px",
x_offset=5, y_offset=5,
source=source, render_mode='canvas')
p.add_layout(labels)
show(p)
Basically, since Bokeh (potentially) computes layouts in the browser, the actual node locations are only available via the "layout provider" which is currently a bit tedious to access. As I said, please open a GitHub issue to suggest making this better for users. There are probably some very quick and easy things we can do to make this much simpler for users.
The code above results in:
similar solution as #bigreddot.
#Libraries for this solution
from bokeh.plotting import figure ColumnDataSource
from bokeh.models import LabelSet
#Remove randomness
import numpy as np
np.random.seed(1337)
#Load positions
pos = nx.spring_layout(G)
#Dict to df
labels_df = pd.DataFrame.from_dict(pos).T
#Reset index + column names
labels_df = labels_df.reset_index()
labels_df.columns = ["names", "x", "y"]
graph_renderer = from_networkx(G, pos, center=(0,0))
.
.
.
plot.renderers.append(graph_renderer)
#Set labels
labels = LabelSet(x='x', y='y', text='names', source=ColumnDataSource(labels_df))
#Add labels
plot.add_layout(labels)
Fixed node positions
From the networkx.spring_layout() documentation: you can add a list of nodes with a fixed position as a parameter.
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
g.add_edges_from([(0,1),(1,2),(0,2),(1,3)])
pos = nx.spring_layout(g)
nx.draw(g,pos)
plt.show()
Then you can plot the nodes at a fixed position:
pos = nx.spring_layout(g, pos=pos, fixed=[0,1,2,3])
nx.draw(g,pos)
plt.show()

Resources