Linking HoloViews plots with Bokeh customizations - bokeh

I'm struggling with some of the finer points of complex HoloViews plots, especially linked plots customizing the appearance of fonts and data points.
Using the following code, I can create this plot that has most of the features I want, but am stumped by a few things:
I want one marginal for the whole set of plots linked to 'ewr' (with individual marginals for each of the other axes), ideally on the left of the set; but my attempts to get just one in my definitions of s1 and s2 haven't worked, and I can find nothing in the documentation about moving a marginal to the left (or bottom for that matter).
I want to be able to define tooltips that use columns from my data that are not displayed in the plots. I can see one way of accomplishing this as shown in the commented alternate definition for s1, but that unlinks the plot it creates from the others. How do I create linked plots that have tooltips with elements not in those plots?
For reference, the data used is available here (converted in the code below to a Pandas dataframe, df).
import holoviews as hv
from holoviews import dim, opts
hv.extension('bokeh')
renderer = hv.renderer('bokeh')
from bokeh.models import HoverTool
from holoviews.plotting.links import DataLink
TOOLS="crosshair,pan,wheel_zoom,zoom_in,zoom_out,box_zoom,undo,redo,reset,tap,save,box_select,poly_select,lasso_select".split(",")
ht = HoverTool(
tooltips=[('Name', '#{name}'), ('EWR', '#{ewr}{%0.2f}'), ('Win Rate', '#{winrate}{%d}')],
formatters={'ewr' : 'printf', 'winrate' : 'printf'})
point_opts = opts.Scatter(fill_color='black', fill_alpha=0.1, line_width=1, line_color='gray', size=5, tools=TOOLS+[ht])
hist_opts = opts.Histogram(fill_color='gray', fill_alpha=0.9, line_width=1, line_color='gray', tools=['box_select'], labelled=[None, None])
#s1 = hv.Scatter(df[['kfai','ewr','name','winrate']]).hist(num_bins=51, dimension='kfai')
s1 = hv.Scatter(df, 'kfai','ewr').hist(num_bins=51, dimension='kfai')
s2 = hv.Scatter(df, 'aerc', 'ewr').hist(num_bins=51, dimension=['aerc',None])
s3 = hv.Scatter(df, 'winrate', 'ewr').hist(num_bins=51, dimension=['winrate','ewr'])
p = (s1 + s2 + s3).opts(point_opts, hist_opts, opts.Layout(shared_axes=True, shared_datasource=True))
renderer.save(p, '_testHV')

Related

Vega-Lite - One plot for multiple datasets

I'm working on a package for Julia with the goal of doing quick plots using Vega-Lite as backend.
As people familiar with Matplotlib know, it is very common to have different sets for vectors, and plot all of them in the same figure, each with it's own label. For example:
x = range(0,10)
y = np.random.rand(10)
w = range(0,5)
z = np.random.rand(5)
plt.plot(x,y,label = 'y')
plt.plot(w,z,label = 'z')
plt.legend()
What I'd like to know is how can I do something similar, but using Vega-Lite (or Altair).
I know that I can do two separate plots and then add one over another. My problem is mainly about how to get the legends to work, since to get a legend, one usually needs another field
such as "color", pointing to another field in the dataframe.
I've seen similar posts, but dealing with the question of posting data from different columns. The answer to this case is basically to use the Fold Transform. But in my question this doesn't quite work, because I'm more interested in starting from two different plots, possibly using two different datasets, so "merging" the datasets is not a good solution.
You can take advantage of the fact that in composite charts, Vega-Lite uses shared scales by default. If you assign the color, shape, strokeDash, etc. to a unique value for each layer, an appropriate legend will be generated automatically.
Here is an example, using Altair to generate the Vega-Lite specification:
import pandas as pd
import numpy as np
import altair as alt
x = np.linspace(0, 10)
df1 = pd.DataFrame({
'x': x,
'y': np.sin(x)
})
df2 = pd.DataFrame({
'x': x,
'y': np.cos(x)
})
chart1 = alt.Chart(df1).transform_calculate(
label='"sine"'
).mark_line().encode(
x='x',
y='y',
color='label:N'
)
chart2 = alt.Chart(df2).transform_calculate(
label='"cosine"'
).mark_line().encode(
x='x',
y='y',
color='label:N'
)
alt.layer(chart1, chart2)

Custom legend labels - geopandas.plot()

A colleague and I have been trying to set custom legend labels, but so far have failed. Code and details below - any ideas much appreciated!
Notebook: toy example uploaded here
Goal: change default rate values used in the legend to corresponding percentage values
Problem: cannot figure out how to access the legend object or pass legend_kwds to geopandas.GeoDataFrame.plot()
Data: KCMO metro area counties
Excerpts from toy example
Step 1: read data
# imports
import geopandas as gpd
import matplotlib.pyplot as plt
%matplotlib inline
# read data
gdf = gpd.read_file('kcmo_counties.geojson')
Option 1 - get legend from ax as suggested here:
ax = gdf.plot('val', legend=True)
leg = ax.get_legend()
print('legend object type: ' + str(type(leg))) # <class NoneType>
plt.show()
Option 2: pass legend_kwds dictionary - I assume I'm doing something wrong here (and clearly don't fully understand the underlying details), but the _doc_ from Geopandas's plotting.py - for which GeoDataFrame.plot() is simply a wrapper - does not appear to come through...
# create number of tick marks in legend and set location to display them
import numpy as np
numpoints = 5
leg_ticks = np.linspace(-1,1,numpoints)
# create labels based on number of tickmarks
leg_min = gdf['val'].min()
leg_max = gdf['val'].max()
leg_tick_labels = [str(round(x*100,1))+'%' for x in np.linspace(leg_min,leg_max,numpoints)]
leg_kwds_dict = {'numpoints': numpoints, 'labels': leg_tick_labels}
# error "Unknown property legend_kwds" when attempting it:
f, ax = plt.subplots(1, figsize=(6,6))
gdf.plot('val', legend=True, ax=ax, legend_kwds=leg_kwds_dict)
UPDATE
Just came across this conversation on adding in legend_kwds - and this other bug? which clearly states legend_kwds was not in most recent release of GeoPandas (v0.3.0). Presumably, that means we'll need to compile from the GitHub master source rather than installing with pip/conda...
I've just come across this issue myself. After following your link to the Geopandas source code, it appears that the colourbar is added as a second axis to the figure. so you have to do something like this to access the colourbar labels (assuming you have plotted a chloropleth with legend=True):
# Get colourbar from second axis
colourbar = ax.get_figure().get_axes()[1]
Having done this, you can manipulate the labels like this:
# Get numerical values of yticks, assuming a linear range between vmin and vmax:
yticks = np.interp(colourbar.get_yticks(), [0,1], [vmin, vmax])
# Apply some function f to each tick, where f can be your percentage conversion
colourbar.set_yticklabels(['{0:.2f}%'.format(ytick*100) for ytick in yticks])
This can be done by passing key-value pairs to dictionary argument legend_kwds:
gdf.plot(column='col1', cmap='Blues', alpha=0.5, legend=True, legend_kwds={'label': 'FOO', 'shrink': 0.5}, ax=ax)

How to add permanent name labels (not interactive ones) on nodes for a networkx graph in bokeh?

I am trying to add a permanent label on nodes for a networkx graph using spring_layout and bokeh library. I would like for this labels to be re-positioned as the graph scales or refreshed like what string layout does, re-positioning the nodes as the graph scales or refreshed.
I tried to create the graph, and layout, then got pos from the string_layout. However, as I call pos=nx.spring_layout(G), it will generated a set of positions for the nodes in graph G, which I can get coordinates of to put into the LabelSet. However, I have to call graph = from_networkx(G, spring_layout, scale=2, center=(0,0)) to draw the network graph. This will create a new set of position for the node. Therefore, the positions of the nodes and the labels will not be the same.
How to fix this issues?
Thanks for asking this question. Working through it, I've realized that it is currently more work than it should be. I'd very strongly encourage you to open a GitHub issue so that we can discuss what improvements can best make this kind of thing easier for users.
Here is a complete example:
import networkx as nx
from bokeh.io import output_file, show
from bokeh.models import CustomJSTransform, LabelSet
from bokeh.models.graphs import from_networkx
from bokeh.plotting import figure
G=nx.karate_club_graph()
p = figure(x_range=(-3,3), y_range=(-3,3))
p.grid.grid_line_color = None
r = from_networkx(G, nx.spring_layout, scale=3, center=(0,0))
r.node_renderer.glyph.size=15
r.edge_renderer.glyph.line_alpha=0.2
p.renderers.append(r)
So far this is all fairly normal Bokeh graph layout code. Here is the additional part you need to add permanent labels for each node:
from bokeh.transform import transform
# add the labels to the node renderer data source
source = r.node_renderer.data_source
source.data['names'] = [str(x*10) for x in source.data['index']]
# create a transform that can extract the actual x,y positions
code = """
var result = new Float64Array(xs.length)
for (var i = 0; i < xs.length; i++) {
result[i] = provider.graph_layout[xs[i]][%s]
}
return result
"""
xcoord = CustomJSTransform(v_func=code % "0", args=dict(provider=r.layout_provider))
ycoord = CustomJSTransform(v_func=code % "1", args=dict(provider=r.layout_provider))
# Use the transforms to supply coords to a LabelSet
labels = LabelSet(x=transform('index', xcoord),
y=transform('index', ycoord),
text='names', text_font_size="12px",
x_offset=5, y_offset=5,
source=source, render_mode='canvas')
p.add_layout(labels)
show(p)
Basically, since Bokeh (potentially) computes layouts in the browser, the actual node locations are only available via the "layout provider" which is currently a bit tedious to access. As I said, please open a GitHub issue to suggest making this better for users. There are probably some very quick and easy things we can do to make this much simpler for users.
The code above results in:
similar solution as #bigreddot.
#Libraries for this solution
from bokeh.plotting import figure ColumnDataSource
from bokeh.models import LabelSet
#Remove randomness
import numpy as np
np.random.seed(1337)
#Load positions
pos = nx.spring_layout(G)
#Dict to df
labels_df = pd.DataFrame.from_dict(pos).T
#Reset index + column names
labels_df = labels_df.reset_index()
labels_df.columns = ["names", "x", "y"]
graph_renderer = from_networkx(G, pos, center=(0,0))
.
.
.
plot.renderers.append(graph_renderer)
#Set labels
labels = LabelSet(x='x', y='y', text='names', source=ColumnDataSource(labels_df))
#Add labels
plot.add_layout(labels)
Fixed node positions
From the networkx.spring_layout() documentation: you can add a list of nodes with a fixed position as a parameter.
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
g.add_edges_from([(0,1),(1,2),(0,2),(1,3)])
pos = nx.spring_layout(g)
nx.draw(g,pos)
plt.show()
Then you can plot the nodes at a fixed position:
pos = nx.spring_layout(g, pos=pos, fixed=[0,1,2,3])
nx.draw(g,pos)
plt.show()

how to plot more than two plots using for loop in python?

I'm trying to do 4 plots using for loop.But I'm not sure how to do it.how can I display the plots one by one orderly?or save the figure as png?
Here is my code:
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from astropy.io import fits
import pyregion
import glob
# read in the image
xray_name = glob.glob("*.fits")
for filename in xray_name:
f_xray = fits.open(filename)
#name = file_name[:-len('.fits')]
try:
from astropy.wcs import WCS
from astropy.visualization.wcsaxes import WCSAxes
wcs = WCS(f_xray[0].header)
fig = plt.figure()
ax = plt.subplot(projection=wcs)
fig.add_axes(ax)
except ImportError:
ax = plt.subplot(111)
ax.imshow(f_xray[0].data, cmap="summer", vmin=0., vmax=0.00038, origin="lower")
reg_name=glob.glob("*.reg")
for i in reg_name:
r =pyregion.open(i).as_imagecoord(header=f_xray[0].header)
from pyregion.mpl_helper import properties_func_default
# Use custom function for patch attribute
def fixed_color(shape, saved_attrs):
attr_list, attr_dict = saved_attrs
attr_dict["color"] = "red"
kwargs = properties_func_default(shape, (attr_list, attr_dict))
return kwargs
# select region shape with tag=="Group 1"
r1 = pyregion.ShapeList([rr for rr in r if rr.attr[1].get("tag") == "Group 1"])
patch_list1, artist_list1 = r1.get_mpl_patches_texts(fixed_color)
r2 = pyregion.ShapeList([rr for rr in r if rr.attr[1].get("tag") != "Group 1"])
patch_list2, artist_list2 = r2.get_mpl_patches_texts()
for p in patch_list1 + patch_list2:
ax.add_patch(p)
#for t in artist_list1 + artist_list2:
# ax.add_artist(t)
plt.show()
the aim of the code is to plot a region on fits file image,if there is a way to change the color of the background image to white and the brighter (centeral region) as it is would be okay.Thanks
You are using colormap "summer" with provided limits. It is not clear to me what you want to achieve since the picture you posted looks more or less digital black and white pixelwise.
In matplotlib there are built in colormaps, and all of those have a reversed twin.
'summer' has a reversed twin with 'summer_r'
This can be picked up in the mpl docs at multiple spots, like colormap example, or SO answers like this.
Hope that is what you are looking for. For the future, when posting code like this, try to remove all non relevant portions as well as at minimum provide a description of the data format/type. Best is to also include a small sample of the data and it's structure. A piece of code only works together with a set of data, so only sharing one is only half the problem formulation.

Multiple Surface Plot including Layout in Plotly?

Plotly has a page for Surface Plots (https://plot.ly/python/3d-surface-plots/) with two nice examples, the first uses Figure and Layout to plot. The second just using iplot() with no Layout modification. But the second example demos how to plot multiple surfaces simultaneously. I need both: (i.e. I need to call fig = go.Figure(data=data, layout=layout), as in the first instance (as I need to provide layout settings), however with the data object containing multiple surfaces.
It seems this doesn't work as expected. Any ideas?
To define the layout and plot multiple surfaces you can run something like this:
import plotly.plotly as py
from plotly.graph_objs import Surface
z1 = [
[8.83,8.89,8.81,8.87,8.9,8.87],
[8.89,8.94,8.85,8.94,8.96,8.92],
[8.84,8.9,8.82,8.92,8.93,8.91],
[8.79,8.85,8.79,8.9,8.94,8.92],
[8.79,8.88,8.81,8.9,8.95,8.92],
[8.8,8.82,8.78,8.91,8.94,8.92],
[8.75,8.78,8.77,8.91,8.95,8.92],
[8.8,8.8,8.77,8.91,8.95,8.94],
[8.74,8.81,8.76,8.93,8.98,8.99],
[8.89,8.99,8.92,9.1,9.13,9.11],
[8.97,8.97,8.91,9.09,9.11,9.11],
[9.04,9.08,9.05,9.25,9.28,9.27],
[9,9.01,9,9.2,9.23,9.2],
[8.99,8.99,8.98,9.18,9.2,9.19],
[8.93,8.97,8.97,9.18,9.2,9.18]
]
z2 = [[zij+1 for zij in zi] for zi in z1]
z3 = [[zij-1 for zij in zi] for zi in z1]
data = [
dict(z=z1, type='surface'),
dict(z=z2, showscale=False, opacity=0.9, type='surface'),
dict(z=z3, showscale=False, opacity=0.9, type='surface')]
# Add all layout info here
layout = dict(title = 'Add Layout')
fig = dict(data=data, layout=layout)
py.iplot(fig, filename='multiple-surfaces-add-layout')

Resources