I try to manually create a stacked bar plot in bokeh as I can do in matplotlib. The code works but the output plot shows some issues I cannot explain or solve. To me, it seems like the error always appear at the second level, no matter if I have two levels (i.e. number of bars) or three levels stacked. In the three levels plot, the color is miss matched due to the missing second level.
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
import numpy as np
output_notebook()
test_data = {
'index' : np.array([1, 2, 3, 4, 5]),
'a' : np.array([10, 20, 30, 40, 50]),
'b' : np.array([10, 10, 10, 10, 10]),
'c' : np.array([10, 10, 10, 10, 10])
}
s = ColumnDataSource(test_data)
alpha_lst = [1, 0.5, 0.1]
keys = list(s.data.keys())[1:-1] # stack 2 levels/bars
# keys = list(s.data.keys())[1:] # stack 3 levels/bars
w_bar = 0.9
level = 0
p = figure(plot_width=600, plot_height=360)
for j in range(len(keys)):
p.vbar(x=s.data['index'], bottom=level,
top=level+s.data[keys[j]], width=w_bar, color='#43a2ca',
line_color='#3A5785', fill_alpha=alpha_lst[j], legend_label=keys[j])
level += s.data[keys[j]]
p.legend.location = 'top_left'
show(p)
Stacked Bar 2 Levels:
Stacked Bar 3 Levels:
Two reasons why level += s.data[keys[j]] is bad:
You change the variable type from int to numpy.array. I maaaaybe OK in this case, but generally it makes it harder to make sense of and debug later on
After it becomes an array (i.e. at the end of the very first loop iteration), the next execution of that line mutates the array in place. Meaning, the VBar that was created by the second call to p.vbar now has bottom with the same value as top.
It's up to you how (and whether) you fix #1. And to fix #2, which will solve your main problem, just replace level += s.data[keys[j]] with level = level + s.data[keys[j]].
Related
Well, i created a bar chart and now i want to specify the color of a bar depending of its value on y-axis. simplified- if the value is positive the bar should be red and is the value nagative the bar should be blue.
For me it's only possible to change the color along the x-axis but not the y-axis.
from bokeh.palettes import plasma
source = ColumnDataSource(data={'date' : pd.to_datetime(df_data['date'], format='%Y-%m'), 'values' : df_data['values'], 'color' : plasma(256)})
p = figure(x_axis_label='time',
x_axis_type='datetime',
y_axis_label='diff',
tools = [hover]
toolbar_location=None
title="title")
p.vbar(x = 'date',top = 'values', source=source, width=timedelta(days=20), color = 'color')
I've found an example on:
https://docs.bokeh.org/en/latest/docs/user_guide/categorical.html
But i need to differentiate or to color the bars by their values no by their number. I know my example makes no sense, but i only want to demonstrate what my expectations are.
Ok, i found a solution by myself by unsing the cut function of pandas.
import pandas as pd
import numpy as np
values = array(df_data['values']).values)
bins = [np.NINF, 0, np.inf]
categories = pd.cut(values, bins, right=False)
palette = ['blue', 'red']
colors = []
for i in categories.codes:
colors.append[palette[i]]
# Now i can add this column to my ColumnDataSource:
source = ColumnDataSource(data={'date' : pd.to_datetime(df_data['date'], format='%Y-%m'), 'values' : df_data['values'], 'color' : colors}
p.vbar(x = 'date',top = 'values', source=source, width=timedelta(days=20), color = 'colors')
Of course this is just as "quick and dirty" solution and there is enough room for optimization.
I'm trying to visualize a high-dim point set x (here of dim (6 x 42)) in a series of 2D scatter plots (x[1] vs x[2] etc.) using bokeh. [edit2] See this nice example from scikit-opt as a reference. When x[1] occurs in two plots it should interact with the same range and the plots should rescale simultaneously. I have accomplished this, but I don't get it to scale correctly. Here's a minimal example: [edit2]
import bokeh
import bokeh.io
import numpy as np
import bokeh.plotting
bokeh.io.output_notebook()
# That's my fictional dataset
x = np.random.randn(6, 42)
x[2] *= 10
# Build the pairwise scatter plots
kw = dict(plot_width=165, plot_height=165)
# `ranges` stores the range in each dimension,
# used as both, x- and y-range depending on
# where the variable is.
figs, ranges = {}, {}
for r, row in enumerate(x):
for c, col in enumerate(x):
if r is not c:
fig = bokeh.plotting.figure(
x_range=ranges.get(c, None), y_range=ranges.get(r, None),
**kw)
fig.scatter(x=col, y=row)
fig.xaxis.axis_label = f'Dim {c}'
fig.yaxis.axis_label = f'Dim {r}'
if c not in ranges:
ranges[c] = fig.x_range
if r not in ranges:
ranges[r] = fig.y_range
figs[f'{r}_{c}'] = fig
else:
break
# Setup the plotting layout
plots = [[]]
for r, row in enumerate(x):
for c, col in enumerate(x):
if r is not c:
plots[-1].append(figs[f'{r}_{c}'])
else:
plots.append([])
break
staircase = bokeh.layouts.gridplot(plots, **kw)
bokeh.plotting.show(staircase)
.. into an ipython notebook (>=py3.6), bokeh sets the scale for dim 1, and 2 correctly. Then, it starts to set the scale for the following dimensions as in dim 2. Notice that I scaled dim 2 10-fold to make this point.
Interactively, I can rescale the plot back to optimal settings. However, I'd like to do that by default. What options do I have inside bokeh to rescale? I played a bit with fig.xaxis.bounds, but unsuccessfully. Thanks for your help!
Epilogue:
Following #bigreddot's answer, I added the lines:
for i, X in enumerate(x):
ranges[i].start = X.min()
ranges[i].end = X.max()
to fix the starting ranges. I still think that the behaviour is a bug.
From your code and description I still can't quite tell what you are hoping to accomplish. [1] But I will state that the default DataRange1d ranges that plot's use automatically make space for all renderers, across all plots they are shared by. In this sense, I see exactly what I would expect when I run your code. If you want something different, there are two things you could control:
DataRange1d has a .renderers property. If you only want the "auto" ranging to be over a subset of the renderers, then you can explicitly set this property to the list you want. Renderers are returned by the glyph functions, e.g. fig.scatter
Don't use the "auto" ranges. You can also set the x_range and y_range yourself to be Range1d objects. These have start and end properties that you can set, and these will be the definite bounds of the range, e.g. x-range=Range1d(0, 10)
[1] The ranges are linked in what I would consider an odd way, and I can't tell if that is intended. But that is a result of your looping/python code and not Bokeh.
I have produced a time series scatter plot in bokeh, which updates when a user interactively selects a new time series. However, I want to fix the x-axis between 0000 to 2359 hours for comparison (Bokeh tries to guess the appropriate x-range).
Below is a random snippet of data. In this code, how do I fix the x_range without it changing the scale to microseconds?
import pandas as pd
from bokeh.io import push_notebook, show, output_notebook
from bokeh.plotting import figure
from bokeh.models import Range1d
output_notebook()
data = {'2015-08-20 13:39:46': [-0.02813796, 0],
'2015-08-28 12:6:5': [ 1.32426938, 1],
'2015-08-28 13:42:59': [-0.16289655, 1],
'2015-12-14 16:19:44': [ 2.30476287, 1],
'2016-02-01 17:8:32': [ 0.41165004, 0],
'2016-02-09 11:26:33': [-0.65023149, 0],
'2016-04-08 17:57:47': [ 0.09335096, 1],
'2016-04-27 19:2:15': [ 1.43917208, 0]}
test = pd.DataFrame(data=data).T
test.columns = ["activity","objectID"]
test.index = pd.to_datetime(test.index)
p = figure(plot_width=500, plot_height=250, x_axis_label='X', y_axis_label='Y', x_axis_type="datetime")# x_range = Range1d(# dont know what to put here))
r = p.circle(x=test.index.time, y=test["activity"])
show(p, notebook_handle=True);
I've found a (scrappy) solution for this but it doesn't fix the axes sizes entirely since the size of the axes seems to be dependent on other axes properties such as the length of the y-tick labels.
# a method for setting constant x-axis in hours for bokeh:
day_x_axis = pd.DataFrame(data=[0,0], index=['2015-07-28 23:59:00', '2015-08- 28 00:01:00'], columns=["activity"])
day_x_axis.index = pd.to_datetime(day_x_axis.index)
new_time_series = pd.concat((old_time_series, day_x_axis), axis=0) # this will set all other columns you had to NaN.
I fixed my axes entirely by also setting the y_range property when instantiating the figure object.
I draw following plot with bokeh.plotting.Figure.line.
How I can add vertical guideline to emphasize a point of Feb/14 ?
Here's another plot. This is bokeh.charts.Bar.
I'd like to add horizontal guideline to emphasize a point of 50. I searched bokeh doc but have no luck to find relevant API reference. It would be appreciate someone address me about this.
I added a vertical line to a simple line chart by creating a new set of data that corresponded to the vertical line I wanted to create.
from datetime import *
x = [date(2001,1,1), date(2002,1,1),date(2003,1,1), date(2004,1,1),
date(2005,1,1), date(2006,1,1),date(2007,1,1), date(2008,1,1),
date(2009,1,1), date(2010,1,1),date(2011,1,1)]
y = [0, 3, 2, 4, 6, 9, 15, 18, 19, 25, 28]
output_file("lines.html", title="line plot example")
p = figure(title="simple line example",x_axis_type = "datetime")
p.line(x, y)
a = [min(y),max(y)]
b = [date(2009,1,1),date(2009,1,1)]
p.line(b, a ,line_color="red")
show(p)
You can do this fairly easily with the ray glyph in bokeh. If you set the angle to be 1.57079633 (90 degrees in radians) you'll get a vertical ray. Just update the x value to be where you want the line and the length to be the height of your x axis.
p.ray(x=.5, y=0, length=1, angle=1.57079633, color='black')
You can probably use the new BoxAnnotation (new as of Bokeh 0.9.3) with zero width or height to do this, with slightly better effect:
https://docs.bokeh.org/en/latest/docs/user_guide/annotations.html#box-annotations
It's probably worth adding a LineAnnotation as well, I'll make an issue for it.
I have altered the sample code that from the help file for gap.barplot.
twogrp<-c(0, 4, 5, 7, 2,3, 1, 7, 18, 22, 25, 26, 28)
gap.barplot(twogrp,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),
ylab="Group values",main="Barplot with gap")
When I plot the above code, the resulting plot makes a 0 value appear to be a negative value, and a value of 1 look like zero. Is there any way to change this so that if my vector contains a zero value, nothing is plotted, and if there is a value of one then I see a raised bar. I have noticed that the sample files avoid ones and zeros, and that the plots resulting from the sample files have the x axis at y=1.
There is clearly a typo in gap.barplot (plotrix version 3.5-5). Right now it is setting the bottom of the bars to the minimum x value rather than the minimum y value. Here's some code that will copy that function and change that line (if found)
gap.barplot2<-gap.barplot
if (deparse(body(gap.barplot2)[[c(20,4,4)]])==
"botgap <- ifelse(gap[1] < 0, gap[1], xlim[1])") {
body(gap.barplot2)[[c(20,4,4)]] <-
quote(botgap <- ifelse(gap[1] < 0, gap[1], ylim[1]))
} else {
stop("line not found")
}
Then you can run
gap.barplot2(twogrp,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),
ylab="Group values",main="Barplot with gap")
to get
There appears to be no easy way to set the ylim[1]=0 without also setting ylim[2] (the max y-value). Lattice plotting functions would allow ylim=c(0,NA). Which would be nice to force a zero line but let the rest of the function figure out what the default max should be.
So you can use this alternative for now. I would contact the package authors to let them know about this error. You can send them a link to this question if you like.