I wanted to create a heatmap of a probability density matrix using plotly.
import numpy as np
from plotly.offline import download_plotlyjs, init_notebook_mode, plot
import plotly.graph_objs as go
probability_matrix = np.loadtxt("/path/to/file")
trace = go.Heatmap(z = probability_matrix)
data=[trace]
plot(data, filename='basic-heatmap')
This gives me an image like this:
I want to smoothen the color of the squares so that the transition between adjacent squares in the image are somewhat "smoother". I was wondering if there is a way of doing that, without manually resizing the matrix using interpolation.
You can use the zsmooth argument which can take three values ('fast', 'best', or False). For example:
data = [go.Heatmap(z=[[1, 20, 30],
[20, 1, 60],
[30, 60, 1]],
zsmooth = 'best')]
iplot(data)
Will give you the following smooth heatmap:
Related
I am having trouble with aligning grids on a plot I made. Basically the plots show the result of a 34x34 matrix where each point has a value of 0,1,2,3 and is colored based on this. The lines which outline the cells do not match up perfectly with the coloring of the cells. My code and image are below.
library(raster)
r<-raster(xmn=1,xmx=34,ymn=1,ymx=34,nrows=34,ncols=34)
data1<-read.csv(file ="mat_aligned.csv",row.names = 1)
numbers<-data.matrix(data1)
r[]<-numbers
breakpoints<-c(-1,0.1,1.1,2.1,3.1)
colors<-c("white","blue","green","red")
plot(r,breaks=breakpoints,col=colors)
plot(rasterToPolygons(r),add=TRUE,border='black',lwd=3)
I would appreciate any help with this!
The problem is that the base R plot and the drawing of the grid use different plotting systems. The polygons will stay constant relative to the plotting window (they will appear narrower as the window shrinks), and won't preserve their relationship to the underlying plot axes, whereas the coloured squares will resize to preserve shape. You'll probably find that you can get your grid to match better by resizing your window, but of course, this isn't ideal.
The best way to get round this is to use the specific method designed for plotting SpatialPolygonDataFrame, which is the S4 class produced by rasterToPolygons. This is, after all, how you're "meant" to create such a plot.
Here's a reprex (obviously I've had to make some random data as yours wasn't shared in the question) :
library(raster)
r <- raster(xmn = 1, xmx = 34, ymn = 1, ymx = 34, nrows = 34, ncols = 34)
r[] <- data.matrix(as.data.frame(replicate(34, sample(0:3, 34, TRUE))))
colors <- c("white","blue","green","red")
spplot(rasterToPolygons(r), at = 0:4 - 0.5, col.regions = colors)
Created on 2020-05-04 by the reprex package (v0.3.0)
It is difficult to help if you not provide a minimal self-contained reporducible example. Something like this
library(raster)
r <- raster(xmn=1,xmx=34,ymn=1,ymx=34,nrows=34,ncols=34)
values(r) <- sample(4, ncell(r), replace=T)
p <- rasterToPolygons(r)
plot(r)
lines(p)
I see what you describe, even though it is minimal. A work-around could be to only plot the polygons
colors<-c("white","blue","green","red")
plot(p, col=colors[p$layer])
I'm using holoviews with bokeh backend for interactive visualizations. I have a histogram with edges and frequency data. What is an elegant way of overlaying my histogram with the cumulative distribution (cdf) curve?
I tried using the cumsum option in hv.dim but don't think i'm doing it right. The help simply says,
Help on function cumsum in module holoviews.util.transform:
cumsum(self, **kwargs)
My code looks something like,
df_hist = pd.DataFrame(columns=['edges', 'freq'])
df_hist['edges'] = [-2, -1, 0, 1, 2]
df_hist['freq'] = [1, 3, 5, 3, 1]
hv.Histogram((df_hist.edges, df_hist.freq))
The result is a histogram plot.
Is there something like a...
hv.Histogram((df_hist.edges, df_hist.freq), type='cdf')
... to show the cumulative distribution?
One possible solution is by using histogram(cumulative=True) as follows:
from holoviews.operation import histogram
histogram(hv.Histogram((df_hist.edges, df_hist.freq)), cumulative=True)
More info on transforming elements here:
http://holoviews.org/user_guide/Transforming_Elements.html
Or a more general solution by turning the original data into a hv.Dataset():
import holoviews as hv
import seaborn as sns
hv.extension('bokeh')
iris = sns.load_dataset('iris')
hv_data = hv.Dataset(iris['petal_width'])
histogram(hv_data, cumulative=True)
But I like using library hvplot, which is built on top of Holoviews, even more:
import hvplot
import hvplot.pandas
iris['petal_width'].hvplot.hist(cumulative=True)
I try to add the legend which should, according to my example, output:
a red square with the word fruit and
a green square with the word
veggie.
I tried several things (the example below is just 1 of the many trials), but I can't get it work.
Can someone tell me how to solve this problem?
import pandas as pd
from matplotlib import pyplot as plt
data = [['apple', 'fruit', 10], ['nanaba', 'fruit', 15], ['salat','veggie', 144]]
data = pd.DataFrame(data, columns = ['Object', 'Type', 'Value'])
colors = {'fruit':'red', 'veggie':'green'}
c = data['Type'].apply(lambda x: colors[x])
bars = plt.bar(data['Object'], data['Value'], color=c, label=colors)
plt.legend()
The usual way to create a legend for objects which are not in the axes would be to create proxy artists as shown in the legend guide
Here,
colors = {'fruit':'red', 'veggie':'green'}
labels = list(colors.keys())
handles = [plt.Rectangle((0,0),1,1, color=colors[label]) for label in labels]
plt.legend(handles, labels)
So this is a hacky solution and I'm sure there are probably better ways to do this. What you can do is plot individual bar plots that are invisible using width=0 with the original plot colors and specify the labels. You will have to do this in a subplot though.
import pandas as pd
from matplotlib import pyplot as plt
data = [['apple', 'fruit', 10], ['nanaba', 'fruit', 15], ['salat','veggie', 144]]
data = pd.DataFrame(data, columns = ['Object', 'Type', 'Value'])
colors = {'fruit':'red', 'veggie':'green'}
c = data['Type'].apply(lambda x: colors[x])
ax = plt.subplot(111) #specify a subplot
bars = ax.bar(data['Object'], data['Value'], color=c) #Plot data on subplot axis
for i, j in colors.items(): #Loop over color dictionary
ax.bar(data['Object'], data['Value'],width=0,color=j,label=i) #Plot invisible bar graph but have the legends specified
ax.legend()
plt.show()
Surprisingly nobody took the pain to make an example in the bokeh gallery for 2D histogram plotting
histogram2d of numpy gives the raw material, but would be nice to have an example as it happens for matplotlib
Any idea for a short way to make one?
Following up a proposed answer let me attach a case in which hexbin does not the job because exagons are not a good fit for the job. Also check out matplotlib result.
Of course I am not saying bokeh cannot do this, but it seem not straightfoward. Would be enough to change the hexbin plot into a square bin plot, but quad(left, right, top, bottom, **kwargs) seems not to do this, nor hexbin to have an option to change "tile" shapes.
You can make something close with relatively few lines of code (comapring with this example from the matplotib gallery). Note bokeh has some examples for hex binning in the gallery here and here. Adapting those and the example provided in the numpy docs you can get the below:
import numpy as np
from bokeh.plotting import figure, show
from bokeh.layouts import row
# normal distribution center at x=0 and y=5
x = np.random.randn(100000)
y = np.random.randn(100000) + 5
H, xe, ye = np.histogram2d(x, y, bins=100)
# produce an image of the 2d histogram
p = figure(x_range=(min(xe), max(xe)), y_range=(min(ye), max(ye)), title='Image')
p.image(image=[H], x=xe[0], y=ye[0], dw=xe[-1] - xe[0], dh=ye[-1] - ye[0], palette="Spectral11")
# produce hexbin plot
p2 = figure(title="Hexbin", match_aspect=True)
p.grid.visible = False
r, bins = p2.hexbin(x, y, size=0.1, hover_color="pink", hover_alpha=0.8, palette='Spectral11')
show(row(p, p2))
I have produced a time series scatter plot in bokeh, which updates when a user interactively selects a new time series. However, I want to fix the x-axis between 0000 to 2359 hours for comparison (Bokeh tries to guess the appropriate x-range).
Below is a random snippet of data. In this code, how do I fix the x_range without it changing the scale to microseconds?
import pandas as pd
from bokeh.io import push_notebook, show, output_notebook
from bokeh.plotting import figure
from bokeh.models import Range1d
output_notebook()
data = {'2015-08-20 13:39:46': [-0.02813796, 0],
'2015-08-28 12:6:5': [ 1.32426938, 1],
'2015-08-28 13:42:59': [-0.16289655, 1],
'2015-12-14 16:19:44': [ 2.30476287, 1],
'2016-02-01 17:8:32': [ 0.41165004, 0],
'2016-02-09 11:26:33': [-0.65023149, 0],
'2016-04-08 17:57:47': [ 0.09335096, 1],
'2016-04-27 19:2:15': [ 1.43917208, 0]}
test = pd.DataFrame(data=data).T
test.columns = ["activity","objectID"]
test.index = pd.to_datetime(test.index)
p = figure(plot_width=500, plot_height=250, x_axis_label='X', y_axis_label='Y', x_axis_type="datetime")# x_range = Range1d(# dont know what to put here))
r = p.circle(x=test.index.time, y=test["activity"])
show(p, notebook_handle=True);
I've found a (scrappy) solution for this but it doesn't fix the axes sizes entirely since the size of the axes seems to be dependent on other axes properties such as the length of the y-tick labels.
# a method for setting constant x-axis in hours for bokeh:
day_x_axis = pd.DataFrame(data=[0,0], index=['2015-07-28 23:59:00', '2015-08- 28 00:01:00'], columns=["activity"])
day_x_axis.index = pd.to_datetime(day_x_axis.index)
new_time_series = pd.concat((old_time_series, day_x_axis), axis=0) # this will set all other columns you had to NaN.
I fixed my axes entirely by also setting the y_range property when instantiating the figure object.