How to plot StatsBase.Histogram object in Julia? - plot

I am using a package(LightGraphs.jl) in Julia, and it has a predefined histogram method that creates the degree distribution of a network g.
deg_hist = degree_histogram(g)
I want to make a plot of this but i am new to plotting in Julia. The object returned is a StatsBase.Histogram which has the following as its inner fields:
StatsBase.Histogram{Int64,1,Tuple{FloatRange{Float64}}}
edges: 0.0:500.0:6000.0
weights: [79143,57,32,17,13,4,4,3,3,2,1,1]
closed: right
Can you help me how I can make use of this object to plot the histogram?

I thought this was already implemented, but I just added the recipe to StatPlots. If you check out master, you'll be able to do:
julia> using StatPlots, LightGraphs
julia> g = Graph(100,200);
julia> plot(degree_histogram(g))
For reference, the associated recipe that I added to StatPlots:
#recipe function f(h::StatsBase.Histogram)
seriestype := :histogram
h.edges[1], h.weights
end

Use the histogram fields .edges and .weights to plot it e.g.
using PyPlot, StatsBase
a = rand(1000); # generate something to plot
test_hist = fit(Histogram, a)
# line plot
plot(test_hist.edges[1][2:end], test_hist.weights)
# bar plot
bar(0:length(test_hist.weights)-1, test_hist.weights)
xticks(0:length(test_hist.weights), test_hist.edges[1])
or you could create/extend a plotting function adding a method like so:
function myplot(x::StatsBase.Histogram)
... # your code here
end
Then you will be able to call your plotting functions directly on the histogram object.

Related

Julia: Passing Plot optional arguments through outer function

I have a function in Julia that produces a plot using the Plots package plot() command. I'd like to set some optional arguments for the plot by passing arguments into my outer function. For example, I'd like to set title and axis labels without needing to program a bunch of if statements to check what parameters I'm trying to pass in. A MWE of what I'd like to do is as follows:
function outer(data; plot_options...)
x = data.x
y = data.y
plot(x,y, plot_options...)
end
So that if I call something like outer(data, title="My Title", lw=2) I produce a plot with title set to "My Title" and a linewidth of 2. Trying the naive thing that I programmed above results in an error.
function outer(data; plot_options...)
x = data.x
y = data.y
plot(x,y; plot_options...)
end
missed a semicolon?

Processing statistics in Gadfly

I want to extend the Gadfly package to match my own idiosyncratic preferences. However I am having trouble understanding how to use Gadfly's statistics in a way that allows for their output to be processed before plotting.
For example, say I want to use the x,y aesthetics produced by Stat.histogram. To add these to a plot, I understand I can include Stat.histogram as an argument in a layer(). But what do I do if I want to use Stat.histogram to calculate the x,y aesthetics, edit them using my own code, and then plot these edited aesthetics?
I'm looking for a function like load_aesthetics(layer(x=x, Stat.histogram)), or a field like layer(x=x, Stat.histogram).aesthetics.
you can create your own statistic. see https://github.com/GiovineItalia/Gadfly.jl/issues/894
Building off #bjarthur's answer, I wrote the below function.
"Return the aesthetics produced by a Gadfly Statistic object."
function process_statistic(statistic::Gadfly.StatisticElement,
input_aesthetics::Dict{Symbol,<:Any}
)
# Check that enough statistics have been provided.
required_aesthetics = Gadfly.input_aesthetics(statistic)
for required_aesthetic in required_aesthetics
if required_aesthetic ∉ keys(input_aesthetics)
error("Aesthetic $(required_aesthetic) is required")
end
end
# Create the aes object, which contains the statistics.
aes = Gadfly.Aesthetics()
[setfield!(aes, key, value) for (key, value) in input_aesthetics]
# These need to be passed to the apply_statistic() function. I do
# not understand them, and the below code might need to be edited
# for this function to work in some cases.
scales = Dict{Symbol, Gadfly.ScaleElement}()
coord = Gadfly.Coord.Cartesian()
# This function edits the aes object, filling it with the desired aesthetics.
Gadfly.Stat.apply_statistic(statistic, scales, coord, aes)
# Return the produced aesthetics in a dictionary.
outputs = Gadfly.output_aesthetics(statistic)
return Dict(output => getfield(aes, output) for output in outputs)
end
Example usage:
process_statistic(Stat.histogram(), Dict(:x => rand(100)))

How can you make a stacked area / line chart in Julia with Plots.jl?

I would like to create a stacked area chart, similar to this for example, in Julia using Plots.
I know / suppose that you can do this if you directly use the Gadfly or PyPlot backends in Julia, but I was wondering if there was a recipe for this. If not, how can you contribute to the Plots Recipes? Would be a useful addition.
There's a recipe for something similar in
https://docs.juliaplots.org/latest/examples/pgfplots/#portfolio-composition-maps
For some reason the thumbnail looks broken now though (but the code works).
The exact plot in the matlab example can be produced by
plot(cumsum(Y, dims = 2)[:,end:-1:1], fill = 0, lc = :black)
As a recipe that would look like
#userplot AreaChart
#recipe function f(a::AreaChart)
fillto --> 0
linecolor --> :black
seriestype --> :path
cumsum(a.args[1], dims = 2)[:,end:-1:1]
end
If you want to contribute a recipe to Plots you can open a pull request on Plots, or, eg. on StatsPlots - there's a good description of contributing here: https://docs.juliaplots.org/latest/contributing/
It's a bit of reading, but very generally useful as an introduction to contributing to Julia packages.
You can read this thread in the Julia discourse forum where the question is developed in deep.
One solution posted there using Plots is :
# a simple "recipe" for Plots.jl to get stacked area plots
# usage: stackedarea(xvector, datamatrix, plotsoptions)
#recipe function f(pc::StackedArea)
x, y = pc.args
n = length(x)
y = cumsum(y, dims=2)
seriestype := :shape
# create a filled polygon for each item
for c=1:size(y,2)
sx = vcat(x, reverse(x))
sy = vcat(y[:,c], c==1 ? zeros(n) : reverse(y[:,c-1]))
#series (sx, sy)
end
end
a = [1,1,1,1.5,2,3]
b = [0.5,0.6,0.4,0.3,0.3,0.2]
c = [2,1.8,2.2,3.3,2.5,1.8]
sNames = ["a","b","c"]
x = [2001,2002,2003,2004,2005,2006]
plotly()
stackedarea(x, [a b c], labels=reshape(sNames, (1,3)))
(by user NiclasMattsson)
Other ways presented there include using the VegaLite.jl package.

Multiple histograms in Julia using Plots.jl

I am working with a large number of observations and to really get to know it I want to do histograms using Plots.jl
My question is how I can do multiple histograms in one plot as this would be really handy. I have tried multiple things already, but I am a bit confused with the different plotting sources in julia (plots.jl, pyplot, gadfly,...).
I don't know if it would help for me to post some of my code, as this is a more general question. But I am happy to post it, if needed.
There is an example that does just this:
using Plots
pyplot()
n = 100
x1, x2 = rand(n), 3rand(n)
# see issue #186... this is the standard histogram call
# our goal is to use the same edges for both series
histogram(Any[x1, x2], line=(3,0.2,:green), fillcolor=[:red :black], fillalpha=0.2)
I looked for "histograms" in the Plots.jl repo, found this related issue and followed the links to the example.
With Plots, there are two possibilities to show multiple series in one plot:
First, you can use a matrix, where each column constitutes a separate series:
a, b, c = randn(100), randn(100), randn(100)
histogram([a b c])
Here, hcat is used to concatenate the vectors (note the spaces instead of commas).
This is equivalent to
histogram(randn(100,3))
You can apply options to the individual series using a row matrix:
histogram([a b c], label = ["a" "b" "c"])
(Again, note the spaces instead of commas)
Second, you can use plot! and its variants to update a previous plot:
histogram(a) # creates a new plot
histogram!(b) # updates the previous plot
histogram!(c) # updates the previous plot
Alternatively, you can specify which plot to update:
p = histogram(a) # creates a new plot p
histogram(b) # creates an independent new plot
histogram!(p, c) # updates plot p
This is useful if you have several subplots.
Edit:
Following Felipe Lema's links, you can implement a recipe for histograms that share the edges:
using StatsBase
using PlotRecipes
function calcbins(a, bins::Integer)
lo, hi = extrema(a)
StatsBase.histrange(lo, hi, bins) # nice edges
end
calcbins(a, bins::AbstractVector) = bins
#userplot GroupHist
#recipe function f(h::GroupHist; bins = 30)
args = h.args
length(args) == 1 || error("GroupHist should be given one argument")
bins = calcbins(args[1], bins)
seriestype := :bar
bins, mapslices(col -> fit(Histogram, col, bins).weights, args[1], 1)
end
grouphist(randn(100,3))
Edit 2:
Because it is faster, I changed the recipe to use StatsBase.fit for creating the histogram.

R, graph of binomial distribution

I have to write own function to draw the density function of binomial distribution and hence draw
appropriate graph when n = 20 and p = 0.1,0.2,...,0.9. Also i need to comments on the graphs.
I tried this ;
graph <- function(n,p){
x <- dbinom(0:n,size=n,prob=p)
return(barplot(x,names.arg=0:n))
}
graph(20,0.1)
graph(20,0.2)
graph(20,0.3)
graph(20,0.4)
graph(20,0.5)
graph(20,0.6)
graph(20,0.7)
graph(20,0.8)
graph(20,0.9)
#OR
graph(20,scan())
My first question : is there any way so that i don't need to write down the line graph(20,p) several times except using scan()?
My second question :
I want to see the graph in one device or want to hit ENTER to see the next graph. I wrote
par(mfcol=c(2,5))
graph(20,0.1)
graph(20,0.2)
graph(20,0.3)
graph(20,0.4)
graph(20,0.5)
graph(20,0.6)
graph(20,0.7)
graph(20,0.8)
graph(20,0.9)
but the graph is too tiny. How can i present the graphs nicely with giving head line n=20 and p=the value which i used to draw the graph?[though it can be done by writing mtext() after calling the function graphbut doing so i have to write a similar line few times. So i want to do this including in function graph. ]
My last question :
About comment. The graphs are showing that as the probability of success ,p is increasing the graph is tending to right, that is , the graph is right skewed.
Is there any way to comment on the graph using program?
Here a job of mapply since you loop over 2 variables.
graph <- function(n,p){
x <- dbinom(0:n,size=n,prob=p)
barplot(x,names.arg=0:n,
main=sprintf(paste('bin. dist. ',n,p,sep=':')))
}
par(mfcol=c(2,5))
mapply(graph,20,seq(0.1,1,0.1))
Plotting base graphics is one of the times you often want to use a for loop. The reason is because most of the plotting functions return an object invisibly, but you're not interested in these; all you want is the side-effect of plotting. A loop ignores the returned obects, whereas the *apply family will waste effort collecting and returning them.
par(mfrow=c(2, 5))
for(p in seq(0.1, 1, len=10))
{
x <- dbinom(0:20, size=20, p=p)
barplot(x, names.arg=0:20, space=0)
}

Resources