'stack order' in Gadfly histogram - julia

When plotting a "stacked histogram", I would like the "stack order" to be the same as the legend order - Fair (first / bottom) and Ideal (last / top) - so that the colors are in order from light to dark. Like in this example.
Any idea how to do that? My code so far:
using CSV, DataFrames, Gadfly
download("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/diamonds.csv", "diamonds.csv")
diamonds = DataFrame(CSV.File("diamonds.csv"))
palette = ["#9D95C2", "#B4ADCF", "#C9C4DB", "#DFDCE9", "#F5F3F4"]
plot(
diamonds,
x = :price,
color = :cut,
Geom.histogram(bincount=50),
Scale.x_log10,
Scale.color_discrete_manual(palette..., order = [1, 2, 4, 3, 5]),
Theme(
background_color = "white",
bar_highlight = color("black")
),
)

This is actually not that easy.
There is a somewhat related question at How do I sort a bar chart in ascending or descending order in Julia's Gadfly? (Does anyone know a less hacky way?)
I will probably try to fiddle a bit more, but what you can do is using levels on top of order.
Scale.color_discrete_manual(palette...,
levels = ["Fair", "Good", "Very Good", "Premium", "Ideal"]
order = [1, 2, 4, 3, 5]),
Then you have three lists, one for the colors one for the levels and one for the order. It is a nightmare, but there should be one permutation that looks similar to the seaborn example (I hope).

Related

How to overlay multiple TA in new plot using quantmod?

We can plot candle stick chart using chart series function chartSeries(Cl(PSEC)) I have created some custom values (I1,I2 and I3) which I want to plot together(overlay) outside the candle stick pattern. I have used addTA() for this purpose
chartSeries(Cl(PSEC)), TA="addTA(I1,col=2);addTA(I2,col=3);addTA(I3,col=4)")
The problem is that it plots four plots for Cl(PSEC),I1,I2 and I3 separately instead of two plots which I want Cl(PSEC) and (I1,I2,I3)
EDITED
For clarity I am giving a sample code with I1, I2 and I3 variable created for this purpose
library(quantmod)
PSEC=getSymbols("PSEC",auto.assign=F)
price=Cl(PSEC)
I1=SMA(price,3)
I2=SMA(price,10)
I3=SMA(price,15)
chartSeries(price, TA="addTA(I1,col=2);addTA(I2,col=3);addTA(I3,col=4)")
Here is an option which preserves largely your original code.
You can obtain the desired result using the option on=2 for each TA after the first:
library(quantmod)
getSymbols("PSEC")
price <- Cl(PSEC)
I1 <- SMA(price,3)
I2 <- SMA(price,10)
I3 <- SMA(price,15)
chartSeries(price, TA=list("addTA(I1, col=2)", "addTA(I2, col=4, on=2)",
"addTA(I3, col=5, on=2)"), subset = "last 6 months")
If you want to overlay the price and the SMAs in one chart, you can use the option on=1 for each TA.
Thanks to #hvollmeier who made me realize with his answer that I had misunderstood your question in the previous version of my answer.
PS: Note that several options are described in ?addSMA(), including with.col which can be used to select a specific column of the time series (Cl is the default column).
If I understand you correctly you want the 3 SMAs in a SUBPLOT and NOT in your main chart window.You can do the following using newTA.
Using your data:
PSEC=getSymbols("PSEC",auto.assign=F)
price=Cl(PSEC)
Now plotting a 10,30,50 day SMA in a window below the main window:
chartSeries(price['2016'])
newSMA <- newTA(SMA, Cl, on=NA)
newSMA(10)
newSMA(30,on=2)
newSMA(50,on=2)
The key is the argument on. Use on = NA in defining your new TA function, because the default value foron is 1, which is the main window. on = NA plots in a new window. Then plot the remaining SMAs to the same window as the first SMA. Style the colours etc.to your liking :-).
You may want to consider solving this task using plotting with the newer quantmod charts in the quantmod package (chart_Series as opposed to chartSeries).
Pros:
-The plots look cleaner and better (?)
-have more flexibility via editing the pars and themes options to chart_Series (see other examples here on SO for the basics of things you can do with pars and themes)
Cons:
-Not well documented.
PSEC=getSymbols("PSEC",auto.assign=F)
price=Cl(PSEC)
chart_Series(price, subset = '2016')
add_TA(SMA(price, 10))
add_TA(SMA(price, 30), on = 2, col = "green")
add_TA(SMA(price, 50), on = 2, col = "red")
# Make plot all at once (this approach is useful in shiny applications):
print(chart_Series(price, subset = '2016', TA = 'add_TA(SMA(price, 10), yaxis = list(0, 10));
add_TA(SMA(price, 30), on = 2, col = "purple"); add_TA(SMA(price, 50), on = 2, col = "red")'))

Bokeh how to add legend to figure created by multi_line method?

I'm trying to add legend to a figure, which contains two lines created by multi_line method.
Example:
p = figure(plot_width=300, plot_height=300)
p.multi_line(xs=[[4, 2, 5], [1, 3, 4]], ys=[[6, 5, 2], [6, 5, 7]], color=['blue','yellow'], legend="first")
In this case the legend is only for the first line. When the legend is defined as a list there is an error:
p.multi_line(xs=[[4, 2, 5], [1, 3, 4]], ys=[[6, 5, 2], [6, 5, 7]], color=['blue','yellow'], legend=["first","second"])
Is it possible to add legend to many lines?
Maintainer Note : PR #8218 which will be merged for Bokeh 1.0, allows legends to be created directly for multi line and patches, without any looping.
To make it faster, when you have a lot of data or a big table etc. You can make a for loop:
1) Make a list of colors and legends
You can always import bokeh paletts for your colors
from bokeh.palettes import "your palett"
Check this link: bokeh.palets
colors_list = ['blue', 'yellow']
legends_list = ['first', 'second']
xs=[[4, 2, 5], [1, 3, 4]]
ys=[[6, 5, 2], [6, 5, 7]]
2) Your figure
p = figure(plot_width=300, plot_height=300)
3) Make a for loop throgh the above lists and show
for (colr, leg, x, y ) in zip(colors_list, legends_list, xs, ys):
my_plot = p.line(x, y, color= colr, legend= leg)
show(p)
Maintainer Note: PR #8218 which will be merged for Bokeh 1.0, allows legends to be created directly for multi line and patches, without any looping or using separate line calls.
multi_line is intended for conceptually single things, that happen to have multiple sub-components. Think of the state of Texas, it is one logical thing, but it has several distinct (and disjoint) polygons. You might use Patches to draw all the polys for "Texas" but you'd only want one legend overall. Legends label logical things. If you want to label several lines as logically distinct things, you will have to draw them all separately with p.line(..., legend_label="...")
On more recent releases (since 0.12.15, I think) its possible to add legends to multi_line plots. You simple need to add a 'legend' entry to your data source. Here is an example taken from the Google Groups discussion forum:
data = {'xs': [np.arange(5) * 1, np.arange(5) * 2],
'ys': [np.ones(5) * 3, np.ones(5) * 4],
'labels': ['one', 'two']}
source = ColumnDataSource(data)
p = figure(width=600, height=300)
p.multi_line(xs='xs', ys='ys', legend='labels', source=source)

Using optional arguments (...) in a function, as illustrated with new population pyramid plot

Wanting to show the distribution of participants in a survey by level, I came upon the recently-released pyramid package and tried it. As the font on the x-axis is too large and there seem to be no other formatting choices to fix it, I realized I don't know how to add "other options" as permitted by the ... in the pyramid call.
install.packages("pyramid")
library(pyramid)
level.pyr <- data.frame(left = c(1, 4, 6, 4, 41, 17),
right = c(1, 4, 6, 4, 41, 17),
level = c("Mgr", "Sr. Mgr.", "Dir.", "Sr. Dir.", "VP", "SVP+"))
pyramid(level.pyr, Laxis = seq(2,50,6), Cstep = 1, Cgap = .5, Llab = "", Rlab = "", Clab = "Title", GL = T, Lcol = "deepskyblue", Rcol = "deepskyblue", Ldens = -1, main = "Distribution of Participants in Survey")
Agreed, the plot below looks odd because the left and the right sides are the same, not male and female. But my question remains as to how to invoke the options and do something like "Laxis.size = 2" of "Raxis.font = "bold".
Alternatives to this new package for creating pyramid plots include plotrix, grid, and base R, as demonstrated here:
population pyramid density plot in r
By the way, if there were a ggplot method, I would welcome trying it.
Contrary to Roland's and now nrussell's guesses (without apparently looking at the code) expressed in comments, the dots arguments will not be passed to pyramid's axis plotting routine, despite this being a base graphics function. The arguments are not even passed to an axis call, although that would have seemed reasonable. The x-axis tick labels are constructed with a call to text(). You could hack the text calls to accept a named argument of your choosing and it would be passed via the dots mechanism. You seem open to other options and I would recommend using plotrix::pyramid.plot since Jim Lemon does a better job of documenting his routines and it's more likely they will be using standard R plotting conventions:
library(plotrix)
pyramid.plot(lx,rx,labels=NA,top.labels=c("Male","Age","Female"),
main="",laxlab=NULL,raxlab=NULL,unit="%",lxcol,rxcol,gap=1,space=0.2,
ppmar=c(4,2,4,2),labelcex=1,add=FALSE,xlim,show.values=FALSE,ndig=1,
do.first=NULL)
with( level.pyr, pyramid.plot(lx=left, rx=right, labels=level,
gap =5, top.labels=c("", "Title", ""), labelcex=0.6))

Heatmap like plot with Lattice

I can not figure out how the lattice levelplot works. I have played with this now for some time, but could not find reasonable solution.
Sample data:
Data <- data.frame(x=seq(0,20,1),y=runif(21,0,1))
Data.mat <- data.matrix(Data)
Plot with levelplot:
rgb.palette <- colorRampPalette(c("darkgreen","yellow", "red"), space = "rgb")
levelplot(Data.mat, main="", xlab="Time", ylab="", col.regions=rgb.palette(100),
cuts=100, at=seq(0,1,0.1), ylim=c(0,2), scales=list(y=list(at=NULL)))
This is the outcome:
Since, I do not understand how this levelplot really works, I can not make it work. What I would like to have is the colour strips to fill the whole window of the corresponding x (Time).
Alternative solution with other method.
Basically, I'm trying here to plot the increasing risk over time, where the red is the highest risk = 1. I would like to visualize the sequence of possible increase or clustering risk over time.
From ?levelplot we're told that if the first argument is a matrix then "'x' provides the
'z' vector described above, while its rows and columns are
interpreted as the 'x' and 'y' vectors respectively.", so
> m = Data.mat[, 2, drop=FALSE]
> dim(m)
[1] 21 1
> levelplot(m)
plots a levelplot with 21 columns and 1 row, where the levels are determined by the values in m. The formula interface might look like
> df <- data.frame(x=1, y=1:21, z=runif(21))
> levelplot(z ~ y + x, df)
(these approaches do not quite result in the same image).
Unfortunately I don't know much about lattice, but I noted your "Alternative solution with other method", so may I suggest another possibility:
library(plotrix)
color2D.matplot(t(Data[ , 2]), show.legend = TRUE, extremes = c("yellow", "red"))
Heaps of things to do to make it prettier. Still, a start. Of course it is important to consider the breaks in your time variable. In this very simple attempt, regular intervals are implicitly assumed, which happens to be the case in your example.
Update
Following the advice in the 'Details' section in ?color2D.matplot: "The user will have to adjust the plot device dimensions to get regular squares or hexagons, especially when the matrix is not square". Well, well, quite ugly solution.
par(mar = c(5.1, 4.1, 0, 2.1))
windows(width = 10, height = 2.5)
color2D.matplot(t(Data[ , 2]),
show.legend = TRUE,
axes = TRUE,
xlab = "",
ylab = "",
extremes = c("yellow", "red"))

spplot() - make color.key look nice

I'm afraid I have a spplot() question again.
I want the colors in my spplot() to represent absolute values, not automatic values as spplot does it by default.
I achieve this by making a factor out of the variable I want to draw (using the command cut()). This works very fine, but the color-key doesn't look good at all.
See it yourself:
library(sp)
data(meuse.grid)
gridded(meuse.grid) = ~x+y
meuse.grid$random <- rnorm(nrow(meuse.grid), 7, 2)
meuse.grid$random[meuse.grid$random < 0] <- 0
meuse.grid$random[meuse.grid$random > 10] <- 10
# making a factor out of meuse.grid$ random to have absolute values plotted
meuse.grid$random <- cut(meuse.grid$random, seq(0, 10, 0.1))
spplot(meuse.grid, c("random"), col.regions = rainbow(100, start = 4/6, end = 1))
How can I have the color.key on the right look good - I'd like to have fewer ticks and fewer labels (maybe just one label on each extreme of the color.key)
Thank you in advance!
[edit]
To make clear what I mean with absolute values: Imagine a map where I want to display the sea height. Seaheight = 0 (which is the min-value) should always be displayed blue. Seaheight = 10 (which, just for the sake of the example, is the max-value) should always be displayed red. Even if there is no sea on the regions displayed on the map, this shouldn't change.
I achieve this with the cut() command in my example. So this part works fine.
THIS IS WHAT MY QUESTION IS ABOUT
What I don't like is the color description on the right side. There are 100 ticks and each tick has a label. I want fewer ticks and fewer labels.
The way to go is using the attribute colorkey. For example:
## labels
labelat = c(1, 2, 3, 4, 5)
labeltext = c("one", "two", "three", "four", "five")
## plot
spplot(meuse.grid,
c("random"),
col.regions = rainbow(100, start = 4/6, end = 1),
colorkey = list(
labels=list(
at = labelat,
labels = labeltext
)
)
)
First, it's not at all clear what you are wanting here. There are many ways to make the color.key look "nice" and that is to understand first the data being passed to spplot and what is being asked of it. cut() is providing fully formatted intervals like (2.3, 5.34] which will need to be handled a different way, increasing the margins in the plot, specific formatting and spacing for the labels, etc. etc. This just may not be what you ultimately want.
Perhaps you just want integer values, rounded from the input values?
library(sp)
data(meuse.grid)
gridded(meuse.grid) = ~x+y
meuse.grid$random <- rnorm(nrow(meuse.grid), 7, 2)
Round the values (or trunc(), ceil(), floor() them . . .)
meuse.grid$rclass <- round(meuse.grid$random)
spplot(meuse.grid, c("rclass"), col.regions = rainbow(100, start = 4/6, end = 1))

Resources