Plot not showing in Julia - plot

I have a file named mycode.jl with following code taken from here.
using MultivariateStats, RDatasets, Plots
# load iris dataset
println("loading iris dataset:")
iris = dataset("datasets", "iris")
println(iris)
println("loaded; splitting dataset: ")
# split half to training set
Xtr = Matrix(iris[1:2:end,1:4])'
Xtr_labels = Vector(iris[1:2:end,5])
# split other half to testing set
Xte = Matrix(iris[2:2:end,1:4])'
Xte_labels = Vector(iris[2:2:end,5])
print("split; Performing PCA: ")
# Suppose Xtr and Xte are training and testing data matrix, with each observation in a column. We train a PCA model, allowing up to 3 dimensions:
M = fit(PCA, Xtr; maxoutdim=3)
println(M)
# Then, apply PCA model to the testing set
Yte = predict(M, Xte)
println(Yte)
# And, reconstruct testing observations (approximately) to the original space
Xr = reconstruct(M, Yte)
println(Xr)
# Now, we group results by testing set labels for color coding and visualize first 3 principal components in 3D plot
println("Plotting fn:")
setosa = Yte[:,Xte_labels.=="setosa"]
versicolor = Yte[:,Xte_labels.=="versicolor"]
virginica = Yte[:,Xte_labels.=="virginica"]
p = scatter(setosa[1,:],setosa[2,:],setosa[3,:],marker=:circle,linewidth=0)
scatter!(versicolor[1,:],versicolor[2,:],versicolor[3,:],marker=:circle,linewidth=0)
scatter!(virginica[1,:],virginica[2,:],virginica[3,:],marker=:circle,linewidth=0)
plot!(p,xlabel="PC1",ylabel="PC2",zlabel="PC3")
println("Reached end of program.")
I run above code with command on Linux terminal: julia mycode.jl
The code runs all right and reaches the end but the plot does not appear.
Where is the problem and how can it be solved.

As the Output section of the Plots docs says:
A Plot is only displayed when returned (a semicolon will suppress the return), or if explicitly displayed with display(plt), gui(), or by adding show = true to your plot command.
You can have MATLAB-like interactive behavior by setting the default value: default(show = true)
The first part about "when returned" is about when you call plot from the REPL (or Jupyter, etc.), and doesn't apply here.
Here, you can use one of the other options:
calling display(p) after the last plot! call (this is the most common way to do it)
calling gui() after the last plot!
adding a show = true argument to the last plot! call
setting the default to always show the plot by setting Plots.default(show = true) at the beginning of the script
Any one of these is sufficient to make the plot window appear.
The plot closes when the Julia process ends, if that's happening too soon, you can either:
Run your code as julia -i mycode.jl at the terminal - this will run your code, display the plot, and then land you at the Julia REPL. This will both keep the plot open, and let you work with the variables in your code further if you need to.
add a readline() call at the end of your program. This will keep Julia waiting for an extra press of newline/Enter/Return key, and the plot will remain in display until you press that.
(Credit to ffevotte on Julia Discourse for these suggestions.)

Related

bokeh axis limits fail when mixing x_range with y_range across multiple plots

I'm trying to visualize a high-dim point set x (here of dim (6 x 42)) in a series of 2D scatter plots (x[1] vs x[2] etc.) using bokeh. [edit2] See this nice example from scikit-opt as a reference. When x[1] occurs in two plots it should interact with the same range and the plots should rescale simultaneously. I have accomplished this, but I don't get it to scale correctly. Here's a minimal example: [edit2]
import bokeh
import bokeh.io
import numpy as np
import bokeh.plotting
bokeh.io.output_notebook()
# That's my fictional dataset
x = np.random.randn(6, 42)
x[2] *= 10
# Build the pairwise scatter plots
kw = dict(plot_width=165, plot_height=165)
# `ranges` stores the range in each dimension,
# used as both, x- and y-range depending on
# where the variable is.
figs, ranges = {}, {}
for r, row in enumerate(x):
for c, col in enumerate(x):
if r is not c:
fig = bokeh.plotting.figure(
x_range=ranges.get(c, None), y_range=ranges.get(r, None),
**kw)
fig.scatter(x=col, y=row)
fig.xaxis.axis_label = f'Dim {c}'
fig.yaxis.axis_label = f'Dim {r}'
if c not in ranges:
ranges[c] = fig.x_range
if r not in ranges:
ranges[r] = fig.y_range
figs[f'{r}_{c}'] = fig
else:
break
# Setup the plotting layout
plots = [[]]
for r, row in enumerate(x):
for c, col in enumerate(x):
if r is not c:
plots[-1].append(figs[f'{r}_{c}'])
else:
plots.append([])
break
staircase = bokeh.layouts.gridplot(plots, **kw)
bokeh.plotting.show(staircase)
.. into an ipython notebook (>=py3.6), bokeh sets the scale for dim 1, and 2 correctly. Then, it starts to set the scale for the following dimensions as in dim 2. Notice that I scaled dim 2 10-fold to make this point.
Interactively, I can rescale the plot back to optimal settings. However, I'd like to do that by default. What options do I have inside bokeh to rescale? I played a bit with fig.xaxis.bounds, but unsuccessfully. Thanks for your help!
Epilogue:
Following #bigreddot's answer, I added the lines:
for i, X in enumerate(x):
ranges[i].start = X.min()
ranges[i].end = X.max()
to fix the starting ranges. I still think that the behaviour is a bug.
From your code and description I still can't quite tell what you are hoping to accomplish. [1] But I will state that the default DataRange1d ranges that plot's use automatically make space for all renderers, across all plots they are shared by. In this sense, I see exactly what I would expect when I run your code. If you want something different, there are two things you could control:
DataRange1d has a .renderers property. If you only want the "auto" ranging to be over a subset of the renderers, then you can explicitly set this property to the list you want. Renderers are returned by the glyph functions, e.g. fig.scatter
Don't use the "auto" ranges. You can also set the x_range and y_range yourself to be Range1d objects. These have start and end properties that you can set, and these will be the definite bounds of the range, e.g. x-range=Range1d(0, 10)
[1] The ranges are linked in what I would consider an odd way, and I can't tell if that is intended. But that is a result of your looping/python code and not Bokeh.

Issue: ggplot2 replicates last plot of a list in grid

I have some 16 plots. I want to plot all of these in grid manner with ggplot2. But, whenever I plot, I get a grid with all the plots same, i.e, last plot saved in a list gets plotted at all the 16 places of grid. To replicate the same issue, here I am providing a simple example with two files. Although data are entirely different, but plots drawn are similar.
library(ggplot2)
library(grid)
library(gridExtra)
library(scales)
set.seed(1006)
date1<- as.POSIXct(seq(from=1443709107,by=3600,to=1446214707),origin="1970-01-01")
power <- rnorm(length(date1),100,5)#with normal distribution
write.csv(data.frame(date1,power),"file1.csv",row.names = FALSE,quote = FALSE)
# Now another dataset with uniform distribution
write.csv(data.frame(date1,power=runif(length(date1))),"file2.csv",row.names = FALSE,quote = FALSE)
path=getwd()
files=list.files(path,pattern="*.csv")
plist<-list()# for saving intermediate ggplots
for(i in 1:length(files))
{
dframe<-read.csv(paste(path,"/",files[i],sep = ""),head=TRUE,sep=",")
dframe$date1= as.POSIXct(dframe$date1)
plist[[i]]<- ggplot(dframe)+aes(dframe$date1,dframe$power)+geom_line()
}
grid.arrange(plist[[1]],plist[[2]],ncol = 1,nrow=2)
You need to remove the dframe from your call to aes. You should do that anyway because you have provided a data-argument. In this case it's even more important because while you save the ggplot-object, things don't get evaluated until the call to plot/grid.arrange. When you do that, it looks at the current value of dframe, which is the last dataset in your iteration.
You need to plot with:
ggplot(dframe)+aes(date1,power)+geom_line()

Bandwidth when plotting densities in R

When I plot the density for wind direction using circular package, I get an error. The error is shown below. Can someone explain the bw (bandwidth) that I need for the amount of data?
plot(density(dirCir))
Error in density.circular(dirCir) :
argument "bw" is missing, with no default
This is the actual code that I have.
library (circular)
dir <-c(308,351,330,16,3,346,345,345,287,359,345,358,336,335,346,16,325,354,5,354,322,340,6,278,354,343,261,353,288,8)
dirCir <- circular(dir, units ="degrees", template = "geographics")
mean(dirCir)
var(dirCir)
summary(dirCir)
plot(dirCir)
plot(density(dirCir))
rose.diag(dirCir, main = 'dir Data')
points(dirCir)
As #eipi10 says, bw has to be explicitly chosen. Depending on the kernel that you choose large and small values of this bandwidth parameter may produce spiky density estimates as well as very smooth ones.
Common practice is to try several values and choose the one that seems to describe the data best. However, note that the following functions provide more objective ways of selecting the bw:
# bw.cv.mse.circular(dirCir)
[1] 21.32236
# bw.cv.mse.circular(dirCir, kernel = "wrappednormal")
[1] 16.97266
# bw.cv.ml.circular(dirCir)
[1] 19.71197
# bw.cv.ml.circular(dirCir, kernel = "wrappednormal")
[1] 0.2280636
# bw.nrd.circular(dirCir)
[1] 14.63382
When you run density on an object of class circular, it appears that you have to include a value for bw (bandwidth) explicitly (as the error message indicates). Try this:
plot(density(dirCir, kernel="wrappednormal", bw=0.02), ylim=c(-1,5))
See below for the graph. The ylim range is so that the plot fits inside the plot area without clipping. See the help for density.circular for more info on running the density function on circular objects.

R programming - Graphic edges too large error while using clustering.plot in EMA package

I'm an R programming beginner and I'm trying to implement the clustering.plot method available in R package EMA. My clustering works fine and I can see the results populated as well. However, when I try to generate a heat map using clustering.plot, it gives me an error "Error in plot.new (): graphic edges too large". My code below,
#Loading library
library(EMA)
library(colonCA)
#Some information about the data
data(colonCA)
summary(colonCA)
class(colonCA) #Expression set
#Extract expression matrix from colonCA
expr_mat <- exprs(colonCA)
#Applying average linkage clustering on colonCA data using Pearson correlation
expr_genes <- genes.selection(expr_mat, thres.num=100)
expr_sample <- clustering(expr_mat[expr_genes,],metric = "pearson",method = "average")
expr_gene <- clustering(data = t(expr_mat[expr_genes,]),metric = "pearson",method = "average")
expr_clust <- clustering.plot(tree = expr_sample,tree.sup=expr_gene,data=expr_mat[expr_genes,],title = "Heat map of clustering",trim.heatmap =1)
I do not get any error when it comes to actually executing the clustering process. Could someone help?
In your example, some of the rownames of expr_mat are very long (max(nchar(rownames(expr_mat)) = 271 characters). The clustering_plot function tries to make a margin large enough for all the names but because the names are so long, there isn't room for anything else.
The really long names seem to have long stretches of periods in them. One way to condense the names of these genes is to replace runs of 2 or more periods with just one, so I would add in this line
#Extract expression matrix from colonCA
expr_mat <- exprs(colonCA)
rownames(expr_mat)<-gsub("\\.{2,}","\\.", rownames(expr_mat))
Then you can run all the other commands and plot like normal.

New gplots update in R cannot find function distfun in heatmap.2

I have a bit of R-code to make a heatmap from a correlation matrix, which worked the last time I used it (prior to the 2013 Oct 17 update of gplots; after updating to R Version 3.0.2). This makes me think that something changed in the most recent gplots update, but I can not figure out what.
What used to present a nice plot now gives me this error:
" Error in hclustfun(distfun(x)) : could not find function "distfun" "
and won't plot anything. Below is the code to reproduce the plot (heavily commented as I was using it to teach an undergrad how to use heatmaps for a project). I tried adding the last line to explicitly set the functions, but it didn't help resolve the problem.
EDIT: I changed the last line of code to read:
,distfun =function(c) {as.dist(1-c,upper=FALSE)}, hclustfun=hclust)
and it worked. When I used just "dist=as.dist" I got a plot, but it wasn't sorted right, and several of the dendrogram branches didn't connect to the tree. Not sure what happened, or why this is working, but it appears to be.
Any help would be greatly appreciated.
Thanks in advance,
library(gplots)
set.seed(12345)
randData <- as.data.frame(matrix(rnorm(600),ncol=6))
randDataCorrs <- randData+(rnorm(600))
names(randDataCorrs) <- paste(names(randDataCorrs),"_c",sep="")
randDataExtra <- cbind(randData,randDataCorrs)
randDataExtraMatrix <- cor(randDataExtra)
heatmap.2(randDataExtraMatrix, # sets the correlation matrix to use
symm=TRUE, #tells whether it is symmetrical or not
main= "Correlation matrix\nof Random Data Cor", # Names plot
xlab= "x groups",ylab="", # Sets the x and y labels
scale="none", # Tells it not to scale the data
col=redblue(256),# Sets the colors (can be manual, see below)
trace="none", # tells it not to add a trace
symkey=TRUE,symbreaks=TRUE, # Tells it to keep things symmetric around 0
density.info = "none"#) # can be "histogram" if you want a hist of your corr values here
#,distfun=dist, hclustfun=hclust)
,distfun =function(c) {as.dist(1-c,upper=FALSE)}, hclustfun=hclust) # new last line
I had the same error, then I noticed that I had made a variable called dist, which is the default call for distfun= dist. I renamed the variable and then everything ran fine. You likely made the same error, as your new code is working since you have altered the default call of distfun.

Resources