Formatting Issue with barchart() of Cluster Analysis - r

I've created a segment profile plot of my cluster analysis but I'm having an issue with the formatting of a barchart() command. Here is the chart I created. The obvious issue is that my lines are too close together to read.
Here you can see the code I used to create this chart. Can someone tell me what to add in order to make this chart readable? Below is an example of my code used.
R code for reproducing the clustering and PCA we used:
## if not installed, install: install.packages("flexclust")
library("flexclust")
load("vacpref.RData")
cl6 <- kcca(vacpref, k=vacpref6, control=list(iter=0),
simple=FALSE, save.data=TRUE)
summary(cl6)
hierarchical clustering of the variables
varhier <- hclust(dist(t(vacpref)), "ward")
par(mar=c(0,0,0,15))
plot(as.dendrogram(varhier), xlab="", horiz=TRUE,yaxt="n")
principal component projection
vacpca <- prcomp(vacpref)
R code for generating the Segment Separation Plot
pairs(cl6, project=vacpca, which=1:3, asp=TRUE,points=FALSE,
hull.args=list(density=10))
R code for generating the Segment Positioning Plot:
col <- flxColors(1:6)
col[c(1,3)] <- flxColors(1:4, "light")[c(1,3)]
par(mar=rep(0,4))
plot(cl6, project=vacpca, which=2:3,
col=col,asp=TRUE,points=F,hull.args=list(density=10),axes=FALSE)
projAxes(vacpca, minradius=.5, which=2:3, lwd=2, col=”darkblue”)
R code for generating the Segment Profile Plot:
barchart(cl6, shade=TRUE, which=rev(varhier$order),legend=TRUE)
The last command was the one I used to create my segment profile plot but I wasn't sure if the commands before may have affected it in any way. I'm new to R.

One trick I often use is to change the width/height and resolution through exporting the image. Try this:
png("c:\\temp\\myCrazyPlot.png", res=250, height=8, width=12, unit="in")
barchart(cl6, shade=TRUE, which=rev(varhier$order),legend=TRUE)
# And whatever other plot commands for the same plot
dev.off()
Then go check your .png file. By tinkering the height and width, you can somehow adjust the spacing of the labels at the y-axis. You may even make its height longer than its width to let the labels spread out. (I think currently you can't do that because that's the maximal height of your screen?)

Related

R igraph output vertice is not shown

I am using R igraph package to display gene networks. The plot on Rstudio is like this (I can't post image because I am new user and don't have enough reputation, sorry about that):
R igraph on preview
Now I want to draw this on file to clearly see the changes and there is always an issue on vertices near margin side like this:
part of output pdf file
My code is as follows`
pdf("graph.pdf",width = 20, height = 10)
par(mar = c(9,9,9,9))
plot(finalnet, edge.arrow.size=0.1, edge.curved=FALSE,vertex.size= 3, margin = -0.5)
dev.off()
Update: I have tried square layout and the problem persists, here is my plotting object and square plot.
square plot
rda file for my igraph object
Can anyone give me an suggestion how to solve this issue? To whole net is about 170 vertices but I don't know why it cannot be displayed on output file well. I have tried different plot options in mai, mar but this seems to fail.
The reason you are getting this behavior is because you are specifying margin in your plot call. margin=-0.5 is telling R to extend the plot 0.5 units past the graphics device dimensions, below are three examples:
Your original plotting call, notice the clipping
pdf("withMargin.pdf")
par(mar=c(9,9,9,9))
plot(g, margin=-0.5)
dev.off()
Without the call to par, problem still presists but now youuse the entire dimension of the graphics device.
png("withoutPar_Margin.png")
#par(mar=c(9,9,9,9))
plot(g, margin=-0.5)
dev.off()
Lastly, removing the margin in plot
png("withoutplotMargin.png")
par(mar=c(9,9,9,9))
plot(g)
dev.off()
You're specifying a rectangular size for what looks like a square object. Try a square size, as in
pdf("graph.pdf")
This will use the defaults, which are square.
But, it's hard to know for sure since you haven't given us the object to troubleshoot for you.

How to move the legend to outside the plotting area in Plots.jl (GR)?

I have the following plot where part of the data is being obscured by the legend:
using Plots; gr()
using StatPlots
groupedbar(rand(1:100,(10,10)),bar_position=:stack, label="item".*map(string,collect(1:10)))
I can see that using the "legend" attribute, the legend can be moved to various locations within the plotting area, for example:
groupedbar(rand(1:100,(10,10)),bar_position=:stack, label="item".*map(string,collect(1:10)),legend=:bottomright)
Is there any way of moving the plot legend completely outside the plotting area, for example to the right of the plot or below it? For these kinds of stacked bar plots there's really no good place for the legend inside the plot area. The only solution I've been able to come up with so far is to make some "fake" empty rows in the input data matrix to make space with some zeros, but that seems kind of hacky and will require some fiddling to get the right number of extra rows each time the plot is made:
groupedbar(vcat(rand(1:100,(10,10)),zeros(3,10)),bar_position=:stack, label="item".*map(string,collect(1:10)),legend=:bottomright)
I can see that at there was some kind of a solution proposed for pyplot, does anyone know of a similar solution for the GR backend? Another solution I could imagine - is there a way to save the legend itself to a different file so I can then put them back together in Inkscape?
This is now easily enabled with Plots.jl:
Example:
plot(rand(10), legend = :outertopleft)
Using layouts I can create a workaround making a fake plot with legend only.
using Plots
gr()
l = #layout [a{0.001h}; b c{0.13w}]
values = rand(1:100,(10,10))
p1 = groupedbar(values,bar_position=:stack, legend=:none)
p2 = groupedbar(values,bar_position=:stack, label="item".*map(string,collect(1:10)), grid=false, xlims=(20,3), showaxis=false)
p0=plot(title="Title",grid=false, showaxis=false)
plot(p0,p1,p2,layout=l)

plot igraph in a big area

Just wondering if it is possible to increase the size of the plot so that the nodes and edges can be more scattered over the plot.
Original plot:
What are expected:
I tried many parameters in the layout function such as area, niter, and so on, but all of them do not work. By the way, I am using 'igraph' package in R.
If you are referring to the actual size of the produced output (pdf, png, etc), you can configure it with the width and height parameters. Check this link for png,bpm, etc, and this link for PDF format.
A MWE is something like this:
png("mygraph.png", heigh=400, width=600)
#functions to plot your graph
dev.off()
If you are referring to the size of the graphic produced by the layout function, as #MrFlick referred, you should check the parameters of the particular layout you are using.
Hope it helps you.
In your second graph, it's obviously the graph can be divided into several clusters (or sections). If I understood you correctly, you want to have a layout that separates your clusters more visibly.
Then you can draw this by calculating a two-level layout:
First, calculate the layout of the graph in order to find a place for each cluster.
Second, calculate the layout in each cluster according to first step and plot nodes in the corresponding place.

Specify plot area in a multiplot

I create a row of plots in a graphic device with the par() command and run the first 2 plots:
par(mfrow = c(1, 4))
hist(mydata)
boxplot(y ~ x)
Now let's say the boxplot is wrong and I want to replace it with a new one. By default the next plot goes to the left side of the previous one (or one row below, first column, in case of a multiple rows layout), leaving the previous plot unchanged.
Is there a way to specify the location of the next plot in the multiplot grid area?
To specify the location of the next plot in the multiplot grid area, I prefer to use the function layout.
The layout function provides an alternative to the mfrow and mfcol settings.
For example the equivalent of par(mfrow = c(1, 4)) is :
layout(matrix(c(1, 3, 2, 4), byrow=TRUE, ncol=4))
or
layout(matrix(c(1, 2, 3, 4), byrow=TRUE, ncol=4))
The function layout.show() may be helpful for visualizing the figure regions
that are created. The following code creates a figure visualizing the layout
created in the previous example:
layout.show(4)
The base graphics model is ink-on-paper and does not allow revisions. The lattice and ggplot models are based on lists that can be modified. You can "go back" an add items with lines, points and as pointed out you can change the focus to a particular panel, but to remove or replace stuff .... not possible. Re-running the code shouldn't be a big problem, should it? Pixels are very cheap.
You can specify the next frame to plot to useing the mfg argument to par. See ?par for details. So a command like:
par(mfg=c(1,2))
Will mean that the next high level plot will go to the plot in the 1st row 2nd column. This can be used to plot in your own custom order. However, using layout for this is probably easier better in most cases.
When you use this to specify a frame to plot in R assumes that the frame is ready to be plotted in, it will not remove anything already there, so if there is an existing plot there it will be plotted over and you will likely see both plots and it won't look pretty.
You can draw a rectangle over the top of an existing plot to give yourself a blank frame to plot in using code like:
par(xpd=NA)
rect( grconvertX(0, from='nfc'), grconvertY(0,from='nfc'),
grconvertX(1,from='nfc'), grconvertY(1, from='nfc'),
col='white',border='white')
This works OK for looking at on the screen, but you need to be careful with this if exporting or printing, in some cases the printer or interpreter of the graphics file will interpret the white rectangle as "do nothing" and you will again see both plots.
In general it is best to do plots that take more than a line or 2 of code in a script window so that if you want to change something you can edit the script and just recreate the whole plot from scratch rather than relying on tricks like this.

How to plot dendrograms with large datasets?

I am using ape (Analysis of Phylogenetics and Evolution) package in R that has dendrogram drawing functionality. I use following commands to read the data in Newick format, and draw a dendrogram using the plot function:
library("ape")
gcPhylo <-read.tree(file = "gc.tree")
plot(gcPhylo, show.node.label = TRUE)
As the data set is quite large, it is impossible to see any details in the lower levels of the tree. I can see just black areas but no details. I can only see few levels from the top, and then no detail.
I was wondering if there is any zoom capability of the plot function. I tried to limit the area using xLim and yLim, however, they just limit the area, and do not zoom to make the details visible. Either zooming, or making the details visible without zooming will solve my problem.
I am also appreciated to know any other package, function, or tool that will help me overcoming the problem.
Thanks.
It is possible to cut a dendrogram at a specified height and plot the elements:
First create a clustering using the built-in dataset USArrests. Then convert to a dendrogram:
hc <- hclust(dist(USArrests))
hcd <- as.dendrogram(hc)
Next, use cut.dendrogram to cut at a specified height, in this case h=75. This produces a list of a dendrogram for the upper bit of the cut, and a list of dendograms, one for each branch below the cut:
par(mfrow=c(3,1))
plot(hcd, main="Main")
plot(cut(hcd, h=75)$upper,
main="Upper tree of cut at h=75")
plot(cut(hcd, h=75)$lower[[2]],
main="Second branch of lower tree with cut at h=75")
The cut function described in the other answer is a very good solution; if you would like to maintain the whole tree on one page for some interactive investigation you could also plot to a large page on a PDF.
The resulting PDF is vectorized so you can zoom in closely with your favourite PDF viewer without loss of resolution.
Here's an example of how to direct plot output to PDF:
# Open a PDF for plotting; units are inches by default
pdf("/path/to/a/pdf/file.pdf", width=40, height=15)
# Do some plotting
plot(gcPhylo)
# Close the PDF file's associated graphics device (necessary to finalize the output)
dev.off()

Resources