R: Automatically expand margins in VIM::aggr plots - r

I'm trying to visualise missing data with the R package VIM.
I'm using R version 3.4.0 with RStudio
I've used the function aggr() but the colnames of my dataframe seem to be too long. Thus, some labels of the x axis don't appear.
I would like to increase the space at the bottom of the x axis.
library(VIM)
aggr(df)
Here is my dataframe df and the plot I obtain
I've tried with par() function but it doesn't change anything.
aggr(df,mar=c(10,5,5,3))
or
par(mar=c(10,5,5,3))
g=aggr(df,plot=FALSE)
plot(g)
I can reduce the font size with cex.axis but then labels are too small.
aggr(df,cex.axis=.7)
Here is the plot with small axis labels:
I've not find a lot of examples using aggr() that's why I ask for your help.
Thank you in advance.

I think you are looking for a graphical parameter oma which will allow you to resize the main plot. The help reference states:
For plot.aggr, further graphical parameters to be passed down. par("oma") will be set appropriately unless supplied (see par).
In your case you could do something like:
aggr(df, prop = T, numbers = F, combined = F,
labels = names(df), cex.axis = .9, oma = c(10,5,5,3))
Obviously, you need to play around with cex.axis and other parameters to find out what works best for your data.

Related

How do I edit the dendrogram whilst using heatmap.2 function in r

I am trying to create a heatmap to represent the change of gene expression over a period of time. the code I have used is this:
coul <- colorRampPalette(brewer.pal(8, "Reds"))(25)
heatmap.2(dm, dendogram=c("row"),Colv=NA, xlab="Time points", ylab="Genes of interest", scale="row", col=coul, tracecol = NA)
As one cant really see the branches of the dendogram that well, I was wondering whether you can somehow stretch it out to become more visible?
I was also wondering how to remove the "colour key and histogram" label.
Many thanks!
if you don't mind the color guide being wide, you just use the lwid option, you specify a vector that decides the ratio of dendrogram to heatmap, below I use c(3,3), which means 1:1.
set.seed(100)
x = matrix(rnorm(1000),100,10)
heatmap.2(x,trace="none",Colv=NA,dendrogram=c("row"),tracecol = NA)
heatmap.2(x,trace="none",Colv=NA,
dendrogram=c("row"),lwid=c(3,3),tracecol = NA,keysize=0.75)
One way to narrow the margins of the color guide is to use key.par, where you set the margins on the right to be larger (I use 10 in example below).
heatmap.2(x,trace="none",Colv=NA,dendrogram=c("row"),
lwid=c(3,3),tracecol = NA,keysize=0.75,key.par=list(mar=c(3,3,3,10)))

Manually creating an object that looks like a heatmap color key

I'm working on trying to create a key for a heatmap, but as far as I know, I cannot use the existing tools for adding a legend since I've generated the colors myself (I manually turn a scaled variable into rgb values for a short rainbow ( [255,0,0] to [0,0,255] ).
Basically, all I want to do is use the rightmost 10th of the screen to create a rectangle with these 10 colors: "#0000FF", "#0072FF", "#00E3FF", "#00FFAA", "#00FF38", "#39FF00", "#AAFF00", "#FFE200", "#FF7100", "#FF0000"
with three numerical labels - at 0, max/2, and max
In essence, I want to manually produce an object that looks like a rudimentary heatmap color key.
As far as I know, split.screen can only split the screen in half, which isn't what I'm looking for. I want the graphic I already know how to produce to take up the leftmost 90% of the screen, and I want this colored rectangle to take up the other 10%.
Thanks.
EDIT: I greatly appreciate the advice about the best way to the the plot - that said, I still would like to know the best way to do the task originally asked - creating the legend by hand; I already am able to produce the exact heatmap graphic that I'm looking for - the false coloring wasn't the only problem with ggplot that I was having - it was just the final factor convincing me to switch. I need a non ggplot solution.
EDIT #2: This is close to the solution I am looking for, except this only goes up to 10 instead of accepting a maximum value as a parameter (I will be running this code on multiple data-sets, all with different maximum values - I want the legend to reflect this). Additionally, if I change the size of the graph, the key falls apart into disconnected squares.
Take a look at the layouts function (link). I think you want something like this:
layout(matrix(c(1,2), 1, 2, byrow = TRUE), widths=c(9,1))
## plot heatmap
## plot legend
I would also recommend the ggplot2 package and the geom_tile function which will take care of all of this for you.
Assuming your data is in a data frame with the x and y coordinates and heatmap value (e.g. gdat <- data.frame(x_coord=c(1,2,...), y_coord=c(1,1,...), val=c(6,2,...))) Then you should be able to produce your desired heat map plot with the following ggplot command:
ggplot(gdat) + geom_tile(aes(x=x_coord, y=y_coord, fill=val)) +
scale_fill_gradient(low="#0000FF", high="#FF0000")
To get your data into the following format you may want to look into the very useful reshape2 package.
Given a script no ggplot restriction on this answer here is how one could produce the plot with just base R.
colors <- c("#0000FF", "#0072FF", "#00E3FF", "#00FFAA", "#00FF38",
"#39FF00", "#AAFF00", "#FFE200", "#FF7100", "#FF0000")
layout(matrix(c(1,2), 1, 2, byrow = TRUE), widths=c(9,1))
plot(rnorm(20), rnorm(20), col=sample(colors, 20, replace=TRUE))
par(mar=c(0,0,0,0))
plot(x=rep(1,10), y=1:10, col=colors, pch=15, cex=7.1)
You may have to adjust the cex for your device.

Odd axis label behaviour after setting xlim in pyramid.plot [plotrix]

I'm trying to make an "opposing stacked bar chart" and have found pyramid.plot from the plotrix package seems to do the job. (I appreciate ggplot2 will be the go-to solution for some of you, but I'm hoping to stick with base graphics on this one.)
Unfortunately it seems to do an odd thing with the x axis, when I try to set the limits to non integer values. If I let it define the limits automatically, they are integers and in my case that just leaves too much white space. But defining them as xlim=c(1.5,1.5) produces the odd result below.
If I understand correctly from the documentation, there is no way to pass on additional graphical parameters to e.g. suppress the axis and add it on later, or let alone define the tick points etc. Is there a way to make it more flexible?
Here is a minimal working example used to produce the plot below.
require(plotrix)
set.seed(42)
pyramid.plot(cbind(runif(7,0,1),
rep(0,7),
rep(0,7)),
cbind(rep(0,7),
runif(7,0,1),
runif(7,0,1)),
top.labels=NULL,
gap=0,
labels=rep("",7),
xlim=c(1.5,1.5))
Just in case it is of interest to anyone else, I'm not doing a population pyramid, but rather attempting a stacked bar chart with some of the values negative. The code above includes a 'trick' I use to make it possible to have a different number of sets of bars on each side, namely adding empty columns to the matrix, hopefully someone will find that useful - so sorry the working example is not as minimal as it could have been!
Setting the x axis labels using laxlab and raxlab creates a continuous axis:
pyramid.plot(cbind(runif(7,0,1),
rep(0,7),
rep(0,7)),
cbind(rep(0,7),
runif(7,0,1),
runif(7,0,1)),
top.labels=NULL,
gap=0,
labels=rep("",7),
xlim=c(1.5,1.5),
laxlab = seq(from = 0, to = 1.5, by = 0.5),
raxlab=seq(from = 0, to = 1.5, by = 0.5))

Use wordlayout results for ggplot geom_text

The R package wordcloud has a very useful function which is called wordlayout. It takes initial positions of words and their respective sizes an rearranges them in a way that they do not overlap. I would like to use the results of this functions to do a geom_text plot in ggplot.
I came up with the following example but soon realized that there seems to be a big difference betweetn cex (wordlayout) and size (geom_plot) since words in graphics package appear way larger.
here is my sample code. Plot 1 is the original wordcloud plot which has no overlaps:
library(wordcloud)
library(tm)
library(ggplot2)
samplesize=100
textdf <- data.frame(label=sample(stopwords("en"),samplesize,replace=TRUE),x=sample(c(1:1000),samplesize,replace=TRUE),y=sample(c(1:1000),samplesize,replace=TRUE),size=sample(c(1:5),samplesize,replace=TRUE))
#plot1
plot.new()
pdf(file="plot1.pdf")
textplot(textdf$x,textdf$y,textdf$label,textdf$size)
dev.off()
#plot2
ggplot(textdf,aes(x,y))+geom_text(aes(label = label, size = size))
ggsave("plot2.pdf")
#plot3
new_pos <- wordlayout(x=textdf$x,y=textdf$y,words=textdf$label,cex=textdf$size)
textdf$x <- new_pos[,1]
textdf$y <- new_pos[,2]
ggplot(textdf,aes(x,y))+geom_text(aes(label = label, size = size))
ggsave("plot3.pdf")
#plot4
textdf$x <- new_pos[,1]+0.5*new_pos[,3]#this is the way the wordcloud package rearranges the positions. I took this out of the textplot function
textdf$y <- new_pos[,2]+0.5*new_pos[,4]
ggplot(textdf,aes(x,y))+geom_text(aes(label = label, size = size))
ggsave("plot4.pdf")
is there a way to overcome this cex/size difference and reuse wordlayout for ggplots?
cex stands for character expansion and is the factor by which text is magnified relative the default, specified by cin - set on my installation to 0.15 in by 0.2 in: see ?par for more details.
#hadley explains that ggplot2 sizes are measured in mm. Therefore cex=1 would correspond to size=3.81 or size=5.08 depending on if it is being scaled by the width or height. Of course, font selection may cause differences.
In addition, to use absolute sizes, you need to have the size specification outside the aes otherwise it considers it a variable to map to and choose the scale itself, eg:
ggplot(textdf,aes(x,y))+geom_text(aes(label = label),size = textdf$size*3.81)
Sadly I think you're going to find the short answer is no! I think the package handles the text vector mapping differently from ggplot2, so you can tinker with size and font face/family, etc. but will struggle to replicate exactly what the package is doing.
I tried a few things:
1) Try to plot the grobs from textdata using annotation_custom
require(plyr)
require(grid)
# FIRST TRY PLOT INDIVIDUAL TEXT GROBS
qplot(0:1000,0:1000,geom="blank") +
alply(textdf,1,function(x){
annotation_custom(textGrob(label=x$label,0,0,c("center","center"),gp=gpar(cex=x$size)),x$x,x$x,x$y,x$y)
})
2) Run the wordlayout() function which should readjust the text, but difficult to see for what font (similarly doesn't work)
# THEN USE wordcloud() TO GET CO-ORDS
plot.new()
wordlayout(textdf$x,textdf$y,words=textdf$label,cex=textdf$size,xlim=c(min(textdf$x),max(textdf$x)),ylim=c(min(textdf$y),max(textdf$y)))
plotdata<-cbind(data.frame(rownames(w)),w)
colnames(plotdata)=c("word","x","y","w","h")
# PLOT WORDCLOUD DATA
qplot(0:1000,0:1000,geom="blank") +
alply(plotdata,1,function(x){
annotation_custom(textGrob(label=x$word,0,0,c("center","center"),gp=gpar(cex=x$h*40)),x$x,x$x,x$y,x$y)
})
Here's a cheat if you just want to overplot other ggplot functions on top of it (although the co-ords don't seem to match up exactly between the data and the plot). It basically images the wordcloud, removes the margins, and under-plots it at the same scale:
# make a png file of just the panel
plot.new()
png(filename="bgplot.png")
par(mar=c(0.01,0.01,0.01,0.01))
textplot(textdf$x,textdf$y,textdf$label,textdf$size,xaxt="n",yaxt="n",xlab="",ylab="",asp=1)
dev.off()
# library to get PNG file
require(png)
# then plot it behind the panel
qplot(0:1000,0:1000,geom="blank") +
annotation_custom(rasterGrob(readPNG("bgplot.png"),0,0,1,1,just=c("left","bottom")),0,1000,0,1000) +
coord_fixed(1,c(0,1000),c(0,1000))

Extending the scale of an axis

I have generated the following histogram in R:
I generated it using this hist() call:
hist(x[,1], xlab='t* (Transition Statistic)',
ylab='Proportion of Resamples (n = 10,000)',
main='Distribution of Resamples', col='lightblue',
prob=TRUE, ylim=c(0.00,0.05),xlim=c(1725,max(x[,1])+10))
Plus the following abline():
abline(v=1728,col=4,lty=1,lwd=2)
That vertical line indicates the actual location of a test statistic, which I am comparing to the results of permutation samples.
My question is this: as you can see, the x scale does not extend back to the vertical line. I would really like it to do so, because I think it looks odd otherwise. How can I make this happen?
I have already tried the xaxs="i" parameter, which has no effect. I have also tried making my own axis with axis() but this requires making both axes again from scratch, and the results don't look that great to me. So, I suspect there must be an easier way to do this. Is there? And, if not, can anyone suggest what axis() command might work well, assuming I want everything to look basically the same, but with the longer x scale?
The usual R plot draws a frame around the plot. To add this, do:
box()
after the plot.
If that isn't what you want, you need to suppress axis plotting and then add your own later.
hist(...., axes = FALSE) ## .... is where your other args go
axis(side = 2)
axis(side = 1, at = seq(1730, 1830, by = 20))
That won't go quite to the vertical line but may be close enough. If you want a tick at the vertical line, choose different tick marks, e.g.
axis(side = 1, at = seq(1725, 1835, by = 20))
Since R is using gaps of 20 for the x-axis here, you can get the extension you want using 1720 rather than 1725 for the lower limit , i.e. with xlim=c(1720,max(x[,1])+10) which would produce something like

Resources