Odd axis label behaviour after setting xlim in pyramid.plot [plotrix] - r

I'm trying to make an "opposing stacked bar chart" and have found pyramid.plot from the plotrix package seems to do the job. (I appreciate ggplot2 will be the go-to solution for some of you, but I'm hoping to stick with base graphics on this one.)
Unfortunately it seems to do an odd thing with the x axis, when I try to set the limits to non integer values. If I let it define the limits automatically, they are integers and in my case that just leaves too much white space. But defining them as xlim=c(1.5,1.5) produces the odd result below.
If I understand correctly from the documentation, there is no way to pass on additional graphical parameters to e.g. suppress the axis and add it on later, or let alone define the tick points etc. Is there a way to make it more flexible?
Here is a minimal working example used to produce the plot below.
require(plotrix)
set.seed(42)
pyramid.plot(cbind(runif(7,0,1),
rep(0,7),
rep(0,7)),
cbind(rep(0,7),
runif(7,0,1),
runif(7,0,1)),
top.labels=NULL,
gap=0,
labels=rep("",7),
xlim=c(1.5,1.5))
Just in case it is of interest to anyone else, I'm not doing a population pyramid, but rather attempting a stacked bar chart with some of the values negative. The code above includes a 'trick' I use to make it possible to have a different number of sets of bars on each side, namely adding empty columns to the matrix, hopefully someone will find that useful - so sorry the working example is not as minimal as it could have been!

Setting the x axis labels using laxlab and raxlab creates a continuous axis:
pyramid.plot(cbind(runif(7,0,1),
rep(0,7),
rep(0,7)),
cbind(rep(0,7),
runif(7,0,1),
runif(7,0,1)),
top.labels=NULL,
gap=0,
labels=rep("",7),
xlim=c(1.5,1.5),
laxlab = seq(from = 0, to = 1.5, by = 0.5),
raxlab=seq(from = 0, to = 1.5, by = 0.5))

Related

R: Automatically expand margins in VIM::aggr plots

I'm trying to visualise missing data with the R package VIM.
I'm using R version 3.4.0 with RStudio
I've used the function aggr() but the colnames of my dataframe seem to be too long. Thus, some labels of the x axis don't appear.
I would like to increase the space at the bottom of the x axis.
library(VIM)
aggr(df)
Here is my dataframe df and the plot I obtain
I've tried with par() function but it doesn't change anything.
aggr(df,mar=c(10,5,5,3))
or
par(mar=c(10,5,5,3))
g=aggr(df,plot=FALSE)
plot(g)
I can reduce the font size with cex.axis but then labels are too small.
aggr(df,cex.axis=.7)
Here is the plot with small axis labels:
I've not find a lot of examples using aggr() that's why I ask for your help.
Thank you in advance.
I think you are looking for a graphical parameter oma which will allow you to resize the main plot. The help reference states:
For plot.aggr, further graphical parameters to be passed down. par("oma") will be set appropriately unless supplied (see par).
In your case you could do something like:
aggr(df, prop = T, numbers = F, combined = F,
labels = names(df), cex.axis = .9, oma = c(10,5,5,3))
Obviously, you need to play around with cex.axis and other parameters to find out what works best for your data.

R ggplot2: Draw diagonal lines on log-scale

I had a graph created with default R-plot functionality but now want to change to ggplot2 mainly because I want to use ggrepel to place labels correctly and non-overlapping.
My old plot contains diagonal lines which I need to keep. They are ploted like this:
for (i in -5:10) {
abline(a= i, b= 1, lty = 5)
}
The issues I have now are:
How do I do this for-loop with ggplot2 so I don't need to add all the lines expliclty?
How do I actually created the lines correctly?
geom_abline(slope=1, intercept=10)
Does not work as expected, probably due to log10 scale. So how can I draw diagonal lines on log10 scales correctly?
It actually works fine. This issue is directly related to my other issue about x and y axis limits. Per default the plot draws a bigger area than the x and y limits define (who thought this was a good idea???). And therefore the intercepts look wrong but the actually are ok.
If I set expand = c(0, 0) for both axis, then the intercept is also looks fine because that only draws to the limits.
The solution for multiple lines is a intercept list:
geom_abline(slope=1, intercept=(-3):(5)

Manually creating an object that looks like a heatmap color key

I'm working on trying to create a key for a heatmap, but as far as I know, I cannot use the existing tools for adding a legend since I've generated the colors myself (I manually turn a scaled variable into rgb values for a short rainbow ( [255,0,0] to [0,0,255] ).
Basically, all I want to do is use the rightmost 10th of the screen to create a rectangle with these 10 colors: "#0000FF", "#0072FF", "#00E3FF", "#00FFAA", "#00FF38", "#39FF00", "#AAFF00", "#FFE200", "#FF7100", "#FF0000"
with three numerical labels - at 0, max/2, and max
In essence, I want to manually produce an object that looks like a rudimentary heatmap color key.
As far as I know, split.screen can only split the screen in half, which isn't what I'm looking for. I want the graphic I already know how to produce to take up the leftmost 90% of the screen, and I want this colored rectangle to take up the other 10%.
Thanks.
EDIT: I greatly appreciate the advice about the best way to the the plot - that said, I still would like to know the best way to do the task originally asked - creating the legend by hand; I already am able to produce the exact heatmap graphic that I'm looking for - the false coloring wasn't the only problem with ggplot that I was having - it was just the final factor convincing me to switch. I need a non ggplot solution.
EDIT #2: This is close to the solution I am looking for, except this only goes up to 10 instead of accepting a maximum value as a parameter (I will be running this code on multiple data-sets, all with different maximum values - I want the legend to reflect this). Additionally, if I change the size of the graph, the key falls apart into disconnected squares.
Take a look at the layouts function (link). I think you want something like this:
layout(matrix(c(1,2), 1, 2, byrow = TRUE), widths=c(9,1))
## plot heatmap
## plot legend
I would also recommend the ggplot2 package and the geom_tile function which will take care of all of this for you.
Assuming your data is in a data frame with the x and y coordinates and heatmap value (e.g. gdat <- data.frame(x_coord=c(1,2,...), y_coord=c(1,1,...), val=c(6,2,...))) Then you should be able to produce your desired heat map plot with the following ggplot command:
ggplot(gdat) + geom_tile(aes(x=x_coord, y=y_coord, fill=val)) +
scale_fill_gradient(low="#0000FF", high="#FF0000")
To get your data into the following format you may want to look into the very useful reshape2 package.
Given a script no ggplot restriction on this answer here is how one could produce the plot with just base R.
colors <- c("#0000FF", "#0072FF", "#00E3FF", "#00FFAA", "#00FF38",
"#39FF00", "#AAFF00", "#FFE200", "#FF7100", "#FF0000")
layout(matrix(c(1,2), 1, 2, byrow = TRUE), widths=c(9,1))
plot(rnorm(20), rnorm(20), col=sample(colors, 20, replace=TRUE))
par(mar=c(0,0,0,0))
plot(x=rep(1,10), y=1:10, col=colors, pch=15, cex=7.1)
You may have to adjust the cex for your device.

Extending the scale of an axis

I have generated the following histogram in R:
I generated it using this hist() call:
hist(x[,1], xlab='t* (Transition Statistic)',
ylab='Proportion of Resamples (n = 10,000)',
main='Distribution of Resamples', col='lightblue',
prob=TRUE, ylim=c(0.00,0.05),xlim=c(1725,max(x[,1])+10))
Plus the following abline():
abline(v=1728,col=4,lty=1,lwd=2)
That vertical line indicates the actual location of a test statistic, which I am comparing to the results of permutation samples.
My question is this: as you can see, the x scale does not extend back to the vertical line. I would really like it to do so, because I think it looks odd otherwise. How can I make this happen?
I have already tried the xaxs="i" parameter, which has no effect. I have also tried making my own axis with axis() but this requires making both axes again from scratch, and the results don't look that great to me. So, I suspect there must be an easier way to do this. Is there? And, if not, can anyone suggest what axis() command might work well, assuming I want everything to look basically the same, but with the longer x scale?
The usual R plot draws a frame around the plot. To add this, do:
box()
after the plot.
If that isn't what you want, you need to suppress axis plotting and then add your own later.
hist(...., axes = FALSE) ## .... is where your other args go
axis(side = 2)
axis(side = 1, at = seq(1730, 1830, by = 20))
That won't go quite to the vertical line but may be close enough. If you want a tick at the vertical line, choose different tick marks, e.g.
axis(side = 1, at = seq(1725, 1835, by = 20))
Since R is using gaps of 20 for the x-axis here, you can get the extension you want using 1720 rather than 1725 for the lower limit , i.e. with xlim=c(1720,max(x[,1])+10) which would produce something like

R barplot axis scaling

I want to plot a barplot of some data with some x-axis labels but so far I just keep running into the same problem, as the axis scaling is completely off limits and therefore my labels are wrongly positioned below the bars.
The most simple example I can think of:
x = c(1:81)
barplot(x)
axis(side=1,at=c(0,20,40,60,80),labels=c(20,40,60,80,100))
As you can see, the x-axis does not stretch along the whole plot but stops somewhere in between. It seems to me as if the problem is quite simple, but I somehow I am not able to fix it and I could not find any solution so far :(
Any help is greatly appreciated.
The problem is that barplot is really designed for plotting categorical, not numeric data, and as such it pretty much does its own thing in terms of setting up the horizontal axis scale. The main way to get around this is to recover the actual x-positions of the bar midpoints by saving the results of barplot to a variable, but as you can see below I haven't come up with an elegant way of doing what you want in base graphics. Maybe someone else can do better.
x = c(1:81)
b <- barplot(x)
## axis(side=1,at=c(0,20,40,60,80),labels=c(20,40,60,80,100))
head(b)
You can see here that the actual midpoint locations are 0.7, 1.9, 3.1, ... -- not 1, 2, 3 ...
This is pretty quick, if you don't want to extend the axis from 0 to 100:
b <- barplot(x)
axis(side=1,at=b[c(20,40,60,80)],labels=seq(20,80,by=20))
This is my best shot at doing it in base graphics:
b <- barplot(x,xlim=c(0,120))
bdiff <- diff(b)[1]
axis(side=1,at=c(b[1]-bdiff,b[c(20,40,60,80)],b[81]+19*bdiff),
labels=seq(0,100,by=20))
You can try this, but the bars aren't as pretty:
plot(x,type="h",lwd=4,col="gray",xlim=c(0,100))
Or in ggplot:
library(ggplot2)
d <- data.frame(x=1:81)
ggplot(d,aes(x=x,y=x))+geom_bar(stat="identity",fill="lightblue",
colour="gray")+xlim(c(0,100))
Most statistical graphics nerds will tell you that graphing quantitative (x,y) data is better done with points or lines rather than bars (non-data-ink, Tufte, blah blah blah :-) )
Not sure exactly what you wnat, but If it is to have the labels running from one end to the other evenly places (but not necessarily accurately), then:
x = c(1:81)
bp <- barplot(x)
axis(side=1,at=bp[1+c(0,20,40,60,80)],labels=c(20,40,60,80,100))
The puzzle for me was why you wanted to label "20" at 0. But this is one way to do it.
I run into the same annoying property of batplots - the x coordinates go wild. I would add one another way to show the problem, and that is adding more lines to the plot.
x = c(1:81)
barplot(x)
axis(side=1,at=c(0,20,40,60,80),labels=c(20,40,60,80,100))
lines(c(81,81), c(0, 100)) # this should cross the last bar, but it does not
The best I came with was to define a new barplot function that will take also the parameter "at" for plotting positions of the bars.
barplot_xscaled <- function(bar_heights, at = NA, width = 0.5, col = 'grey'){
if ( is.na(at) ){
at <- c(1:length(bar_heights))
}
plot(bar_heights, type="n", xlab="", ylab="",
ylim=c(0, max(bar_heights)), xlim=range(at), bty = 'n')
for ( i in 1:length(bar_heights)){
rect(at[i] - width, 0, at[i] + width, bar_heights[i], col = col)
}
}
barplot_xscaled(x)
lines(c(81, 81), c(0, 100))
The lines command crosses the last bar - the x scale works just as naively expected, but you could also now define whatever positions of the bars you would like (you could play more with the function a bit to have the same properties as other R plotting functions).

Resources