How to set xlim in multhist? - r

The following code creates 3 vectors, and displays them as interlaced histograms:
a <- c(1,2,3)
b <- c(1,1,2)
c <- c(1,1,1)
l <- list(a,b,c)
multhist(l, col=c("red","green","blue"),xlim=c(0,5))
However, when I specify this xlim=c(0,5), I would expect this to set the x axis range, but it does not seem to do so. The x-axis appears to only range between about 1.0 and 1.4. Is there a different way to specify the x-axis range for a multhist?

Maybe not perfect, but a start:
edit: remove x-axis labels, add box
multhist(l, col=c("red","green","blue"),
breaks=seq(0,5,by=0.2),names.arg=rep("",25))
box(bty="l") ## add box around bottom and left edges
multhist is a bit of a hack (I know, I wrote it!) -- it uses barplot internally, so the x axis is indexing the positions of the bars rather than the actual values.
See also
http://www.math.mcmaster.ca/bolker/R/misc/multhist.pdf
http://www.math.mcmaster.ca/bolker/R/misc/multhist.Rnw
for some other ideas about how to display binned data from multiple groups.

Related

par function, how to set proper values for multiple graphs?

While par() has been very useful in defining the number of plots in a single page, I was wondering if there would be a nice and quick trick to set the proper margins for multiplot pages. For instance,
dev.off()
par(mfrow =c(3,3),mar=c(0,0,0,0),pty="s")
x <- c(1:10)
y <- c(1:10)
for (i in 1:9){
plot(x,y)
}
In this rough example, the x and y labels don't appear and the distance between plots is very close.
In the end, it becomes a rule of thumb trying to fix with mar=c() that all elements of the plot are visually set in one page. So, is there any quicker way to determine the margins depending on the size of the labels, axes, and number of plots?
Thanks!

Add data labels to spineplot in R

iFacColName <- "hireMonth"
iTargetColName <- "attrition"
iFacVector <- as.factor(c(1,1,1,1,10,1,1,1,12,9,9,1,10,12,1,9,5))
iTargetVector <- as.factor(c(1,1,0,1,1,0,0,1,1,0,1,0,1,1,1,1,1))
sp <- spineplot(iFacVector,iTargetVector,xlab=iFacColName,ylab=iTargetColName,main=paste0(iFacColName," vs. ",iTargetColName," Spineplot"))
spLabelPass <- sp[,2]/(sp[,1]+sp[,2])
spLabelFail <- 1-spLabelPass
text(seq_len(nrow(sp)),rep(.95,length(spLabelPass)),labels=as.character(spLabelPass),cex=.8)
For some reason, the text() function only plots one label far to the right of the graph. I have used this format to apply data labels to other types of graphs, so I am confused.
EDIT: added more code to make example work
You're not placing your labels inside the plotting region. It only extends to around 1.3 on the x axis. Try plotting something like
text(
cumsum(prop.table(table(iFacVector))),
rep(.95, length(spLabelPass)),
labels = as.character(round(spLabelPass, 1)),
cex = .8
)
and you'll get something like
This is obviously not the right positions for the labels, but you should be able to figure that out by yourself. (You're going to have to subtract half of the frequency for each bar from the cumulative frequency and account for the fact that the bars are padded with some amount of whitespace.

How to match axes when overlaying boxplot and scatterplot in R?

I am attempting to overlay boxplots over individual points on my scatterplot. However, I am having issues matching the axes on the two plots.
Despite having the same number of elements (x axis) and value limit (y axis), the two axes of the two plots are scaled differently.
I am currently using:
plot((1:length(vec1)), vec1)
par(new=TRUE)
boxplot(mat2, names=c(1:length(vec1)))
Does anyone know of a way of ensuring that the plots are on the same scale without explicitly coercing the xlim and ylim? (the dimensions of vec1 and mat2 change on iterations).
You can use the points function rather than calling plot.
For instance:
vec1 <- rnorm(10)
mat2 <- matrix(rnorm(1000), 100, 10)
boxplot(mat2, names=seq_along(vec1))
points(vec1)
This has also the advantage that the points are in front of the boxplot.
Note that you can retrieve current axis limits using par("usr"), although I can't seem to align the two plots properly even using those as xlim and ylim. I am guessing this depends on how boxplot works internally (haven't investigated that in depth though...)

Binary spark lines with R

I'm looking to plot a set of sparklines in R with just a 0 and 1 state that looks like this:
Does anyone know how I might create something like that ideally with no extra libraries?
I don't know of any simple way to do this, so I'm going to build up this plot from scratch. This would probably be a lot easier to design in illustrator or something like that, but here's one way to do it in R (if you don't want to read the whole step-by-step, I provide my solution wrapped in a reusable function at the bottom of the post).
Step 1: Sparklines
You can use the pch argument of the points function to define the plotting symbol. ASCII symbols are supported, which means you can use the "pipe" symbol for vertical lines. The ASCII code for this symbol is 124, so to use it for our plotting symbol we could do something like:
plot(df, pch=124)
Step 2: labels and numbers
We can put text on the plot by using the text command:
text(x,y,char_vect)
Step 3: Alignment
This is basically just going to take a lot of trial and error to get right, but it'll help if we use values relative to our data.
Here's the sample data I'm working with:
df = data.frame(replicate(4, rbinom(50, 1, .7)))
colnames(df) = c('steps','atewell','code','listenedtoshell')
I'm going to start out by plotting an empty box to use as our canvas. To make my life a little easier, I'm going to set the coordinates of the box relative to values meaningful to my data. The Y positions of the 4 data series will be the same across all plotting elements, so I'm going to store that for convenience.
n=ncol(df)
m=nrow(df)
plot(1:m,
seq(1,n, length.out=m),
# The following arguments suppress plotting values and axis elements
type='n',
xaxt='n',
yaxt='n',
ann=F)
With this box in place, I can start adding elements. For each element, the X values will all be the same, so we can use rep to set that vector, and seq to set the Y vector relative to Y range of our plot (1:n). I'm going to shift the positions by percentages of the X and Y ranges to align my values, and modified the size of the text using the cex parameter. Ultimately, I found that this works out:
ypos = rev(seq(1+.1*n,n*.9, length.out=n))
text(rep(1,n),
ypos,
colnames(df), # These are our labels
pos=4, # This positions the text to the right of the coordinate
cex=2) # Increase the size of the text
I reversed the sequence of Y values because I built my sequence in ascending order, and the values on the Y axis in my plot increase from bottom to top. Reversing the Y values then makes it so the series in my dataframe will print from top to bottom.
I then repeated this process for the second label, shifting the X values over but keeping the Y values the same.
text(rep(.37*m,n), # Shifted towards the middle of the plot
ypos,
colSums(df), # new label
pos=4,
cex=2)
Finally, we shift X over one last time and use points to build the sparklines with the pipe symbol as described earlier. I'm going to do something sort of weird here: I'm actually going to tell points to plot at as many positions as I have data points, but I'm going to use ifelse to determine whether or not to actually plot a pipe symbol or not. This way everything will be properly spaced. When I don't want to plot a line, I'll use a 'space' as my plotting symbol (ascii code 32). I will repeat this procedure looping through all columns in my dataframe
for(i in 1:n){
points(seq(.5*m,m, length.out=m),
rep(ypos[i],m),
pch=ifelse(df[,i], 124, 32), # This determines whether to plot or not
cex=2,
col='gray')
}
So, piecing it all together and wrapping it in a function, we have:
df = data.frame(replicate(4, rbinom(50, 1, .7)))
colnames(df) = c('steps','atewell','code','listenedtoshell')
BinarySparklines = function(df,
L_adj=1,
mid_L_adj=0.37,
mid_R_adj=0.5,
R_adj=1,
bottom_adj=0.1,
top_adj=0.9,
spark_col='gray',
cex1=2,
cex2=2,
cex3=2
){
# 'adJ' parameters are scalar multipliers in [-1,1]. For most purposes, use [0,1].
# The exception is L_adj which is any value in the domain of the plot.
# L_adj < mid_L_adj < mid_R_adj < R_adj
# and
# bottom_adj < top_adj
n=ncol(df)
m=nrow(df)
plot(1:m,
seq(1,n, length.out=m),
# The following arguments suppress plotting values and axis elements
type='n',
xaxt='n',
yaxt='n',
ann=F)
ypos = rev(seq(1+.1*n,n*top_adj, length.out=n))
text(rep(L_adj,n),
ypos,
colnames(df), # These are our labels
pos=4, # This positions the text to the right of the coordinate
cex=cex1) # Increase the size of the text
text(rep(mid_L_adj*m,n), # Shifted towards the middle of the plot
ypos,
colSums(df), # new label
pos=4,
cex=cex2)
for(i in 1:n){
points(seq(mid_R_adj*m, R_adj*m, length.out=m),
rep(ypos[i],m),
pch=ifelse(df[,i], 124, 32), # This determines whether to plot or not
cex=cex3,
col=spark_col)
}
}
BinarySparklines(df)
Which gives us the following result:
Try playing with the alignment parameters and see what happens. For instance, to shrink the side margins, you could try decreasing the L_adj parameter and increasing the R_adj parameter like so:
BinarySparklines(df, L_adj=-1, R_adj=1.02)
It took a bit of trial and error to get the alignment right for the result I provided (which is what I used to inform the default values for BinarySparklines), but I hope I've given you some intuition about how I achieved it and how moving things using percentages of the plotting range made my life easier. In any event, I hope this serves as both a proof of concept and a template for your code. I'm sorry I don't have an easier solution for you, but I think this basically gets the job done.
I did my prototyping in Rstudio so I didn't have to specify the dimensions of my plot, but for posterity I had 832 x 456 with the aspect ratio maintained.

R: mirror y-axis from a plot

I have this problem. I got a heatmap, (but i suppose this applies to every plot) but I need to mirror my y-axis.
I got here some example code:
library(gstat)
x <- seq(1,50,length=50)
y <- seq(1,50,length=50)
z <- rnorm(1000)
df <- data.frame(x=x,y=y,z=z)
image(df,col=heat.colors(256))
This will generate the following heatmap
But I need the y-axis mirrored. Starting with 0 on the top and 50 on the bottom. Does anybody has a clue as to what I must do to change this?
See the help page for ?plot.default, which specifies
xlim: the x limits (x1, x2) of the plot. Note that ‘x1 > x2’ is
allowed and leads to a ‘reversed axis’.
library(gstat)
x <- seq(1,50,length=50)
y <- seq(1,50,length=50)
z <- rnorm(1000)
df <- data.frame(x=x,y=y,z=z)
So
image(df,col=heat.colors(256), ylim = rev(range(y)))
Does this work for you (it's a bit of a hack, though)?
df2<-df
df2$y<-50-df2$y #reverse oredr
image(df2,col=heat.colors(256),yaxt="n") #avoid y axis
axis(2, at=c(0,10,20,30,40,50), labels=c(50,40,30,20,10,0)) #draw y axis manually
The revaxis function in the plotrix package "reverses the sense of either or both the ‘x’ and ‘y’ axes". It doesn't solve your problem (Nick's solution is the correct one) but can be useful when you need to plot a scatterplot with reversed axes.
I would use rev like so:
df <- data.frame(x=x,y=rev(y),z=z)
In case you were not aware, notice that df is actually a function. You might want to be careful when overwriting. If you rm(df), things will go back to normal.
Don't forget to relabel the y axis as Nick suggests.
For the vertical axis increasing in the downward direction, I provided two ways (two different answers) for the following question:
R - image of a pixel matrix?

Resources