par function, how to set proper values for multiple graphs? - r

While par() has been very useful in defining the number of plots in a single page, I was wondering if there would be a nice and quick trick to set the proper margins for multiplot pages. For instance,
dev.off()
par(mfrow =c(3,3),mar=c(0,0,0,0),pty="s")
x <- c(1:10)
y <- c(1:10)
for (i in 1:9){
plot(x,y)
}
In this rough example, the x and y labels don't appear and the distance between plots is very close.
In the end, it becomes a rule of thumb trying to fix with mar=c() that all elements of the plot are visually set in one page. So, is there any quicker way to determine the margins depending on the size of the labels, axes, and number of plots?
Thanks!

Related

R, Plotting points with Labels on a single (horizontal) numberline

For educational purpose I'm trying to plot a singel horizontal "numberline" with some datapoints with labels in R. I came this far;
library(plotrix)
source("spread.labels.R")
plot(0:100,axes=FALSE,type="n",xlab="",ylab="")
axis(1,pos=0)
spread.labels(c(5,5,50,60,70,90),rep(0,6),ony=FALSE,
labels=c("5","5","50","60","70","90"),
offsets=rep(20,6))
This gave me a numberline with smaller lines pointing up to (and a little bit "in") the labels from where the datapoints should lie on the numberline - but without the points itself. Can anyone give me additional or alternative R-codes for solving thess problems:
- datapoints itself still missing are not plotted,
- and labels maybe not evenly divided over the whole numberline,
- and lines come into the labels and not merely point to the labels
Thank a lot,
Benjamin Telkamp
I usually like to create plots using primitive base R graphics functions, such as points(), segments(), lines(), abline(), rect(), polygon(), text(), and mtext(). You can easily create curves (e.g. for circles) and more complex shapes using segments() and lines() across granular coordinate vectors that you define yourself. For example, see Plot angle between vectors. This provides much more control over the plot elements you create, however, it often takes more work and careful coding than more "pre-packaged" solutions, so it's a tradeoff.
For your case, it sounds to me like you're happy with what spread.labels() is trying to do, you just want the following changes:
add point symbols at the labelled points.
prevent overlap between labels and lines.
Here's how this can be done:
## define plot data
xlim <- c(0,100);
ylim <- c(0,100);
px <- c(5,5,50,60,70,90);
py <- c(0,0,0,0,0,0);
lx.buf <- 5;
lx <- seq(xlim[1]+lx.buf,xlim[2]-lx.buf,len=length(px));
ly <- 20;
## create basic plot outline
par(xaxs='i',yaxs='i',mar=c(5,1,1,1));
plot(NA,xlim=xlim,ylim=ylim,axes=F,ann=F);
axis(1);
## plot elements
segments(px,py,lx,ly);
points(px,py,pch=16,xpd=NA);
text(lx,ly,px,pos=3);

How to match axes when overlaying boxplot and scatterplot in R?

I am attempting to overlay boxplots over individual points on my scatterplot. However, I am having issues matching the axes on the two plots.
Despite having the same number of elements (x axis) and value limit (y axis), the two axes of the two plots are scaled differently.
I am currently using:
plot((1:length(vec1)), vec1)
par(new=TRUE)
boxplot(mat2, names=c(1:length(vec1)))
Does anyone know of a way of ensuring that the plots are on the same scale without explicitly coercing the xlim and ylim? (the dimensions of vec1 and mat2 change on iterations).
You can use the points function rather than calling plot.
For instance:
vec1 <- rnorm(10)
mat2 <- matrix(rnorm(1000), 100, 10)
boxplot(mat2, names=seq_along(vec1))
points(vec1)
This has also the advantage that the points are in front of the boxplot.
Note that you can retrieve current axis limits using par("usr"), although I can't seem to align the two plots properly even using those as xlim and ylim. I am guessing this depends on how boxplot works internally (haven't investigated that in depth though...)

Binary spark lines with R

I'm looking to plot a set of sparklines in R with just a 0 and 1 state that looks like this:
Does anyone know how I might create something like that ideally with no extra libraries?
I don't know of any simple way to do this, so I'm going to build up this plot from scratch. This would probably be a lot easier to design in illustrator or something like that, but here's one way to do it in R (if you don't want to read the whole step-by-step, I provide my solution wrapped in a reusable function at the bottom of the post).
Step 1: Sparklines
You can use the pch argument of the points function to define the plotting symbol. ASCII symbols are supported, which means you can use the "pipe" symbol for vertical lines. The ASCII code for this symbol is 124, so to use it for our plotting symbol we could do something like:
plot(df, pch=124)
Step 2: labels and numbers
We can put text on the plot by using the text command:
text(x,y,char_vect)
Step 3: Alignment
This is basically just going to take a lot of trial and error to get right, but it'll help if we use values relative to our data.
Here's the sample data I'm working with:
df = data.frame(replicate(4, rbinom(50, 1, .7)))
colnames(df) = c('steps','atewell','code','listenedtoshell')
I'm going to start out by plotting an empty box to use as our canvas. To make my life a little easier, I'm going to set the coordinates of the box relative to values meaningful to my data. The Y positions of the 4 data series will be the same across all plotting elements, so I'm going to store that for convenience.
n=ncol(df)
m=nrow(df)
plot(1:m,
seq(1,n, length.out=m),
# The following arguments suppress plotting values and axis elements
type='n',
xaxt='n',
yaxt='n',
ann=F)
With this box in place, I can start adding elements. For each element, the X values will all be the same, so we can use rep to set that vector, and seq to set the Y vector relative to Y range of our plot (1:n). I'm going to shift the positions by percentages of the X and Y ranges to align my values, and modified the size of the text using the cex parameter. Ultimately, I found that this works out:
ypos = rev(seq(1+.1*n,n*.9, length.out=n))
text(rep(1,n),
ypos,
colnames(df), # These are our labels
pos=4, # This positions the text to the right of the coordinate
cex=2) # Increase the size of the text
I reversed the sequence of Y values because I built my sequence in ascending order, and the values on the Y axis in my plot increase from bottom to top. Reversing the Y values then makes it so the series in my dataframe will print from top to bottom.
I then repeated this process for the second label, shifting the X values over but keeping the Y values the same.
text(rep(.37*m,n), # Shifted towards the middle of the plot
ypos,
colSums(df), # new label
pos=4,
cex=2)
Finally, we shift X over one last time and use points to build the sparklines with the pipe symbol as described earlier. I'm going to do something sort of weird here: I'm actually going to tell points to plot at as many positions as I have data points, but I'm going to use ifelse to determine whether or not to actually plot a pipe symbol or not. This way everything will be properly spaced. When I don't want to plot a line, I'll use a 'space' as my plotting symbol (ascii code 32). I will repeat this procedure looping through all columns in my dataframe
for(i in 1:n){
points(seq(.5*m,m, length.out=m),
rep(ypos[i],m),
pch=ifelse(df[,i], 124, 32), # This determines whether to plot or not
cex=2,
col='gray')
}
So, piecing it all together and wrapping it in a function, we have:
df = data.frame(replicate(4, rbinom(50, 1, .7)))
colnames(df) = c('steps','atewell','code','listenedtoshell')
BinarySparklines = function(df,
L_adj=1,
mid_L_adj=0.37,
mid_R_adj=0.5,
R_adj=1,
bottom_adj=0.1,
top_adj=0.9,
spark_col='gray',
cex1=2,
cex2=2,
cex3=2
){
# 'adJ' parameters are scalar multipliers in [-1,1]. For most purposes, use [0,1].
# The exception is L_adj which is any value in the domain of the plot.
# L_adj < mid_L_adj < mid_R_adj < R_adj
# and
# bottom_adj < top_adj
n=ncol(df)
m=nrow(df)
plot(1:m,
seq(1,n, length.out=m),
# The following arguments suppress plotting values and axis elements
type='n',
xaxt='n',
yaxt='n',
ann=F)
ypos = rev(seq(1+.1*n,n*top_adj, length.out=n))
text(rep(L_adj,n),
ypos,
colnames(df), # These are our labels
pos=4, # This positions the text to the right of the coordinate
cex=cex1) # Increase the size of the text
text(rep(mid_L_adj*m,n), # Shifted towards the middle of the plot
ypos,
colSums(df), # new label
pos=4,
cex=cex2)
for(i in 1:n){
points(seq(mid_R_adj*m, R_adj*m, length.out=m),
rep(ypos[i],m),
pch=ifelse(df[,i], 124, 32), # This determines whether to plot or not
cex=cex3,
col=spark_col)
}
}
BinarySparklines(df)
Which gives us the following result:
Try playing with the alignment parameters and see what happens. For instance, to shrink the side margins, you could try decreasing the L_adj parameter and increasing the R_adj parameter like so:
BinarySparklines(df, L_adj=-1, R_adj=1.02)
It took a bit of trial and error to get the alignment right for the result I provided (which is what I used to inform the default values for BinarySparklines), but I hope I've given you some intuition about how I achieved it and how moving things using percentages of the plotting range made my life easier. In any event, I hope this serves as both a proof of concept and a template for your code. I'm sorry I don't have an easier solution for you, but I think this basically gets the job done.
I did my prototyping in Rstudio so I didn't have to specify the dimensions of my plot, but for posterity I had 832 x 456 with the aspect ratio maintained.

Using R to make a barplot with a specific width?

I would like to use R to make a barplot of ~100,000 numerical entries. The plot will be dense, which is what I want. So far I am using the following code:
sample_var <- c(2,5,3,2,3,2,6,10,20,...) #Filled with 100,000 entries
barplot(sample_var)
The resulting plot is just what I want, but it is a square, whereas I want a long rectangle. Is there a way to set the dimensions of the barplot? I would like to specific an aspect ratio of 10:1 for length x height, or a specific pixel setting of 1000px x 10px. I tried using xlim in the barplot function statement, but get an "invalid xlim" warning.
Any help is appreciated!
Set the width and hight when outputting to a file:
png(filename="figures.png", width=800, height=200, bg="white")
sample_var <- c(2,5,3,2,3,2,6,10,20)
barplot(sample_var)
dev.off()

How to set xlim in multhist?

The following code creates 3 vectors, and displays them as interlaced histograms:
a <- c(1,2,3)
b <- c(1,1,2)
c <- c(1,1,1)
l <- list(a,b,c)
multhist(l, col=c("red","green","blue"),xlim=c(0,5))
However, when I specify this xlim=c(0,5), I would expect this to set the x axis range, but it does not seem to do so. The x-axis appears to only range between about 1.0 and 1.4. Is there a different way to specify the x-axis range for a multhist?
Maybe not perfect, but a start:
edit: remove x-axis labels, add box
multhist(l, col=c("red","green","blue"),
breaks=seq(0,5,by=0.2),names.arg=rep("",25))
box(bty="l") ## add box around bottom and left edges
multhist is a bit of a hack (I know, I wrote it!) -- it uses barplot internally, so the x axis is indexing the positions of the bars rather than the actual values.
See also
http://www.math.mcmaster.ca/bolker/R/misc/multhist.pdf
http://www.math.mcmaster.ca/bolker/R/misc/multhist.Rnw
for some other ideas about how to display binned data from multiple groups.

Resources