Adding lines to graph created using plotrix library - r

I have created a stacked histogram using the multhist function in the plotrix library, but I am unable to add a straight line to this histogram. Code that I would normally use doesn't seem to work in this setting.
Here's an example. I am trying to add the mean and standard errors of the overall distribution as simple vertical lines on the histogram, but these do not work properly. What am I doing wrong?
library(plotrix)
test1<-rnorm(30,0)
test2<-rnorm(30,0)
test3<-rnorm(30,0)
forstats<-c(test1,test2,test3)
mn<-mean(forstats)
se<-std.error(forstats)
together<-list(test1,test2,test3)
multhist(together, col=c(7,4,2), space=c(0,0), beside=FALSE,right=FALSE)
abline(v=mn)
abline(v=mn+se)
abline(v=mn-se)

multhist uses barplot, so, as #BenBolker mentions here, the x-axis corresponds to bin index. It's a bit tricky to convert between native coordinates and bin index units, so I've put together another function for stacked histograms (for frequencies, anyway):
histstack <- function(x, breaks, col=rainbow(length(x)), ...) {
col <- rev(col)
if (length(breaks)==1) {
rng <- range(pretty(range(x)))
breaks <- seq(rng[1], rng[2], length.out=breaks)
}
h <- lapply(x, hist, plot=FALSE, breaks=breaks)
cumcounts <- apply(sapply(h, '[[', 'counts'), 1, cumsum)
for(i in seq_along(h)) {
h[[i]]$counts <- cumcounts[nrow(cumcounts) - i + 1, ]
}
max_cnt <- max(sapply(h, '[[', 'counts'))
plot(h[[1]], xlim=range(sapply(h, '[', 'breaks')), yaxt='n',
ylim=c(0, max(pretty(max_cnt))), col=col[1], ...)
sapply(seq_along(h)[-1], function(i) plot(h[[i]], col=col[i], add=TRUE, ...))
axis(2, at=pretty(c(0, max_cnt)), labels=pretty(c(0, max_cnt)), ...)
}
And here it is:
histstack(together, seq(-3, 3, 0.5), col=c(7, 4, 2), main='',
las=1, xlab='', ylab='')
abline(v=c(mn, mn+se, mn-se), lwd=2, )
IMO the x-axis labelling is probably more appropriate than that of multhist, since multhist implies that counts relate to the mid-bin values, whereas above it's clear that the x-axis ticks delineate the bins.

Related

Plotting list of functions using for loop in R

How can you plot a list of functions in one graph using a for loop in R? The following code does the trick, but requires a separate call of plot for the first function outside of the for loop, which is very clunky. Is there a way to handle all the plotting inside the for loop without creating multiple plots?
vec <- 1:10
funcs <- lapply(vec, function(base){function(exponent){base^exponent}})
x_vals <- seq(0, 10, length.out=100)
plot(x_vals, funcs[[1]](x_vals), type="l", ylim=c(0,100))
for (i in 2:length(vec)) {
lines(x_vals, funcs[[i]](x_vals))
}
You can also do the computations first and plotting after, like this:
vec <- 1:10
funcs <- lapply(vec, function(base) function(exponent){base^exponent})
x_vals <- seq(0, 10, length.out=100)
y_vals <- sapply(funcs, \(f) f(x_vals))
plot(1, xlim=range(x_vals), ylim=range(y_vals), type='n', log='y',
xlab='x', ylab='y')
apply(y_vals, 2, lines, x=x_vals)
This way you know the range of your y values before initiating the plot and can set the y axis limits accordingly (if you would want that). Note that I chose to use logarithmic y axis here.
Based on MrFlick's comment, it looks like something like this would be one way to do what I'm looking for, but is still not great.
vec <- 1:10
funcs <- lapply(vec, function(base){function(exponent){base^exponent}})
x_vals <- seq(0, 10, length.out=100)
plot(NULL, xlim=c(0,10), ylim=c(0,100))
for (i in 1:length(vec)) {
lines(x_vals, funcs[[i]](x_vals))
}

How to create a monomial plot in R?

I want to create a function, that result will be a plot of moniomals ( degree less than "n").
I wrote the simple code.
Monomial=function(m){
x=1:100
y=1:100
for(i in m) x2=x^m
plot(y,x2,type="l",col="red",xlab="Arguments",ylab="Values",
main=expression("Monomials"))
But for example: Monomial(3) I getting plot x^3. I need yet x^1 and x^2. How to name each line?
Here is what you need:
Monomial <- function(m){
x <- 1:100
cols <- palette(rainbow(m))
plot(x,x,type="l",col = cols[1],xlab="Arguments",ylab="Values",
main=expression("Monomials"))
for (d in 2:m){
lines(x, x^d, type="l", col=cols[d])
}
legend(90, 60, legend=c(as.character(paste0("x",1:m))),
col=cols, lty=1, cex=0.6)
}
You need to generate colors. This is what the cols variable achieves. lines adds a new curve to existing axes. Finally, ledend adds a legend to the plot.

How to fix overlapping issue

plot(USArrests$Murder, USArrests$UrbanPop,
xlab="murder", ylab="% urban population", pch=20, col="grey",
ylim=c(20, 100), xlim=c(0, 20))
text(USArrests$Murder, USArrests$UrbanPop, labels=rownames(USArrests),
cex=0.7, pos=3)
I tried everything, reducing font size with cex, change the positions, change the ylim, xlim to fit the size, I also tried changing the margins, which didn't really help me so I got rid of them. At this point, I don't know how to do this with base R tool. I do know ggplot method, which is way easier. But I want to know if I can do the same task with the base plot(),text() code.
To find neighbors which are too near you could run kmeans() cluster analysis about the data. It's quite a hack, though!
First, subset your data.
dat <- USArrests[c("Murder", "UrbanPop")]
Set a seed. Play around with that. Different seeds => different results.
set.seed(42)
Analyze clusters with kmeans(), option centers assigns number of clusters, play around with that.
dat$cl <- kmeans(dat, centers=10, nstart=5)$cluster
Now split data and assign altering pos numbers for positioning later in the text() command.
l <- split(dat, dat$cl)
l <- lapply(l, function(x) within(x, {
if (nrow(x) == 1)
pos <- 2 # for those with just one observation in cluster
else
pos <- as.numeric(as.character(factor((1:nrow(x)) %% 2, labels=c(2, 4))))
}))
Assemble.
dat <- do.call(rbind, unname(l))
Now plot into a png with a somewhat high resolution, I chose 800x800.
png("plot.png", 800, 800, "px")
plot(dat$Murder, dat$UrbanPop, xlab="murder", ylab="% urban population",
pch=20, col="grey", ylim=c(20, 100), xlim=c(0, 20))
# the sapply assigns the text position according to `pos` column
sapply(c(4, 2), function(x)
with(dat[dat$pos == x, ],
text(Murder, UrbanPop, labels=rownames(dat[dat$pos == x, ]),
cex=0.7, pos=x)))
dev.off()
Which gives me:
I'm sure you can optimize this further.

Colorfill boxplot in R-cran with lines, dots, or similar

I need to use black and white color for my boxplots in R. I would like to colorfill the boxplot with lines and dots. For an example:
I imagine ggplot2 could do that but I can't find any way to do it.
Thank you in advance for your help!
I thought this was a great question and pondered if it was possible to do this in base R and to obtain the checkered look. So I put together some code that relies on boxplot.stats and polygon (which can draw angled lines). Here's the solution, which is really not ready for primetime, but is a solution that could be tinkered with to make more general.
boxpattern <-
function(y, xcenter, boxwidth, angle=NULL, angle.density=10, ...) {
# draw an individual box
bstats <- boxplot.stats(y)
bxmin <- bstats$stats[1]
bxq2 <- bstats$stats[2]
bxmedian <- bstats$stats[3]
bxq4 <- bstats$stats[4]
bxmax <- bstats$stats[5]
bleft <- xcenter-(boxwidth/2)
bright <- xcenter+(boxwidth/2)
# boxplot
polygon(c(bleft,bright,bright,bleft,bleft),
c(bxq2,bxq2,bxq4,bxq4,bxq2), angle=angle[1], density=angle.density)
polygon(c(bleft,bright,bright,bleft,bleft),
c(bxq2,bxq2,bxq4,bxq4,bxq2), angle=angle[2], density=angle.density)
# lines
segments(bleft,bxmedian,bright,bxmedian,lwd=3) # median
segments(bleft,bxmin,bright,bxmin,lwd=1) # min
segments(xcenter,bxmin,xcenter,bxq2,lwd=1)
segments(bleft,bxmax,bright,bxmax,lwd=1) # max
segments(xcenter,bxq4,xcenter,bxmax,lwd=1)
# outliers
if(length(bstats$out)>0){
for(i in 1:length(bstats$out))
points(xcenter,bstats$out[i])
}
}
drawboxplots <- function(y, x, boxwidth=1, angle=NULL, ...){
# figure out all the boxes and start the plot
groups <- split(y,as.factor(x))
len <- length(groups)
bxylim <- c((min(y)-0.04*abs(min(y))),(max(y)+0.04*max(y)))
xcenters <- seq(1,max(2,(len*(1.4))),length.out=len)
if(is.null(angle)){
angle <- seq(-90,75,length.out=len)
angle <- lapply(angle,function(x) c(x,x))
}
else if(!length(angle)==len)
stop("angle must be a vector or list of two-element vectors")
else if(!is.list(angle))
angle <- lapply(angle,function(x) c(x,x))
# draw plot area
plot(0, xlim=c(.97*(min(xcenters)-1), 1.04*(max(xcenters)+1)),
ylim=bxylim,
xlab="", xaxt="n",
ylab=names(y),
col="white", las=1)
axis(1, at=xcenters, labels=names(groups))
# draw boxplots
plots <- mapply(boxpattern, y=groups, xcenter=xcenters,
boxwidth=boxwidth, angle=angle, ...)
}
Some examples in action:
mydat <- data.frame(y=c(rnorm(200,1,4),rnorm(200,2,2)),
x=sort(rep(1:2,200)))
drawboxplots(mydat$y, mydat$x)
mydat <- data.frame(y=c(rnorm(200,1,4),rnorm(200,2,2),
rnorm(200,3,3),rnorm(400,-2,8)),
x=sort(rep(1:5,200)))
drawboxplots(mydat$y, mydat$x)
drawboxplots(mydat$y, mydat$x, boxwidth=.5, angle.density=30)
drawboxplots(mydat$y, mydat$x, # specify list of two-element angle parameters
angle=list(c(0,0),c(90,90),c(45,45),c(45,-45),c(0,90)))
EDIT: I wanted to add that one could also obtain dots as a fill by basically drawing a pattern of dots, then covering them a "donut"-shaped polygon, like so:
x <- rep(1:10,10)
y <- sort(x)
plot(y~x, xlim=c(0,11), ylim=c(0,11), pch=20)
outerbox.x <- c(2.5,0.5,10.5,10.5,0.5,0.5,2.5,7.5,7.5,2.5)
outerbox.y <- c(2.5,0.5,0.5,10.5,10.5,0.5,2.5,2.5,7.5,7.5)
polygon(outerbox.x,outerbox.y, col="white", border="white") # donut
polygon(c(2.5,2.5,7.5,7.5,2.5),c(2.5,2.5,2.5,7.5,7.5)) # inner box
But mixing that with angled lines in a single plotting function would be a bit difficult, and is generally a bit more challenging, but it starts to get you there.
I think it is hard to do this with ggplot2 since it dont use shading polygon(gris limitatipn). But you can use shading line feature in base plot, paramtered by density and angle arguments in some plot functions ( ploygon, barplot,..).
The problem that boxplot don't use this feature. So I hack it , or rather I hack bxp internally used by boxplot. The hack consist in adding 2 arguments (angle and density) to bxp function and add them internally in the call of xypolygon function ( This occurs in 2 lines).
my.bxp <- function (all.bxp.argument,angle,density, ...) {
.....#### bxp code
xypolygon(xx, yy, lty = boxlty[i], lwd = boxlwd[i],
border = boxcol[i],angle[i],density[i])
.......## bxp code after
xypolygon(xx, yy, lty = "blank", col = boxfill[i],angle[i],density[i])
......
}
Here an example. It should be noted that it is entirely the responsibility of the user to ensure
that the legend corresponds to the plot. So I add some code to rearrange the legend an the boxplot code.
require(stats)
set.seed(753)
(bx.p <- boxplot(split(rt(100, 4), gl(5, 20))))
layout(matrix(c(1,2),nrow=1),
width=c(4,1))
angles=c(60,30,40,50,60)
densities=c(50,30,40,50,30)
par(mar=c(5,4,4,0)) #Get rid of the margin on the right side
my.bxp(bx.p,angle=angles,density=densities)
par(mar=c(5,0,4,2)) #No margin on the left side
plot(c(0,1),type="n", axes=F, xlab="", ylab="")
legend("top", paste("region", 1:5),
angle=angles,density=densities)

superpose a histogram and an xyplot

I'd like to superpose a histogram and an xyplot representing the cumulative distribution function using r's lattice package.
I've tried to accomplish this with custom panel functions, but can't seem to get it right--I'm getting hung up on one plot being univariate and one being bivariate I think.
Here's an example with the two plots I want stacked vertically:
set.seed(1)
x <- rnorm(100, 0, 1)
discrete.cdf <- function(x, decreasing=FALSE){
x <- x[order(x,decreasing=FALSE)]
result <- data.frame(rank=1:length(x),x=x)
result$cdf <- result$rank/nrow(result)
return(result)
}
my.df <- discrete.cdf(x)
chart.hist <- histogram(~x, data=my.df, xlab="")
chart.cdf <- xyplot(100*cdf~x, data=my.df, type="s",
ylab="Cumulative Percent of Total")
graphics.off()
trellis.device(width = 6, height = 8)
print(chart.hist, split = c(1,1,1,2), more = TRUE)
print(chart.cdf, split = c(1,2,1,2))
I'd like these superposed in the same frame, rather than stacked.
The following code doesn't work, nor do any of the simple variations of it that I have tried:
xyplot(cdf~x,data=cdf,
panel=function(...){
panel.xyplot(...)
panel.histogram(~x)
})
You were on the right track with your custom panel function. The trick is passing the correct arguments to the panel.- functions. For panel.histogram, this means not passing a formula and supplying an appropriate value to the breaks argument:
EDIT Proper percent values on y-axis and type of plots
xyplot(100*cdf~x,data=my.df,
panel=function(...){
panel.histogram(..., breaks = do.breaks(range(x), nint = 8),
type = "percent")
panel.xyplot(..., type = "s")
})
This answer is just a placeholder until a better answer comes.
The hist() function from the graphics package has an option called add. The following does what you want in the "classical" way:
plot( my.df$x, my.df$cdf * 100, type= "l" )
hist( my.df$x, add= T )

Resources