Bar plot with Y-axis break [duplicate] - r

I found a lot of SO question and answers addressing break and gaps in axis. But most of them are of low quality (in an SO meaning) because of no example code, no picture or to complex codes. This is why I asking.
I try to use library(plotrix). If there is a solution without it and/or another library it would be ok for me, too.
This is a normal R-barplot.
barplot(c(10,20,500))
To break the axis and add gap I tried this.
gap.barplot(c(10,20,500),gap=c(50,400), col=FALSE)
The result is not beautiful.
There is no space between the bars. space parameter from barplot() is not accepted by gap.barplot().
The bars have different widths.
The position of the tics are not in the middle of the bar.
Can I control that parameters with plotrix? I don't see something about it in the documentation.
Is there another library or solution for my problem?

There are so many different answers because of a lot of individual problems. For your problem you can try the following. But there is always a better solution out there. And IMO its always better to show your complete data instead of cropping it.
# Your data with names
library(plotrix)
d <- c(10,20,500)
names(d) <- letters[1:3]
# Specify a cutoff where the y.axis should be splitted.
co <- 200
# Now cut off this area in your data.
d[d > co] <- d[d > co] - co
# Create new axis label using the pretty() function
newy <- pretty(d)
newy[ newy > co] <- newy[ newy > co] + co
# remove values in your cutoff.
gr <- which(newy != co)
newy <- newy[ gr ]
# plot the data
barplot(d, axes=F)
# add the axis
axis(2, at = pretty(d)[gr], labels = newy)
axis.break(2, co, style = "gap")
As an alternative you can try to log your axis using log="y".

Related

How to break axis in a barplot (maybe using plotrix gap.barplot)?

I found a lot of SO question and answers addressing break and gaps in axis. But most of them are of low quality (in an SO meaning) because of no example code, no picture or to complex codes. This is why I asking.
I try to use library(plotrix). If there is a solution without it and/or another library it would be ok for me, too.
This is a normal R-barplot.
barplot(c(10,20,500))
To break the axis and add gap I tried this.
gap.barplot(c(10,20,500),gap=c(50,400), col=FALSE)
The result is not beautiful.
There is no space between the bars. space parameter from barplot() is not accepted by gap.barplot().
The bars have different widths.
The position of the tics are not in the middle of the bar.
Can I control that parameters with plotrix? I don't see something about it in the documentation.
Is there another library or solution for my problem?
There are so many different answers because of a lot of individual problems. For your problem you can try the following. But there is always a better solution out there. And IMO its always better to show your complete data instead of cropping it.
# Your data with names
library(plotrix)
d <- c(10,20,500)
names(d) <- letters[1:3]
# Specify a cutoff where the y.axis should be splitted.
co <- 200
# Now cut off this area in your data.
d[d > co] <- d[d > co] - co
# Create new axis label using the pretty() function
newy <- pretty(d)
newy[ newy > co] <- newy[ newy > co] + co
# remove values in your cutoff.
gr <- which(newy != co)
newy <- newy[ gr ]
# plot the data
barplot(d, axes=F)
# add the axis
axis(2, at = pretty(d)[gr], labels = newy)
axis.break(2, co, style = "gap")
As an alternative you can try to log your axis using log="y".

How to adjust x labels in R boxplot

This is my code to create a boxplot in R that has 4 boxplots in one.
psnr_x265_256 <- c(39.998,39.998, 40.766, 38.507,38.224,40.666,38.329,40.218,44.746,38.222)
psnr_x264_256 <- c(39.653, 38.106,37.794,36.13,36.808,41.991,36.718,39.26,46.071,36.677)
psnr_xvid_256 <- c(33.04564,33.207269,32.715427,32.104696,30.445141,33.135261,32.669766, 31.657039,31.53103,31.585865)
psnr_mpeg2_256 <- c(32.4198,32.055051,31.424819,30.560274,30.740421,32.484694, 32.512268,32.04659,32.345848, 31)
all_errors = cbind(psnr_x265_256, psnr_x264_256, psnr_xvid_256,psnr_mpeg2_256)
modes = cbind(rep("PSNR",10))
journal_linear_data <-data.frame(psnr_x265_256, psnr_x264_256, psnr_xvid_256,psnr_mpeg2_256)
yvars <- c("psnr_x265_256","psnr_x264_256","psnr_xvid_256","psnr_mpeg2_256")
xvars <- c("x265","x264","xvid","mpeg2")
bmp(filename="boxplot_PSNR_256.bmp")
boxplot(journal_linear_data[,yvars], xlab=xvars, ylab="PSNR")
dev.off()
This is the image I get.
I want to have the corresponding values for each boxplot in x axis "x265","x264","xvid","mpeg2".
Do you have any idea how to fix this?
There are multiple ways of changing the labels for your boxplot variables. Probably the simplest way is changing the column names of your data frame:
colnames(journal_linear_data) <- c("x265","x264","xvid","mpeg2")
Even simpler: you could do this right at the creation of your data frame too:
journal_linear_data <- data.frame(x265=psnr_x265_256, x264=psnr_x264_256, xvid=psnr_xvid_256, mpeg2=psnr_mpeg2_256)
If you run into the problem of your labels not being shown or overlapping due to too few space, try rotating the x labels using the las parameter, e.g. las=2 or las=3.

Gantt plot in base r - modifying plot properties

I would like to ask a follow-up question related to the answer given in this post [Gantt style time line plot (in base R) ] on Gantt plots in base r. I feel like this is worth a new question as I think these plots have a broad appeal. I'm also hoping that a new question would attract more attention. I also feel like I need more space than the comments of that question to be specific.
The following code was given by #digEmAll . It takes a dataframe with columns referring to a start time, end time, and grouping variable and turns that into a Gantt plot. I have modified #digEmAll 's function very slightly to get the bars/segments in the Gantt plot to be contiguous to one another rather than having a gap. Here it is:
plotGantt <- function(data, res.col='resources',
start.col='start', end.col='end', res.colors=rainbow(30))
{
#slightly enlarge Y axis margin to make space for labels
op <- par('mar')
par(mar = op + c(0,1.2,0,0))
minval <- min(data[,start.col])
maxval <- max(data[,end.col])
res.colors <- rev(res.colors)
resources <- sort(unique(data[,res.col]),decreasing=T)
plot(c(minval,maxval),
c(0.5,length(resources)+0.5),
type='n', xlab='Duration',ylab=NA,yaxt='n' )
axis(side=2,at=1:length(resources),labels=resources,las=1)
for(i in 1:length(resources))
{
yTop <- i+0.5
yBottom <- i-0.5
subset <- data[data[,res.col] == resources[i],]
for(r in 1:nrow(subset))
{
color <- res.colors[((i-1)%%length(res.colors))+1]
start <- subset[r,start.col]
end <- subset[r,end.col]
rect(start,yBottom,end,yTop,col=color)
}
}
par(op) # reset the plotting margins
}
Here are some sample data. You will notice that I have four groups 1-4. However, not all dataframes have all four groups. Some only have two, some only have 3.
mydf1 <- data.frame(startyear=2000:2009, endyear=2001:2010, group=c(1,1,1,1,2,2,2,1,1,1))
mydf2 <- data.frame(startyear=2000:2009, endyear=2001:2010, group=c(1,1,2,2,3,4,3,2,1,1))
mydf3 <- data.frame(startyear=2000:2009, endyear=2001:2010, group=c(4,4,4,4,4,4,3,2,3,3))
mydf4 <- data.frame(startyear=2000:2009, endyear=2001:2010, group=c(1,1,1,2,3,3,3,2,1,1))
Here I run the above function, but specify four colors for plotting:
plotGantt(mydf1, res.col='group', start.col='startyear', end.col='endyear',
res.colors=c('red','orange','yellow','gray99'))
plotGantt(mydf2, res.col='group', start.col='startyear', end.col='endyear',
res.colors=c('red','orange','yellow','gray99'))
plotGantt(mydf3, res.col='group', start.col='startyear', end.col='endyear',
res.colors=c('red','orange','yellow','gray99'))
plotGantt(mydf4, res.col='group', start.col='startyear', end.col='endyear',
res.colors=c('red','orange','yellow','gray99'))
These are the plots:
What I would like to do is modify the function so that:
1) it will plot on the y-axis all four groups regardless of whether they actually appear in the data or not.
2) Have the same color associated with each group for every plot regardless of how many groups there are. As you can see, mydf2 has four groups and all four colors are plotted (1-red, 2-orange, 3-yellow, 4-gray). These colors are actually plotted with the same groups for mydf3 as that only contains groups 2,3,4 and the colors are picked in reverse order. However mydf1 and mydf4 have different colors plotted for each group as they do not have any group 4's. Gray is still the first color chosen but now it is used for the lowest occurring group (group2 in mydf1 and group3 in mydf3).
It appears to me that the main thing I need to work on is the vector 'resources' inside the function, and have that not just contain the unique groups but all. When I try manually overriding to make sure it contains all the groups, e.g. doing something as simple as resources <-as.factor(1:4) then I get an error:
'Error in rect(start, yBottom, end, yTop, col = color) : cannot mix zero-length and non-zero- length coordinates'
Presumably the for loop does not know how to plot data that do not exist for groups that don't exist.
I hope that this is a replicable/readable question and it's clear what I'm trying to do.
EDIT: I realize that to solve the color problem, I could just specify the colors for the 3 groups that exist in each of these sample dfs. However, my intention is to use this plot as an output to a function whereby it wouldn't be known ahead of time if all of the groups exist for a particular df.
I slightly modified your function to account for NA in start and end dates :
plotGantt <- function(data, res.col='resources',
start.col='start', end.col='end', res.colors=rainbow(30))
{
#slightly enlarge Y axis margin to make space for labels
op <- par('mar')
par(mar = op + c(0,1.2,0,0))
minval <- min(data[,start.col],na.rm=T)
maxval <- max(data[,end.col],na.rm=T)
res.colors <- rev(res.colors)
resources <- sort(unique(data[,res.col]),decreasing=T)
plot(c(minval,maxval),
c(0.5,length(resources)+0.5),
type='n', xlab='Duration',ylab=NA,yaxt='n' )
axis(side=2,at=1:length(resources),labels=resources,las=1)
for(i in 1:length(resources))
{
yTop <- i+0.5
yBottom <- i-0.5
subset <- data[data[,res.col] == resources[i],]
for(r in 1:nrow(subset))
{
color <- res.colors[((i-1)%%length(res.colors))+1]
start <- subset[r,start.col]
end <- subset[r,end.col]
rect(start,yBottom,end,yTop,col=color)
}
}
par(mar=op) # reset the plotting margins
invisible()
}
In this way, if you simply append all your possible group values to your data you'll get them printed on the y axis. e.g. :
mydf1 <- data.frame(startyear=2000:2009, endyear=2001:2010,
group=c(1,1,1,1,2,2,2,1,1,1))
# add all the group values you want to print with NA dates
mydf1 <- rbind(mydf1,data.frame(startyear=NA,endyear=NA,group=1:4))
plotGantt(mydf1, res.col='group', start.col='startyear', end.col='endyear',
res.colors=c('red','orange','yellow','gray99'))
About the colors, at the moment the ordered res.colors are applied to the sorted groups; so the 1st color in res.colors is applied to 1st (sorted) group and so on...

Stretch x-axis between two values

I have to plot several IR-spectrums. The x-axis with this plots has to be stretched between 2000 and 500. I've tried axis(side=1,at=c(4000,3500,2000,1500,1000,500)), but this does not produce the same distance between the labels. I've searched nearly 2 hours but can't figure out how to achieve this.
Help would be appreciated.
Thanks in advance
I don't think that there's a particularly clean way to do this in base graphics - no doubt there's something in one of the many graphics packages that would do it, but heres' my workaround for what I think you're trying to do.
#Some data to plot
x <- 0:4000
y <- sin(x/100)
#A function to do the stretching that you describe
stretcher <- function(x)
{
lower <- 500 ##lower end of expansion
upper <- 2000 ##upper end of expansion
stretchfactor <- 3 ##must be greater than 1, factor of expansion
x[x>upper] <- x[x>upper] + (stretchfactor-1) * (upper-lower)
x[x<=upper & x>lower] <- (x[x<=upper & x>lower] - lower) * stretchfactor + lower
x
}
#Create the plot
plot(stretcher(x),y,axes=FALSE)
labels <- c(4000,3500,3000,2500,2000,1500,1000,500)
box()
axis(2)
axis(1,labels=labels,at=stretcher(labels))
I'd also emphasis the breaks with something like:
abline(v=stretcher(2000),col='red',lty=2)
abline(v=stretcher(500),col='red',lty=2)

add labels to lattice barchart

I would like to place the value for each bar in barchart (lattice) at the top of each bar. However, I cannot find any option with which I can achieve this. I can only find options for the axis.
Create a custom panel function, e.g.
library("lattice")
p <- barchart((1:10)^2~1:10, horiz=FALSE, ylim=c(0,120),
panel=function(...) {
args <- list(...)
panel.text(args$x, args$y, args$y, pos=3, offset=1)
panel.barchart(...)
})
print(p)
I would have suggested using the new directlabels package, which can be used with both lattice and ggplot (and makes life very easy for these labeling problems), but unfortunately it doesn't work with barcharts.
Since I had to do this anyway, here's a close-enough-to-figure it out code sample along the lines of what #Alex Brown suggests (scores is a 2D array of some sort, which'll get turned into a grouped vector):
barchart(scores, horizontal=FALSE, stack=FALSE,
xlab='Sample', ylab='Mean Score (max of 9)',
auto.key=list(rectangles=TRUE, points=FALSE),
panel=function(x, y, box.ratio, groups, errbars, ...) {
# We need to specify groups because it's not actually the 4th
# parameter
panel.barchart(x, y, box.ratio, groups=groups, ...)
x <- as.numeric(x)
nvals <- nlevels(groups)
groups <- as.numeric(groups)
box.width <- box.ratio / (1 + box.ratio)
for(i in unique(x)) {
ok <- x == i
width <- box.width / nvals
locs <- i + width * (groups[ok] - (nvals + 1)/2)
panel.arrows(locs, y[ok] + 0.5, scores.ses[,i], ...)
}
} )
I haven't tested this, but the important bits (the parts determining the locs etc. within the panel function) do work. That's the hard part to figure out. In my case, I was actually using panel.arrows to make errorbars (the horror!). But scores.ses is meant to be an array of the same dimension as scores.
I'll try to clean this up later - but if someone else wants to, I'm happy for it!
If you are using the groups parameter you will find the labels in #rcs's code all land on top of each other. This can be fixed by extending panel.text to work like panel.barchart, which is easy enough if you know R.
I can't post the code of the fix here for licencing reasons, sorry.

Resources