x-axis labels do not match bars - r

Hello I am trying to create a stacked barplot using the following code:
test <- as.matrix(read.csv(file="test4.csv",sep=",",head=TRUE))
test <- test[,2:ncol(test)]
pdf(file="test.pdf", height=4, width=6)
par(lwd = 0.3)
barplot(test, space=0.4, xaxt='n', ann=FALSE)
axis(1, cex.axis=0.25, las=2, at=1:ncol(test), space=0.4, labels=colnames(test))
dev.off()
And I get:
As you can see the labels in the x-axis do not match the bars in the plot. Also, the ticks are huge.
Can you guys help me beautify the x axis? Thanks so much

Try storing the returned value of the call to barplot() in a named object, and then passing that in to the at= argument of axis():
xLabLocs <- barplot(test, space=0.4, xaxt='n', ann=FALSE)
axis(1, cex.axis=0.25, las=2, at=xLabLocs,
space=0.4, labels=colnames(test))
This may look odd, but it is explained in the Value section of the ?barplot help file:
Value:
A numeric vector (or matrix, when ‘beside = TRUE’), say ‘mp’,
giving the coordinates of _all_ the bar midpoints drawn, useful
for adding to the graph.
You just made the (easy enough to make) mistake of assuming that the x-axis coordinates of the bar centers are at 1:n, where n is the number of bars. That's not necessarily true, so it's nice that a single call to barplot() will both: (a) plot the bar plot as its side effect; and (b) return the necessary x-axis coordinates as its return value.

Related

Place bars at specific x-axis values in a barplot

I would like to represent two-dimensional data as bars, placed over the x-axis values, but barplot() does not allow to control x-axis placement, and plot() does not draw bars:
x <- c(1, 2, 3, 5)
y <- 1:4
plot(x, y, type = "h")
barplot(y)
Click for an image illustrating the plot() and barplot() examples.
I understand that I can plot a histogram –
hist(rep(x, y), breaks = seq(min(x) - 0.5, max(x) + 0.5, 1))
Click for an image illustrating the hist() example.
– but the recreation of the original (non-frequency) data and the calculation of the breaks is not always as straightforward as in this example, so:
Is there a way to force plot() to draw bars?
Or is there a way to force barplot() to place the bars at specific values on the x-axis?
Basically, what I would like is something like:
barplot(y, at = x)
I would prefer to use base R and avoid ggplot.
While I agree with #Dave2e that a barplot may not be the best way to represent your data, you can get something like what you are describing by starting with a blank plot and drawing the relevant rectangles. I am using your y values (1:4) and the x values that you mentioned in your comment. I am not sure what you want on the x-axis, but I show labels for the x-values that you give. In order to look like a barplot, I suppress the tick marks on the x-axis.
plot(NULL, xlim=c(0,11), ylim=c(0,4.5), bty="n",
xaxt="n", xaxs="i", yaxs="i", xlab="", ylab="")
rect(x-0.5, 0, x+0.5, y, col="gray")
axis(side=1, at=x, col.ticks=NA)

R: par(mfg) resets ylim values

I'm having a frustrating experience trying to use par(mfg) to move between subplots of a figure. It seems like changing which plot I'm working in using this command resets something about the way y axes are specified such that the ylim=c(a,b) call is useless. This thread (puzzled by xlim/ylim behavior in R) makes me believe that asp may play a role here, but I can't figure out how or how to correct the error.
Briefly, to plot results from density() for multiple datasets on two subplots of a single window, I've written a loop that increments through two lists of output from density() adding new lines to subplot 1, then subplot 2, then back to subplot 1, etc.
DATA.A<-vector("list",length=6)
DATA.B<-vector("list",length=6)
par(mfrow=c(2,1))
plot(0,0, main="title", xlab="X", ylab="Y", xlim=c(c,d), ylim=c(0,30))
plot(0,0, main="title", xlab="X", ylab="Y", xlim=c(c,d), ylim=c(-5,5))
for(i in 1:6){
DATA.A[[i]]<-density(RAWDATA.A[[i]][,"varname"], from=c, to=d, by=e)
DATA.B[[i]]<-density(RAWDATA.B[[i]][,"varname"], from=c, to=d, by=e)
par(mfg=c(1,1))
lines(DATA.A[[i]]$x,DATA.A[[i]]$y,ylim=c(0,30),col="black", lty=i)
lines(DATA.B[[i]]$x,DATA.B[[i]]$y,ylim=c(0,30),col="red", lty=i)
par(mfg=c(2,1))
lines(DATA.A[[i]]$x,DATA.B[[i]]$y-DATA.A[[i]]$y,
ylim=c(-5,5), col="red", lty=i)
abline(v=median(RAWDATA.A[[i]][,"varname"]),lty=i, col="black")
}
EDIT: I am realizing that it fails mostly for the first subplot where it is supposed to be plotting densities over the range from 0 to 30, but instead always resets the axis to the range -1 to 1. Calling plot(0,0), the y tick labels correspond to ylim values I provide, but the data is plotted on the -1 to 1 range. I'd be very grateful for any suggestions.

R: Two axis chart adjustments

I am trying to plot a chat with two axis, here is the code and attached is the plot,
I have to do two adjustments to it.
I want to plot a line with dots and dots should be middle of the bars
Adjusting right side axis(i.e axis(4)) tick marks should align with left side axix(i.e axis(2))
Code:
Region=c("North","South","East","West")
Sales=sample(500:1000,4)
Change=sample(1:10,4)/10
names(Sales)=Region
names(Change)=Region
barplot(Sales,ylim=c(0,1000))
par(new=T)
plot(Change,type="b",axes=F,ylim=c(0,1))
axis(4)
box()
Regards,
Sivaji
First, save your barplot as some object. So you will get coordinates of the middle points. Then to add line you can use also function lines() and just multiply Change values with 1000.
Then for axis() function supply at= values and labels= the same as at=, just divided by 1000.
x<-barplot(Sales,ylim=c(0,1000))
lines(x,Change*1000,type="b")
axis(4,at=seq(0,800,200),labels=seq(0,800,200)/1000)
You need to play to set the same x-axis in the second plot, you get this info from par("usr").
The xaxs="i" is to set the xlim exactly, by default R increase the xlim a bit to make it better looking.
par(mar=c(5,5,2,5)) # change margins
x = barplot(Sales, ylim=c(0,1000)) # barplot, keep middle points of bars
mtext("Sales", 2, line=3) # first y-axis label
xlim = par("usr")[1:2] # get xlim from plot
par(new=TRUE)
plot.new() # new plot
plot.window(xlim=xlim, ylim=c(0,1), xaxs="i", yaxs="i") # new plot area, same xlim
lines(x,Change,type="b") # the lines in the middle points
axis(4) # secondary y-axis
mtext("Change", 4, line=3) # secondary y-axis label
box()

Data points do not display when specifying axis in R

For some reason if I try to display data with the following code, I get the axis right, but the actual data does not plot. Any suggestions?
par(bg="lightgray")
adates <-as.Date(row.names(try),format="%Y-%m-%d")
plot(try[,1],x=adates,type="o",axes=FALSE, ann=FALSE)
usr <- par("usr")
rect(usr[1], usr[3], usr[2], usr[4], col="cornsilk", border="black")
lines(try[,1], col="blue")
axis(2,col.axis="blue",,at=pretty(try[,1]),las=1,labels=sprintf("$%1.0f",pretty(try[,1]/1000)),cex.axis=.75)
axis.Date(1, at=pretty(adates), label=format(pretty(adates),"%y"))
box()
title(main="This is a graph", font.main=4, col.main="red",xlab="Date",ylab="$ (in $1000s)")
Forgetting all the other bits of code, take a look at
plot(try[,1],x=adates,type="o",axes=FALSE, ann=FALSE)
The first argument to plot is the vector x-coordinates, which in this case is try[,1]. You then provide another set of x-coordinates with x=adates. So now you have two sets of x-coords but no y-coords.

Histogram with Logarithmic Scale and custom breaks

I'm trying to generate a histogram in R with a logarithmic scale for y. Currently I do:
hist(mydata$V3, breaks=c(0,1,2,3,4,5,25))
This gives me a histogram, but the density between 0 to 1 is so great (about a million values difference) that you can barely make out any of the other bars.
Then I've tried doing:
mydata_hist <- hist(mydata$V3, breaks=c(0,1,2,3,4,5,25), plot=FALSE)
plot(rpd_hist$counts, log="xy", pch=20, col="blue")
It gives me sorta what I want, but the bottom shows me the values 1-6 rather than 0, 1, 2, 3, 4, 5, 25. It's also showing the data as points rather than bars. barplot works but then I don't get any bottom axis.
A histogram is a poor-man's density estimate. Note that in your call to hist() using default arguments, you get frequencies not probabilities -- add ,prob=TRUE to the call if you want probabilities.
As for the log axis problem, don't use 'x' if you do not want the x-axis transformed:
plot(mydata_hist$count, log="y", type='h', lwd=10, lend=2)
gets you bars on a log-y scale -- the look-and-feel is still a little different but can probably be tweaked.
Lastly, you can also do hist(log(x), ...) to get a histogram of the log of your data.
Another option would be to use the package ggplot2.
ggplot(mydata, aes(x = V3)) + geom_histogram() + scale_x_log10()
It's not entirely clear from your question whether you want a logged x-axis or a logged y-axis. A logged y-axis is not a good idea when using bars because they are anchored at zero, which becomes negative infinity when logged. You can work around this problem by using a frequency polygon or density plot.
Dirk's answer is a great one. If you want an appearance like what hist produces, you can also try this:
buckets <- c(0,1,2,3,4,5,25)
mydata_hist <- hist(mydata$V3, breaks=buckets, plot=FALSE)
bp <- barplot(mydata_hist$count, log="y", col="white", names.arg=buckets)
text(bp, mydata_hist$counts, labels=mydata_hist$counts, pos=1)
The last line is optional, it adds value labels just under the top of each bar. This can be useful for log scale graphs, but can also be omitted.
I also pass main, xlab, and ylab parameters to provide a plot title, x-axis label, and y-axis label.
Run the hist() function without making a graph, log-transform the counts, and then draw the figure.
hist.data = hist(my.data, plot=F)
hist.data$counts = log(hist.data$counts, 2)
plot(hist.data)
It should look just like the regular histogram, but the y-axis will be log2 Frequency.
I've put together a function that behaves identically to hist in the default case, but accepts the log argument. It uses several tricks from other posters, but adds a few of its own. hist(x) and myhist(x) look identical.
The original problem would be solved with:
myhist(mydata$V3, breaks=c(0,1,2,3,4,5,25), log="xy")
The function:
myhist <- function(x, ..., breaks="Sturges",
main = paste("Histogram of", xname),
xlab = xname,
ylab = "Frequency") {
xname = paste(deparse(substitute(x), 500), collapse="\n")
h = hist(x, breaks=breaks, plot=FALSE)
plot(h$breaks, c(NA,h$counts), type='S', main=main,
xlab=xlab, ylab=ylab, axes=FALSE, ...)
axis(1)
axis(2)
lines(h$breaks, c(h$counts,NA), type='s')
lines(h$breaks, c(NA,h$counts), type='h')
lines(h$breaks, c(h$counts,NA), type='h')
lines(h$breaks, rep(0,length(h$breaks)), type='S')
invisible(h)
}
Exercise for the reader: Unfortunately, not everything that works with hist works with myhist as it stands. That should be fixable with a bit more effort, though.
Here's a pretty ggplot2 solution:
library(ggplot2)
library(scales) # makes pretty labels on the x-axis
breaks=c(0,1,2,3,4,5,25)
ggplot(mydata,aes(x = V3)) +
geom_histogram(breaks = log10(breaks)) +
scale_x_log10(
breaks = breaks,
labels = scales::trans_format("log10", scales::math_format(10^.x))
)
Note that to set the breaks in geom_histogram, they had to be transformed to work with scale_x_log10

Resources