I would like to represent two-dimensional data as bars, placed over the x-axis values, but barplot() does not allow to control x-axis placement, and plot() does not draw bars:
x <- c(1, 2, 3, 5)
y <- 1:4
plot(x, y, type = "h")
barplot(y)
Click for an image illustrating the plot() and barplot() examples.
I understand that I can plot a histogram –
hist(rep(x, y), breaks = seq(min(x) - 0.5, max(x) + 0.5, 1))
Click for an image illustrating the hist() example.
– but the recreation of the original (non-frequency) data and the calculation of the breaks is not always as straightforward as in this example, so:
Is there a way to force plot() to draw bars?
Or is there a way to force barplot() to place the bars at specific values on the x-axis?
Basically, what I would like is something like:
barplot(y, at = x)
I would prefer to use base R and avoid ggplot.
While I agree with #Dave2e that a barplot may not be the best way to represent your data, you can get something like what you are describing by starting with a blank plot and drawing the relevant rectangles. I am using your y values (1:4) and the x values that you mentioned in your comment. I am not sure what you want on the x-axis, but I show labels for the x-values that you give. In order to look like a barplot, I suppress the tick marks on the x-axis.
plot(NULL, xlim=c(0,11), ylim=c(0,4.5), bty="n",
xaxt="n", xaxs="i", yaxs="i", xlab="", ylab="")
rect(x-0.5, 0, x+0.5, y, col="gray")
axis(side=1, at=x, col.ticks=NA)
Related
Purpose
Create scatter plot with third dimension and multiple colors.
First:
- 3rd dimension with another scale in contrast to y-axis
- create two colors (this is done using col, see code)
Sketch simulating the purpose:
Code
Two "containers" of points plotted in this way:
plot(1:3, c(3,3,3))
points(1:3, c(2,2,2), col="blue")
Another nice plotting is done by:
#install.packages("hexbin")
library(hexbin)
x <- 1:1000#rnorm(1000)
y <- 1500:501#rnorm(1000)
bin<-hexbin(x, y, xbins=50)
plot(bin, main="Hexagonal Binning")
But I do not know how to use hexbin (I do not understand the functionality). There are needed two colors which I do not know how to generate.
Questions
How to create the 3rd axis with other scaling than the y-axis?
Can I use ´hexbin´ to get the result?
For some reason, using points() does not work, but using plot() does work:
#Set margin on right side to be a bit larger
par(mar = c(5,4.5,4,5))
#Plot first set of data
plot(1:3, rep(3,3), ylim=c(-5,5), xlab="X-Axis", ylab="Y-Axis 1")
#Plot second set of data on different axis.
par(new=T)
plot(1:3, rep(5,3), ylim=c(-10,10), col="blue", xlab="", ylab="", axes=FALSE)
#Add numbers and labels to the second y-axis
mtext("Y-Axis 2",side=4,line=3)
axis(4, ylim=c(-10,10))
I either seem to be running in circles or there is something wrong with my data. I want to plot some data and use ?axisto modify the labels on the X axis.
I have two issues however:
The axis labels begin in the center of the x-axis instead of at the beginning
The axis labels do not match the data points in the plot
I would like to have X axis labels ranging from 10 to 90 by 5.
This is the code that I use and came up with so far:
values <- cbind(1:180,1)
l <- list(1:10,11:20,21:30,31:40,41:50,51:60,61:70,71:80,81:90,91:100,101:110,111:120,121:130,131:140,141:150,151:160,161:170,171:180)
# compute mean across the intervals in l
meanqual <- sapply(l, function(x) mean(values[x,1]))
meanqual
plot <- plot(meanqual, type="o", xlab="% Size of Wave", ylab="Values",xaxt='n', lty=1)
legend('bottomright', c("Values"),pch=21, lty=1, cex=1)
axis(side=1, at= seq(10,90,5))
If you only give one numeric vector to plot it "assumes" you meant to use the position or index of the values in that vector as the x values, so the plot call plotted meanqual against 1:length(meanqual). If you wanted to plot against the seq() argument you latter used in the axis call you should supply it (or rather something similar in scale with the same length as meanqual) to plot:
plot <- plot(x=seq(5,90,5), y=meanqual, type="o",
xlab="% Size of Wave", ylab="Values",xaxt='n', lty=1)
legend('bottomright', c("Values"),pch=21, lty=1, cex=1)
axis(side=1, at= seq(10,90,5), labels=seq(10,90,5))
Hello I am trying to create a stacked barplot using the following code:
test <- as.matrix(read.csv(file="test4.csv",sep=",",head=TRUE))
test <- test[,2:ncol(test)]
pdf(file="test.pdf", height=4, width=6)
par(lwd = 0.3)
barplot(test, space=0.4, xaxt='n', ann=FALSE)
axis(1, cex.axis=0.25, las=2, at=1:ncol(test), space=0.4, labels=colnames(test))
dev.off()
And I get:
As you can see the labels in the x-axis do not match the bars in the plot. Also, the ticks are huge.
Can you guys help me beautify the x axis? Thanks so much
Try storing the returned value of the call to barplot() in a named object, and then passing that in to the at= argument of axis():
xLabLocs <- barplot(test, space=0.4, xaxt='n', ann=FALSE)
axis(1, cex.axis=0.25, las=2, at=xLabLocs,
space=0.4, labels=colnames(test))
This may look odd, but it is explained in the Value section of the ?barplot help file:
Value:
A numeric vector (or matrix, when ‘beside = TRUE’), say ‘mp’,
giving the coordinates of _all_ the bar midpoints drawn, useful
for adding to the graph.
You just made the (easy enough to make) mistake of assuming that the x-axis coordinates of the bar centers are at 1:n, where n is the number of bars. That's not necessarily true, so it's nice that a single call to barplot() will both: (a) plot the bar plot as its side effect; and (b) return the necessary x-axis coordinates as its return value.
While plotting histogarm, scatterplots and other plots with axes scaled to logarithmic scale in R, how is it possible to use labels such as 10^-1 10^0 10^1 10^2 10^3 and so on instead of the axes showing just -1, 0, 1, 2, 3 etc. What parameters should be added to the commands such as hist(), plot() etc?
Apart from the solution of ggplot2 (see gsk3's comment), I would like to add that this happens automatically in plot() as well when using the correct arguments, eg :
x <- 1:10
y <- exp(1:10)
plot(x,y,log="y")
You can use the parameter log="x" for the X axis, or log="xy" for both.
If you want to format the numbers, or you have the data in log format, you can do a workaround using axis(). Some interesting functions :
axTicks(x) gives you the location of the ticks on the X-axis (x=1) or Y-axis (x=2)
bquote() converts expressions to language, but can replace a variable with its value. More information on bquote() in the question Latex and variables in plot label in R? .
as.expression() makes the language object coming from bquote() an expression. This allows axis() to do the formatting as explained in ?plotmath. It can't do so with language objects.
An example for nice formatting :
x <- y <- 1:10
plot(x,y,yaxt="n")
aty <- axTicks(2)
labels <- sapply(aty,function(i)
as.expression(bquote(10^ .(i)))
)
axis(2,at=aty,labels=labels)
Which gives
Here is a different way to draw this type of axis:
plot(NA, xlim=c(0,10), ylim=c(1, 10^4), xlab="x", ylab="y", log="y", yaxt="n")
at.y <- outer(1:9, 10^(0:4))
lab.y <- ifelse(log10(at.y) %% 1 == 0, at.y, NA)
axis(2, at=at.y, labels=lab.y, las=1)
EDIT: This is also solved in latticeExtra with scale.components
In ggplot2 you just can add a
... +
scale_x_log10() +
scale_y_log10(limits = c(1e-4,1), breaks=c(1e-4,1e-3,1e-2,0.1,1)) + ...
to scale your axis, Label them and add custom breaks.
I'm trying to generate a histogram in R with a logarithmic scale for y. Currently I do:
hist(mydata$V3, breaks=c(0,1,2,3,4,5,25))
This gives me a histogram, but the density between 0 to 1 is so great (about a million values difference) that you can barely make out any of the other bars.
Then I've tried doing:
mydata_hist <- hist(mydata$V3, breaks=c(0,1,2,3,4,5,25), plot=FALSE)
plot(rpd_hist$counts, log="xy", pch=20, col="blue")
It gives me sorta what I want, but the bottom shows me the values 1-6 rather than 0, 1, 2, 3, 4, 5, 25. It's also showing the data as points rather than bars. barplot works but then I don't get any bottom axis.
A histogram is a poor-man's density estimate. Note that in your call to hist() using default arguments, you get frequencies not probabilities -- add ,prob=TRUE to the call if you want probabilities.
As for the log axis problem, don't use 'x' if you do not want the x-axis transformed:
plot(mydata_hist$count, log="y", type='h', lwd=10, lend=2)
gets you bars on a log-y scale -- the look-and-feel is still a little different but can probably be tweaked.
Lastly, you can also do hist(log(x), ...) to get a histogram of the log of your data.
Another option would be to use the package ggplot2.
ggplot(mydata, aes(x = V3)) + geom_histogram() + scale_x_log10()
It's not entirely clear from your question whether you want a logged x-axis or a logged y-axis. A logged y-axis is not a good idea when using bars because they are anchored at zero, which becomes negative infinity when logged. You can work around this problem by using a frequency polygon or density plot.
Dirk's answer is a great one. If you want an appearance like what hist produces, you can also try this:
buckets <- c(0,1,2,3,4,5,25)
mydata_hist <- hist(mydata$V3, breaks=buckets, plot=FALSE)
bp <- barplot(mydata_hist$count, log="y", col="white", names.arg=buckets)
text(bp, mydata_hist$counts, labels=mydata_hist$counts, pos=1)
The last line is optional, it adds value labels just under the top of each bar. This can be useful for log scale graphs, but can also be omitted.
I also pass main, xlab, and ylab parameters to provide a plot title, x-axis label, and y-axis label.
Run the hist() function without making a graph, log-transform the counts, and then draw the figure.
hist.data = hist(my.data, plot=F)
hist.data$counts = log(hist.data$counts, 2)
plot(hist.data)
It should look just like the regular histogram, but the y-axis will be log2 Frequency.
I've put together a function that behaves identically to hist in the default case, but accepts the log argument. It uses several tricks from other posters, but adds a few of its own. hist(x) and myhist(x) look identical.
The original problem would be solved with:
myhist(mydata$V3, breaks=c(0,1,2,3,4,5,25), log="xy")
The function:
myhist <- function(x, ..., breaks="Sturges",
main = paste("Histogram of", xname),
xlab = xname,
ylab = "Frequency") {
xname = paste(deparse(substitute(x), 500), collapse="\n")
h = hist(x, breaks=breaks, plot=FALSE)
plot(h$breaks, c(NA,h$counts), type='S', main=main,
xlab=xlab, ylab=ylab, axes=FALSE, ...)
axis(1)
axis(2)
lines(h$breaks, c(h$counts,NA), type='s')
lines(h$breaks, c(NA,h$counts), type='h')
lines(h$breaks, c(h$counts,NA), type='h')
lines(h$breaks, rep(0,length(h$breaks)), type='S')
invisible(h)
}
Exercise for the reader: Unfortunately, not everything that works with hist works with myhist as it stands. That should be fixable with a bit more effort, though.
Here's a pretty ggplot2 solution:
library(ggplot2)
library(scales) # makes pretty labels on the x-axis
breaks=c(0,1,2,3,4,5,25)
ggplot(mydata,aes(x = V3)) +
geom_histogram(breaks = log10(breaks)) +
scale_x_log10(
breaks = breaks,
labels = scales::trans_format("log10", scales::math_format(10^.x))
)
Note that to set the breaks in geom_histogram, they had to be transformed to work with scale_x_log10