Segmenting X-axis label in an R plot - r

I've been working extensively with R lately and I have a nitpicky plotting question.
I've attached an image of my current plot as reference. As you can see, I've added vertical lines to segment parts of my data inputs. I have 200 'agents' and each of them comes from different categorical subsets which make them all a little different. So, my goal is to keep the bottom axis as the index of my 'agents' vector, but I'd like to add a label to each of my subdivisions at the bottom to make it a little clearer as to why I'm segmenting them with the vertical lines.
Any suggestions?
http://i.imgur.com/YGNdBhg.png?1?1971

You just need to call axis like this:
x = sin(1:100) + rnorm(100, 0,.125)
breaks = c(10,33,85, 96)
plot(x)
sapply(breaks, function(x){abline(v=x, lty=2)})
axis(1, breaks, as.character(breaks))
If you don't want the default ticks plotted at all (i.e. just the ticks in the "breaks" vector) you just need to modify this slightly:
plot(x, axes=F)
sapply(breaks, function(x){abline(v=x, lty=2)})
axis(1, breaks, as.character(breaks))
axis(2)
box()

You don't provide any example data or code, so the code I am sending is untested. I am calling the vector of vertical lines vertlines and the vector of labels labels. I define the midpoints of each category using the vertlines and the range of the agent values. Then I add them to the plot using the mtext() function. Give it a try.
vertlines <- c(40, 80, 120, 140, 160, 180)
labels <- letters[1:7]
labelx <- diff(c(1, vertlines, 200))/2 + c(1, vertlines)
mtext(labels, at=labelx, side=1, line=4)

Related

Plotting a function with points()

I am using R to plot a function, and want to add lines describing multiple functions to the same plot. To plot a function, I write:
plot(function(x){x},
xlab="Celsius", xlim=c(-100, 100),
ylab="Degrees", ylim=c(-100, 100))
This would plot a 1:1 line. If I want to plot a different function on the same graph, I can use the points() function but this requires data values for x to be provided such that it plots length(x) points (joined by lines) as:
points(x=seq(-100, 100, by=0.1),
y=c(seq(-100, 100, by=0.1)-32)*5/9,
typ="l", col="red")
Is it possible to add lines to a plot when plotting a function rather than having to calculate data points using points() or another function? Essentially, it would be something like this:
plot(function(x){x},
xlab="Celsius", xlim=c(-100, 100),
ylab="Degrees", ylim=c(-100, 100))
points(function(x){(x-32)*5/9},
typ="l", col="red")
This is just an example, it shows the relationship between degrees Celsius on the X axis, and degrees on the Y axis in Celsius (black) and Fahrenheit (red). In reality I want to plot multiple complex functions but that would just add noise to the question.
One solution I found is
plot(function(x){x},
xlab="Celsius", xlim=c(-100, 100),
ylab="Degrees", ylim=c(-100, 100))
par(new=TRUE)
plot(function(x){(x-32)*5/9},
xlab="", xlim=c(-100, 100),
ylab="", ylim=c(-100, 100),
axes=FALSE, col="red")
But it seems cumbersome having to define limits and labels and AXES=FALSE each time.
You can use the plot function twice and add add = TRUE for the second plot.
With plot, you can also use from and to parameters to avoid repeating the y-axis limits, although it will keep the y-axis limits defined in the first plot (so it might not be optimal).
plot(function(x){x},
xlab="Celsius", xlim=c(-100, 100),
ylab="Degrees", ylim=c(-100, 100))
plot(function(x) {(x-32)*5/9}, from = -100, to = 100, typ="l", col="red", add=T)
As mentioned by #Roland and #user2554330, you can also use curves if you want to plot multiple lines from the same function, and use () to avoid assigning the function beforehand, with add = i!=1 standing for add = T at every iteration except the first one.
for(y in 1:10) {
curve((x + 10*y), from=-100, to=100, add=i!=1)
}

uneven spacing on axis with R

I have a problem in plotting uneven scale plot on x axis with R
Here is an example:
plot(1:100,1:100)
will give the equal tick space on x axis.
However, I want to show the graph with first half of space showing 1 to 10, and the left half space showing 10 to 100, so the points in the 10 to 100 more dense, and points in 1:10 are easier to see. How to do it with R?
Like this:
This is not an easy one-off task to complete. You'll actually need to transform to the scaled data and supply custom tick marked axes. Any reason you haven't considered simply logging the x-axis instead? (supplying the option plot(x, y, log='x') will do that).
What I think you've described is this:
xnew <- ifelse(x<10, x, x/10)
plot(xnew, y, axes=FALSE, xlab='x')
axis(1, at=c(0, 10, 20), labels=c(0, 10, 100))
axis(2)
box()
You could log the x axis:
x<-1:100
y<-1:100
plot(log(x,base=10),y,axes=F)
axis(2)
axis(1,at=0:2,labels=10^(0:2))
For a logarithmic axis, use:
plot(x,y,log="x") ## specifies which axis to put on log scale
For determining how many "tick marks" to use, check
par()$lab
Default is 5,5,7. To put more x axis labels, do
par(lab=c(10,5,7))
And for y:
par(lab=c(5,10,7))

How to specify the actual x axis values to plot as x axis ticks in R

I am creating a plot in R and I dont like the x axis values being plotted by R.
For example:
x <- seq(10,200,10)
y <- runif(x)
plot(x,y)
This plots a graph with the following values on the X axis:
50, 100, 150, 200
However, I want to plot the 20 values 10,20, 30 ... 200 stored in variable x, as the X axis values. I have scoured through countless blogs and the terse manual - after hours of searching, the closest I've come to finding anything useful is the following (summarized) instructions:
call plot() or par(), specifying argument xaxt='n'
call axis() e.g. axis(side = 1, at = seq(0, 10, by = 0.1), labels = FALSE, tcl = -0.2)
I tried it and the resulting plot had no x axis values at all. Is it possible that someone out there knows how to do this? I can't believe that no one has ever tried to do this before.
You'll find the answer to your question in the help page for ?axis.
Here is one of the help page examples, modified with your data:
Option 1: use xaxp to define the axis labels
plot(x,y, xaxt="n")
axis(1, xaxp=c(10, 200, 19), las=2)
Option 2: Use at and seq() to define the labels:
plot(x,y, xaxt="n")
axis(1, at = seq(10, 200, by = 10), las=2)
Both these options yield the same graphic:
PS. Since you have a large number of labels, you'll have to use additional arguments to get the text to fit in the plot. I use las to rotate the labels.
Take a closer look at the ?axis documentation. If you look at the description of the labels argument, you'll see that it is:
"a logical value specifying whether (numerical) annotations are
to be made at the tickmarks,"
So, just change it to true, and you'll get your tick labels.
x <- seq(10,200,10)
y <- runif(x)
plot(x,y,xaxt='n')
axis(side = 1, at = x,labels = T)
# Since TRUE is the default for labels, you can just use axis(side=1,at=x)
Be careful that if you don't stretch your window width, then R might not be able to write all your labels in. Play with the window width and you'll see what I mean.
It's too bad that you had such trouble finding documentation! What were your search terms? Try typing r axis into Google, and the first link you will get is that Quick R page that I mentioned earlier. Scroll down to "Axes", and you'll get a very nice little guide on how to do it. You should probably check there first for any plotting questions, it will be faster than waiting for a SO reply.
Hope this coding will helps you :)
plot(x,y,xaxt = 'n')
axis(side=1,at=c(1,20,30,50),labels=c("1975","1980","1985","1990"))
In case of plotting time series, the command ts.plot requires a different argument than xaxt="n"
require(graphics)
ts.plot(ldeaths, mdeaths, xlab="year", ylab="deaths", lty=c(1:2), gpars=list(xaxt="n"))
axis(1, at = seq(1974, 1980, by = 2))

Plot percentages on y-axis

I'm plotting a graph using this
plot(dates,returns)
I would like to have the returns expressed as percentages instead of numbers. 0.1 would become 10%. Also, the numbers on the y-axis appear tilted 90 degrees on the left. Is it possible to make them appear horizontally?
Here is one way using las=TRUE to turn the labels on the y-axis and axis() for the new y-axis with adjusted labels.
dates <- 1:10
returns <- runif(10)
plot(dates, returns, yaxt="n")
axis(2, at=pretty(returns), lab=pretty(returns) * 100, las=TRUE)
If you use ggplot you can use the scales package.
library(scales)
plot + scale_y_continuous(labels = percent)
library(scales)
dates <- 1:100
returns <- runif(100)
yticks_val <- pretty_breaks(n=5)(returns)
plot(dates, returns, yaxt="n")
axis(2, at=yticks_val, lab=percent(yticks_val))
Highlights:
No need to explicitly add "%"
Manually fix the number of y-ticks to be consistent with further plots. Here I chose 5.
Combining two answers together #rengis #vladiim

Histogram with Logarithmic Scale and custom breaks

I'm trying to generate a histogram in R with a logarithmic scale for y. Currently I do:
hist(mydata$V3, breaks=c(0,1,2,3,4,5,25))
This gives me a histogram, but the density between 0 to 1 is so great (about a million values difference) that you can barely make out any of the other bars.
Then I've tried doing:
mydata_hist <- hist(mydata$V3, breaks=c(0,1,2,3,4,5,25), plot=FALSE)
plot(rpd_hist$counts, log="xy", pch=20, col="blue")
It gives me sorta what I want, but the bottom shows me the values 1-6 rather than 0, 1, 2, 3, 4, 5, 25. It's also showing the data as points rather than bars. barplot works but then I don't get any bottom axis.
A histogram is a poor-man's density estimate. Note that in your call to hist() using default arguments, you get frequencies not probabilities -- add ,prob=TRUE to the call if you want probabilities.
As for the log axis problem, don't use 'x' if you do not want the x-axis transformed:
plot(mydata_hist$count, log="y", type='h', lwd=10, lend=2)
gets you bars on a log-y scale -- the look-and-feel is still a little different but can probably be tweaked.
Lastly, you can also do hist(log(x), ...) to get a histogram of the log of your data.
Another option would be to use the package ggplot2.
ggplot(mydata, aes(x = V3)) + geom_histogram() + scale_x_log10()
It's not entirely clear from your question whether you want a logged x-axis or a logged y-axis. A logged y-axis is not a good idea when using bars because they are anchored at zero, which becomes negative infinity when logged. You can work around this problem by using a frequency polygon or density plot.
Dirk's answer is a great one. If you want an appearance like what hist produces, you can also try this:
buckets <- c(0,1,2,3,4,5,25)
mydata_hist <- hist(mydata$V3, breaks=buckets, plot=FALSE)
bp <- barplot(mydata_hist$count, log="y", col="white", names.arg=buckets)
text(bp, mydata_hist$counts, labels=mydata_hist$counts, pos=1)
The last line is optional, it adds value labels just under the top of each bar. This can be useful for log scale graphs, but can also be omitted.
I also pass main, xlab, and ylab parameters to provide a plot title, x-axis label, and y-axis label.
Run the hist() function without making a graph, log-transform the counts, and then draw the figure.
hist.data = hist(my.data, plot=F)
hist.data$counts = log(hist.data$counts, 2)
plot(hist.data)
It should look just like the regular histogram, but the y-axis will be log2 Frequency.
I've put together a function that behaves identically to hist in the default case, but accepts the log argument. It uses several tricks from other posters, but adds a few of its own. hist(x) and myhist(x) look identical.
The original problem would be solved with:
myhist(mydata$V3, breaks=c(0,1,2,3,4,5,25), log="xy")
The function:
myhist <- function(x, ..., breaks="Sturges",
main = paste("Histogram of", xname),
xlab = xname,
ylab = "Frequency") {
xname = paste(deparse(substitute(x), 500), collapse="\n")
h = hist(x, breaks=breaks, plot=FALSE)
plot(h$breaks, c(NA,h$counts), type='S', main=main,
xlab=xlab, ylab=ylab, axes=FALSE, ...)
axis(1)
axis(2)
lines(h$breaks, c(h$counts,NA), type='s')
lines(h$breaks, c(NA,h$counts), type='h')
lines(h$breaks, c(h$counts,NA), type='h')
lines(h$breaks, rep(0,length(h$breaks)), type='S')
invisible(h)
}
Exercise for the reader: Unfortunately, not everything that works with hist works with myhist as it stands. That should be fixable with a bit more effort, though.
Here's a pretty ggplot2 solution:
library(ggplot2)
library(scales) # makes pretty labels on the x-axis
breaks=c(0,1,2,3,4,5,25)
ggplot(mydata,aes(x = V3)) +
geom_histogram(breaks = log10(breaks)) +
scale_x_log10(
breaks = breaks,
labels = scales::trans_format("log10", scales::math_format(10^.x))
)
Note that to set the breaks in geom_histogram, they had to be transformed to work with scale_x_log10

Resources