Plot with two Y axes : confidence intervals - r

I am trying to plot several points with error bars, with two y axes.
However at every call of the plotCI or errbar functions, a new plot is initialized - with or without par(new=TRUE) calls -.
require(plotrix)
x <- 1:10
y1 <- x + rnorm(10)
y2<-x+rnorm(10)
delta <- runif(10)
plotCI(x,y=y1,uiw=delta,xaxt="n",gap=0)
axis(side=1,at=c(1:10),labels=rep("a",10),cex=0.7)
par(new=TRUE)
axis(4)
plotCI(x,y=y2,uiw=delta,xaxt="n",gap=0)
I have also tried the twoord.plot function from plotrix, but I don't think it's possible to add the error bars.
With ggplot2 I have only managed to plot in two different panels with the same Y axis.
Is there a way to do this?

Use add=TRUE,
If FALSE (default), create a new plot; if TRUE, add error bars to an
existing plot.
For example the last line becomes:
plotCI(x,y=y2,uiw=delta,xaxt="n",gap=0,add=TRUE)
PS: hard to do this with ggplot2. take a look at this hadley code
EDIT
The user coordinate system is now redefined by specifying a new user setting. Here I do it manually.
plotCI(x,y=y1,uiw=delta,xaxt="n",gap=0)
axis(side=1,at=c(1:10),labels=rep("a",10),cex=0.7)
usr <- par("usr")
par(usr=c(usr[1:2], -1, 20))
plotCI(x,y=y2,uiw=delta,xaxt="n",gap=0,add=TRUE,col='red')
axis(4,col.ticks ='red')

Related

Multiple Pen's Parade Graphs on the same Plot

I'm doing stochastic dominance analysis with diferent income distributions using Pen's Parade. I can plot a single Pen's Parade using Pen function from ineq package, but I need a visual comparison and I want multiple lines in the same image. I don't know how extract values from the function, so I can't do this.
I have the following reproducible example:
set.seed(123)
x <- rnorm(100)
y <- rnorm(100, mean = 0.2)
library(ineq)
Pen(x)
Pen(y)
I obtain the following plots:
I want obtain sometime as the following:
You can use add = TRUE:
set.seed(123)
x <- rnorm(100)
y <- rnorm(100, mean = 0.2)
library(ineq)
Pen(x); Pen(y, add = TRUE)
From help("Pen"):
add logical. Should the plot be added to an existing plot?
While the solution mentioned by M-M in the comments is a more general solution, in this specific case it produces a busy Y axis:
Pen(x)
par(new = TRUE)
Pen(y)
I would generalize the advice for plotting functions in this way:
Check the plotting function's help file. If it has an add argument, use that.
Otherwise, use the par(new = TRUE) technique
Update
As M-M helpfully mentions in the comments, their more general solution will not produce a busy Y axis if you manually suppress the Y axis on the second plot:
Pen(x)
par(new = TRUE)
Pen(y, yaxt = "n")
Looking at ?ineq::Pen() it seems to work like plot(); therefore, followings work for you.
Pen(x)
Pen(y, add=T)
Note: However, add=T cuts out part of your data since second plot has points which fall out of the limit of the first.
Update on using par(new=T):
Using par(new=T) basically means overlaying two plots on top of each other; hence, it is important to make them with the same scale. We can achieve that by setting the same axis limits. That said, while using add=T argument it is desired to set limits of the axis to not loose any part of data. This is the best practice for overlaying two plots.
Pen(x, ylim=c(0,38), xlim=c(0,1))
par(new=T)
Pen(y, col="red", ylim=c(0,38), xlim=c(0,1), yaxt='n', xaxt='n')
Essentially, you can do the same with add=T.

log="y" convert only y-axis label or also y-coordinates of my data?

I'm building a plot in R and I have used the plot() function, with log="y" parameter.
Does that mean that ONLY the y-axis labels will be converted in log scale OR that also the y-coordinates of my data will be converted in log-scale?
Thank you
When using log = "y" it plots the log-transformed y-values with the labels on the original scale -- the opposite of what you seem to suggest.
Compare these three plots:
x <- rnorm(50)
y <- 2*exp(x) + rexp(50)
plot(x, y) # y-scale, y-scale-labels
plot(x, y, log = "y") # log-y-scale, y-scale-labels
plot(x, log(y)) # log-y-scale, log-y-scale labels
Notice that the last two plots only differs in the y-axis labels. Both are still correct as the axis titles are also different.

plot lines instead of points R

This is probably a simple question, but I´m not able to find the solution for this.
I have the following plot (I´m using plot CI since I´m not able to fill the points with plot()).
leg<-c("1","2","3","4","5","6","7","8")
Col.rar1<-c(rgb(1,0,0,0.7), rgb(0,0,1,0.7), rgb(0,1,1,0.7),rgb(0.6,0,0.8,0.7),rgb(1,0.8,0,0.7),rgb(0.4,0.5,0.6,0.7),rgb(0.2,0.3,0.2,0.7),rgb(1,0.3,0,0.7))
library(plotrix)
plotCI(test$size,test$Mean,
pch=c(21), pt.bg=Col.rar1,xlab="",ylab="", ui=test$Mean,li= test$Mean)
legend(4200,400,legend=leg,pch=c(21),pt.bg=Col.rar1, bty="n", cex=1)
I want to creat the same effect but with lines, instead of points (continue line)
Any suggestion?
You have 2 solutions :
Use The lines() function draws lines between (x, y) locations.
Use plot with type = "l" like line
hard to show it without a reproducible example , but you can do for example:
Col.rar1<-c(rgb(1,0,0,0.7), rgb(0,0,1,0.7), rgb(0,1,1,0.7),rgb(0.6,0,0.8,0.7),rgb(1,0.8,0,0.7),rgb(0.4,0.5,0.6,0.7),rgb(0.2,0.3,0.2,0.7),rgb(1,0.3,0,0.7))
x <- seq(0, 5000, length.out=10)
y <- matrix(sort(rnorm(10*length(Col.rar1))), ncol=length(Col.rar1))
plot(x, y[,1], ylim=range(y), ann=FALSE, axes=T,type="l", col=Col.rar1[1])
lapply(seq_along(Col.rar1),function(i){
lines(x, y[,i], col=Col.rar1[i])
points(x, y[,i]) # this is optional
})
When it comes to generating plots where you want lines connected according to some grouping variable, you want to get away from base-R plots and check out lattice and ggplot2. Base-R plots don't have a simple concept of 'groups' in an xy plot.
A simple lattice example:
library( lattice )
dat <- data.frame( x=rep(1:5, times=4), y=rnorm(20), gp=rep(1:4,each=5) )
xyplot( y ~ x, dat, group=gp, type='b' )
You should be able to use something like this if you have a variable in test similar to the color vector you define.

Labelling logarithmic scale display in R

While plotting histogarm, scatterplots and other plots with axes scaled to logarithmic scale in R, how is it possible to use labels such as 10^-1 10^0 10^1 10^2 10^3 and so on instead of the axes showing just -1, 0, 1, 2, 3 etc. What parameters should be added to the commands such as hist(), plot() etc?
Apart from the solution of ggplot2 (see gsk3's comment), I would like to add that this happens automatically in plot() as well when using the correct arguments, eg :
x <- 1:10
y <- exp(1:10)
plot(x,y,log="y")
You can use the parameter log="x" for the X axis, or log="xy" for both.
If you want to format the numbers, or you have the data in log format, you can do a workaround using axis(). Some interesting functions :
axTicks(x) gives you the location of the ticks on the X-axis (x=1) or Y-axis (x=2)
bquote() converts expressions to language, but can replace a variable with its value. More information on bquote() in the question Latex and variables in plot label in R? .
as.expression() makes the language object coming from bquote() an expression. This allows axis() to do the formatting as explained in ?plotmath. It can't do so with language objects.
An example for nice formatting :
x <- y <- 1:10
plot(x,y,yaxt="n")
aty <- axTicks(2)
labels <- sapply(aty,function(i)
as.expression(bquote(10^ .(i)))
)
axis(2,at=aty,labels=labels)
Which gives
Here is a different way to draw this type of axis:
plot(NA, xlim=c(0,10), ylim=c(1, 10^4), xlab="x", ylab="y", log="y", yaxt="n")
at.y <- outer(1:9, 10^(0:4))
lab.y <- ifelse(log10(at.y) %% 1 == 0, at.y, NA)
axis(2, at=at.y, labels=lab.y, las=1)
EDIT: This is also solved in latticeExtra with scale.components
In ggplot2 you just can add a
... +
scale_x_log10() +
scale_y_log10(limits = c(1e-4,1), breaks=c(1e-4,1e-3,1e-2,0.1,1)) + ...
to scale your axis, Label them and add custom breaks.

Histogram with Logarithmic Scale and custom breaks

I'm trying to generate a histogram in R with a logarithmic scale for y. Currently I do:
hist(mydata$V3, breaks=c(0,1,2,3,4,5,25))
This gives me a histogram, but the density between 0 to 1 is so great (about a million values difference) that you can barely make out any of the other bars.
Then I've tried doing:
mydata_hist <- hist(mydata$V3, breaks=c(0,1,2,3,4,5,25), plot=FALSE)
plot(rpd_hist$counts, log="xy", pch=20, col="blue")
It gives me sorta what I want, but the bottom shows me the values 1-6 rather than 0, 1, 2, 3, 4, 5, 25. It's also showing the data as points rather than bars. barplot works but then I don't get any bottom axis.
A histogram is a poor-man's density estimate. Note that in your call to hist() using default arguments, you get frequencies not probabilities -- add ,prob=TRUE to the call if you want probabilities.
As for the log axis problem, don't use 'x' if you do not want the x-axis transformed:
plot(mydata_hist$count, log="y", type='h', lwd=10, lend=2)
gets you bars on a log-y scale -- the look-and-feel is still a little different but can probably be tweaked.
Lastly, you can also do hist(log(x), ...) to get a histogram of the log of your data.
Another option would be to use the package ggplot2.
ggplot(mydata, aes(x = V3)) + geom_histogram() + scale_x_log10()
It's not entirely clear from your question whether you want a logged x-axis or a logged y-axis. A logged y-axis is not a good idea when using bars because they are anchored at zero, which becomes negative infinity when logged. You can work around this problem by using a frequency polygon or density plot.
Dirk's answer is a great one. If you want an appearance like what hist produces, you can also try this:
buckets <- c(0,1,2,3,4,5,25)
mydata_hist <- hist(mydata$V3, breaks=buckets, plot=FALSE)
bp <- barplot(mydata_hist$count, log="y", col="white", names.arg=buckets)
text(bp, mydata_hist$counts, labels=mydata_hist$counts, pos=1)
The last line is optional, it adds value labels just under the top of each bar. This can be useful for log scale graphs, but can also be omitted.
I also pass main, xlab, and ylab parameters to provide a plot title, x-axis label, and y-axis label.
Run the hist() function without making a graph, log-transform the counts, and then draw the figure.
hist.data = hist(my.data, plot=F)
hist.data$counts = log(hist.data$counts, 2)
plot(hist.data)
It should look just like the regular histogram, but the y-axis will be log2 Frequency.
I've put together a function that behaves identically to hist in the default case, but accepts the log argument. It uses several tricks from other posters, but adds a few of its own. hist(x) and myhist(x) look identical.
The original problem would be solved with:
myhist(mydata$V3, breaks=c(0,1,2,3,4,5,25), log="xy")
The function:
myhist <- function(x, ..., breaks="Sturges",
main = paste("Histogram of", xname),
xlab = xname,
ylab = "Frequency") {
xname = paste(deparse(substitute(x), 500), collapse="\n")
h = hist(x, breaks=breaks, plot=FALSE)
plot(h$breaks, c(NA,h$counts), type='S', main=main,
xlab=xlab, ylab=ylab, axes=FALSE, ...)
axis(1)
axis(2)
lines(h$breaks, c(h$counts,NA), type='s')
lines(h$breaks, c(NA,h$counts), type='h')
lines(h$breaks, c(h$counts,NA), type='h')
lines(h$breaks, rep(0,length(h$breaks)), type='S')
invisible(h)
}
Exercise for the reader: Unfortunately, not everything that works with hist works with myhist as it stands. That should be fixable with a bit more effort, though.
Here's a pretty ggplot2 solution:
library(ggplot2)
library(scales) # makes pretty labels on the x-axis
breaks=c(0,1,2,3,4,5,25)
ggplot(mydata,aes(x = V3)) +
geom_histogram(breaks = log10(breaks)) +
scale_x_log10(
breaks = breaks,
labels = scales::trans_format("log10", scales::math_format(10^.x))
)
Note that to set the breaks in geom_histogram, they had to be transformed to work with scale_x_log10

Resources