Add jittered data points to lattice xYplot with error bars - r

I'm trying to plot error bars overlaid on jittered raw data points using xYplot from package Hmisc. Seemed straightforward to just call a function within xYplot using panel.stripplot. It works, but there is a strange glitch - I can't 'jitter' the data plotted with panel.stripplot. Let me illustrate my point:
library(reshape2)
library(Hmisc)
data(iris)
#get error bars
d <- melt(iris, id=c("Species"), measure=c("Sepal.Length"))
X <- dcast(d, Species ~ variable, mean)
SD <- dcast(d, Species ~ variable, sd)
SE = SD[,2]/1#this is wrong on purpose, to plot larger error bars
Lo = X[,2]-SE
Hi = X[,2]+SE
fin <- data.frame(X,Lo=Lo,Hi=Hi)
#plot the error bars combined with raw data points
quartz(width=5,height=7)
xYplot(Cbind(Sepal.Length, Lo, Hi) ~ numericScale(Species), fin,
type=c("p"), ylim=c(4,8),lwd=3, col=1,
scales = list(x = list(at=1:3, labels=levels(d$Species))),
panel = function(x, y, ...) {
panel.xYplot(x, y, ...)
panel.stripplot(d$Species, d$value, jitter.data = TRUE, cex=0.2, ...)
}
)
Which results in:
As you can see, the points are lined up vertically with the error bars, why I would like them to be slightly offset in horizontal plain. I tried to tweak factor and amount parameters in the panel.stripplot but it doesn't change it. Any suggestions? Solutions with lattice-only please, preferably using xYplot.

Use horizontal=FALSE:
panel.stripplot(d$Species, d$value,
jitter.data = TRUE, cex=0.2,horizontal=FALSE, ...)
Internally is just a call to :
panel.xyplot(d$Species, d$value, cex=0.2,jitter.x=TRUE, ...)

Related

How to set x limits on varImpPlot

How can I change the x limits of a plot produced by varImpPlot from the randomForest package?
If I try
set.seed(4543)
data(mtcars)
mtcars.rf <- randomForest(mpg ~ ., data=mtcars, ntree=1000, keep.forest=FALSE,
importance=TRUE)
varImpPlot(mtcars.rf, scale=FALSE, type=1, xlim=c(0,15))
I get the following error:
Error in dotchart(imp[ord, i], xlab = colnames(imp)[i], ylab = "", main = if (nmeas == : formal argument "xlim" matched by multiple actual arguments".
This is because varImpPlot defines its own x limits, I think, but how could I get around this if I wanted to set the x limits myself (perhaps for consistency across plots)?
First I extracted the values using importance() (thanks to the suggestion from #dww)
impToPlot <- importance(mtcars.rf, scale=FALSE)
Then I plotted them using dotchart(), which allowed me to manually set the x limits (and any other plot features I'd like)
dotchart(sort(impToPlot[,1]), xlim=c(0,15), xlab="%IncMSE")

plot the first point of each group of panel data in lattice xyplot

I'm trying to create a line plot using groups and panels that superposes a symbol on the first value of each group. This attempt plots the first point of the first group only and does nothing for the other two groups.
library(lattice)
foo <- data.frame(x=-100:100, y=sin(c(-100:100)*pi/4))
xyplot( y~x | y>0, foo, groups=cut(x,3),
panel = function(x, y, subscripts, ...) {
panel.xyplot(x, y, subscripts=subscripts, type='l', ...)
# almost but not quite:
panel.superpose(x[1], y[1], subscripts=subscripts, cex=c(1,0), ...)
} )
General solutions would be appreciated, to allow plotting of specific points within each group and panel (e.g., first, middle, and endpoints).
(Thanks for this good lattice question.) You should use Subscripts because it is the mechanism for picking individual data points for panels : Here you want to pick the groups by panel: groups[subscripts]. Once you have the right grouping variables you can use it to split your data and pick the first element of each group:
## first points
xx = tapply(x,groups[subscripts],'[',1)
yy = tapply(y,groups[subscripts],'[',1)
## end points
xx = tapply(x,groups[subscripts],tail,1)
yy = tapply(y,groups[subscripts],tail,1)
The you plot the point using panel.points (higher level than the basic panel.superpose).
library(lattice)
foo <- data.frame(x=-100:100, y=sin(c(-100:100)*pi/4))
xyplot( y~x | y>0, foo, groups=cut(x,3),
panel = function(x, y, subscripts, groups,...) {
panel.xyplot(x, y, subscripts=subscripts, type='l',groups=groups, ...)
# Note the use `panel.points` and the grouping using groups[subscripts]
panel.points(tapply(x,groups[subscripts],'[',1),
tapply(y,groups[subscripts],'[',1),
cex=2,pch=20, ...)
} )
In case you want to color points according the color groups, you should add a groups argument to panel.points. (I leave you this as an exercise)

Visualize data using histogram in R

I am trying to visualize some data and in order to do it I am using R's hist.
Bellow are my data
jancoefabs <- as.numeric(as.vector(abs(Janmodelnorm$coef)))
jancoefabs
[1] 1.165610e+00 1.277929e-01 4.349831e-01 3.602961e-01 7.189458e+00
[6] 1.856908e-04 1.352052e-05 4.811291e-05 1.055744e-02 2.756525e-04
[11] 2.202706e-01 4.199914e-02 4.684091e-02 8.634340e-01 2.479175e-02
[16] 2.409628e-01 5.459076e-03 9.892580e-03 5.378456e-02
Now as the more cunning of you might have guessed these are the absolute values of some model's coefficients.
What I need is an histogram that will have for axes:
x will be the number (count or length) of coefficients which is 19 in total, along with their names.
y will show values of each column (as breaks?) having a ylim="" set, according to min and max of those values (or something similar).
Note that Janmodelnorm$coef simply produces the following
(Intercept) LON LAT ME RAT
1.165610e+00 -1.277929e-01 -4.349831e-01 -3.602961e-01 -7.189458e+00
DS DSA DSI DRNS DREW
-1.856908e-04 1.352052e-05 4.811291e-05 -1.055744e-02 -2.756525e-04
ASPNS ASPEW SI CUR W_180_270
-2.202706e-01 -4.199914e-02 4.684091e-02 -8.634340e-01 -2.479175e-02
W_0_360 W_90_180 W_0_180 NDVI
2.409628e-01 5.459076e-03 -9.892580e-03 -5.378456e-02
So far and consulting ?hist, I am trying to play with the code bellow without success. Therefore I am taking it from scratch.
# hist(jancoefabs, col="lightblue", border="pink",
# breaks=8,
# xlim=c(0,10), ylim=c(20,-20), plot=TRUE)
When plot=FALSE is set, I get a bunch of somewhat useful info about the set. I also find hard to use breaks argument efficiently.
Any suggestion will be appreciated. Thanks.
Rather than using hist, why not use a barplot or a standard plot. For example,
## Generate some data
set.seed(1)
y = rnorm(19, sd=5)
names(y) = c("Inter", LETTERS[1:18])
Then plot the cofficients
barplot(y)
Alternatively, you could use a scatter plot
plot(1:19, y, axes=FALSE, ylim=c(-10, 10))
axis(2)
axis(1, 1:19, names(y))
and add error bars to indicate the standard errors (see for example Add error bars to show standard deviation on a plot in R)
Are you sure you want a histogram for this? A lattice barchart might be pretty nice. An example with the mtcars built-in data set.
> coef <- lm(mpg ~ ., data = mtcars)$coef
> library(lattice)
> barchart(coef, col = 'lightblue', horizontal = FALSE,
ylim = range(coef), xlab = '',
scales = list(y = list(labels = coef),
x = list(labels = names(coef))))
A base R dotchart might be good too,
> dotchart(coef, pch = 19, xlab = 'value')
> text(coef, seq(coef), labels = round(coef, 3), pos = 2)

superpose a histogram and an xyplot

I'd like to superpose a histogram and an xyplot representing the cumulative distribution function using r's lattice package.
I've tried to accomplish this with custom panel functions, but can't seem to get it right--I'm getting hung up on one plot being univariate and one being bivariate I think.
Here's an example with the two plots I want stacked vertically:
set.seed(1)
x <- rnorm(100, 0, 1)
discrete.cdf <- function(x, decreasing=FALSE){
x <- x[order(x,decreasing=FALSE)]
result <- data.frame(rank=1:length(x),x=x)
result$cdf <- result$rank/nrow(result)
return(result)
}
my.df <- discrete.cdf(x)
chart.hist <- histogram(~x, data=my.df, xlab="")
chart.cdf <- xyplot(100*cdf~x, data=my.df, type="s",
ylab="Cumulative Percent of Total")
graphics.off()
trellis.device(width = 6, height = 8)
print(chart.hist, split = c(1,1,1,2), more = TRUE)
print(chart.cdf, split = c(1,2,1,2))
I'd like these superposed in the same frame, rather than stacked.
The following code doesn't work, nor do any of the simple variations of it that I have tried:
xyplot(cdf~x,data=cdf,
panel=function(...){
panel.xyplot(...)
panel.histogram(~x)
})
You were on the right track with your custom panel function. The trick is passing the correct arguments to the panel.- functions. For panel.histogram, this means not passing a formula and supplying an appropriate value to the breaks argument:
EDIT Proper percent values on y-axis and type of plots
xyplot(100*cdf~x,data=my.df,
panel=function(...){
panel.histogram(..., breaks = do.breaks(range(x), nint = 8),
type = "percent")
panel.xyplot(..., type = "s")
})
This answer is just a placeholder until a better answer comes.
The hist() function from the graphics package has an option called add. The following does what you want in the "classical" way:
plot( my.df$x, my.df$cdf * 100, type= "l" )
hist( my.df$x, add= T )

Draw vertical ending of error bar line in dotplot

I am drawing dotplot() using lattice or Dotplot() using Hmisc. When I use default parameters, I can plot error bars without small vertical endings
--o--
but I would like to get
|--o--|
I know I can get
|--o--|
when I use centipede.plot() from plotrix or segplot() from latticeExtra, but those solutions don't give me such nice conditioning options as Dotplot(). I was trying to play with par.settings of plot.line, which works well for changing error bar line color, width, etc., but so far I've been unsuccessful in adding the vertical endings:
require(Hmisc)
mean = c(1:5)
lo = mean-0.2
up = mean+0.2
d = data.frame (name = c("a","b","c","d","e"), mean, lo, up)
Dotplot(name ~ Cbind(mean,lo,up),data=d,ylab="",xlab="",col=1,cex=1,
par.settings = list(plot.line=list(col=1),
layout.heights=list(bottom.padding=20,top.padding=20)))
Please, don't give me solutions that use ggplot2...
I've had this same need in the past, with barchart() instead of with Dotplot().
My solution then was to create a customized panel function that: (1) first executes the original panel function ; and (2) then uses panel.arrows() to add the error bar (using a two-headed arrow, in which the edges of the head form a 90 degree angle with the shaft).
Here's what that might look like with Dotplot():
# Create the customized panel function
mypanel.Dotplot <- function(x, y, ...) {
panel.Dotplot(x,y,...)
tips <- attr(x, "other")
panel.arrows(x0 = tips[,1], y0 = y,
x1 = tips[,2], y1 = y,
length = 0.15, unit = "native",
angle = 90, code = 3)
}
# Use almost the same call as before, replacing the default panel function
# with your customized function.
Dotplot(name ~ Cbind(mean,lo,up),data=d,ylab="",xlab="",col=1,cex=1,
panel = mypanel.Dotplot,
par.settings = list(plot.line=list(col=1),
layout.heights=list(bottom.padding=20,top.padding=20)))

Resources