Inconsistent spacing in lattice dotplot between y values - r

I'm trying to use dotplot() from ´lattice´ to plot a data set where categories are only present for a subset, and I'm calling scales = list(y = list(relation = "free")) to avoid unnecessary vertical spacing. However, doing this seems to bungle up the vertical spacing between items. What more is that is seems to be related to whether or not the categories are overlapping, since it is only then that the error occurs.
library(lattice)
variables <- c(rep("Age", 4), rep("Sex", 2), rep("Children", 3))
levels <- c(1, 5, 100, 101, "Females", "Males", 2, 3, 90)
values <- rnorm(9)
dotplot(levels ~ values | variables, layout = c(1,3),
scales = list(y = list(relation = "free")))
You can clearly see that the spacing between for example 90 and 3 are off, whereas there is no issue with Males and Females. Now if I change the categories that have numerical values so that they don't overlap, I get correct spacing.
levels <- c(1:4, "Females", "Males", 5:7)
dotplot(levels ~ values | variables, layout = c(1,3),
scales = list(y = list(relation = "free")))
Does anybody know what is going on and what I can do to fix this?

You can use a function by the author of lattice (see dotplot, dropping unused levels of 'y').
Quoting Deepayan Sarkar from that post:
"It's a bit problematic. Basically, you can use relation="free"/"sliced", but y behaves as as.numeric(y) would. So, if the small subset in each panel are always more or less contiguous (in terms of the levels being close to each other) then you would be fine. Otherwise you would not. In that case, you can still write your own prepanel and panel functions,"
dotplot(levels ~ values | variables, layout = c(1,3),
scales = list(y = list(relation = "free")),
prepanel = function(x, y, ...) {
yy <- y[, drop = TRUE]
list(ylim = levels(yy),
yat = sort(unique(as.numeric(yy))))
},
panel = function(x, y, ...) {
yy <- y[, drop = TRUE]
panel.dotplot(x, yy, ...)
})

Related

How to color background in each panel of an xyplot according to a factor variable?

I'm trying to construct a xyplot which contains different background color according to different values of an additional categorical variable. It is no problem to get repeating background coloring with the panel.xblocks (package: latticeExtra) function, but up to now I found no method to implement this with different coloring for different subplots in the xyplot.
JD <- c(seq(0,19, 1), seq(0,19, 1))
VAR <- c(rnorm(20, mean=10, sd=1), rnorm(20, mean=10, sd=1))
CATEG <- c(rep("A", 5), rep("B", 15), rep("A", 10), rep("B", 10))
YEAR <- c(rep(2001, 20), rep(2002, 20))
myd <- data.frame(JD, VAR, CATEG, YEAR)
xyplot((VAR) ~ JD | factor(YEAR), type="l",
xlab="", ylab="", col=1, data=myd)+
layer_(panel.xblocks(x, CATEG,
col = c("lightgray")))
Running the above code, the background coloring from the first xyplot-subplot (year 2001) is repeated in the second xyplot-subplot (year 2002). my aim is to get different background coloring according to the varaiable "CATEG" for the two subplots. Any suggestions welcome.
I think the panel.xblocks function is a good approach. The use of subscripts and groups is handy too but always requires some re-learning for me.
The conditioning request (|) generates subscripts. The groups argument is used to pass the CATEG values to the panel function. It isn't actually used for any grouping here. The ... in the panel function is not actually used either, but it's a good practice in case the code is changed and other functions need arguments passed down.
# Starting with data in 'myd' from above
# Load non-standard packages
library(lattice)
library(latticeExtra)
# Old school colors
myCol <- c("salmon", "lightgray")
names(myCol) <- levels(myd$CATEG)
# To use a different color for each level of 'CATEG' in each panel:
obj1 <- xyplot(VAR ~ JD | factor(YEAR), data = myd,
groups = CATEG, xlab = "", ylab = "",
panel = function(x, y, subscripts, groups, ...) {
panel.xblocks(x, myCol[groups][subscripts])
panel.lines(x, y, col = 1)
})
# Here's a solution to a different problem (second plot):
# How to use a different color for the first level of 'CATEG' in each panel
obj2 <- xyplot(VAR ~ JD | factor(YEAR), data = myd,
xlab = "", ylab = "", groups = CATEG,
panel = function(x, y, subscripts, groups, ...) {
panel.xblocks(x, myCol[panel.number()][groups][subscripts])
panel.lines(x, y, col = 1)
})
# Plot in one device
plot(obj1, position = c(0, 0.45, 1, 1), more = TRUE)
plot(obj2, position = c(0, 0, 1, 0.55))

R levelplot: color green-white-red (white on 0) according to one variable, but show the values of another variable

The title is pretty much self-descriptive. I want to do a heatmap-like plot with lattice, showing the data values as well, something like in here
However, in my case, I want to color the plot according to one variable (fold.change), but show the values of another variable (p.value).
It would be optimal that the color range is green-white-red, and the white is on 0 (negative fold.change values in green, and positive ones in red).
My last question would be how to change the text size of the title and axis text, remove axis title, and rotate x axis text 45 degrees; I don't find this information in the documentation. Thanks!
This is my MWE so far:
library(lattice)
library(latticeExtra)
library(RColorBrewer)
pv.df <- data.frame(compound = rep(LETTERS[1:8], each = 3),
comparison = rep(c("a/b", "b/c", "a/c"), 8),
p.value = runif(24, 0, 1),
fold.change = runif(24, -2, 6))
myPanel <- function(x, y, z, ...) {
panel.levelplot(x,y,z,...)
panel.text(x, y, round(z,1))
}
cols <- rev(colorRampPalette(brewer.pal(6, "RdYlGn"))(20))
png(filename = "test.png", height = 1000, width = 600)
print(
levelplot(fold.change ~ comparison*compound,
pv.df,
panel = myPanel,
col.regions = cols,
colorkey = list(col = cols,
at = do.breaks(range(pv.df$fold.change), 20)),
scales = list(x = list(rot = 90)),
main = "Total FAME abundance - TREATMENT",
type = "g")
)
dev.off()
Which produces this plot:
Thanks!
There are several parts to your question. Let's address them one by one:
1: Change labels. This can be done by varying the 3rd argument for panel.text():
myPanel <- function(x, y, z, ...) {
panel.levelplot(x, y, z, ...)
panel.text(x, y, round(pv.df$p.value, 2))
}
2: Change color scale with white positioned at 0. Calculate how long each segment of the color scale should be, then define each segment separately:
color.ramp.length <- 20
negative.length <- round(abs(range(pv.df$fold.change)[1]) /
diff(range(pv.df$fold.change)) *
color.ramp.length)
positive.length <- color.ramp.length - negative.length
cols <- c(colorRampPalette(c("seagreen", "white"))(negative.length),
colorRampPalette(c("white", "firebrick"))(positive.length))
(Note: you can use other color options from here. I just find the colors associated with "red" / "green" an eye sore.)
3: Modify axis titles / labels. Specify the relevant arguments in levelplot().
levelplot(fold.change ~ comparison*compound,
pv.df,
panel = myPanel,
col.regions = cols,
colorkey = list(col = cols,
at = do.breaks(range(pv.df$fold.change),
color.ramp.length)),
xlab = "", ylab = "", # remove axis titles
scales = list(x = list(rot = 45), # change rotation for x-axis text
cex = 0.8), # change font size for x- & y-axis text
main = list(label = "Total FAME abundance - TREATMENT",
cex = 1.5)) # change font size for plot title

How to plot multiple datasets with errbar?

This is what I have done so far
library(Hmisc)
m1 <- read.table("mt7.1r1.rp", header = FALSE)
m2 <- read.table("mt7.1r2.rp", header = FALSE)
m3 <- read.table("mt7.2r1.rp", header = FALSE)
m4 <- read.table("mt7.2r2.rp", header = FALSE)
p1=m1[1]
per1=log10(p1)
ixxr=m1[3]
ixxi=m1[4]
p2=m2[1]
per2=log10(p2)
ixyr=m2[3]
ixyi=m2[4]
p3=m3[1]
per3=log10(p3)
iyxr=m3[3]
iyxi=m3[4]
p4=m4[1]
per4=log10(p4)
iyyr=m4[3]
iyyi=m4[4]
erxx=m1[5]
erxy=m2[5]
eryx=m3[5]
eryy=m4[5]
xmin <- floor(min(per1,per2,per3,per4))
xmax <- ceiling(max(per1,per2,per3,per4))
ymin <- floor(min(ixxr,ixxi))
ymax <- ceiling(max(ixxr,ixxi))
per1=unname(per1)
ixxr=unname(ixxr)
ixxi=unname(ixxi)
erxx=unname(erxx)
per1=unlist(per1)
ixxr=unlist(ixxr)
ixxi=unlist(ixxi)
erxx=unlist(erxx)
errbar(per1,ixxr,ixxr+erxx,ixxr-erxx,col='red',xlabel='Per (s)',ylabel='Zxx/Zxy')
par(new = T)
errbar(per1,ixxi,ixxi+erxx,ixxi-erxx,col='green')
But i got image
Y-axis from two datasets are overlapping. How to prevent this?
I want to have a unique axis in min,max range with one single label.
Should I group the data before the plotting or...?
Adding yaxt = 'n' to one of the two plots (I did it for the first one) you do not report the y axis. For having just one y label, use first ylab = NA, then set the y label in the second plot (or viceversa).
errbar(per1,ixxr,ixxr+erxx,ixxr-erxx,col='red', xlab='Per (s)',
yaxt = 'n', ylab = NA)
errbar(per1,ixxi,ixxi+erxx,ixxi-erxx,col='green', ylab = 'ixxr and ixxi')
It would be good practice to compute the common range of the y values and setting it through ylim, so to be sure that everything will be shown on the plot.

Break axis in R

Recently I had some problems representing some data with xyplot. Everything appears very nicely but my boss asked me to break the axis (I'm not very fan of break axis). So far I have been able to do this with the function shingle, yet the order of panels is such a mess that read the information is imposible. In addition, I would like strips showing only the variable cs no the st (information specified in my sample data.frame) on the graphic. Then, the real challenge is to fix all this requirements into lattice, which is the internal standard for graphics in the group.
Here is my example data.frame (http://1drv.ms/1JOKSPU) sample of code:
AA<-read.csv("~/example.csv",header = T,sep = ";",dec = ",")
AA$st <-
shingle(Indl.Data$t,
intervals = rbind(c(0, 14),
c(23, 26)))
AA<-AA[with(AA, order(t)), ]
my.panel.1 <- function(x, y, subscripts, col, pch,cex,sd,...) {
low95 <- y-sd[subscripts]
up95 <- y+sd[subscripts]
panel.xyplot(x, y, col=col, pch=pch,cex=cex, ...)
panel.arrows(x, low95, x, up95, angle=90, code=3,lwd=3,
length=0.05, alpha=0.5,col=col)
}
xyplot( logOD~t|cs+st,
data=AA,
strip = T,
sd=0,
groups=cs,
xlab = list("Time ", cex=1.5),
ylab = list("growth", cex=1.5),
type="p",
col=c("red","black"),
scales = list(x = "free"), between = list(x = 0.5),
panel.groups="my.panel.1",
panel="panel.superpose",
par.settings = list(layout.widths = list(panel = c(6, 2))))
This is what I get with this:
In advance, sorry if there is any mistake in the formulation of my question, I do not have a programming background and this is me third question.
Cheers;
You want to go through st and then cs. So try xyplot( logOD~t|st+cs,.
Maybe use layout=c(4,3) if needed. And as.table=TRUE if you want it to start in upper left.

How to overlay density plots in R?

I would like to overlay 2 density plots on the same device with R. How can I do that? I searched the web but I didn't find any obvious solution.
My idea would be to read data from a text file (columns) and then use
plot(density(MyData$Column1))
plot(density(MyData$Column2), add=T)
Or something in this spirit.
use lines for the second one:
plot(density(MyData$Column1))
lines(density(MyData$Column2))
make sure the limits of the first plot are suitable, though.
ggplot2 is another graphics package that handles things like the range issue Gavin mentions in a pretty slick way. It also handles auto generating appropriate legends and just generally has a more polished feel in my opinion out of the box with less manual manipulation.
library(ggplot2)
#Sample data
dat <- data.frame(dens = c(rnorm(100), rnorm(100, 10, 5))
, lines = rep(c("a", "b"), each = 100))
#Plot.
ggplot(dat, aes(x = dens, fill = lines)) + geom_density(alpha = 0.5)
Adding base graphics version that takes care of y-axis limits, add colors and works for any number of columns:
If we have a data set:
myData <- data.frame(std.nromal=rnorm(1000, m=0, sd=1),
wide.normal=rnorm(1000, m=0, sd=2),
exponent=rexp(1000, rate=1),
uniform=runif(1000, min=-3, max=3)
)
Then to plot the densities:
dens <- apply(myData, 2, density)
plot(NA, xlim=range(sapply(dens, "[", "x")), ylim=range(sapply(dens, "[", "y")))
mapply(lines, dens, col=1:length(dens))
legend("topright", legend=names(dens), fill=1:length(dens))
Which gives:
Just to provide a complete set, here's a version of Chase's answer using lattice:
dat <- data.frame(dens = c(rnorm(100), rnorm(100, 10, 5))
, lines = rep(c("a", "b"), each = 100))
densityplot(~dens,data=dat,groups = lines,
plot.points = FALSE, ref = TRUE,
auto.key = list(space = "right"))
which produces a plot like this:
That's how I do it in base (it's actually mentionned in the first answer comments but I'll show the full code here, including legend as I can not comment yet...)
First you need to get the info on the max values for the y axis from the density plots. So you need to actually compute the densities separately first
dta_A <- density(VarA, na.rm = TRUE)
dta_B <- density(VarB, na.rm = TRUE)
Then plot them according to the first answer and define min and max values for the y axis that you just got. (I set the min value to 0)
plot(dta_A, col = "blue", main = "2 densities on one plot"),
ylim = c(0, max(dta_A$y,dta_B$y)))
lines(dta_B, col = "red")
Then add a legend to the top right corner
legend("topright", c("VarA","VarB"), lty = c(1,1), col = c("blue","red"))
I took the above lattice example and made a nifty function. There is probably a better way to do this with reshape via melt/cast. (Comment or edit if you see an improvement.)
multi.density.plot=function(data,main=paste(names(data),collapse = ' vs '),...){
##combines multiple density plots together when given a list
df=data.frame();
for(n in names(data)){
idf=data.frame(x=data[[n]],label=rep(n,length(data[[n]])))
df=rbind(df,idf)
}
densityplot(~x,data=df,groups = label,plot.points = F, ref = T, auto.key = list(space = "right"),main=main,...)
}
Example usage:
multi.density.plot(list(BN1=bn1$V1,BN2=bn2$V1),main='BN1 vs BN2')
multi.density.plot(list(BN1=bn1$V1,BN2=bn2$V1))
You can use the ggjoy package. Let's say that we have three different beta distributions such as:
set.seed(5)
b1<-data.frame(Variant= "Variant 1", Values = rbeta(1000, 101, 1001))
b2<-data.frame(Variant= "Variant 2", Values = rbeta(1000, 111, 1011))
b3<-data.frame(Variant= "Variant 3", Values = rbeta(1000, 11, 101))
df<-rbind(b1,b2,b3)
You can get the three different distributions as follows:
library(tidyverse)
library(ggjoy)
ggplot(df, aes(x=Values, y=Variant))+
geom_joy(scale = 2, alpha=0.5) +
scale_y_discrete(expand=c(0.01, 0)) +
scale_x_continuous(expand=c(0.01, 0)) +
theme_joy()
Whenever there are issues of mismatched axis limits, the right tool in base graphics is to use matplot. The key is to leverage the from and to arguments to density.default. It's a bit hackish, but fairly straightforward to roll yourself:
set.seed(102349)
x1 = rnorm(1000, mean = 5, sd = 3)
x2 = rnorm(5000, mean = 2, sd = 8)
xrng = range(x1, x2)
#force the x values at which density is
# evaluated to be the same between 'density'
# calls by specifying 'from' and 'to'
# (and possibly 'n', if you'd like)
kde1 = density(x1, from = xrng[1L], to = xrng[2L])
kde2 = density(x2, from = xrng[1L], to = xrng[2L])
matplot(kde1$x, cbind(kde1$y, kde2$y))
Add bells and whistles as desired (matplot accepts all the standard plot/par arguments, e.g. lty, type, col, lwd, ...).

Resources