Break axis in R - r

Recently I had some problems representing some data with xyplot. Everything appears very nicely but my boss asked me to break the axis (I'm not very fan of break axis). So far I have been able to do this with the function shingle, yet the order of panels is such a mess that read the information is imposible. In addition, I would like strips showing only the variable cs no the st (information specified in my sample data.frame) on the graphic. Then, the real challenge is to fix all this requirements into lattice, which is the internal standard for graphics in the group.
Here is my example data.frame (http://1drv.ms/1JOKSPU) sample of code:
AA<-read.csv("~/example.csv",header = T,sep = ";",dec = ",")
AA$st <-
shingle(Indl.Data$t,
intervals = rbind(c(0, 14),
c(23, 26)))
AA<-AA[with(AA, order(t)), ]
my.panel.1 <- function(x, y, subscripts, col, pch,cex,sd,...) {
low95 <- y-sd[subscripts]
up95 <- y+sd[subscripts]
panel.xyplot(x, y, col=col, pch=pch,cex=cex, ...)
panel.arrows(x, low95, x, up95, angle=90, code=3,lwd=3,
length=0.05, alpha=0.5,col=col)
}
xyplot( logOD~t|cs+st,
data=AA,
strip = T,
sd=0,
groups=cs,
xlab = list("Time ", cex=1.5),
ylab = list("growth", cex=1.5),
type="p",
col=c("red","black"),
scales = list(x = "free"), between = list(x = 0.5),
panel.groups="my.panel.1",
panel="panel.superpose",
par.settings = list(layout.widths = list(panel = c(6, 2))))
This is what I get with this:
In advance, sorry if there is any mistake in the formulation of my question, I do not have a programming background and this is me third question.
Cheers;

You want to go through st and then cs. So try xyplot( logOD~t|st+cs,.
Maybe use layout=c(4,3) if needed. And as.table=TRUE if you want it to start in upper left.

Related

How to color background in each panel of an xyplot according to a factor variable?

I'm trying to construct a xyplot which contains different background color according to different values of an additional categorical variable. It is no problem to get repeating background coloring with the panel.xblocks (package: latticeExtra) function, but up to now I found no method to implement this with different coloring for different subplots in the xyplot.
JD <- c(seq(0,19, 1), seq(0,19, 1))
VAR <- c(rnorm(20, mean=10, sd=1), rnorm(20, mean=10, sd=1))
CATEG <- c(rep("A", 5), rep("B", 15), rep("A", 10), rep("B", 10))
YEAR <- c(rep(2001, 20), rep(2002, 20))
myd <- data.frame(JD, VAR, CATEG, YEAR)
xyplot((VAR) ~ JD | factor(YEAR), type="l",
xlab="", ylab="", col=1, data=myd)+
layer_(panel.xblocks(x, CATEG,
col = c("lightgray")))
Running the above code, the background coloring from the first xyplot-subplot (year 2001) is repeated in the second xyplot-subplot (year 2002). my aim is to get different background coloring according to the varaiable "CATEG" for the two subplots. Any suggestions welcome.
I think the panel.xblocks function is a good approach. The use of subscripts and groups is handy too but always requires some re-learning for me.
The conditioning request (|) generates subscripts. The groups argument is used to pass the CATEG values to the panel function. It isn't actually used for any grouping here. The ... in the panel function is not actually used either, but it's a good practice in case the code is changed and other functions need arguments passed down.
# Starting with data in 'myd' from above
# Load non-standard packages
library(lattice)
library(latticeExtra)
# Old school colors
myCol <- c("salmon", "lightgray")
names(myCol) <- levels(myd$CATEG)
# To use a different color for each level of 'CATEG' in each panel:
obj1 <- xyplot(VAR ~ JD | factor(YEAR), data = myd,
groups = CATEG, xlab = "", ylab = "",
panel = function(x, y, subscripts, groups, ...) {
panel.xblocks(x, myCol[groups][subscripts])
panel.lines(x, y, col = 1)
})
# Here's a solution to a different problem (second plot):
# How to use a different color for the first level of 'CATEG' in each panel
obj2 <- xyplot(VAR ~ JD | factor(YEAR), data = myd,
xlab = "", ylab = "", groups = CATEG,
panel = function(x, y, subscripts, groups, ...) {
panel.xblocks(x, myCol[panel.number()][groups][subscripts])
panel.lines(x, y, col = 1)
})
# Plot in one device
plot(obj1, position = c(0, 0.45, 1, 1), more = TRUE)
plot(obj2, position = c(0, 0, 1, 0.55))

Inconsistent spacing in lattice dotplot between y values

I'm trying to use dotplot() from ´lattice´ to plot a data set where categories are only present for a subset, and I'm calling scales = list(y = list(relation = "free")) to avoid unnecessary vertical spacing. However, doing this seems to bungle up the vertical spacing between items. What more is that is seems to be related to whether or not the categories are overlapping, since it is only then that the error occurs.
library(lattice)
variables <- c(rep("Age", 4), rep("Sex", 2), rep("Children", 3))
levels <- c(1, 5, 100, 101, "Females", "Males", 2, 3, 90)
values <- rnorm(9)
dotplot(levels ~ values | variables, layout = c(1,3),
scales = list(y = list(relation = "free")))
You can clearly see that the spacing between for example 90 and 3 are off, whereas there is no issue with Males and Females. Now if I change the categories that have numerical values so that they don't overlap, I get correct spacing.
levels <- c(1:4, "Females", "Males", 5:7)
dotplot(levels ~ values | variables, layout = c(1,3),
scales = list(y = list(relation = "free")))
Does anybody know what is going on and what I can do to fix this?
You can use a function by the author of lattice (see dotplot, dropping unused levels of 'y').
Quoting Deepayan Sarkar from that post:
"It's a bit problematic. Basically, you can use relation="free"/"sliced", but y behaves as as.numeric(y) would. So, if the small subset in each panel are always more or less contiguous (in terms of the levels being close to each other) then you would be fine. Otherwise you would not. In that case, you can still write your own prepanel and panel functions,"
dotplot(levels ~ values | variables, layout = c(1,3),
scales = list(y = list(relation = "free")),
prepanel = function(x, y, ...) {
yy <- y[, drop = TRUE]
list(ylim = levels(yy),
yat = sort(unique(as.numeric(yy))))
},
panel = function(x, y, ...) {
yy <- y[, drop = TRUE]
panel.dotplot(x, yy, ...)
})

Using Bxp function in R with varwidth

I am quite new to R programming and have been given the task of representing some data in a boxplot. We were only provided the five figure summary of the data, i.e the lowest value, lower quartile,median,upper quartile,highest value. We are also told the amount of samples (n).
I read bxp was a function similar to boxplot but drew the boxplot based upon this five figure summary.
However, I know varwidth can be used to change the width of boxes proportionate to N, yet it does not seem to work here as all boxes are the same length. This is what I need help with.
MORSEYear1 <- c(18.2,58.5,64.4,73.4,91.1)
MORSEYear2 <- c(22.3,56.4,64.3,75.7,97.4)
MORSEYear3 <- c(29.1,57.9,66.6,73.4,86.0)
MathStatYear1 <- c(46.8,54.8,66.1,71.4,84.1)
MathStatYear2 <- c(35.1,47.8,57.8,65.7,82.8)
MathStatYear3 <- c(32.6,56.3,61.1,75.6,89.4)
MORSE1<-list(stats=matrix(MORSEYear1,MORSEYear1[5],MORSEYear1[1]), n=139)
MORSE2<-list(stats=matrix(MORSEYear2,MORSEYear2[5],MORSEYear2[1]), n=132)
MORSE3<-list(stats=matrix(MORSEYear3,MORSEYear3[5],MORSEYear3[1]), n=131)
MS1 <- list(stats=matrix(MathStatYear1,MathStatYear1[5],MathStatYear1[1]), n= 21)
MS2 <- list(stats=matrix(MathStatYear2,MathStatYear2[5],MathStatYear2[1]), n=20)
MS3 <- list(stats=matrix(MathStatYear3,MathStatYear3[5],MathStatYear3[1]), n= 14)
bxp(MORSE1, xlim = c(0.5,6.5),ylim = c(0,100),varwidth= TRUE, main = "Graph comparing distribution of marks across different years of MORSE and MathStat",ylab = "Marks", xlab = "Course and year of study (Course,Year)", axes = FALSE)
par(new=T)
bxp(MORSE2, xlim = c(-0.5,5.5), ylim = c(0,100),axes= TRUE, varwidth=TRUE)
par(new=T)
bxp(MORSE3, xlim = c(-1.5,4.5), ylim = c(0,100), varwidth=TRUE, axes = FALSE)
par(new=T)
bxp(MS1, xlim = c(-2.5,3.5), ylim = c(0,100), varwidth=TRUE, axes = FALSE)
par(new=T)
bxp(MS2, xlim = c(-3.5,2.5), ylim = c(0,100), varwidth=TRUE, axes = FALSE)
par(new=T)
bxp(MS3, xlim = c(-4.5,1.5), ylim = c(0,100), varwidth=TRUE, axes = FALSE)
NOTE: My supervisor said to use par(new=T) and change the xlim to plot multiple graphs using bxp(), if someone could verify if this is the best method or not that would be great!
Thanks
Stumbled upon the same problem, without much experience with R.
The varwidth argument of the bxp() function requires multiple boxplots being plotted at once. Adding to an initial plot does not count, as no readjustment is possible after the fact.
The question is how to construct a multidimensional z argument for bxp(). To answer this, a look at the result of something like boxplot(c(c(1,1),c(2,2))~c(c(11,11),c(22,22))) helps.
First, a generic example with made-up data to aid anyone that lands here:
# data
d1 <- c(1,2,3,4,5)
d2 <- c(1,2,3,5,8,13,21,34)
# summaries (generated with quantile and structured accordingly)
z1 <- list(
stats=matrix(quantile(d1, c(0.05,0.25,0.5,0.75,0.85))),
n=length(d1)
)
z2 <- list(
stats=matrix(quantile(d2, c(0.05,0.25,0.5,0.75,0.85))),
n=length(d2)
)
# merging the summaries appropriately
z <- list(
stats=cbind(z1$stats,z2$stats),
n=c(z1$n,z2$n)
)
# check result
print(z)
# call bxp with needed parameters ("at" can/should also be used here)
bxp(z=z,varwidth=TRUE)
In the case of the original question, one should merge MORSE# and MS#. The code is far from optimal - there might be a better way to merge and a function for this can be written, but the aim is ugly clarity and simplicity:
z <- list(
stats=cbind(MORSE1$stats, MORSE2$stats, MORSE3$stats, M1$stats, M2$stats, M3$stats),
n=c(MORSE1$stats, MORSE2$n, MORSE3$n, M1$n, M2$n, M3$n)
)

Colour gradient on a map in R (using image)

I am trying to create a map just to get a concept across, not actually display real data. So far I have the following code:
library(maps)
image(x=-100:10, y = -10:80, z = outer(-360:-250, -10:80), xlab = "lon", ylab = "lat")
map("world", col="gray", fill=TRUE, add=TRUE)
box()
Which in part I pulled together from some other forum posts. It creates this.
The bit I am struggling with is I want the gradational red-yellow-white colours to run N to S (it is just to demonstrate the direction of a trend). They are nearly there, but I cant seem to get the configuration of 'z' correct and I have a feeling I am doing a bad bodge and there is a proper solution. For info, I also want to create the same map with the gradient running E to W, ideally in a different colour palette.
Many thanks in advance.
This seems to work for making the color more even.
x <- -100:10
y <- -10:80
r <- outer(x, y^3, "+")
image(x, y, z = r, col = rev(heat.colors(30)), xlab = "lon", ylab = "lat")
map("world", col = "grey", fill = TRUE, add = TRUE)
And to change the direction of the color, adjust r,
x <- -100:10
y <- -10:80
r <- outer(x^3, y, "+")
image(x, y, z = r, col = heat.colors(30), xlab = "lon", ylab = "lat")
map("world", col = "grey", fill = TRUE, add = TRUE)

How to overlay density plots in R?

I would like to overlay 2 density plots on the same device with R. How can I do that? I searched the web but I didn't find any obvious solution.
My idea would be to read data from a text file (columns) and then use
plot(density(MyData$Column1))
plot(density(MyData$Column2), add=T)
Or something in this spirit.
use lines for the second one:
plot(density(MyData$Column1))
lines(density(MyData$Column2))
make sure the limits of the first plot are suitable, though.
ggplot2 is another graphics package that handles things like the range issue Gavin mentions in a pretty slick way. It also handles auto generating appropriate legends and just generally has a more polished feel in my opinion out of the box with less manual manipulation.
library(ggplot2)
#Sample data
dat <- data.frame(dens = c(rnorm(100), rnorm(100, 10, 5))
, lines = rep(c("a", "b"), each = 100))
#Plot.
ggplot(dat, aes(x = dens, fill = lines)) + geom_density(alpha = 0.5)
Adding base graphics version that takes care of y-axis limits, add colors and works for any number of columns:
If we have a data set:
myData <- data.frame(std.nromal=rnorm(1000, m=0, sd=1),
wide.normal=rnorm(1000, m=0, sd=2),
exponent=rexp(1000, rate=1),
uniform=runif(1000, min=-3, max=3)
)
Then to plot the densities:
dens <- apply(myData, 2, density)
plot(NA, xlim=range(sapply(dens, "[", "x")), ylim=range(sapply(dens, "[", "y")))
mapply(lines, dens, col=1:length(dens))
legend("topright", legend=names(dens), fill=1:length(dens))
Which gives:
Just to provide a complete set, here's a version of Chase's answer using lattice:
dat <- data.frame(dens = c(rnorm(100), rnorm(100, 10, 5))
, lines = rep(c("a", "b"), each = 100))
densityplot(~dens,data=dat,groups = lines,
plot.points = FALSE, ref = TRUE,
auto.key = list(space = "right"))
which produces a plot like this:
That's how I do it in base (it's actually mentionned in the first answer comments but I'll show the full code here, including legend as I can not comment yet...)
First you need to get the info on the max values for the y axis from the density plots. So you need to actually compute the densities separately first
dta_A <- density(VarA, na.rm = TRUE)
dta_B <- density(VarB, na.rm = TRUE)
Then plot them according to the first answer and define min and max values for the y axis that you just got. (I set the min value to 0)
plot(dta_A, col = "blue", main = "2 densities on one plot"),
ylim = c(0, max(dta_A$y,dta_B$y)))
lines(dta_B, col = "red")
Then add a legend to the top right corner
legend("topright", c("VarA","VarB"), lty = c(1,1), col = c("blue","red"))
I took the above lattice example and made a nifty function. There is probably a better way to do this with reshape via melt/cast. (Comment or edit if you see an improvement.)
multi.density.plot=function(data,main=paste(names(data),collapse = ' vs '),...){
##combines multiple density plots together when given a list
df=data.frame();
for(n in names(data)){
idf=data.frame(x=data[[n]],label=rep(n,length(data[[n]])))
df=rbind(df,idf)
}
densityplot(~x,data=df,groups = label,plot.points = F, ref = T, auto.key = list(space = "right"),main=main,...)
}
Example usage:
multi.density.plot(list(BN1=bn1$V1,BN2=bn2$V1),main='BN1 vs BN2')
multi.density.plot(list(BN1=bn1$V1,BN2=bn2$V1))
You can use the ggjoy package. Let's say that we have three different beta distributions such as:
set.seed(5)
b1<-data.frame(Variant= "Variant 1", Values = rbeta(1000, 101, 1001))
b2<-data.frame(Variant= "Variant 2", Values = rbeta(1000, 111, 1011))
b3<-data.frame(Variant= "Variant 3", Values = rbeta(1000, 11, 101))
df<-rbind(b1,b2,b3)
You can get the three different distributions as follows:
library(tidyverse)
library(ggjoy)
ggplot(df, aes(x=Values, y=Variant))+
geom_joy(scale = 2, alpha=0.5) +
scale_y_discrete(expand=c(0.01, 0)) +
scale_x_continuous(expand=c(0.01, 0)) +
theme_joy()
Whenever there are issues of mismatched axis limits, the right tool in base graphics is to use matplot. The key is to leverage the from and to arguments to density.default. It's a bit hackish, but fairly straightforward to roll yourself:
set.seed(102349)
x1 = rnorm(1000, mean = 5, sd = 3)
x2 = rnorm(5000, mean = 2, sd = 8)
xrng = range(x1, x2)
#force the x values at which density is
# evaluated to be the same between 'density'
# calls by specifying 'from' and 'to'
# (and possibly 'n', if you'd like)
kde1 = density(x1, from = xrng[1L], to = xrng[2L])
kde2 = density(x2, from = xrng[1L], to = xrng[2L])
matplot(kde1$x, cbind(kde1$y, kde2$y))
Add bells and whistles as desired (matplot accepts all the standard plot/par arguments, e.g. lty, type, col, lwd, ...).

Resources