How do I plot this data using R? - r

> aggregate(dat[, 3:7], by=list(dat$TRT), FUN=mean)
Group.1 DBP1 DBP2 DBP3 DBP4 DBP5
1 A 116.55 113.5 110.70 106.25 101.35
2 B 116.75 115.2 114.05 112.45 111.95
I wish to create a lines plot were the x-axis are the names (DBP1, DBP2, ..., DBP5).
It takes two seconds in Excel (I admit) and gives exactly what I want:
To be clear, the question is about getting the two rows of data into the plot, not about how they are displayed (i.e. with what line/point/color combination).

With dplyr, tidyr and ggplot2
Data
zz <- "Group.1 DBP1 DBP2 DBP3 DBP4 DBP5
A 116.55 113.5 110.70 106.25 101.35
B 116.75 115.2 114.05 112.45 111.95"
df <- read.table(text = zz, header = TRUE)
Load Required Packages
library(dplyr)
library(tidyr)
library(ggplot2)
Tidy
df_tidy <- df %>%
gather(key, value, -Group.1)
Plot
ggplot(data = df_tidy, aes(x = key, y = value)) +
geom_line(aes(color = Group.1)) +
ylim(90, 120)
Output

First step: use melt from the reshape2 package:
d <- aggregate(
dat[, 3:7],
by=list(dat$TRT),
FUN=mean
)
m <- melt(d
id="TRT",
measure.vars=c("DBP1","DBP2","DBP3","DBP4","DBP5")
)
Then use
xyplot(m$value~m$variable, type="o", group=m$TRT, auto.key=list(TRUE))

Simplest possible (??) base-R answer:
dd <- read.table(header=TRUE,text="
Group.1 DBP1 DBP2 DBP3 DBP4 DBP5
A 116.55 113.5 110.70 106.25 101.35
B 116.75 115.2 114.05 112.45 111.95")
matplot() is the basic function for plotting multiple parallel sequences, but (1) it requires that the series be in columns of a matrix; (2) it can't handle character variables, so you have to drop the first column; (3) if you want the group names as axis labels, you have to add that with a separate axis() command. Unfortunately it's not (that I know of) possible to suppress just one of the axes, so you have to suppress them both (axes=FALSE), then add them both manually.
par(las=1) ## horizontal y-axis labels (cosmetic)
matplot(t(dd[,-1]),type="b",axes=FALSE,
ylab="",ylim=c(90,120),
col=c("red","blue"),pch=16,lty=1)
axis(side=2) ## y-axis (default labels)
axis(side=1,at=1:5,label=names(dd)[-1]) ## x-axis
box() ## bounding box
legend("bottomleft",legend=dd$Group.1,
col=c("red","blue"),lty=1,pch=16)
If you want to dispense with legend, nice tick-marks, etc., then just matplot(t(dd[,-1]),...) will do it.

A simple R code can be:
A <- c(116.55, 113.5, 110.70, 106.25, 101.35)
B <- c(116.75, 115.2, 114.05, 112.45, 111.95)
plot(A, type="n")
axis(1, at=1:5, labels=c("DBP1","DBP2","DBP3","DBP4","DBP5"))
lines(A, col="blue")
lines(B, col="red")
Alternate way:
plot(A, type="l", col="blue")
axis(1, at=1:5, labels=c("DBP1","DBP2","DBP3","DBP4","DBP5"))
lines(B, col="red")

A simple approach is to custom your plot, step-by-step.
First, plot the first line and specify you don't when an x-axis to be drawn. Add the second line.
Add your custom x-axis with the labels you want.
Add points on the values you just plot.
Translate into R :
data <- matrix(c(116.55,113.5,110.7,106.25,101.35,116.75,115.2,114.05,112.45,111.95), nrow=2)
plot(data[1,], type="l", xaxt="n")
axis(1, at=1:5, labels=c("DBP1","DBP2","DBP3","DBP4","DBP5"))
lines(data[2,])
points(data[1,])
points(data[2,])
the xaxt="n" specify that you want no x-axis text.
Here is a good reference : http://www.statmethods.net/advgraphs/axes.html
Then, make it beautiful!
If you want a simpler approach for the future, here is a basic function you can improve
plot.Custom <- function(yy, xLabels, ...){
plot(yy[1,], type="l", xaxt="n", ...)
axis(1, at=1:dim(yy)[2], labels=xLabels)
for(i in 1:dim(yy)[1]){
lines(yy[i,])
points(yy[i,])
}
}
plot.Custom(data, c("DBP1","DBP2","DBP3","DBP4","DBP5"))

Related

Visualize data using histogram in R

I am trying to visualize some data and in order to do it I am using R's hist.
Bellow are my data
jancoefabs <- as.numeric(as.vector(abs(Janmodelnorm$coef)))
jancoefabs
[1] 1.165610e+00 1.277929e-01 4.349831e-01 3.602961e-01 7.189458e+00
[6] 1.856908e-04 1.352052e-05 4.811291e-05 1.055744e-02 2.756525e-04
[11] 2.202706e-01 4.199914e-02 4.684091e-02 8.634340e-01 2.479175e-02
[16] 2.409628e-01 5.459076e-03 9.892580e-03 5.378456e-02
Now as the more cunning of you might have guessed these are the absolute values of some model's coefficients.
What I need is an histogram that will have for axes:
x will be the number (count or length) of coefficients which is 19 in total, along with their names.
y will show values of each column (as breaks?) having a ylim="" set, according to min and max of those values (or something similar).
Note that Janmodelnorm$coef simply produces the following
(Intercept) LON LAT ME RAT
1.165610e+00 -1.277929e-01 -4.349831e-01 -3.602961e-01 -7.189458e+00
DS DSA DSI DRNS DREW
-1.856908e-04 1.352052e-05 4.811291e-05 -1.055744e-02 -2.756525e-04
ASPNS ASPEW SI CUR W_180_270
-2.202706e-01 -4.199914e-02 4.684091e-02 -8.634340e-01 -2.479175e-02
W_0_360 W_90_180 W_0_180 NDVI
2.409628e-01 5.459076e-03 -9.892580e-03 -5.378456e-02
So far and consulting ?hist, I am trying to play with the code bellow without success. Therefore I am taking it from scratch.
# hist(jancoefabs, col="lightblue", border="pink",
# breaks=8,
# xlim=c(0,10), ylim=c(20,-20), plot=TRUE)
When plot=FALSE is set, I get a bunch of somewhat useful info about the set. I also find hard to use breaks argument efficiently.
Any suggestion will be appreciated. Thanks.
Rather than using hist, why not use a barplot or a standard plot. For example,
## Generate some data
set.seed(1)
y = rnorm(19, sd=5)
names(y) = c("Inter", LETTERS[1:18])
Then plot the cofficients
barplot(y)
Alternatively, you could use a scatter plot
plot(1:19, y, axes=FALSE, ylim=c(-10, 10))
axis(2)
axis(1, 1:19, names(y))
and add error bars to indicate the standard errors (see for example Add error bars to show standard deviation on a plot in R)
Are you sure you want a histogram for this? A lattice barchart might be pretty nice. An example with the mtcars built-in data set.
> coef <- lm(mpg ~ ., data = mtcars)$coef
> library(lattice)
> barchart(coef, col = 'lightblue', horizontal = FALSE,
ylim = range(coef), xlab = '',
scales = list(y = list(labels = coef),
x = list(labels = names(coef))))
A base R dotchart might be good too,
> dotchart(coef, pch = 19, xlab = 'value')
> text(coef, seq(coef), labels = round(coef, 3), pos = 2)

Colorfill boxplot in R-cran with lines, dots, or similar

I need to use black and white color for my boxplots in R. I would like to colorfill the boxplot with lines and dots. For an example:
I imagine ggplot2 could do that but I can't find any way to do it.
Thank you in advance for your help!
I thought this was a great question and pondered if it was possible to do this in base R and to obtain the checkered look. So I put together some code that relies on boxplot.stats and polygon (which can draw angled lines). Here's the solution, which is really not ready for primetime, but is a solution that could be tinkered with to make more general.
boxpattern <-
function(y, xcenter, boxwidth, angle=NULL, angle.density=10, ...) {
# draw an individual box
bstats <- boxplot.stats(y)
bxmin <- bstats$stats[1]
bxq2 <- bstats$stats[2]
bxmedian <- bstats$stats[3]
bxq4 <- bstats$stats[4]
bxmax <- bstats$stats[5]
bleft <- xcenter-(boxwidth/2)
bright <- xcenter+(boxwidth/2)
# boxplot
polygon(c(bleft,bright,bright,bleft,bleft),
c(bxq2,bxq2,bxq4,bxq4,bxq2), angle=angle[1], density=angle.density)
polygon(c(bleft,bright,bright,bleft,bleft),
c(bxq2,bxq2,bxq4,bxq4,bxq2), angle=angle[2], density=angle.density)
# lines
segments(bleft,bxmedian,bright,bxmedian,lwd=3) # median
segments(bleft,bxmin,bright,bxmin,lwd=1) # min
segments(xcenter,bxmin,xcenter,bxq2,lwd=1)
segments(bleft,bxmax,bright,bxmax,lwd=1) # max
segments(xcenter,bxq4,xcenter,bxmax,lwd=1)
# outliers
if(length(bstats$out)>0){
for(i in 1:length(bstats$out))
points(xcenter,bstats$out[i])
}
}
drawboxplots <- function(y, x, boxwidth=1, angle=NULL, ...){
# figure out all the boxes and start the plot
groups <- split(y,as.factor(x))
len <- length(groups)
bxylim <- c((min(y)-0.04*abs(min(y))),(max(y)+0.04*max(y)))
xcenters <- seq(1,max(2,(len*(1.4))),length.out=len)
if(is.null(angle)){
angle <- seq(-90,75,length.out=len)
angle <- lapply(angle,function(x) c(x,x))
}
else if(!length(angle)==len)
stop("angle must be a vector or list of two-element vectors")
else if(!is.list(angle))
angle <- lapply(angle,function(x) c(x,x))
# draw plot area
plot(0, xlim=c(.97*(min(xcenters)-1), 1.04*(max(xcenters)+1)),
ylim=bxylim,
xlab="", xaxt="n",
ylab=names(y),
col="white", las=1)
axis(1, at=xcenters, labels=names(groups))
# draw boxplots
plots <- mapply(boxpattern, y=groups, xcenter=xcenters,
boxwidth=boxwidth, angle=angle, ...)
}
Some examples in action:
mydat <- data.frame(y=c(rnorm(200,1,4),rnorm(200,2,2)),
x=sort(rep(1:2,200)))
drawboxplots(mydat$y, mydat$x)
mydat <- data.frame(y=c(rnorm(200,1,4),rnorm(200,2,2),
rnorm(200,3,3),rnorm(400,-2,8)),
x=sort(rep(1:5,200)))
drawboxplots(mydat$y, mydat$x)
drawboxplots(mydat$y, mydat$x, boxwidth=.5, angle.density=30)
drawboxplots(mydat$y, mydat$x, # specify list of two-element angle parameters
angle=list(c(0,0),c(90,90),c(45,45),c(45,-45),c(0,90)))
EDIT: I wanted to add that one could also obtain dots as a fill by basically drawing a pattern of dots, then covering them a "donut"-shaped polygon, like so:
x <- rep(1:10,10)
y <- sort(x)
plot(y~x, xlim=c(0,11), ylim=c(0,11), pch=20)
outerbox.x <- c(2.5,0.5,10.5,10.5,0.5,0.5,2.5,7.5,7.5,2.5)
outerbox.y <- c(2.5,0.5,0.5,10.5,10.5,0.5,2.5,2.5,7.5,7.5)
polygon(outerbox.x,outerbox.y, col="white", border="white") # donut
polygon(c(2.5,2.5,7.5,7.5,2.5),c(2.5,2.5,2.5,7.5,7.5)) # inner box
But mixing that with angled lines in a single plotting function would be a bit difficult, and is generally a bit more challenging, but it starts to get you there.
I think it is hard to do this with ggplot2 since it dont use shading polygon(gris limitatipn). But you can use shading line feature in base plot, paramtered by density and angle arguments in some plot functions ( ploygon, barplot,..).
The problem that boxplot don't use this feature. So I hack it , or rather I hack bxp internally used by boxplot. The hack consist in adding 2 arguments (angle and density) to bxp function and add them internally in the call of xypolygon function ( This occurs in 2 lines).
my.bxp <- function (all.bxp.argument,angle,density, ...) {
.....#### bxp code
xypolygon(xx, yy, lty = boxlty[i], lwd = boxlwd[i],
border = boxcol[i],angle[i],density[i])
.......## bxp code after
xypolygon(xx, yy, lty = "blank", col = boxfill[i],angle[i],density[i])
......
}
Here an example. It should be noted that it is entirely the responsibility of the user to ensure
that the legend corresponds to the plot. So I add some code to rearrange the legend an the boxplot code.
require(stats)
set.seed(753)
(bx.p <- boxplot(split(rt(100, 4), gl(5, 20))))
layout(matrix(c(1,2),nrow=1),
width=c(4,1))
angles=c(60,30,40,50,60)
densities=c(50,30,40,50,30)
par(mar=c(5,4,4,0)) #Get rid of the margin on the right side
my.bxp(bx.p,angle=angles,density=densities)
par(mar=c(5,0,4,2)) #No margin on the left side
plot(c(0,1),type="n", axes=F, xlab="", ylab="")
legend("top", paste("region", 1:5),
angle=angles,density=densities)

R arrowed labelling of data points on a plot

I am looking to label data points with indices -- to identify the index number easily by visual examination.
So for instance,
x<-ts.plot(rnorm(10,0,1)) # would like to visually identify the data point indices easily through arrow labelling
Of course, if there's a better way of achieving this, please suggest
You can use arrows function:
set.seed(1); ts.plot(x <-rnorm(10,0,1), ylim=c(-1.6,1.6)) # some random data
arrows(x0=1:length(x), y0=0, y1=x, code=2, col=2, length=.1) # adding arrows
text(x=1:10, y=x+.1, 0, labels=round(x,2), cex=0.65) # adding text
abline(h=0) # adding a horizontal line at y=0
Use my.symbols from package TeachingDemos to get arrows pointing to the locations you want:
require(TeachingDemos)
d <- rnorm(10,0,1)
plot(d, type="l", ylim=c(min(d)-1, max(d)+1))
my.symbols(x=1:10, y=d, ms.arrows, angle=pi/2, add=T, symb.plots=TRUE, adj=1.5)
You can use text() for this
n <- 10
d <- rnorm(n)
plot(d, type="l", ylim=c(min(d)-1, max(d)+1))
text(1:n, d+par("cxy")[2]/2,col=2) # Upside
text(1:n, d-par("cxy")[2]/2,col=3) # Downside
Here a lattice version, to see the analogous of some base function.
set.seed(1234)
dat = data.frame(x=1:10, y = rnorm(10,0,1))
xyplot(y~x,data=dat, type =c('l','p'),
panel = function(x,y,...){
panel.fill(col=rgb(1,1,0,0.5))
panel.xyplot(x,y,...)
panel.arrows(x, y0=0,x1=x, y1=y, code=2, col=2, length=.1)
panel.text(x,y,label=round(y,2),adj=1.2,cex=1.5)
panel.abline(a=0)
})

Is it possible to create a 3d contour plot without continuous data in R?

I want to create a contour of variable z with the x,y,z data. However, it seems like we need to provide the data in increasing order.
I tried to use some code but it gave me the error.
I tried the following code: Trial 1:
age2100 <- read.table("temp.csv",header=TRUE,sep=",")
x <- age2100$x
y <- age2100$y
z <- age2100$z
contour(x,y,z,add=TRUE,col="black")
I got the following error
Error in contour.default(x, y, z, add = TRUE, col = "black") : increasing 'x' and 'y' values expected
I then tried to use ggplot2 to create the contour. I used the following code:
library("ggplot2")
library("MASS")
library("rgdal")
library("gpclib")
library("maptools")
age2100 <- read.table("temp.csv",header=TRUE,sep=",")
v <- ggplot(age2100, aes(age2100$x, age2100$y,z=age2100$z))+geom_contour()
v
I got the following error:
Warning message:
Not possible to generate contour data
Please find the data on the following location https://www.dropbox.com/s/mg2bo4rcr6n3dks/temp.csv
Can anybody tell me how to create the contour data from the third variable (z) from the temp.csv ? I need to do these many times so I am trying to do on R instead of Arcgis.
Here is an example of how one interpolates using interp from the akimapackage:
age2100 <- read.table("temp.csv",header=TRUE,sep=",")
x <- age2100$x
y <- age2100$y
z <- age2100$z
require(akima)
fld <- interp(x,y,z)
par(mar=c(5,5,1,1))
filled.contour(fld)
Here is an alternate plot using the imagefunction (this allows some flexibility to adding lower level plotting functions (requires the image.scale function, found here):
source("image.scale.R") # http://menugget.blogspot.de/2011/08/adding-scale-to-image-plot.html
x11(width=5, height=6)
layout(matrix(c(1,2), nrow=1, ncol=2), widths=c(4,1), height=6, respect=TRUE)
layout.show(2)
par(mar=c(4,4,1,1))
image(fld)
contour(fld, add=TRUE)
points(age2100$x,age2100$y, pch=".", cex=2)
par(mar=c(4,0,1,4))
image.scale(fld$z, xlab="", ylab="", xaxt="n", yaxt="n", horiz=FALSE)
box()
axis(4)
mtext("text", side=4, line=2.5)

Plotting two histograms of a continuous variable, with bars next to each other instead of overlapping

I am trying to plot two histograms in one plot, but the way these two groups are distributed makes the histogram a little hard to interpret. My histogram now looks like this:
This is my code:
hist(GROUP1, col=rgb(0,0,1,1/2), breaks=100, freq=FALSE,xlab="X",main="") # first histogram
hist(GROUP1, col=rgb(1,0,0,1/2), breaks=100, freq=FALSE , add=T) # second
legend(0.025,600,legend=c("group 1","group 2"),col=c(rgb(1,0,0,1/2),rgb(0,0,1,1/2)),pch=20,bty="n",cex=1.5)
Is it possible to plot this histograms, with the bars of the two groups right next to each other, instead of them overlapping? I realize that that might add some confusion, since the X-axis represents a continuous variable... Other suggestions of how to make this plot in more clear are of course also welcome!
Rather than messing about with overlapping histograms, what about:
Have two histograms in separate panels, i.e.
par(mfrow=c(1,2))
d1 = rnorm(100);d2 = rnorm(100);
hist(d1);hist(d2)
Or, use density plots
plot(density(d1))
lines(density(d2), col=2)
Or use a combination of density plots and histograms
hist(d1, freq=FALSE)
lines(density(d2), col=2)
You could misuse barplot for it:
multipleHist <- function(l, col=rainbow(length(l))) {
## create hist for each list element
l <- lapply(l, hist, plot=FALSE);
## get mids
mids <- unique(unlist(lapply(l, function(x)x$mids)))
## get densities
densities <- lapply(l, function(x)x$density[match(x=mids, table=x$mids, nomatch=NA)]);
## create names
names <- unique(unlist(lapply(l, function(x)x$breaks)))
a <- head(names, -1)
b <- names[-1]
names <- paste("(", a, ", ", b, "]", sep="");
## create barplot list
h <- do.call(rbind, densities);
## set names
colnames(h) <- names;
## draw barplot
barplot(h, beside=TRUE, col=col);
invisible(l);
}
Example:
x <- lapply(c(1, 1.1, 4), rnorm, n=1000)
multipleHist(x)
EDIT:
Here is an example to draw a x-axis like the OP suggested. IMHO this is very misleading (because bins of a barplot are not continuous values) and should not be used.
multipleHist <- function(l, col=rainbow(length(l))) {
## create hist for each list element
l <- lapply(l, hist, plot=FALSE);
## get mids
mids <- unique(unlist(lapply(l, function(x)x$mids)))
## get densities
densities <- lapply(l, function(x)x$density[match(x=mids, table=x$mids, nomatch=NA)]);
## create names
breaks <- unique(unlist(lapply(l, function(x)x$breaks)))
a <- head(breaks, -1)
b <- breaks[-1]
names <- paste("(", a, ", ", b, "]", sep="");
## create barplot list
h <- do.call(rbind, densities);
## set names
colnames(h) <- names;
## draw barplot
barplot(h, beside=TRUE, col=col, xaxt="n");
## draw x-axis
at <- axTicks(side=1, axp=c(par("xaxp")[1:2], length(breaks)-1))
labels <- seq(min(breaks), max(breaks), length.out=1+par("xaxp")[3])
labels <- round(labels, digits=1)
axis(side=1, at=at, labels=breaks)
invisible(l);
}
Please find the complete source code on github.

Resources