Axis break with gap.plot while data contains NAs - r

I want to make a x-axis break with gap.plot() function from package plotrix while my data contains NAs.
My code works fine if there aren´t any NAs but with NAs it tells me:
Error in if (lostones) warning("some values of x will not be
displayed") : argument is not interpretable as logical
and it doesn´t plot anything at all.
dt is just an example dataset
dt <- data.frame(c(1.2,NA,5,6,4.3,1),c(22,33,22,25,NA,27))
names(dt) <- c("a","b")
library(plotrix)
gap.plot(dt$a, dt$b, gap=c(1.5,3.5), gap.axis="x",col="blue", ylim=range(c(dt$b)),xtics=c(0:1.5,3.5:6), xticlab=c(0:1.5,3.5:6))
abline(v=1.5, col="white")
abline(v=1.56, col="white", lwd=4)
axis.break(1,breakpos=1.55,style="slash", brw=0.03)
axis.break(3,breakpos=1.55,style="slash", brw=0.03)
What do I have to change? By the way I don´t want to use ggplot.

Since you are trying to produce a scatterplot, you have to omit your lines that contain NA.
For example:
dt <- data.frame(c(1.2,NA,5,6,4.3,1),c(22,33,22,25,NA,27))
names(dt) <- c("a","b")
Now remove your NA's:
library(dplyr)
dt <- dt %>%
na.omit()
Plot:
library(plotrix)
gap.plot(dt$a, dt$b, gap=c(1.5,3.5), gap.axis="x",col="blue", ylim=range(dt$b) ,xtics=c(0:1.5,3.5:6), xticlab=c(0:1.5,3.5:6))
abline(v=1.5, col="white")
abline(v=1.56, col="white", lwd=4)
axis.break(1,breakpos=1.55,style="slash", brw=0.03)
axis.break(3,breakpos=1.55,style="slash", brw=0.03)
Result:

Related

arrange R plots: how can I arrange plots of the VIM package?

I would like to generate multiple plots using marginplot() (VIM package) and then arrange them into one big figure. I tried to use grid.arrange (grid/gridExtra package) and it did not work. The error was, that a grob was expected as input. So I tried to first convert the marginplot into a ggplot (as.ggplot) or grob (as.grob) but this did not work. Has anyone an idea how to arrange the plots?
library(VIM)
library(ggplotify)
library(grid)
library(gridExtra)
x <- cars[, c("speed", "dist")]
marginplot(x)
y <- cars[, c("speed", "dist")]
marginplot(y)
p <- qplot(1,1)
#p2 <- as.ggplot(marginplot(x))
r <- rectGrob(gp=gpar(fill="grey90"))
grid.arrange( r, p,p, r, ncol=2)
I created a small code with cars, where I managed to arrange grey squares and qplots. put I cannot add the marginplot.
With base plots this error occurs. Learned here: grid.arrange from gridExtras exiting with "only 'grobs' allowed in 'gList'" after update
grid.arrange is intended to be used with "grid graphical objects" (grobs), such as ggplot2.
One could find an equivalent grid-plot or use a base-graphics approach to stacking plots.
Try this:
library(VIM)
x <- cars[, c("speed", "dist")]
y <- cars[, c("speed", "dist")]
par(mfrow = c(2,2))
marginplot(x)
marginplot(y)
plot(rnorm(100))
hist(rnorm(100))
par(mfrow = c(1, 1)) #reset this parameter

Retrieve facet labels from a ggplot or a gtable/gTree/grob/gDesc object

I have data I'm plotting using ggplot's facet_grid:
My data:
species <- c("spcies1","species2")
conditions <- c("cond1","cond2","cond3")
batches <- 1:6
df <- expand.grid(species=species,condition=conditions,batch=batches)
set.seed(1)
df$y <- rnorm(nrow(df))
df$replicate <- 1
df$col.fill <- paste(df$species,df$condition,df$batch,sep=".")
My plot:
integerBreaks <- function(n = 5, ...)
{
library(scales)
breaker <- pretty_breaks(n, ...)
function(x){
breaks <- breaker(x)
breaks[breaks == floor(breaks)]
}
}
library(ggplot2)
p <- ggplot(df,aes(x=replicate,y=y,color=col.fill))+
geom_point(size=3)+facet_grid(~col.fill,scales="free_x")+
scale_x_continuous(breaks=integerBreaks())+
theme_minimal()+theme(legend.position="none",axis.title=element_text(size=8))
which gives:
Obviously the labels are long and come out pretty messed up in the figure so I was wondering if there's a way edit these labels in the ggplot object (p) or the gtable/gTree/grob/gDesc object (ggplotGrob(p)).
I am aware that one way of getting better labels is to use the labeller function when the ggplot object is created but in my case I'm specifically looking for a way to edit the facet labels after the ggplot object has been created.
As I mentioned in the comments, the facet names are nested quite deeply within the gtable that ggplotGrob() gives you. However, this is still possible and since the OP explicitly wants to edit them after being plotted, you can do this with:
library(grid)
gg <- ggplotGrob(p)
edited_grobs <- mapply(FUN = function(x, y) {
x[["grobs"]][[1]][["children"]][[2]][["children"]][[1]][["label"]] <- y
return(x)
},
gg$grobs[which(grepl("strip-t",gg$layout$name))],
unique(gsub("cond","c", df$condition)),
SIMPLIFY = FALSE)
gg$grobs[which(grepl("strip-t",gg$layout$name))] <- edited_grobs
grid.draw(gg)
Note that this extracts all the strips using gg$grobs[which(grepl("strip-t",gg$layout$name))] and passes them to the mapply to be reset with the gsub(...) that OP specified in their comment.
In general, if you want to access just one of the text labels, there is a very similar structure which I made use of in my mapply:
num_to_access <- 1
gg$grobs[which(grepl("strip-t",gg$layout$name))][[num_to_access]][["grobs"]][[1]][["children"]][[2]][["children"]][[1]]$label
So to access the 4th label for example all you would need to do is change num_to_acces to be 4. Hope this helps!

How do I plot this data using R?

> aggregate(dat[, 3:7], by=list(dat$TRT), FUN=mean)
Group.1 DBP1 DBP2 DBP3 DBP4 DBP5
1 A 116.55 113.5 110.70 106.25 101.35
2 B 116.75 115.2 114.05 112.45 111.95
I wish to create a lines plot were the x-axis are the names (DBP1, DBP2, ..., DBP5).
It takes two seconds in Excel (I admit) and gives exactly what I want:
To be clear, the question is about getting the two rows of data into the plot, not about how they are displayed (i.e. with what line/point/color combination).
With dplyr, tidyr and ggplot2
Data
zz <- "Group.1 DBP1 DBP2 DBP3 DBP4 DBP5
A 116.55 113.5 110.70 106.25 101.35
B 116.75 115.2 114.05 112.45 111.95"
df <- read.table(text = zz, header = TRUE)
Load Required Packages
library(dplyr)
library(tidyr)
library(ggplot2)
Tidy
df_tidy <- df %>%
gather(key, value, -Group.1)
Plot
ggplot(data = df_tidy, aes(x = key, y = value)) +
geom_line(aes(color = Group.1)) +
ylim(90, 120)
Output
First step: use melt from the reshape2 package:
d <- aggregate(
dat[, 3:7],
by=list(dat$TRT),
FUN=mean
)
m <- melt(d
id="TRT",
measure.vars=c("DBP1","DBP2","DBP3","DBP4","DBP5")
)
Then use
xyplot(m$value~m$variable, type="o", group=m$TRT, auto.key=list(TRUE))
Simplest possible (??) base-R answer:
dd <- read.table(header=TRUE,text="
Group.1 DBP1 DBP2 DBP3 DBP4 DBP5
A 116.55 113.5 110.70 106.25 101.35
B 116.75 115.2 114.05 112.45 111.95")
matplot() is the basic function for plotting multiple parallel sequences, but (1) it requires that the series be in columns of a matrix; (2) it can't handle character variables, so you have to drop the first column; (3) if you want the group names as axis labels, you have to add that with a separate axis() command. Unfortunately it's not (that I know of) possible to suppress just one of the axes, so you have to suppress them both (axes=FALSE), then add them both manually.
par(las=1) ## horizontal y-axis labels (cosmetic)
matplot(t(dd[,-1]),type="b",axes=FALSE,
ylab="",ylim=c(90,120),
col=c("red","blue"),pch=16,lty=1)
axis(side=2) ## y-axis (default labels)
axis(side=1,at=1:5,label=names(dd)[-1]) ## x-axis
box() ## bounding box
legend("bottomleft",legend=dd$Group.1,
col=c("red","blue"),lty=1,pch=16)
If you want to dispense with legend, nice tick-marks, etc., then just matplot(t(dd[,-1]),...) will do it.
A simple R code can be:
A <- c(116.55, 113.5, 110.70, 106.25, 101.35)
B <- c(116.75, 115.2, 114.05, 112.45, 111.95)
plot(A, type="n")
axis(1, at=1:5, labels=c("DBP1","DBP2","DBP3","DBP4","DBP5"))
lines(A, col="blue")
lines(B, col="red")
Alternate way:
plot(A, type="l", col="blue")
axis(1, at=1:5, labels=c("DBP1","DBP2","DBP3","DBP4","DBP5"))
lines(B, col="red")
A simple approach is to custom your plot, step-by-step.
First, plot the first line and specify you don't when an x-axis to be drawn. Add the second line.
Add your custom x-axis with the labels you want.
Add points on the values you just plot.
Translate into R :
data <- matrix(c(116.55,113.5,110.7,106.25,101.35,116.75,115.2,114.05,112.45,111.95), nrow=2)
plot(data[1,], type="l", xaxt="n")
axis(1, at=1:5, labels=c("DBP1","DBP2","DBP3","DBP4","DBP5"))
lines(data[2,])
points(data[1,])
points(data[2,])
the xaxt="n" specify that you want no x-axis text.
Here is a good reference : http://www.statmethods.net/advgraphs/axes.html
Then, make it beautiful!
If you want a simpler approach for the future, here is a basic function you can improve
plot.Custom <- function(yy, xLabels, ...){
plot(yy[1,], type="l", xaxt="n", ...)
axis(1, at=1:dim(yy)[2], labels=xLabels)
for(i in 1:dim(yy)[1]){
lines(yy[i,])
points(yy[i,])
}
}
plot.Custom(data, c("DBP1","DBP2","DBP3","DBP4","DBP5"))

Plotting number of times a name occurs in a column (histogram)

I have a list of names sorted, like below:
ACVR2B
ADAM19
ADAM29
ADAM29
ADAMTS1
ADAMTS1
ADAMTS1
ADAMTS12
ADAMTS16
ADAMTS16
ADAMTS16
ADAMTS17
ADAMTS17
ADAMTSL1
ADCY10
would like to plot them as a histogram. It is very easy when these are values but with characters how can I do it in R or in open office?
Thank you
Try plotting the result of table(). The function table() computes the cross-tabulation frequency, which is exactly what you want.
set.seed(42)
x <- sample(letters, 100, replace = TRUE)
plot(table(x))
To plot the sorted values, try this:
z <- sort(table(x))
plot(z, xaxt="n", type="h")
axis(1, at=seq_along(z), names(z))
Given to what Andrie suggested: I did this:
Letter<-read.table("letters", header=T)
x <- sample(Letters, replace = F)
plot(sort(table(x)))
but the things is when, I want to plot in a descending order with only top 10 I miss out on the labels.
Can anyone suggest how to fix it and get only top 10.

Labeling outliers on boxplot in R

I would like to plot each column of a matrix as a boxplot and then label the outliers in each boxplot as the row name they belong to in the matrix. To use an example:
vv=matrix(c(1,2,3,4,8,15,30),nrow=7,ncol=4,byrow=F)
rownames(vv)=c("one","two","three","four","five","six","seven")
boxplot(vv)
I would like to label the outlier in each plot (in this case 30) as the row name it belongs to, so in this case 30 belongs to row 7. Is there an easy way to do this? I have seen similar questions to this asked but none seemed to have worked the way I want it to.
There is a simple way. Note that b in Boxplot in following lines is a capital letter.
library(car)
Boxplot(y ~ x, id.method="y")
Or alternatively, you could use the "Boxplot" function from the {car} package which labels outliers for you.
See the following link: https://CRAN.R-project.org/package=car
In the example given it's a bit boring because they are all the same row. but here is the code:
bxpdat <- boxplot(vv)
text(bxpdat$group, # the x locations
bxpdat$out, # the y values
rownames(vv)[which(vv == bxpdat$out, arr.ind=TRUE)[, 1]], # the labels
pos = 4)
This picks the rownames that have values equal to the "out" list (i.e., the outliers) in the result of boxplot. Boxplot calls and returns the values from boxplot.stats. Take a look at:
str(bxpdat)
#DWin's solution works very well for a single boxplot, but will fail for anything with duplicate values, like the dataset I have created:
#Create data
set.seed(1)
basenums <- c(1,2,3,4,8,15,30)
vv=matrix(c(basenums, sample(basenums), 1-basenums,
c(0, 29, 30, 31, 32, 33, 60)),nrow=7,ncol=4,byrow=F)
dimnames(vv)=list(c("one","two","three","four","five","six","seven"), 1:4)
On this dataset, #DWin's solution gives:
Which is false, because in the 4th example, it is not possible for the minimum and maximum to be in the same row.
This solution is monstrous (and I hope can be simplified), but effective.
#Reshape data
vv_dat <- as.data.frame(vv)
vv_dat$row <- row.names(vv_dat)
library(reshape2)
new_vv <- melt(vv_dat, id.vars="row")
#Get boxplot data
bxpdat <- as.data.frame(boxplot(value~variable, data=new_vv)[c("out", "group")])
#Get matches with boxplot data
text_guide <- do.call(rbind, apply(bxpdat, 1,
function(x) new_vv[new_vv$value==x[1]&new_vv$variable==x[2], ]))
#Add labels
with(text_guide, text(x=as.numeric(variable)+0.2, y=value, labels=row))
Or you can simply run the code from this blog post:
source("https://raw.githubusercontent.com/talgalili/R-code-snippets/master/boxplot.with.outlier.label.r") # Load the function
set.seed(6484)
y <- rnorm(20)
x1 <- sample(letters[1:2], 20,T)
lab_y <- sample(letters, 20)
# plot a boxplot with interactions:
boxplot.with.outlier.label(y~x1, lab_y)
(which handles multiple outliers which are close to one another)
#sebastian-c
This is a slight modification of DWin solution that seem to work with more generality
bx1<-boxplot(pb,las=2,cex.axis=.8)
if(length(bx1$out)!=0){
## get the row of each outlier
out.rows<-sapply(1:length(bx1$out),function(i) which(vv[,bx1$group[i]]==bx1$out[i]))
text(bx1$group,bx1$out,
rownames(vv)[out.rows],
pos=4
)
}

Resources