Produce single plot and conditioning plot in trellis graphics - r

Hi there: I'm playing around with the ``tips'' data set for a course that I'm teaching. I'd like to produce one .png file that has the plot of tip as a function of size on the top row of the plotting device (ideally in the center-top, of the window) and the bottom row being the conditioning plot of tip as a function of size grouped by a categorical variable recoded from the total_bill variable contained in the data set. I'm much more familiar with the ggplot2 environment, although I can't quite figure out how to do this there, either.
Thanks!
library(reshape2)
library(grid)
library(lattice)
data(tips)
tips$bill2<-cut(tips$total_bill, breaks=3, labels=c('low', 'medium', 'high'))
#Create one plot window with this plot on the top row, ideally in the center
xyplot(tip~size, data=tips,type=c('r', 'p'))
#With this plot in the second row
xyplot(tip~size|bill2, data=tips, type=c('r', 'p'))

You can use the split argument to print
p1 <- xyplot(tip~size, data=tips,type=c('r', 'p'))
p2 <- xyplot(tip~size|bill2, data=tips, type=c('r', 'p'))
print(p1,split=c(1,1,1,2),more=TRUE)
print(p2,split=c(1,2,1,2),more=FALSE)
see ?print.trellis
Update: to adjust size
That also is in ?print.trellis.
print(p1,split=c(1,1,1,2),more=TRUE,position=c(.3,0,.7,1))
print(p2,split=c(1,2,1,2),more=FALSE)
Tweak the position if you like.
By the way, lattice might not always arrange your second plot as 3 panels in one row, depending on the shape of your graphics window. You can force this by
p2 <- xyplot(tip~size|bill2, data=tips, type=c('r', 'p'),layout=c(3,1))

Related

How to increase the interval of labels in geom_text?

I am trying to put labels beside some points which are very close to each other on geographic coordinate. Of course, the problem is overlapping labels. I have used the following posts for reference:
geom_text() with overlapping labels
avoid overlapping labels in ggplot2 charts
Relative positioning of geom_text in ggplot2?
The problem is that I do not want to relocate labels but increase the interval of labeling (for example every other 10 points).
I tried to make column as alpha in my dataframe to make unwanted points transparent
[![combined_df_c$alpha=rep(c(1,rep(0,times=11)),
times=length(combined_df_c$time)/
length(rep(c(1,rep(0,times=11)))))][1]][1]
I do not know why it does not affect the plot and all labels are plotted again.
The expected output is fewer labels on my plot.
You can do this by sequencing your dataframe for the labs of geom_text.
I used the build-in dataset mtcars for this, since you did not provide any data. With df[seq(1,nrow(df),6),] i slice the data with 6-steps. This are the labels which get shown in your graph afterwards. You could use this with any steps you want. The sliced dataframe is given to geom_text, so it does not use the original dataset anymore, just the sliced one. This way the amount of points for the labels and the amount of labels are equal.
df <- mtcars
labdf<- df[seq(1,nrow(df),6),]
ggplot()+
geom_point(data=df, aes(x=drat, y=seq(1:length(drat))))+
geom_text(data=labdf,
aes(x=drat, y=seq(1:length(drat))), label=labdf$drat)
The output is as expected: from 32 rows, just 6 get labeled.
You can easily adjust the code for your case.
also: you can put the aes in ggplot() which may be more useful if you use more then just gemo_point. I made it like this, so i can clarify: there is a different dataset used on geom_text()

panel.text xyplot R

I am adding text to different panels of a xyplot in lattice and was wondering if anyone knows a way to not specify a x and y coordinates or is there something similar to legend where you can say upper left or upper right,etc?
I ask because I want to use scales=free in the plotting code, but when I do the text in the mytext code ends up covering up parts of the graph and doesn't make for a good plot. I would like to have a way to plot the graphs without making individual plots because in my real dataset I have up to 10 grouping factor levels (sams in the code). The example provided is not as extreme as the real data.
Example data
d_exp<-data.frame(sams=c(rep("A",6),rep("B",6),rep("C",6)),
gear=c(rep(1:2,9)),
fraction=c(.12,.61,.23,.05,.13,.45,0.3,.5,.45,.20,.35,.10,.8,.60,.10,.01,.23,.03),
interval=c(rep(c(0,10,20),6)))
d_exp<-d_exp[order(d_exp$sams,d_exp$gear,d_exp$interval),]
Plot with scales=same. mytext x and y coordinates are specified.
mytext<-c("N=3","N=35","N=6")
panel.my <- function(...) {
panel.superpose(col=c("red","blue"),lwd=1.5,...)
panel.text(x=2.5,y=0.5,labels=mytext[panel.number()],cex=.8)
}
xyplot(fraction~interval | sams, data=d_exp,groups=gear,type="l",
scales=list(relation="same",y=list(alternating=1,cex=0.8),x=list(alternating=1,cex=.8,abbreviate=F)),
strip = strip.custom(bg="white", strip.levels = T),drop.unused.levels=T,as.table=T,
par.strip.text=list(cex=0.8),panel=panel.my)
Same thing with scales=free. Text is in odd places because all text has the same coordinates.
xyplot(fraction~interval | sams, data=d_exp,groups=gear,type="l",
scales=list(relation="free",y=list(alternating=1,cex=0.8),x=list(alternating=1,cex=.8,abbreviate=F)),
strip = strip.custom(bg="white", strip.levels = T),drop.unused.levels=T,as.table=T,
par.strip.text=list(cex=0.8),panel=panel.my)
Thanks for any help.
You can use grid.text() to specify units in a range-independent way. For example
library(grid)
panel.my <- function(...) {
panel.superpose(col=c("red","blue"),lwd=1.5,...)
grid.text(x=.5,y=.8,label=mytext[panel.number()])
}
With grid.text the x and y values use npc units by default which range from 0 to 1. So x=.5 means centered and y=.8 means 80% of the way to the top.

Multiple plot in the same figure

I have several data and I need to plot them compactly in a picture like this:
I already tried par() layout() and ggplot() but plots are displayed so far each other.
I need them to be very close, as if they were in the same plot with a different y (e.g. plot1 y=0, plot2 y=1, plot3 y=3 and so on..)
Can someone help me?
That can be acquired using the layout, also, but maybe an easier approach is to set the graphical parameters in a suitable way.
Function par() let's you specify the number of panels in a single figure using the argument mfrow. It takes a vector of two numbers, that specify the number sub-figure rows and columns. For example, c(2,1) would create two rows of figure,s but only a single column. That's what is in your example figure. You can change the number of figure rows to the number of sub-figures you would like to plot vertically.
In addition, the margins around each sub-figure can be set using the argument mar. The margins are specified in the order of 1. bottom, 2. left, 3. top., and 4. right. Making the bottom and top margins smaller would draw your sub-figures closer together.
In R this could look something like the following:
# Simulate some random data
a<-runif(10000)
b<-runif(10000)
# Open a new plot windows
# width: 7 inches, height: 2 inches
x11(width=7, height=1)
# Specify the number of sub-figures
# Specify the margins (top and bottom are 0.1, left and right are 2)
# Needs some experimenting with to get these right
par(mfrow=c(2,1), mar=c(0.1,2,0.1,2))
# Plot the figures
barplot(a)
barplot(b)
The resulting figure should roughly resemble this:
Here is ggplot version using facet_grid:
df <- data.frame(a=runif(3e3), b=rep(letters[1:3], 1e3), c=rep(1:1e3, 3))
ggplot(df, aes(y=a, x=c)) + geom_bar(stat="identity") + facet_grid(b ~ .)

positioning plots and table

I would like to plot two histograms and add a table to a pdf file. With the layout function I managed to plot the histograms (plotted them using hist function) where I want them to be but when I used grid.table function from the gridExtra package to add the table the table is laid out on the histograms and I am not able to position them properly. I have tried addtable2plot function but I dont find it visually appealing.
Any thoughts on How do I get around this?
I want my pdf to look like this
histogram1 histogram2
t a b l e
Essentially, one row with two columns and another row with just one column. This is what I did.
require(gridExtra)
layout(matrix(c(1,2,3,3),2,2,byrow=T),heights=c(1,1))
count_table=table(cut(tab$Longest_OHR,breaks=c(0,0.05,0.10,0.15,0.20,0.25,0.30,0.35,0.40,0.45,0.50,0.55,0.60,0.65,0.70,0.75,0.80,0.85,0.90,0.95,1.00)))
ysize=max(count_table)+1000
hist(tab$Longest_OHR,xlab="OHR longest",ylim=c(0,ysize))
count_table=table(cut(tab$Sum_of_OHR.s,breaks=c(0,0.05,0.10,0.15,0.20,0.25,0.30,0.35,0.40,0.45,0.50,0.55,0.60,0.65,0.70,0.75,0.80,0.85,0.90,0.95,1.00)))
ysize=max(count_table)+1000
hist(tab$Sum_of_OHR.s,xlab="OHR Sum",ylim=c(0,ysize))
tmp <- table(cut(tab$Length_of_Gene.Protein, breaks = c(0,100,200,500,1000,2000,5000,10000,1000000000)), cut(tab$Sum_of_OHR.s, breaks = (0:10)/10))
grid.table(tmp)
dev.off()
Any help will be appreciated.
Ram
Here's an example of how to combine two base plots and a grid.table in the same figure.
library(gridExtra)
layout(matrix(c(1,0,2,0), 2))
hist(iris$Sepal.Length, col="lightblue")
hist(iris$Sepal.Width, col="lightblue")
pushViewport(viewport(y=.25,height=.5))
grid.table(head(iris), h.even.alpha=1, h.odd.alpha=1,
v.even.alpha=0.5, v.odd.alpha=1)
The coordinates sent to viewport are the center of the panel. Too see exactly where its boundaries are you can call grid.rect().

Plot With Blocks

I have been searching for hours, but I can't find a function that does this.
How do I generate a plot like
Lets say I have an array x1 = c(2,13,4) and y2=c(5,23,43). I want to create 3 blocks with height from 2-5,13-23...
How would I approach this problem? I'm hoping that I could be pointed in the right direction as to what built-in function to look at?
I have not used your data because you say you are working with an array, but you gave us two vectors. Moreover, the data you showed us is overlapping. This means that if you chart three bars, you only see two.
Based on the little image you provided, you have three ranges you want to plot for each individual or date. Using times series, we usually see this to plot the min/max, the standard deviation and the current data.
The trick is to chart the series as layers. The first series is the one with the largest range (the beige band in this example). In the following example, I chart an empty plot first and I add three layers of rectangles, one for beige, one for gray and one for red.
#Create data.frame
n=100
df <-data.frame(1:n,runif(n)*10,60+runif(n)*10,25+runif(n)*10,40+runif(n)*10,35-runif(n)*10,35+runif(n)*10)
colnames(df) <-c("id","beige.min","beige.max","gray.min","gray.max","red.min","red.max")
#Create chart
plot(x=df$id,y=NULL,ylim=range(df[,-1]), type="n") #blank chart, ylim is the range of the data
rect(df$id-0.5,df[,2],df$id+0.5,df[,3],col="beige", border=FALSE) #first layer
rect(df$id-0.5,df[,4],df$id+0.5,df[,5],col="gray", border=FALSE) #second layer
rect(df$id-0.5,df[,6],df$id+0.5,df[,7],col="darkred", border=FALSE) #third layer
It's not entirely clear what you want based on the png, but based on what you've written:
x1 <- c(2,13,4)
y2 <- c(5,23,43)
foo <- data.frame(id=1:3, x1, y2)
library(ggplot2)
ggplot(data=foo) + geom_rect(aes(ymin=x1, ymax=y2, xmin=id-0.4, xmax=id+0.4))

Resources