How to only change parameters for "lower" plots in the ggpairs function from GGally package - r

I have the following example
data(diamonds, package="ggplot2")
diamonds.samp <- diamonds[sample(1:dim(diamonds)[1],200),]
ggpairs(diamonds.samp, columns=8:10,
upper=list(continuous='cor'),
lower=list(continuous = 'points'),
diag=list(continuous='density'),
axisLabels='show'
)
Resulting in a really nice figure:
But my problem is that in the real dataset I have to many points whereby I would like to change the parameters for the point geom. I want to reduce the dot size and use a lower alpha value. I can however not doe this with the "param" option it applies to all plot - not just the lower one:
ggpairs(diamonds.samp, columns=8:10,
upper=list(continuous='cor'),
lower=list(continuous = 'points'),
diag=list(continuous='density'),
params=c(alpha=1/10),
axisLabels='show'
)
resulting in this plot:
Is there a way to apply parameters to only "lower" plots - or do I have to use the ability to create custom plots as suggested in the topic How to adjust figure settings in plotmatrix?
In advance - thanks!

There doesn't seem to be any elegant way to do it, but you can bodge it by writing a function to get back the existing subchart calls from the ggally_pairs() object and then squeezing the params in before the last bracket. [not very robust, it'll only work for if the graphs are already valid]
diamonds.samp <- diamonds[sample(1:dim(diamonds)[1],200),]
g<-ggpairs(diamonds.samp, columns=8:10,
upper=list(continuous='cor'),
lower=list(continuous = 'points'),
diag=list(continuous='density'),
axisLabels='show'
)
add_p<-function(g,i,params){
side=length(g$columns) # get number of cells per side
lapply(i,function(i){
s<-as.character(g$plots[i]) # get existing call as a template
l<-nchar(s)
p<-paste0(substr(s,1,l-1),",",params,")") # append params before last bracket
r<-i%/%side+1 # work out the position on the grid
c<-i%%side
array(c(p,r,c)) # return the sub-plot and position data
})
}
rep_cells<-c(4,7,8)
add_params<-"alpha=0.3, size=0.1, color='red'"
ggally_data<-g$data # makes sure that the internal parameter picks up your data (it always calls it's data 'ggally_data'
calls<-add_p(g,rep_cells,params=add_params) #call the function
for(i in 1:length(calls)){g<-putPlot(g,calls[[i]][1],as.numeric(calls[[i]][2]),as.numeric(calls[[i]][3]))}
g # call the plot

Related

How to process all text elements of ggplot through a function before output?

I'm creating a set of functions for enciphering data plots as a hook for data literacy lessons.
For example, this function performs an alphabetic shift of +/- X for any string input
enciphR<-function(x,alg,key=F){
x.vec<-tolower(unlist(strsplit(x,fixed=T,split="")))#all lower case as vector
#define new alphabet
alphabet<-1:26+alg
alphabet.shifted.idx<-sapply(alphabet,function(x) {if(x>26){x-26}else{ if(x<1){x+26}else{x}}})
alphabet.shifted<-letters[alphabet.shifted.idx]
keyMat=cbind(IN=letters,OUT=alphabet.shifted)
#encipher
x1.1<-as.vector(sapply(x.vec,function(s) {
if(!s%in%letters){s}else{#If nonletter, leave it alone, else...
keyMat[match(s,keyMat[,"IN"]),"OUT"]
}},USE.NAMES = F))
x2<-paste0(x1.1,collapse="")
if(key){
out<-list(IN=x,OUT=x2,KEY=keyMat)}
else{
out<-x2
}
return(out)
}
enciphR(letters,+1) yields "bcdefghijklmnopqrstuvwxyza"
I want to feed a ggplot object into another function ggEnciphR(), which uses the enciphR() helper to encode all text elements of that object, according to the supplied cipher algorithm.
So, start with a figure:
data("AirPassengers")
require(ggplot2);require(reshape2)
df <- data.frame(Passengers=as.numeric(AirPassengers), year = as.factor(trunc(time(AirPassengers))), mnth = month.abb[cycle(AirPassengers)])
(gg<-qplot(mnth,Passengers,aes(col=year),data=df)+geom_line(aes(group=year,col=year)))
now I define the ggCiphR function:
ggCiphR<-function(ggGraph,alg){
g<-ggGraph
pass2enciphR=function(x){enciphR(x,alg=alg)}
#process all labels
g$labels<-lapply(g$labels,pass2enciphR)
#process custom legend if present (sometimes works, depending on ggplot object)
try(if(length(g$scales$scales)>0){
for(i in 1:length(g$scales$scales)){
g$scales$scales[[i]]$name=enciphR(g$scales$scales[[i]],alg=alg)
}
})
g
}
ggCiphR(gg,+1)
Yields:
This is getting there. Axis labels and legend title have been enciphered, but not tick values. Also, if my factor was Airline, instead of Year, I'd like that to be sent through enciphR, as well.
I've searched high and wide and cannot for the life of me figure out how to modify x- and y-axis values and factor levels from the gg object. I don't want to have to recast the dataframe and regenerate the ggplot... ideally I can do this programmatically using the gg object. And also, it would (ideally) be flexible enough to work with different classes of data in X & Y. Thoughts?

Controlling margins in a genoPlotR plot_gene_map

I'm producing a plot_gene_map figure by the genoPlotR R package, which gives a horizontal phylogenetic tree where aligned with each leaf is a genomic segment.
Here's a simple example that illustrates my usage and problem:
The plot_gene_map function requires an ade4s' package phylog object which represents the phylogenetic tree:
tree <- ade4::newick2phylog("(((A:0.08,B:0.075):0.028,(C:0.06,D:0.06):0.05):0.0055,E:0.1);")
A list of genoPlotR's dna_seg objects (which are essentially data.frames with specific columns), where the names of the list elements have to match the names of the leaves of tree:
dna.segs.list <- list(A=genoPlotR::as.dna_seg(data.frame(name=paste0("VERY.LONG.NAME.A.",1:10),start=seq(1,91,10),end=seq(5,95,10),strand=1,col="black",ly=1,lwd=1,pch=1,cex=1,gene_type="blocks",fill="red")),
B=genoPlotR::as.dna_seg(data.frame(name=paste0("VERY.LONG.NAME.B.",1:10),start=seq(1,91,10),end=seq(5,95,10),strand=1,col="black",ly=1,lwd=1,pch=1,cex=1,gene_type="blocks",fill="blue")),
C=genoPlotR::as.dna_seg(data.frame(name=paste0("VERY.LONG.NAME.C.",1:10),start=seq(1,91,10),end=seq(5,95,10),strand=1,col="black",ly=1,lwd=1,pch=1,cex=1,gene_type="blocks",fill="green")),
D=genoPlotR::as.dna_seg(data.frame(name=paste0("VERY.LONG.NAME.D.",1:10),start=seq(1,91,10),end=seq(5,95,10),strand=1,col="black",ly=1,lwd=1,pch=1,cex=1,gene_type="blocks",fill="yellow")),
E=genoPlotR::as.dna_seg(data.frame(name=paste0("VERY.LONG.NAME.E.",1:10),start=seq(1,91,10),end=seq(5,95,10),strand=1,col="black",ly=1,lwd=1,pch=1,cex=1,gene_type="blocks",fill="orange")))
And a list of genoPlotR's annotation objects, which give coordinate information, also named according to the tree leaves:
annotation.list <- lapply(1:5,function(s){
mids <- genoPlotR::middle(dna.segs.list[[s]])
return(genoPlotR::annotation(x1=mids,x2=NA,text=dna.segs.list[[s]]$name,rot=30,col="black"))
})
names(annotation.list) <- names(dna.segs.list)
And the call to the function is:
genoPlotR::plot_gene_map(dna_segs=dna.segs.list,tree=tree,tree_width=2,annotations=annotation.list,annotation_height=1.3,annotation_cex=0.9,scale=F,dna_seg_scale=F)
Which gives:
As you can see the top and right box (gene) names get cut off.
I tried playing with pdf's width and height, when saving the figure to a file, and with the margins through par's mar, but they have no effect.
Any idea how to display this plot without getting the names cut off?
Currently genoPlotR's plot_gene_map does not have a legend option implemented. Any idea how can I add a legend, let's say which shows these colors in squares aside these labels:
data.frame(label = c("A","B","C","D","E"), color = c("red","blue","green","yellow","orange"))
Glad that you like genoPlotR.
There are no real elegant solution to your problem, but here are a few things you can attempt:
- increase annotation_height and reduce annotation_cex
- increase rotation (“rot”) in the annotation function
- use xlims to artificially increase the length of the dna_seg (but that’s a bad hack)
For the rest (including the legend), you’ll have to use grid and its viewports.
A blend of the first 3 solutions:
annotation.list <- lapply(1:5,function(s){
mids <- genoPlotR::middle(dna.segs.list[[s]])
return(genoPlotR::annotation(x1=mids, x2=NA, text=dna.segs.list[[s]]$name,rot=75,col="black"))
})
names(annotation.list) <- names(dna.segs.list)
genoPlotR::plot_gene_map(dna_segs=dna.segs.list,tree=tree,tree_width=2,annotations=annotation.list,annotation_height=5,annotation_cex=0.4,scale=F,dna_seg_scale=F, xlims=rep(list(c(0,110)),5))
For the better solution with grid: (note the "plot_new=FALSE" in the call to plot_gene_map)
# changing rot to 30
annotation.list <- lapply(1:5,function(s){
mids <- genoPlotR::middle(dna.segs.list[[s]])
return(genoPlotR::annotation(x1=mids,x2=NA,text=dna.segs.list[[s]]$name,rot=30,col="black"))
})
names(annotation.list) <- names(dna.segs.list)
# main viewport: two columns, relative widths 1 and 0.3
pushViewport(viewport(layout=grid.layout(1,2, widths=unit(c(1, 0.3), rep("null", 2))), name="overall_vp"))
# viewport with gene_map
pushViewport(viewport(layout.pos.col=1, name="geneMap"))
genoPlotR::plot_gene_map(dna_segs=dna.segs.list,tree=tree,tree_width=2,annotations=annotation.list,annotation_height=3,annotation_cex=0.5,scale=F,dna_seg_scale=F, plot_new=FALSE)
upViewport()
# another viewport for the margin/legend
pushViewport(viewport(layout.pos.col=2, name="legend"))
plotLegend(…)
upViewport()
Hope that helps!
Lionel
Which function or package could I use to add the legend? The R base functions did not seem to work for me. The following message is displayed:
Error in strheight(legend, units = "user", cex = cex) :
plot.new has not been called yet"

Issue: ggplot2 replicates last plot of a list in grid

I have some 16 plots. I want to plot all of these in grid manner with ggplot2. But, whenever I plot, I get a grid with all the plots same, i.e, last plot saved in a list gets plotted at all the 16 places of grid. To replicate the same issue, here I am providing a simple example with two files. Although data are entirely different, but plots drawn are similar.
library(ggplot2)
library(grid)
library(gridExtra)
library(scales)
set.seed(1006)
date1<- as.POSIXct(seq(from=1443709107,by=3600,to=1446214707),origin="1970-01-01")
power <- rnorm(length(date1),100,5)#with normal distribution
write.csv(data.frame(date1,power),"file1.csv",row.names = FALSE,quote = FALSE)
# Now another dataset with uniform distribution
write.csv(data.frame(date1,power=runif(length(date1))),"file2.csv",row.names = FALSE,quote = FALSE)
path=getwd()
files=list.files(path,pattern="*.csv")
plist<-list()# for saving intermediate ggplots
for(i in 1:length(files))
{
dframe<-read.csv(paste(path,"/",files[i],sep = ""),head=TRUE,sep=",")
dframe$date1= as.POSIXct(dframe$date1)
plist[[i]]<- ggplot(dframe)+aes(dframe$date1,dframe$power)+geom_line()
}
grid.arrange(plist[[1]],plist[[2]],ncol = 1,nrow=2)
You need to remove the dframe from your call to aes. You should do that anyway because you have provided a data-argument. In this case it's even more important because while you save the ggplot-object, things don't get evaluated until the call to plot/grid.arrange. When you do that, it looks at the current value of dframe, which is the last dataset in your iteration.
You need to plot with:
ggplot(dframe)+aes(date1,power)+geom_line()

Change of colors in compare.matrix command in r

I'm trying to change the colors for the compare.matrix command in r, but the error is always the same:
Error in image.default(x = mids, y = mids, z = mdata, col = c(heat.colors(10)[10:1]), :
formal argument "col" matched by multiple actual arguments
My code is very simple:
compare.matrix(current,ech_b1,nbins=40)
and some of my attempts are:
compare.matrix(current,ech_b1,nbins=40,col=c(grey.colors(5)))
compare.matrix(current,ech_b1,nbins=40,col=c(grey.colors(10)[10:1]))
Assuming you're using compare.matrix() from the SDMTools package, the color arguments appear to be hard-coded into the function, so you'll need to redefine the function in order to make them flexible:
# this shows you the code in the console
SDMTools::compare.matrix
function(x,y,nbins,...){
#---- preceding code snipped ----#
suppressWarnings(image(x=mids, y=mids, z=mdata, col=c(heat.colors(10)[10:1]),...))
#overlay contours
contour(x=mids, y=mids, z=mdata, col="black", lty="solid", add=TRUE,...)
}
So you can make a new one like so, but bummer, there are two functions using the ellipsis that have a col argument predefined. If you'll only be using extra args to image() and not to contour(), this is cheap and easy.
my.compare.matrix <- function(x,y,nbins,...){
#---- preceding code snipped ----#
suppressWarnings(image(x=mids, y=mids, z=mdata,...))
#overlay contours
contour(x=mids, y=mids, z=mdata, col="black", lty="solid", add=TRUE)
}
If, however, you want to use ... for both internal calls, then the only way I know of to avoid confusion about redundant argument names is to do something like:
my.compare.matrix <- function(x,y,nbins,
image.args = list(col=c(heat.colors(10)[10:1])),
contour.args = list(col="black", lty="solid")){
#---- preceding code snipped ----#
contour.args[[x]] <- contour.args[[y]] <- image.args[[x]] <- image.args[[y]] <- mids
contour.args[[z]] <- image.args[[z]] <- mdata
suppressWarnings(do.call(image, image.args))
#overlay contours
do.call(contour, contour.args)
}
Decomposing this change: instead of ... make a named list of arguments, where the previous hard codes are now defaults. You can then change these items by renaming them in the list or adding to the list. This could be more elegant on the user side, but it gets the job done. Both of the above modifications are untested, but should get you there, and this is all prefaced by my above comment. There may be some other problem that cannot be detected by SO Samaritans because you didn't specify the package or the data.

R, graph of binomial distribution

I have to write own function to draw the density function of binomial distribution and hence draw
appropriate graph when n = 20 and p = 0.1,0.2,...,0.9. Also i need to comments on the graphs.
I tried this ;
graph <- function(n,p){
x <- dbinom(0:n,size=n,prob=p)
return(barplot(x,names.arg=0:n))
}
graph(20,0.1)
graph(20,0.2)
graph(20,0.3)
graph(20,0.4)
graph(20,0.5)
graph(20,0.6)
graph(20,0.7)
graph(20,0.8)
graph(20,0.9)
#OR
graph(20,scan())
My first question : is there any way so that i don't need to write down the line graph(20,p) several times except using scan()?
My second question :
I want to see the graph in one device or want to hit ENTER to see the next graph. I wrote
par(mfcol=c(2,5))
graph(20,0.1)
graph(20,0.2)
graph(20,0.3)
graph(20,0.4)
graph(20,0.5)
graph(20,0.6)
graph(20,0.7)
graph(20,0.8)
graph(20,0.9)
but the graph is too tiny. How can i present the graphs nicely with giving head line n=20 and p=the value which i used to draw the graph?[though it can be done by writing mtext() after calling the function graphbut doing so i have to write a similar line few times. So i want to do this including in function graph. ]
My last question :
About comment. The graphs are showing that as the probability of success ,p is increasing the graph is tending to right, that is , the graph is right skewed.
Is there any way to comment on the graph using program?
Here a job of mapply since you loop over 2 variables.
graph <- function(n,p){
x <- dbinom(0:n,size=n,prob=p)
barplot(x,names.arg=0:n,
main=sprintf(paste('bin. dist. ',n,p,sep=':')))
}
par(mfcol=c(2,5))
mapply(graph,20,seq(0.1,1,0.1))
Plotting base graphics is one of the times you often want to use a for loop. The reason is because most of the plotting functions return an object invisibly, but you're not interested in these; all you want is the side-effect of plotting. A loop ignores the returned obects, whereas the *apply family will waste effort collecting and returning them.
par(mfrow=c(2, 5))
for(p in seq(0.1, 1, len=10))
{
x <- dbinom(0:20, size=20, p=p)
barplot(x, names.arg=0:20, space=0)
}

Resources