ggplot2 fix geom_points position - r

Im developing a function that uses ggplot2 to create multiple graphs that compares two set of points representing shapes.
ref = matrix(c(1,3,1,3,2,2,4,4),nrow=4, ncol=2)
ref<-data.frame(x=ref[,1], y=ref[,2])
shapes<-list()
shapes[[1]]<-matrix(c(1.5,2.9,1.4,3.1,2.2,2.3,4.5,3.5),nrow=4, ncol=2)
shapes[[2]]<-matrix(c(0.5,3.9,1.1,3.1,1.8,2,4.5,3.5),nrow=4, ncol=2)
shapes[[3]]<-matrix(c(1.8,3.2,1,3.5,2.2,2.3,4.5,3.5),nrow=4, ncol=2)
newplots<-list()
for(i in 1:length(shapes)){
target<-shapes[[i]]
vari<-data.frame(x=target[,1], y=target[,2])
newplots[[i]]<-ggplot(ref,aes(x = x,y = y)) + geom_point(size=2,color="red")+coord_fixed()+
geom_point(data=vari,aes(x=x,y=y),color="black",size=3)
}
newplots[[1]]
newplots[[2]]
newplots[[3]]
newplots[[4]]
The problem is that the reference points seem to "move" from plot to plot when they suppose to stay in the same place.

When you say 'move' are you referring to the graph scaling changing? I ran the code and that's the only change I noticed. If so- you can simply add something like:
+ xlim(1, 4) + ylim(2,5)
in the the loop after the ggplot.

Related

how to mimic histogram plot from flowjo in R using flowCore?

I'm new to flowCore + R. I would like to mimic a histogram plot after gating that can be manually done in FlowJo software. I got something similar but it doesn't look quite right because it is a "density" plot and is shifted. How can I get the x axis to shift over and look similar to how FlowJo outputs the plot? I tried reading this document but couldn't find a plot similar to the one in FlowJo: howtoflowcore Appreciate any guidance. Thanks.
code snippet:
library(flowCore)
parentpath <- "/parent/path"
subfolder <- "Sample 1"
fcs_files <- list.files(paste0(parentpath, subfolder), pattern = ".fcs")
fs <- read.flowSet(fcs_files)
rect.g <- rectangleGate(filterId = "main",list("FSC-A" = c(1e5, 2e5), "SSC-A" = c(3e4,1e5)))
fs_sub <- Subset(fs, rect.g)
p <- ggcyto(fs_sub[[15]], aes(x= `UV-379-A`)) +
geom_density(fill='black', alpha = 0.4) +
ggcyto_par_set(limits = list(x = c(-1e3, 5e4), y = c(0, 6e-5)))
p
FlowJo output:
R FlowCore output:
The reason that for the "shift" is that the x axis is logarithmic (base 10) in the flowJo graph. To achieve the same result in R, add
+ scale_x_log10()
after the existing code. This might interact weirdly with the axis limits you've set, so bare that in mind.
To make the y-axis "count" rather than density, you can change the first line of your ggcyto() call to:
aes(x= `UV-379-A`, y = after_stat(count))
Let me know if that works - I don't have your data to hand so that's all from memory!
For any purely aesthetic changes, they are relatively easy to look up.

Plot a table with box size changing

Does anyone have an idea how is this kind of chart plotted? It seems like heat map. However, instead of using color, size of each cell is used to indicate the magnitude. I want to plot a figure like this but I don't know how to realize it. Can this be done in R or Matlab?
Try scatter:
scatter(x,y,sz,c,'s','filled');
where x and y are the positions of each square, sz is the size (must be a vector of the same length as x and y), and c is a 3xlength(x) matrix with the color value for each entry. The labels for the plot can be input with set(gcf,properties) or xticklabels:
X=30;
Y=10;
[x,y]=meshgrid(1:X,1:Y);
x=reshape(x,[size(x,1)*size(x,2) 1]);
y=reshape(y,[size(y,1)*size(y,2) 1]);
sz=50;
sz=sz*(1+rand(size(x)));
c=[1*ones(length(x),1) repmat(rand(size(x)),[1 2])];
scatter(x,y,sz,c,'s','filled');
xlab={'ACC';'BLCA';etc}
xticks(1:X)
xticklabels(xlab)
set(get(gca,'XLabel'),'Rotation',90);
ylab={'RAPGEB6';etc}
yticks(1:Y)
yticklabels(ylab)
EDIT: yticks & co are only available for >R2016b, if you don't have a newer version you should use set instead:
set(gca,'XTick',1:X,'XTickLabel',xlab,'XTickLabelRotation',90) %rotation only available for >R2014b
set(gca,'YTick',1:Y,'YTickLabel',ylab)
in R, you should use ggplot2 that allows you to map your values (gene expression in your case?) onto the size variable. Here, I did a simulation that resembles your data structure:
my_data <- matrix(rnorm(8*26,mean=0,sd=1), nrow=8, ncol=26,
dimnames = list(paste0("gene",1:8), LETTERS))
Then, you can process the data frame to be ready for ggplot2 data visualization:
library(reshape)
dat_m <- melt(my_data, varnames = c("gene", "cancer"))
Now, use ggplot2::geom_tile() to map the values onto the size variable. You may update additional features of the plot.
library(ggplot2)
ggplot(data=dat_m, aes(cancer, gene)) +
geom_tile(aes(size=value, fill="red"), color="white") +
scale_fill_discrete(guide=FALSE) + ##hide scale
scale_size_continuous(guide=FALSE) ##hide another scale
In R, corrplotpackage can be used. Specifically, you have to use method = 'square' when creating the plot.
Try this as an example:
library(corrplot)
corrplot(cor(mtcars), method = 'square', col = 'red')

Ggplot does not show plots in sourced function

I've been trying to draw two plots using R's ggplot library in RStudio. Problem is, when I draw two within one function, only the last one displays (in RStudio's "plots" view) and the first one disappears. Even worse, when I run ggsave() after each plot - which saves them to a file - neither of them appear (but the files save as expected). However, I want to view what I've saved in the plots as I was able to before.
Is there a way I can both display what I'll be plotting in RStudio's plots view and also save them? Moreover, when the plots are not being saved, why does the display problem happen when there's more than one plot? (i.e. why does it show the last one but not the ones before?)
The code with the plotting parts are below. I've removed some parts because they seem unnecessary (but can add them if they are indeed relevant).
HHIplot = ggplot(pergame)
# some ggplot geoms and misc. here
ggsave(paste("HHI Index of all games,",year,"Finals.png"),
path = plotpath, width = 6, height = 4)
HHIAvePlot = ggplot(AveHHI, aes(x = AveHHI$n_brokers))
# some ggplot geoms and misc. here
ggsave(paste("Average HHI Index of all games,",year,"Finals.png"),
path = plotpath, width = 6, height = 4)
I've already taken a look here and here but neither have helped. Adding a print(HHIplot) or print(HHIAvePlot) after the ggsave() lines has not displayed the plot.
Many thanks in advance.
Update 1: The solution suggested below didn't work, although it works for the answer's sample code. I passed the ggplot objects to .Globalenv and print() gives me an empty gray box on the plot area (which I imagine is an empty ggplot object with no layers). I think the issue might lie in some of the layers or manipulators I have used, so I've brought the full code for one ggplot object below. Any thoughts? (Note: I've tried putting the assign() line in all possible locations in relation to ggsave() and ggplot().)
HHIplot = ggplot(pergame)
HHIplot +
geom_point(aes(x = pergame$n_brokers, y = pergame$HHI)) +
scale_y_continuous(limits = c(0,10000)) +
scale_x_discrete(breaks = gameSizes) +
labs(title = paste("HHI Index of all games,",year,"Finals"),
x = "Game Size", y = "Herfindahl-Hirschman Index") +
theme(text = element_text(size=15),axis.text.x = element_text(angle = 0, hjust = 1))
assign("HHIplot",HHIplot, envir = .GlobalEnv)
ggsave(paste("HHI Index of all games,",year,"Finals.png"),
path = plotpath, width = 6, height = 4)
I'll preface this by saying that the following is bad practice. It's considered bad practice to break a programming language's scoping rules for something as trivial as this, but here's how it's done anyway.
So within the body of your function you'll create both plots and put them into variables. Then you'll use ggsave() to write them out. Finally, you'll use assign() to push the variables to the global scope.
library(ggplot2)
myFun <- function() {
#some sample data that you should be passing into the function via arguments
df <- data.frame(x=1:10, y1=1:10, y2=10:1)
p1 <- ggplot(df, aes(x=x, y=y1))+geom_point()
p2 <- ggplot(df, aes(x=x, y=y2))+geom_point()
ggsave('p1.jpg', p1)
ggsave('p2.jpg', p2)
assign('p1', p1, envir=.GlobalEnv)
assign('p2', p2, envir=.GlobalEnv)
return()
}
Now, when you run myFun() it will write out your two plots to .jpg files, and also drop the plots into your global environment so that you can just run p1 or p2 on the console and they'll appear in RStudio's Plot pane.
ONCE AGAIN, THIS IS BAD PRACTICE
Good practice would be to not worry about the fact that they're not popping up in RStudio. They wrote out to files, and you know they did, so go look at them there.

Concentric Circles like a grid, centered at origin

I would like to include a sequence of concentric circles as a grid in a plot of points. The goal is to give the viewer an idea of which points in the plot have approximately the same magnitude.
I created a hack to do this:
add_circle_grid <- function(g,ncirc = 10){
gb <- ggplot_build(g)
xl <- gb$panel$ranges[[1]]$x.range
yl <- gb$panel$ranges[[1]]$y.range
rmax = sqrt(max(xl)^2+max(yl)^2)
theta=seq(from=0,by=.01,to=2*pi)
for(n in 1:ncirc){
r <- n*rmax/ncirc
circle <- data.frame(x=r*sin(theta),y=r*cos(theta))
g<- g+geom_path(data=circle,aes(x=x,y=y),alpha=.2)
}
return(g+xlim(xl)+ylim(yl))
}
xy<-data.frame(x=rnorm(100),y=rnorm(100))
ggplot(xy,aes(x,y))+geom_point()
ggg<-add_circle_grid(ggplot(xy,aes(x,y))+geom_point())
print(ggg)
But I was wondering if there is a more ggplot way to do this. I also considered using polar coordinates but it does not allow me to set x- and y-limits in the same way.
Finally, I wouldn't mind little text labels indicating the radius of each circle.
EDIT
Perhaps this is asking too much but there are two other things that I would like.
The axis limits should stay the same (which can be done via ggplot_build)
Can this work with facets? As far as I can tell you would need to somehow figure out the facets if I want to add the circles dynamically.
set.seed(1)
xy <- data.frame(x=rnorm(100),y=rnorm(100))
rmax = sqrt(max(xy$x)^2+max(xy$y)^2)
theta=seq(from=0,by=.01,to=2*pi)
ncirc=10
dat.circ = do.call(rbind,
lapply(seq_len(ncirc),function(n){
r <- n*rmax/ncirc
data.frame(x=r*sin(theta),y=r*cos(theta),r=round(r,2))
}))
rr <- unique(dat.circ$r)
dat.text=data.frame(x=rr*cos(30),y=rr*sin(30),label=rr)
library(ggplot2)
ggplot(xy,aes(x,y))+
geom_point() +
geom_path(data=dat.circ,alpha=.2,aes(group=factor(r))) +
geom_text(data=dat.text,aes(label=rr),vjust=-1)
How about this with ggplot2 and grid:
require(ggplot2)
require(grid)
x<-(runif(100)-0.5)*4
y<-(runif(100)-0.5)*4
circ_rads<-seq(0.25,2,0.25)
qplot(x,y)+
lapply(circ_rads,FUN=function(x)annotation_custom(circleGrob(gp=gpar(fill="transparent",color="black")),-x,x,-x,x))+
geom_text(aes(x=0,y=circ_rads+0.1,label=circ_rads)) + coord_fixed(ratio = 1)

How can I overlay two dense scatter plots so that I can see the outlines of each in R or Matlab?

See this example
This was created in matlab by making two scatter plots independently, creating images of each, then using the imagesc to draw them into the same figure and then finally setting the alpha of the top image to 0.5.
I would like to do this in R or matlab without using images, since creating an image does not preserve the axis scale information, nor can I overlay a grid (e.g. using 'grid on' in matlab). Ideally I wold like to do this properly in matlab, but would also be happy with a solution in R. It seems like it should be possible but I can't for the life of me figure it out.
So generally, I would like to be able to set the alpha of an entire plotted object (i.e. of a matlab plot handle in matlab parlance...)
Thanks,
Ben.
EDIT: The data in the above example is actually 2D. The plotted points are from a computer simulation. Each point represents 'amplitude' (y-axis) (an emergent property specific to the simulation I'm running), plotted against 'performance' (x-axis).
EDIT 2: There are 1796400 points in each data set.
Using ggplot2 you can add together two geom_point's and make them transparent using the alpha parameter. ggplot2 als adds up transparency, and I think this is what you want. This should work, although I haven't run this.
dat = data.frame(x = runif(1000), y = runif(1000), cat = rep(c("A","B"), each = 500))
ggplot(aes(x = x, y = y, color = cat), data = dat) + geom_point(alpha = 0.3)
ggplot2 is awesome!
This is an example of calculating and drawing a convex hull:
library(automap)
library(ggplot2)
library(plyr)
loadMeuse()
theme_set(theme_bw())
meuse = as.data.frame(meuse)
chull_per_soil = ddply(meuse, .(soil),
function(sub) sub[chull(sub$x, sub$y),c("x","y")])
ggplot(aes(x = x, y = y), data = meuse) +
geom_point(aes(size = log(zinc), color = ffreq)) +
geom_polygon(aes(color = soil), data = chull_per_soil, fill = NA) +
coord_equal()
which leads to the following illustration:
You could first export the two data sets as bitmap images, re-import them, add transparency:
library(grid)
N <- 1e7 # Warning: slow
d <- data.frame(x1=rnorm(N),
x2=rnorm(N, 0.8, 0.9),
y=rnorm(N, 0.8, 0.2),
z=rnorm(N, 0.2, 0.4))
v <- with(d, dataViewport(c(x1,x2),c(y, z)))
png("layer1.png", bg="transparent")
with(d, grid.points(x1,y, vp=v,default="native",pch=".",gp=gpar(col="blue")))
dev.off()
png("layer2.png", bg="transparent")
with(d, grid.points(x2,z, vp=v,default="native",pch=".",gp=gpar(col="red")))
dev.off()
library(png)
i1 <- readPNG("layer1.png", native=FALSE)
i2 <- readPNG("layer2.png", native=FALSE)
ghostize <- function(r, alpha=0.5)
matrix(adjustcolor(rgb(r[,,1],r[,,2],r[,,3],r[,,4]), alpha.f=alpha), nrow=dim(r)[1])
grid.newpage()
grid.rect(gp=gpar(fill="white"))
grid.raster(ghostize(i1))
grid.raster(ghostize(i2))
you can add these as layers in, say, ggplot2.
Use the transparency capability of color descriptions. You can define a color as a sequence of four 2-byte words: muddy <- "#888888FF" . The first three pairs set the RGB colors (00 to FF); the final pair sets the transparency level.
AFAIK, your best option with Matlab is to just make your own plot function. The scatter plot points unfortunately do not yet have a transparency attribute so you cannot affect it. However, if you create, say, most crudely, a bunch of loops which draw many tiny circles, you can then easily give them an alpha value and obtain a transparent set of data points.

Resources