I'm experimenting with the outstanding ggraph library to depict some really-hard to depict interrelationships for a scientific work. Specifically, I want to show SNP-SNP interactions in a genetic locus. It would be very nice if I plotted the interactions as curved nodes of a graph, where the SNPs are positioned in a linear fashion according to their genetic positions. The geom_edge_arc() aesthetics from the ggraph library would be ideal. However, I cannot put the nodes in an order according to the positions.
Here is an example
library(igraph)
library(tidyverse)
library(ggraph)
set.seed(10)
nodes <- tibble(nodes = paste("SNP",seq(1:10)), pos = sample(c(10000:20000),10))
edges <- expand.grid(nodes$nodes,nodes$nodes) %>%
mutate(interaction = rnorm(100)) %>%
filter(abs(interaction)>1)
gr <- graph_from_data_frame(edges, vertices = nodes)
ggraph(gr, 'linear', circular=F) +
geom_edge_arc(aes(edge_width=interaction))
The nodes are evenly spaced here, as "factors". However, I wanted to place them on the x coordinate as specified by the pos variable (which in turn becomes an attribute of the nodes). Adding + geom_node_point(aes(x=pos))to the ggplot object doesn't result in a correct rendering. I could probably do the plot with "basic" ipgraph too, but I like ggraph and ggplot2, and it would be an elegant and easy way to plot with this.
Kind regards, and thanks in advance,
Robert
Not sure if this is still relevant, but there are two ways to solve this.
As noted by #axeman, you can use the manual layout, and basically pass the x and y coordinates to it:
ggraph(gr,
layout = 'manual',
node.position = data_frame(y = rep(0, length(nodes$pos)), x = nodes$pos)) +
geom_edge_arc(aes(edge_width=interaction))
The othe way is to overrride the x aes inside geom_edge_arc. To be able to pass a node attribute to an aes we need to use geom_edge_arc2:
ggraph(gr, 'linear', circular=F) +
geom_edge_arc2(aes(edge_width=interaction, x = node.pos))
Created on 2018-05-30 by the reprex package (v0.2.0).
Related
I am trying to visualize some networks using the ggraph package. My network has two different types of edges, A and B, which have different scales. I'd like to color the edges by type (which I've done) and also modulate their opacity by the value. However, since all the edges are displayed together and since A and B have different scales, using aes(alpha=value) uses the entire scale over both A and B, so all the edges with the smaller scale (here A) are practically invisible. How can I separate the alpha scales for A and B so that the alpha corresponds to their internal scales? (ie, alpha=1 when an A edge is at max A and a B edge is at max B)
I've included a small example below:
library(ggplot2)
library(igraph)
library(ggraph)
nodes <- data.frame(id=seq(1,5),label=c('a','b','c','d','e'))
edges <- data.frame(from=c(3,3,4,1,5,3,4,5),
to= c(2,4,5,5,3,4,5,1),
type=c('A','A','A','A','A','B','B','B'),
value=c(1,.2,.5,.3,1,5,12,8))
net <- graph_from_data_frame(d=edges,vertices=nodes,directed=T)
ggraph(net,layout='stress') +
geom_edge_fan(aes(color=type,alpha=value)) +
geom_node_label(aes(label=label),size=5)
This is what the graph currently looks like:
And I want something that looks like this:
Ideally I'd be able to do this in R and not do a convoluted editing process in GIMP.
I was hoping this would be possible with set_scale_edge_alpha, but I can't find the solution anywhere. I saw from here that this can be done with ggnewscale, but this seems to require drawing two separate objects, and it also doesn't seem like there is a function for specifically changing edge aesthetics. Is there a simple way to do this without drawing two overlapping graphs?
Thanks!
It would probably be better just to rescale the values yourself before plotting. You can scale the values to a 0-1 scale within each group
edges <- edges %>%
group_by(type) %>%
mutate(value = scales::rescale(value))
I have a igraph network object with a number of edge attributes. One edge attribute classifies type of network according to number (e.g. 1 = friendship, 2 = advice). I want to plot each type of network separately. However, I do not want to create separate sub-graphs. I want to maintain the same layout for all network types, using just one igraph network object. How to do this is not obvious to me in the iGraph documentation. Can somebody help me here?
if you plot the graph twice and set the same set.seed() before each plot, the layout should be the same. Then you could make edges transparent/visible depending on the edges you want to show.
I believe there is no direct way to do that in igraph. And that makes sense because those are attributes; i.e., something additional rather than a standard way to specify an edge type. Hence, I think one good option is simply altering the set of edges while plotting a graph as in the following example:
library(igraph)
g <- make_ring(10) %>%
set_edge_attr("weight", value = 1:10) %>%
set_edge_attr("color", value = "red")
plot(g %>% delete_edges(which(edge_attr(g)$weight > 5)))
plot(g %>% delete_edges(which(edge_attr(g)$weight <= 5)))
I have 23 different groups,each of them consists of from 7 to 20 individual samples (totally approximately 350-400 observations) with their own x,y & z coordinates. I'd like to produce 3D plot based on the data i have by means of plot3d function of rgl R package. It's not a big deal in general. The problem, that i'd like to make each one from the mentioned above 23 groups to be easy distinguishable on the 3D plot. I tried to use different colors for each group, but unfortunately it's not possible to find a 23 well recognizable by human eyes colors. I was thinking about pch parameter like in the plot function of base R library. But, again, as i can see there is not such option in the plot3d function. Besides, i have to explain, that there are too much points in my data set and adding the labels to each point (e.g. with text3d rgl function) is not a good idea (they will overlap with each other and give in result some kind of a mess on the 3D plot). Is there way to figure out it (i gues it's very common problem)? Thank you in advance!
Below is code of some toy example for better explanation:
# generate data
prefix=rep("ID",69)
suffix=rep(1:23,3)
suffix_2=as.character(suffix[order(suffix)])
titles_1=paste(prefix,suffix,sep="_")
titles_2=titles_1[order(titles_1)]
x=1:69
y=x+20
z=x+50
df=data.frame(titles_2,x,y,z)
# load rgl library
library('rgl')
# make 3D plot
plot3d(x,y,z)
If you like living on the bleeding edge, there's a new function rgl::pch3d() that draws symbols using the same codes as points() does
in base graphics. It's in rgl 0.95.1475, available on R-forge (and within a few hours on Github; see How do I install the latest version of rgl?). It's not completely working with rglwidget() yet.
The example code
open3d()
i <- 0:25; x <- i %% 5; y <- rep(0, 26); z <- i %/% 5
pch3d(x, y, z, pch = i, bg = "green")
text3d(x, y, z + 0.3, i)
pch3d(x + 5, y, z, pch = LETTERS[i+1])
text3d(x + 5, y, z + 0.3, i+65)
produces this display (after some resizing and rotation):
It's not perfect, but how about using letters a-w to distinguish the groups?
with(df,plot3d(x,y,z))
with(df,text3d(x,y,z,texts=letters[titles_2]))
Because i'm going to use the 3D plot for publication purposes i used this solution for now. It's not pretended to be the best one.
# generate data
prefix=rep("ID",69)
suffix=rep(1:23,3)
suffix_2=as.character(suffix[order(suffix)])
titles_1=paste(prefix,suffix,sep="_")
titles_2=titles_1[order(titles_1)]
x=1:69
y=x+20
z=x+50
df=data.frame(titles_2,x,y,z)
# load rgl library
library('rgl')
# load randomcoloR library
library(randomcoloR)
# create a custom palette
palette <- distinctColorPalette(23)
palette(palette)
# make 3D plot
plot3d(x,y,z,size = 10,col=suffix[order(suffix)])
To create a parallel coordinate plot I wanted to use ggparcoord() function in package GGally. The following codes show a reproducible example.
set.seed(3674)
k <- rep(1:3, each=30)
x <- k + rnorm(mean=10, sd=.2,n=90)
y <- -2*k + rnorm(mean=10, sd=.4,n=90)
z <- 3*k + rnorm(mean=10, sd=.6,n=90)
dat <- data.frame(group=factor(k),x,y,z)
library(GGally)
ggparcoord(dat,columns=1:4,groupColumn = 1)
Notice in the picture that the color for group was continuous even though I have the group variable as a factor. Is there any way I can display the plot with three discrete color instead?
I have looked at some other posts where they discuss various other ways of doing parallel coordinate plots in here. But I really wanted to do this in ggparcoord() function of package GGally. I appreciate your time in thinking about this problem.
Your code was almost correct. I spotted that columns=1:4 was not right in this case. You need to drop the column for groupColumn in columns
ggparcoord(dat,columns=2:4,groupColumn = 1)
I know dendrograms are quite popular. However if there are quite large number of observations and classes it hard to follow. However sometime I feel that there should be better way to present the same thing. I got an idea but do not know how to implement it.
Consider the following dendrogram.
> data(mtcars)
> plot(hclust(dist(mtcars)))
Can plot it like a scatter plot. In which the distance between two points is plotted with line, while sperate clusters (assumed threshold) are colored and circle size is determined by value of some variable.
You are describing a fairly typical way of going about cluster analysis:
Use a clustering algorithm (in this case hierarchical clustering)
Decide on the number of clusters
Project the data in a two-dimensional plane using some form or principal component analysis
The code:
hc <- hclust(dist(mtcars))
cluster <- cutree(hc, k=3)
xy <- data.frame(cmdscale(dist(mtcars)), factor(cluster))
names(xy) <- c("x", "y", "cluster")
xy$model <- rownames(xy)
library(ggplot2)
ggplot(xy, aes(x, y)) + geom_point(aes(colour=cluster), size=3)
What happens next is that you get a skilled statistician to help explain what the x and y axes mean. This usually involves projecting the data to the axes and extracting the factor loadings.
The plot: