Match vertex and edge color in igraph - r

I have a large data set that I want to represent with a network graph using igraph. I just don't understand how to get the colors right. Is it possible to get an igraph plot with edge having the same color as vertex color? I my example below, I would like to color vertex and edges according to the status 'sampled' or 'unsampled'. An other problem is that all the edge do not appear on the igraph, and I don't understand why
My code so far is:
d <- data.frame(individual=c(1:10), mother_id = c(0,0,0,0,0,1,3,7,6,7), father_id = c(0,0,0,0,0,4,1,6,7,6) , generation = c(0,0,0,0,0,1,1,2,2,2), status=c("sampled","unsampled","unsampled","sampled",'sampled',"sampled","unsampled","unsampled","sampled",'sampled'))
#Just some settings for layout plot
g <- d$generation
n <- nrow(d)
pos <- matrix(data = NA, nrow = n, ncol = 2)
pos[, 2] <- max(g) - g
pos[, 1] <- order(g, partial = order(d$individual, decreasing = TRUE)) - cumsum(c(0, table(g)))[g + 1]
#Plotting the igraph
G <- graph_from_data_frame(d)
plot(G, rescale = T, vertex.label = d$individual, layout = pos,
edge.arrow.mode = "-",
vertex.color = d$status,
edge.color = d$status,
asp = 0.35)
My question is somewhat similar to this question, but I would like to do it with igraph package.
Ggraph node color to match edge color
Thanks for your help

if you plot(G) you will see that the graph from data frame object is not what you expect, most likely. That is why you dont see all edges (i.e the column father_id is not used at all).
By default igraph takes the first column as "from" and the second one as "to". That is why you see 1to0, 2to0 and so on.
You can fix this by passing in two objects, one with the edges and their attributes, and one with the nodes and their attributes.
It is not so clear to me where the edges should be. However, your code should look something like this:
dd <- read.table(text = "
from to type
1 6 A
3 7 B
7 8 A
6 9 B
7 10 A
4 6 B
1 7 A
6 8 B
7 9 B
6 10 A ", header=T )
nodes <- data.frame(id=unique(c(dd$from, dd$to)) )
nodes$type <- sample(LETTERS[1:2], 8, replace = T )
nodes$x <- c(8,3,5,7,1,2,4,10) # this if for the layout
nodes$y <- c(1, 2, 4, 5, 6, 8, 5, 7)
nodes
id type x y
1 1 B 8 1
2 3 A 3 2
3 7 B 5 4
4 6 A 7 5
5 4 A 1 6
6 8 B 2 8
7 9 A 4 5
8 10 A 10 7
G <- graph_from_data_frame(dd, vertices = nodes ) # directed T or F?
V(G)$color <- ifelse( V(G)$type == "A", "pink", "skyblue")
E(G)$color <- ifelse( E(G)$type == "A", "pink", "skyblue")
edge_attr(G)
vertex_attr(G)
plot(G)

Related

how to convert a node list to an edge list in igraph?

I have a empty graph and need to plot the graph based on the convex hull with inner verticies.
My attemp is:
library(igraph)
set.seed(45)
n = 10
g <- graph.empty(n)
xy <- cbind(runif(n), runif(n))
vp <- convex_hull(xy)$resverts + 1
#[1] 8 10 7 2 1
## convert node_list to edge_list
plot(g, layout=xy)
Expected result in right figure.
Question. How to convert a node list to an edge list in igraph??
You can use add_edges along with embed
g2 <- g %>%
add_edges(c(t(embed(vp, 2)), vp[1], vp[length(vp)])) %>%
as.undirected()
and plot(g2, layout = xy) in turn gives
convex_hull does not output a node list in the same sense that an igraph object has a node list. In this case, vp is the sequence of indices so in order to create an edge list, you just need to have the from vertex be going to the next vertex in the sequence. This can be accomplished with dplyr::lead using the first vertex as the default to create a circuit.
data.frame(
from = vp,
to = dplyr::lead(vp, 1, default = vp[1])
)
#> from to
#> 1 8 10
#> 2 10 7
#> 3 7 2
#> 4 2 1
#> 5 1 8
Try this.
## create graph.
vids <- as.character(c(8, 10, 7, 2, 1))
g <- make_graph(c(), length(vids))
V(g)$name <- vids
## and connect the dots.
g2 <- g + path(c(vids, vids[1]))
g2

Changing edge color based on attribute

I'm working on the visual representation of a network on R software, using the igraph package.
I have a data set with links between all the nodes and, for each link/edge, the district that they are assign to.
So, I would like to change the edge color of each edge, based on each district they are assign to. The table above shows the structure of the table.
nodei
nodej
depot1
depot2
4
5
1
0
In this case the link (4-5) is assigned to depot1, so the edge color of the edge should be green, for example.
Here is an exmaple that edges associated with 1 will be colored in "red", and "green" otherwise:
g <- make_ring(5)
g %>%
set_edge_attr(
name = "color",
value = c("green", "red")[1 + (rowSums(ends(., E(.)) == "1") > 0)]
) %>%
plot()
In your case, you could replace "1" by "depot1" and have a try.
I tried to understand your question, I created a node type to color the edges and nodes.
library(igraph)
data <- read.table(text = "
N D type
1 6 A
3 7 B
7 8 A
4 5 B
7 10 A
4 6 B
1 7 A
6 8 B
7 9 B
6 10 A ", header=T )
nodes <- data.frame(id=unique(c(data$N, data$D)) )
nodes$type <- c("A","B") # this if for the layout
nodes$x <- c(1,3,7,4,7, 4, 1,6,7,6)
nodes$y <- c(6,7,8,5,10,,5, 7,8,9,10)
nodes
G <- graph_from_data_frame(dd, vertices = nodes )
V(G)$color <- ifelse( V(G)$type == "A", "red", "green")
E(G)$color <- ifelse( E(G)$type == "A", "red","green")
edge_attr(G)
vertex_attr(G)
plot(G)

Creating a dataframe of edges from node information in R?

I am trying to create an edge list based on binary splits.
If I have a data frame that only contains the node number and some other metric, then I can manually create an edge list for the nodes. For example, if my data frame looks like this:
dfTest <- data.frame(
node = c(1,2,3,4,5),
var = c("milk", NA, "coffee", NA, NA),
isLeaf = c(F, T, F, T, T)
)
> dfTest
node var isLeaf
1 1 milk FALSE
2 2 <NA> TRUE
3 3 coffee FALSE
4 4 <NA> TRUE
5 5 <NA> TRUE
Then, based on the var or isLeaf column, I can manually create an edge list to connect the nodes. For example, As node 2 is a leaf, I know that node 1 must go to node 2. Then (as they are binary splits) I know node 1 must also connect to node 3. And as node 4 and 5 are leaf nodes, I know that they must split on node 3.
Manually creating the edge list would look like this:
edges <- data.frame(
from = c(1, 1, 3, 3),
to = c(2, 3, 4, 5)
)
The to column is easy to find... it will always be c(2:length(dfTest$nodes)). In this case 2,3,4,5. But the from column is proving difficult to find.
Just for a visual aid, the resulting tree would look like this:
Is there any way to do this without having to manually work out the edges?
EDIT:
In response to an answer, I'm adding a slightly larger dataset to use:
dfTest <- data.frame(
node = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11),
var = c("milk", "milk", NA, NA, "coffee", "sugar", NA, NA, "milk", NA, NA),
isLeaf = c(F, F, T, T, F, F, T, T, F, T, T)
)
A little explanation:
From the var column I know that milk (the root/node 1) splits to another milk (node 2). I can then see that node 2 splits to NA (node 3) and NA (node 4). As I know they are binary splits, I know that node 2 cant split any further. So, I must go back to the previous node that only had 1 split… in this case node 1 (i.e., milk) which then splits to the right on coffee (node 5). Again, as they are binary splits, I now know that coffee (node 5) must split to sugar (node 6). Sugar (node 6) is followed by 2 NAs (node 7 & 8 ). Now, I must go back to coffee (node 5) and split to the right to get milk (node 9) which splits to 2 NAs (node 10 &11)
The desired node/edge list should look like this:
edges <- data.frame(
from = c(1,2,2,1,5,6,6,5,9,9),
to = c(2,3,4,5,6,7,8,9,10,11)
)
Which would produce a tree like this:
As per your update, perhaps you can try the code below
grps <- split(dfTest, ~ cumsum(!isLeaf))
edges <- do.call(
rbind,
lapply(
grps,
function(x) {
with(x, expand.grid(from = node[!isLeaf], to = node[isLeaf]))
}
)
)
for (k in seq_along(grps)) {
if (nrow(grps[[k]]) == 1) {
lleaf <- with(grps[[k + 1]], node[!isLeaf])
rleaf <- with(grps[[k + 2]], node[!isLeaf])
edges <- rbind(edges, data.frame(from = grps[[k]]$node, to = c(lleaf, rleaf)))
}
}
edges <- `row.names<-`(edges[with(edges, order(from, to)), ], NULL)
which gives
> edges
from to
1 1 2
2 1 5
3 2 3
4 2 4
5 5 6
6 5 9
7 6 7
8 6 8
9 9 10
10 9 11
Visualization
library(igraph)
graph_from_data_frame(edges) %>%
plot(layout = layout_as_tree)

Control which nodes to change size igraph

This may look simple but I am not able to do this. I want to plot two types of nodes, small and big using a cutoff. The values are in the attribute degree.
Here is a small toy example
g1 <- graph(edges=c(1,2, 2,3, 3, 1, 4,2), n=4, directed=F) %>%
set_vertex_attr("names", value = LETTERS[1:4])
g1_degree <- degree(g1, mode = "total")
g1_degree
[1] 2 3 2 1
g1 <- set_vertex_attr(g1, "degree", value = g1_degree)
plot(g1, vertex.size=V(g1)$degree)
This gives me every node according to the degree, but I want nodes of degree 2 and 3 big and 1 small.
So I tried to edit the values within V(g1)$degree
ifelse(V(g1)$degree < 2, yes = V(g1)$degree==1, no = V(g1)$degree==5)
FALSE FALSE FALSE TRUE
Ok, I checked my degree values, but how can I overwrite the TRUE or FALSE using the cutoffs I need?
Here are two solutions.
One with ifelse, like in the question.
g1 <- set_vertex_attr(g1, "degree", value = ifelse(V(g1)$degree < 2, 1, 5))
V(g1)$degree
#[1] 5 5 5 1
And another with findInterval. This has better performance than ifelse, which can be important in large data sets.
i <- findInterval(V(g1)$degree, c(0, 2, Inf))
g1 <- set_vertex_attr(g1, "degree", value = c(1, 5)[i])
V(g1)$degree
#[1] 5 5 5 1
With different new sizes set with the findInterval index, c(10, 50)[i], the graph would look like below.
g1 <- set_vertex_attr(g1, "degree", value = c(10, 50)[i])
plot(g1, vertex.size = V(g1)$degree)

Plot conditional colors based on rows

I have two data.frames called outlier and data.
outlier just keeps row numbers which needs to be coloured.
data has 1000 data.
It has two columns called x and y.
If row number exists in outliers I want dots in plot to be red, otherwise black
plot(data$x, data$y, col=ifelse(??,"red","black"))
Something should be in ?? .
Hi this way works for me using ifelse, let me know what you think:
outlier <- sample(1:100, 50)
data <- data.frame(x = 1:100, y = rnorm(n = 100))
plot(
data[ ,1], data[ ,2]
,col = ifelse(row.names(data) %in% outlier, "red", "blue")
,type = "h"
)
I think this can be accomplished by creating a new color column in your data frame:
data$color <- "black"
Then set the outliers to a different value:
data[outlier,"color"] <- "red"
I dont have your exact data but I think I got something similar to what you wanted using the following:
outlier <- c(1, 2, 7, 9)
data <- data.frame(x=c(1,2,3,4,5,6,7,8,9,10),
y=c(1,2,3,4,5,6,7,8,9,10))
data$color <- "black"
data[outlier,"color"] <- "red"
data
x y color
1 1 1 red
2 2 2 red
3 3 3 black
4 4 4 black
5 5 5 black
6 6 6 black
7 7 7 red
8 8 8 black
9 9 9 red
10 10 10 black
Finally plot using the new value in data:
plot(data$x, data$y, col=data$color)
Results in:

Resources