Labels on only root and terminal vertices in igraph (R)? - r

inst2 = c(2, 3, 4, 5, 6)
motherinst2 = c(7, 8, 2, 10, 11)
km = c(20, 30, 40, 25, 60)
df2 = data.frame(inst2, motherinst2)
df2 = cbind(df2, km)
g2 = graph_from_data_frame(df2)
tkplot(g2)
how would I approach adding labels to exclusively my root and terminal vertices in a graph? I know it would involve this function, but how would you set it up? Assuming the graph object is just called 'g', or something obvious.
vertex.label =

The solution from #eipi1o is good, but the OP says "I'm finding it difficult to apply to my large data set effectively." I suspect that the issue is finding which are the intermediate nodes whose name should be blanked out. I will continue the example of #eipi10. Since my answer is based on his, if you upvote my answer, please upvote his as well.
You can use the neighbors function to determine which points are sources and sinks. Everything else is an intermediate node.
## original graph from eipi10
g = graph_from_edgelist(cbind(c(rep(1,10),2:11), c(2:21)))
## Identify which nodes are intermediate
SOURCES = which(sapply(V(g), function(x) length(neighbors(g, x, mode="in"))) == 0)
SINKS = which(sapply(V(g), function(x) length(neighbors(g, x, mode="out"))) == 0)
INTERMED = setdiff(V(g), c(SINKS, SOURCES))
## Fix up the node names and plot
V(g)$name = V(g)
V(g)$name[INTERMED] = ""
plot(g)

Using your example graph, we'll identify the root and terminal vertices and remove the labels for other vertices. Here's what the initial graph looks like:
set.seed(2)
plot(g2)
Now let's identify and remove the names of the intermediate vertices
# Get all edges
e = get.edgelist(g2)
# Root vertices are in first column but not in second column
root = setdiff(e[,1],e[,2])
# Terminal vertices are in second column but not in first column
terminal = setdiff(e[,2], e[,1])
# Vertices to remove are not in root or terminal vertices
remove = setdiff(unique(c(e)), c(root, terminal))
# Remove names of intermediate vertices
V(g2)$name[V(g2)$name %in% remove] = ""
set.seed(2)
plot(g2)
Original Answer
You can use set.vertex.attribute to change the label names. Here's an example:
library(igraph)
# Create a graph to work with
g = graph_from_edgelist(cbind(c(rep(1,10),2:11), c(2:21)))
plot(g)
Now we can remove the labels from the intermediate vertices:
g = set.vertex.attribute(g, "name", value=c(1,rep("", length(2:11)),12:21))
plot(g)

Related

How to remove vertices by condition in igraph object, color edges by group, and size arrows based on attribute?

I am trying to do something similar to this and this post. I have an igraph object and want to remove vertices(arrows) based on an values in a column of the edges dataframe, color the edges(circles) by a group, and change the line/arrow size based on the same column in the edges dataframe. Here is some reproducible code that looks exactly like my data:
# Data
edges <- data.frame(
"agency.from" = c(rep("a",4),rep("b",4),rep("c",4),rep("d",4)),
"agency.to" = c(rep(c("a","b","c","d"),4)),
"comm.freq" = sample(0:5,16, replace=TRUE))
nodes <- data.frame(
"agency" = c("a","b","c","d"),
"group" = c("x", "y", "x", "y"),
"state" = c("i", "j", "j", "i"))
# make igraph object
net <- graph_from_data_frame(d=edges, vertices=nodes, directed=T)
plot(net)
# remove loops
net2 <- simplify(net, remove.multiple = T, remove.loops = T)
plot(net2)
Which gives me:
this
# remove vertices where communication frequency is 1 and 0
net3 <- delete.vertices(net2, which(E(net2)$comm.freq == 1))
net4 <- delete.vertices(net3, which(E(net2)$comm.freq == 0))
plot(net4)
Which does not change the plot at all
Then I try to change the colors and sizes:
# color edges by group
colrs <- c("gray50", "tomato")
V(net4)$color <- colrs[V(net4)$group]
plot(net4)
# make size of arrow based on communication frequency
plot(net4, edge.width = E(net4)$comm.freq * 5, edge.arrow.size = E(net4)$comm.freq)
And still nothing changes
I followed the code provided in the other posts and I'm just really confused why nothing will work.
Any help is much appreciated!
The simplify() function removed your edge attributes. You need to specify how you want those values to be preserved when simplifying your graph. If you just want to keep the first possible value, you can do
net2 <- simplify(net, remove.multiple = T, remove.loops = T, edge.attr.comb=list("first"))
And then you use delete.vertices but you are passing indexes for edges, not vertices. If you want to drop both vertices that are adjacent to an edge with that given property, it should look more like
net3 <- delete_vertices(net2, V(net2)[.inc(E(net2)[comm.freq==1])])
net4 <- delete_vertices(net3, V(net3)[.inc(E(net3)[comm.freq==0])])
And then for the colors you have values like "x" and "y" for group, but you are indexing into the colrs vector which has no idea what "x" and "y" correspond to. It would be better to use a named vector. For example
colrs <- c(x="gray50", y="tomato")
V(net4)$color <- colrs[V(net4)$group]

delete igraph vertices but retain all edges in R

Is there a way to delete (or selectively display) vertices but retain edges in an igraph plot? For example, in the code below, we delete vertices but that deletes edges between them. My goal is to highlight a specific node but keep all edges.
g <- make_ring(10) %>%
set_vertex_attr("name", value = LETTERS[1:10])
g
V(g)
g2 <- delete_vertices(g, c(1,5)) %>%
delete_vertices("B")
g2
V(g2)
If you delete the vertices, the edges no longer make any sense. However, if all you want is to not display the vertices, you can just use vertex.size=0.
plot(g, vertex.size=0)
If you do not want to even see the node names, add vertex.label=NA
You can show just one node by making a vector of vertex sizes and labels
VS = rep(0, vcount(g))
VS[2] = 14
VL = rep(NA, vcount(g))
VL[2] = V(g)$name[2]
VFC = rep(NA, vcount(g))
VFC[2] = "black"
VC = rep(NA, vcount(g))
VC[2] = 1
plot(g, vertex.size=VS, vertex.label=VL, vertex.color=VC,
vertex.frame.color=VFC)

Shortest Paths based on edge attribute with igraph

I'm trying to get the shortest paths of a graph but based on its edge ids.
So having the following graph:
library(igraph)
set.seed(45)
g <- erdos.renyi.game(25, 1/10, directed = TRUE)
E(g)$id <- sample(1:3, length(E(g)), replace = TRUE)
The shortest_paths(g, 1, V(g)) function finds all the shortest paths from node 1 to all the other nodes. However, I would like to calculate this, not just by following the geodesic distance, but a mix between the geodesic distance, and the minimum of edge id changes.
For example if this would be a train network, and the edge ids would represent trains. I would like to calculate how to get from node A to all the other nodes using the shortest path, but while changing the least amount of time of trains.
OK I think I have a working solution, although the code is a little ugly. The basic algorithm (lets call it gs(i, j)) goes like this: If we want to find the shortest train journey from i to j (gs(i, j)) we:
find the shortest path from i to j considering all trains. if this path is length 0 or 1 return it (there is either no path or a path on 1 train)
split the graph up by 'trains' (subset graph by edges) so as to consider each train network separately, and find the shortest path between i and j in each individual train network
if a single train will get you from i to j, return the train route with the fewest stops between i and j, else
if no single train runs from i to j then call gs(i, j-1) where (j-1) is the stop before j in the shortest path between i and j on the full network.
So basically, we look to see if a single train can do it, and if it can't we call the function recursively looking if a single train can get you to the stop before the last stop, etc. etc.
library(igraph)
# First your data
set.seed(45)
g <- erdos.renyi.game(25, 1/10, directed = TRUE)
E(g)$id <- sample(1:3, length(E(g)), replace = TRUE)
plot(g, edge.color = E(g)$id)
# The function takes as arguments the graph, and the id of the vertex
# you want to go from/to. It should work for a vector of
# destinations but I have not rigorously tested it so proceed with
# caution!
get.shortest.routes <- function(g, from, to){
train.routes <- lapply(unique(E(g)$id), function(id){subgraph.edges(g, eids = which(E(g)$id==id), delete.vertices = F)})
target.sp <- shortest_paths(g, from = from, to = to, output = 'vpath')$vpath
single.train.paths <- lapply(train.routes, function(gs){shortest_paths(gs, from = from, to = to, output = 'vpath')$vpath})
for (i in length(target.sp)){
if (length(target.sp[[i]]>1)) {
cands <- lapply(single.train.paths, function(l){l[[i]]})
if (sum(unlist(lapply(cands, length)))!=0) {
cands <- cands[lapply(cands, length)!=0]
cands <- cands[lapply(cands, length)==min(unlist(lapply(cands, length)))]
target.sp[[i]] <- cands[[1]]
} else {
target.sp[[i]] <- c(get.shortest.routes(g, from = as.numeric(target.sp[[i]][1]),
to = as.numeric(target.sp[[i]][(length(target.sp[[i]]) - 1)]))[[1]],
get.shortest.routes(g, from = as.numeric(target.sp[[i]][(length(target.sp[[i]]) - 1)]),
to = as.numeric(target.sp[[i]][length(target.sp[[i]])]))[[1]][-1])
}
}
}
target.sp
}
OK now lets run some tests. If you squint at the graph above you can see that the path from vertex 5 to vertex 21 is length-2 if you take two trains, but that you can get there on 1 train if you pass through an extra station. Our new function should return the longer path:
shortest_paths(g, 5, 21)$vpath
#> [[1]]
#> + 3/25 vertices, from b014eb9:
#> [1] 5 13 21
get.shortest.routes(g, 5, 21)
#> Warning in shortest_paths(gs, from = from, to = to, output = "vpath"): At
#> structural_properties.c:745 :Couldn't reach some vertices
#> Warning in shortest_paths(gs, from = from, to = to, output = "vpath"): At
#> structural_properties.c:745 :Couldn't reach some vertices
#> [[1]]
#> + 4/25 vertices, from c22246c:
#> [1] 5 13 15 21
Lets make a really easy graph where we are sure what we want to see: here we should get 1-2-4-5 instead of 1-3-5:
df <- data.frame(from = c(1, 1, 2, 3, 4), to = c(2, 3, 4, 5, 5))
g1 <- graph_from_data_frame(df)
E(g1)$id <- c(1, 2, 1, 3, 1)
plot(g1, edge.color = E(g1)$id)
get.shortest.routes(g1, 1, 5)
#> Warning in shortest_paths(gs, from = from, to = to, output = "vpath"): At
#> structural_properties.c:745 :Couldn't reach some vertices
#> Warning in shortest_paths(gs, from = from, to = to, output = "vpath"): At
#> structural_properties.c:745 :Couldn't reach some vertices
#> [[1]]
#> + 4/5 vertices, named, from c406649:
#> [1] 1 2 4 5
I'm sure there is a more rigorous solution, and you'll probably want to optimize the code a bit. For instance, I just realized that I don't stop the function immediately if the shortest path on the full graph has only two nodes -- doing so would avoid some needless computations! This was a fun problem, I hope some other answers gets posted.
Created on 2018-05-11 by the reprex package (v0.2.0).
Here is my take on the problem. A few notes:
1) all_simple_paths will not scale well with large or highly connected graphs
2) I favored fewest changes above all else, which means a path with two changes and a dist of 40 will beat a path with three changes and a dist of 3.
4) I can imagine an even faster approach if # of changes and distance change priority if there is no path on one id
library(igraph)
# First your data
set.seed(45)
g <- erdos.renyi.game(25, 1/10, directed = TRUE)
E(g)$id <- sample(1:3, length(E(g)), replace = TRUE)
plot(g, edge.color = E(g)$id)
##Option 1:
rst <- all_simple_paths(g, from = 1, to = 18, mode = "out")
rst <- lapply(rst, as_ids)
rst1 <- lapply(rst, function(x) c(x[1], rep(x[2:(length(x)-1)],
each=2), x[length(x)]))
rst2 <- lapply(rst1, function(x) data.frame(eid = get.edge.ids(graph=g, vp = x),
train=E(g)$id[get.edge.ids(graph=g, vp = x)]))
rst3 <- data.frame(pathID=seq_along(rst),
changes=sapply(rst2, function(x) length(rle(x$train)$lengths)),
dist=sapply(rst2, nrow))
spath <- rst3[order(rst3$changes, rst3$dist), ][1,1]
#Vertex IDs
rst[[spath]]
#[1] 1 23 8 18
plot(g, edge.color = E(g)$id, vertex.color=ifelse(V(g) %in% rst[[spath]], "firebrick", "gray80"),
edge.arrow.size=0.5)

How to calculate the edge attributes as the path length in igraph?

Pretend the dataframe below is an edgelist (relation between inst2 and motherinst2), and that km is an attribute I want to calculate as a path that's been assigned to the edges. I'm too new at coding to make a reproducible edge list.
inst2 = c(2, 3, 4, 5, 6)
motherinst2 = c(7, 8, 9, 10, 11)
km = c(20, 30, 40, 25, 60)
df2 = data.frame(inst2, motherinst2)
edgelist = cbind(df2, km)
g = graph_from_data_frame(edgelist)
I know how to calculate the path length of vertices in a graph, but I have some attributes attached to the edges that I want to sum up as path lengths. They are simple attributes (distance in km, time in days, and speed as km/day).
This is how I was calculating the path of vertices (between roots and terminals/leaves):
roots = which(sapply(sapply(V(g),
function(x) neighbors(g, x, mode = 'in')), length) == 0)
#slight tweaking this piece of code will also calculate 'terminal' nodes (or leaves). (11):
terminals = which(sapply(sapply(V(g),
function(x) neighbors(g, x, mode = 'out')), length) == 0)
paths= lapply(roots, function(x) get.all.shortest.paths(g, from = x, to = terminals, mode = "out")$res)
named_paths= lapply(unlist(paths, recursive=FALSE), function(x) V(g)[x])
I just want to do essentially exactly as I did above, but summing up the distance, time, and rate (which I will compute the mean of) incurred between each of those paths. If it helps to know how the edges have been added as attributes, I've used cbind like so:
edgelist_df = cbind(edgelist_df, time, dist, speed)
and my graph object (g) is set up like this:
g <- graph_from_data_frame(edgelist_df, vertices = vattrib_df)
vattrib_df is the attributes of the vertices, which is not of interest to us here.

Define Edge labels based on matching vertex type in R

I have a graph net with two different types (1 and 2) of vertices, appearing n1 and n2 times, respectively:
net %v% "type" <- c(rep("1", n1), rep("2", n2))
We have some edges which were generated randomly with probabilities ps and pd, where ps is the edge probability with a same type (1-1 or 2-2) and pd with a different type (1-2).
I would like to plot this graph such that the edges between same types (i.e. 1-1 or 2-2) have a different color than edges between different types (1-2).
How do I do this?
I tried playing around with the %e% operator of the network package, but I'm confused about how to grab the type of the end node of each edge.
Thank you!
Do you want that?
from <- sample(1:2, 10, replace = T)
to <- sample(1:2, 10, replace = T)
node <- cbind(from, to)
library(igraph)
net <- graph_from_edgelist(node, directed = F)
edge_color <- function(from_to){
from_node <- from_to[1]
to_node <- from_to[2]
ifelse(from_node == to_node, return("red"), return("blue"))
}
color<- apply(node, 1, edge_color)
plot(net, edge.color=color)

Resources