graph visualization in R basis symmetric matrix having values in diagonal - r

I have a symmetric matrix which I modified a bit:
The above matrix is a symmetric matrix except the fact that I have added values in diagonal too (will tell the purpose going forward)
This matrix represents that how many times a person (A, B, C, D, E) works with other person on a publication. e.g. B and C worked 3 times together, similarly A and E worked 4 times together. Now the diagonal values represents how many times a person worked individually e.g. B worked on 4 publications (either alone or with someone else) similarly C worked on 3 publications.
Now I want to make a network analysis graph in R which describes relation between different person in terms of edge thickness and node size. e.g. the graph should look like this:
In graph, node circle size depends on number of publications a person worked on, e.g. circle B is largest as its diagonal value is maximum and A & E are smallest as they have lowest diagonal values. Also, the edge thickness between nodes depends on how many times they worked together, e.g. edge thickness between A & E is maximum as they worked 4 times together, compared to edge thickness (lesser than edge thickness between A & E) between B & C as they have worked 3 times together.
I can describe the relation between two persons basis edge thickness, however inclusion of diagonal values creating problems for me. Is it possible to do it in R? Any leads would be highly appreciated

You can do this with the igraph package. Because the diagonal means something different from the other entries in the matrix, I have separated the matrix into two pieces, the diagonal and the rest.
Your data
SM = as.matrix(read.table(text="A B C D E
1 2 1 1 4
2 4 3 2 1
1 3 3 1 2
1 2 1 2 1
4 1 2 1 1",
header=TRUE))
rownames(SM) = colnames(SM)
library(igraph)
AM = SM
diag(AM) = 0
D = diag(SM)
g = graph_from_adjacency_matrix(AM,
mode = "undirected",
weighted = TRUE)
plot(g,
edge.width=E(g)$weight,
vertex.size = 10+3*D)

Related

How to randomly change ties in a bipartite network without creating ties within the same level

I have an undirected bipartite network where I want to randomly switch ties around while still maintaining the same degree distribution.
Working in igraph, I've used the rewire() and keeping_degseq() functions to randomly switch edges between nodes as shown in the code below. However, this results in edges being created between nodes that are in the same mode. For this sample network, nodes 1 through 10 are in one mode, and nodes 11-14 are in another mode.
library(igraph)
set.seed(1)
b <- sample_bipartite(10,4,type = "gnp", p = 0.5)
b2 <- rewire(b, keeping_degseq(niter = 100))
b2
The resulting edges are:
[1] 1--13 3--11 5--11 3-- 6 7--12 1--11 9--12 9--11 7--14 8--13 12--13 4-- 6 13--14 11--14 10--11 1--12 2--10 3--14 5--14 6--14 12--14
8--14 6-- 9 10--14
Edges such as 12--13, 3--6, and 13--14 are between nodes in the same mode, which is a property of the original network that I want to preserve in this simulation.
Is there a different function that better serves my purpose? Is there a way to tell the rewire() function not create edges between nodes in the same mode?
Instead of doing something with edges, we may permute the vertices:
set.seed(1)
b1 <- sample_bipartite(10,4,type = "gnp", p = 0.5)
table(degree(b1))
# 1 2 3 4 5 6
# 4 3 2 2 2 1
(b2 <- permute(b1, c(sample(1:10), sample(11:14))))
# IGRAPH cf35948 U--B 14 20 -- Bipartite Gnp random graph
# + attr: name (g/c), p (g/n), type (v/l)
# + edges from cf35948:
# [1] 5--14 4--14 9--14 3--14 5--11 8--11 4--11 7--11 10--11 5--13 8--13 6--13
# [13] 7--13 3--13 10--13 5--12 4--12 2--12 1--12 10--12
table(degree(b2))
# 1 2 3 4 5 6
# 4 3 2 2 2 1
This approach is also easily generalizable to, say, multiple communities. It all depends on the blocks of shuffled vertex indices in the second argument of permute. Instead of thinking about shuffling, an easy way to see why this works is to think that we only switch around the names of the vertices.

Discrete Math - Vertex Coloring

I have a homework which I was given about a week ago. The thing is, I don't understand what my teacher taught but he gave us a homework...
A = {a,b,s}, B = {b,h,t}, C = {a,t,s}, D = {h,t,s}, E = {a,b}, F = {b,t,s}
How to create a minimal vertex coloring, which A,B,C,D,E and F are the vertexes?
I do know how to color a vertex but I don't know how to create the graphs from the given sets. Any helps? I tried looking on the internet but I don't come across a question like this.
If the graph is to be interpreted in such a way that the vertices A, B, C, D, E, F are meant to be connected if and only if they intersect, an optimal coloring has 5 colors.
The resulting graph is almost the complete graph on 6 vertices - {E,F} and {E,D} are the only edges which are missing. That being said, it contains the complete graph on 5 vertices via the subgraph induced by {A,B,C,D,F}. Consequently, any vertex coloring cannot use less than 5 colors. In total, the coloring
F : 1
A : 2
B : 3
C : 4
D : 5
E : 1
is a 5-coloring of the graph which is optimal.

Finding 2 vectors knowing their cross product and one constraint

I have to find two vectors in 3D space, a and b such a x b = c where c is known. I also know one constraint, let's say that a_y = 0
So I have to look for these 2 vectors on (c_x)x+(c_y)y+(c_z)z=0 plane, for vector a I can simplify it to (c_x)x+(c_z)z=0
for vector b since it's perpendicular to a as well it has to be in intersection of (a_x)x+(a_z)z=0 plane and (c_x)x+(c_y)y+(c_z)z=0 plane. After adding cross product equation to that I have 4 equations and 5 unknowns (a_x,a_z,b_x,b_y,b_z). How can I solve this?
Thanks in advance.
EDIT: Maybe explaining what I need these for will help out somehow.
I have camera direction vector and I need vector that points to the right of screen, and second one that points up.
problem definition
A,B=?
C=!
Ay=!
A x B = C
implicated properties of dot and cross product due to perpendicularity:
(A.C) = 0
(B.C) = 0
|A|.|B| = |C|
set length for one of the vectors to any known constant like 1
|A|=1
|B|=|C|
This is also mentioned by John Moeller in his comment
Compute A
So length of |A|=1 and dot product of (A.C)=0 as they are perpendicular so:
Ax^2 + Ay^2 + Az^2 = 1
Ax.Cx + Ay.Cy + Az.Cz = 0
This is system of 2 equations and 2 unknowns so solve it. It will lead to 2 solutions chose one that is nonzero.
Compute B
We know that B is perpendicular to C so (B.C)=0 so put the constrains together:
A x B = C
Bx.Cx + By.Cy + Bz.Cz = 0
Bx^2 + By^2 + Bz^2 = Cx^2 + Cy^2 + Cz^2
If you expand the cross product you will get 5 equations and 3 unknowns. So solve the system (chose any 3 of the non trivial equations).
PS It seems this is to generate your NEH matrix analogy
So if that is the case all 3 vectors are perpendicular to each other while one points to specific direction (Up or North ...) and the sizes are usually 1 for all vectors.
So let assume D vector is the known aligning vector:
A'= C x D
B = C x A'
A = C x B
You can change the order of operands to obtain the directions you need. If the D is not known then you can use (1,0,0) or (0,1,0) or (0,0,1) instead chose one that is non parallel with C ... or have biggest (C.D). Also take a look at:
How to find view point coordinates?
[Notes]
dot product: (A.B)=Ax.Bx+Ay.By+Az.Cz
cross product: A x B
length: |A| = sqrt (Ax^2 + Ay^2 + Az^2)

assign colour to edge when reciprocated

I have a standard edgelist:
[1] 1 -> 2
[2] 1 -> 3
[3] 1 -> 6
[4] 2 -> 1
[5] 2 -> 3
[6] 2 -> 4
[7] 2 -> 5
I would like to colour the edges according to three rules:
i) if the first value is less than the second -> blue
ii) if the first value is greater than the second ->red
This seems to be quite straightforward:
y.ed<-as.edgelist(am.ed)
z<-as.data.frame(y.ed)
m<-z[,1]
n<-z[,2]
o<-ifelse(m<n, "red", "blue")
However I would like to iii) colour reciprocal ties i.e. 1->2, 2->1 (as above) a further colour e.g. orange.
I cannot work out how to do this.
After thinking about this a little, it occurred to me that this could be tricky and that someone had probably already done this, since networks are an area of pretty heavy application in R. In package igraph I found an approach that will work if you can get your edge list reformatted slightly. Here is an example using made up data:
library("igraph")
set.seed(11)
ed <- sample(1:10, 30, replace = TRUE)
g <- graph(edges = ed)
str(g)
plot(g)
m <- is.mutual(g)
The T/F vector in m can then be used to color your edges as desired.
Note that in this example an edge starting and ending on the same node/vertex and duplicated edges are counted as "mutual" by the function. If you have those as well as reciprocal edges you might have to do some additional fiddling. This example has one reciprocal edge (1 <-> 3).

How to turn data from R data frames into a network

Suppose I have the following data frames
df <- data.frame(dev = c("A","A","B","B","C","C","C"),
proj = c("W","X","Y","X","W","X","Z"))
types <- data.frame(proj = c("W","X","Y","Z"),
type = c("blue","orange","orange","blue"))
> df
dev proj
1 A W
2 A X
3 B Y
4 B X
5 C W
6 C X
7 C Z
> types
proj type
1 W blue
2 X orange
3 Y orange
4 Z blue
I would like to turn these into the following network
The nodes are the unique entries in proj. For nodes u,v, there is an arc from u to v if u and v share an element from dev. The data is a list of developers and projects that each developer has worked on, and I would like to form a network which connects projects that have a developer in common. Each project is of a particular type, and that information would need to be encoded in the graph (I did this in this toy example via colour).
From this graph what I need is the degree of each node, as well as one or more measures of centrality. In particular I need the closeness centrality of each node, as well as a modified version of closeness centrality which measures the centrality within each type. So my end goal is to obtain a table like this:
proj degree closeness_centrality type_centrality
W 2 0.75 1
X 3 1 1
Y 2 0.75 1
Z 1 0.60 1
For reference, the closeness centrality of a node u is defined as C(u)=(N-1)/(sum over all nodes v of the distance from u to v), where N is the number of nodes in the graph and the distance from u to v is the length of the shortest u-v-path. The type centrality is C(T,u)=|T-u|/(sum over all nodes v in T of the distance from u to v) where T is the set of all nodes of a given type, and |T-u| is the size of T with u excluded (so either |T| or |T|-1 depending on the type of u).
One of the big challenges is that my actual df has almost 300,000 rows and this graph will have around 155,000 vertices. The average degree will be very low though so I think that it is doable.
My questions are:
Is R the best tool to be using for this? Are there good packages for performing these types of calculations on graphs?
What is the best way to store this kind of data? Should I form an adjacency matrix, or something else?
Any insight or tips at all would be well appreciated; as an economics major I'm kind of in over my head comp-sci-wise here.
Thanks!

Resources