r convert igraph into visNetwork - r

I found a way to convert igraph into visNetwork (refer to Interactive arules with arulesViz and visNetwork). Suppose before and after conversion from igraph to visNetwork should be the same, but my result shows after convert into visNetwork, the results are different.
I'll try demonstrate the issue using sample data data("Groceries") from Library(arules).
#Pre-defined library
library(arules)
library(arulesViz)
library(visNetwork)
library(igraph)
#Get sample data & get association rules
data("Groceries")
rules <- apriori(Groceries, parameter=list(support=0.01, confidence=0.4))
rules <- head(sort(rules, by="lift"), 10)
#Convert rules to data.table
library(data.table)
rules_dt <- data.table( lhs = labels( lhs(rules) ),
rhs = labels( rhs(rules) ),
quality(rules) )[ order(-lift), ]
print all rules in table format (sort by lift)
Plot top 10 association rules via using igraph
ig <- plot(rules, method="graph", control=list(type="items"))
Note: Based on association rules i plot a network diagram via using igraph, everything is correct. Next i'll try to convert existing igraph to visNetwork, then we compare the results.
tf <- tempfile( )
saveAsGraph(rules, file = tf, format = "dot" )
# clean up temp file if desired
#unlink(tf)
# Convert igraph to dataframe
ig_df <- as_data_frame(ig, what = "both")
# Plot visNetwork
visNetwork(
nodes = data.frame(
id = ig_df$vertices$name
,value = ig_df$vertices$lift # could change to lift or confidence
,title = ifelse(ig_df$vertices$label == "",ig_df$vertices$name,
ig_df$vertices$label)
,ig_df$vertices
),
edges = ig_df$edges
) %>%
visEdges(arrows ="to") %>%
visOptions( highlightNearest = T )
Plot top 10 association rules via using visNetwork
Note: For visNetwork diagram, the size of intercept node indicate "lift", the higher the lift, the larger the size of intercept node; unlike igraph diagram, size of intercept node indicate "support", while colour of intercept node indicate "lift".
Let's compare the igraph and visNetwork
By referring to the association rules in table format, rules no.10 (rules with smallest "lift"), suppose the size of the intercept node is the smallest, but end up it's not the smallest.
Problems
I tried to further drill down into ig_df <- get.data.frame( ig, what = "both" ), and i found something weird on ig_df$vertices table, which generated from as_data_frame function from library(igraph).
I found that the assoc10 (intercept node for association rules no.10) actually had NA for all variables (i.e. "support", "confidence", "lift" & "count"), more precisely the dimension for columns "support", "confidence", "lift" & "count" in ig_df$vertices are moving up one row! Kindly correct me if i'm wrong..
Conclusion
Since the key to convert an igraph into visNetwork is to use this as_data_frame to get extract all data from an igraph and convert those data into dataframe, then plot visNetwork using the data from extracted dataframe. But due to extract issue when using as_data_frame to extract data from igraph, so the result from also different.
Question: is this a bug? or i made a mistake on my code? Any suggestion is welcome. Thanks!

A year later... but I already typed everything so might as well.
If I understand it correctly, this is the source of your troubles -
value = ig_df$vertices$lift # could change to lift or confidence
The easiest solution is to assign your lift values to size because in visNetwork size: Number. Default to 25. The size is used to determine the size of node shapes that do not have the label inside of them. These shapes are: image, circularImage, diamond, dot, star, triangle, triangleDown, square and icon. I'm not sure how values works here but I think you could use values if you also provide appropriate scaling.
https://www.rdocumentation.org/packages/visNetwork/versions/2.0.9/topics/visNodes
So the solution is:
size = ig_df$vertices$lift # could change to lift or confidence
Note that the nodes that have NA lift are set to a size 25 by default and that with a size 2 or 3 lift you can barely see the nodes. You could raise the lift size exponentially which will increase the size of the nodes and exaggerate the differences in lift.
ig_df <- as_data_frame(ig, what = "both")
ig_df$vertices["lift"] <- ig_df$vertices["lift"] ** 3
And I couldn't reproduce your shifty table problem, mine looked fine, hopefully it solves itself!

Related

Defining Node shape in a Network plot with an additional attribute table in R

I am working on plotting a Network and it contains two different types of Nodes which I want to visualise with different shapes. For that I made an additional table in which I specified which structure is which type using a binary system. Now I want to specify in my plot function that the structures with 1 are to be triangles and the ones with 0 as circles.
My data for the Network is in the format of an adjacency matrix (I use igraph) and I am using ggnet2 for the plotting of it.
this is how I imported the data:
am <- as.matrix(read.csv2("mydata.csv", header = T, row.names = 1))
g <- graph_from_adjacency_matrix(am, mode = "undirected")
attr <- read.csv2("myattributes.csv", header = T, row.names = 1)
this is how I would plot it but I dont know how to specify the shape function
ggnet2(g, size = "degree", node.color = "darkgreen", shape = ??????)
Thanks in advance for your help!
Note that the package-requirements for plotting igraphs with ggnet2 include ggplot2, sna and network as well as intergraph as a bridge.
ggnet2 is prettier, sure, but the igraph-way is this:
g <- erdos.renyi.game(100,100,'gnm')
V(g)$shape <- sample(c('csquare','circle'), 100, replace=T)
plot(g, vertex.label = NA)
Note that I added two igraph-style shapes as vertex-attributes to g above. In ggent2 you can provide a vector with shapes, but they can be any values (even a factor), or numbers (the usual gray circle is 19. Try this out to plot in ggnet2
ggnet2(g, shape=19)
ggnet2(g, shape=10+round(1:100/10))
ggnet2(g, shape=factor(V(g)$shape))
V(g)$shape <- sample(c('One shape','Another shape'), 100, replace=T)
ggnet2(g, shape=V(g)$shape, size = "degree", node.color = "darkgreen")
Note that, if you add attributes to your vertices after separately loading attribute data (as you do above), it may be so that the very order of your data matters. Make sure your table import actually works as intended with the correct attribute being assigned to the correct vertex. I find it a good practice to tie all values as attributes on the igraph-object (edge- and vertex attributes alike) rather than letting the network data live in different dataframes or loose vectors to be combined in order to correctly visualise a network.

plot a subset of igraph data , just subset by their attribute

I have a fblog data set ,PolParty is one attribute of my data, I want to plot just 2 political parties (say P1 and P2) and plot the network of blogs.
i wrote code below but i think it is wrong , can some one help me?
library(statnet)
library(igraph)
library(sand)
data(fblog)
fblog = upgrade_graph(fblog)
class(fblog)
summary(fblog)
V(fblog)$PolParty
table(V(fblog)$PolParty)
p1<-V(fblog)[PolParty=="PS"] # by their labels/names
p2<-V(fblog)[PolParty=="UDF"]
class(p1)
The objects p1 and p2 that you were creating are of the class igraph.vs (instead of igraph). This object just documents the vertices. Which is not a full graph. Hence when you try to plot it you don't get anything.
Based on the following post: Subset igraph graph by label
g=subgraph.edges(graph=fblog, eids=which(V(fblog)$PolParty==" PS"), delete.vertices = TRUE)
plot(g)
The above works. NOTE: regarding the output of V(fblog)$PolParty- everything is preceeded by a space hence you need to use V(fblog)$PolParty==" PS"
UPDATE: if I want to subset based on 2 conditions- I will modify the which() command:
g=subgraph.edges(graph=fblog, eids=which(V(fblog)$PolParty==" PS"| V(fblog)$PolParty==" UDF"), delete.vertices = TRUE)
plot(g)

Node labels on circular phylogenetic tree

I am trying to create circular phylogenetic tree. I have this part of code:
fit<- hclust(dist(Data[,-4]), method = "complete", members = NULL)
nclus= 3
color=c('red','blue','green')
color_list=rep(color,nclus/length(color))
clus=cutree(fit,nclus)
plot(as.phylo(fit),type='fan',tip.color=color_list[clus],label.offset=0.2,no.margin=TRUE, cex=0.70, show.node.label = TRUE)
And this is result:
Also I am trying to show label for each node and to color branches. Any suggestion how to do that?
Thanks!
When you say "color branches" I assume you mean color the edges. This seems to work, but I have to think there's a better way.
Using the built-in mtcars dataset here, since you did not provide your data.
plot.fan <- function(hc, nclus=3) {
palette <- c('red','blue','green','orange','black')[1:nclus]
clus <-cutree(hc,nclus)
X <- as.phylo(hc)
edge.clus <- sapply(1:nclus,function(i)max(which(X$edge[,2] %in% which(clus==i))))
order <- order(edge.clus)
edge.clus <- c(min(edge.clus),diff(sort(edge.clus)))
edge.clus <- rep(order,edge.clus)
plot(X,type='fan',
tip.color=palette[clus],edge.color=palette[edge.clus],
label.offset=0.2,no.margin=TRUE, cex=0.70)
}
fit <- hclust(dist(mtcars[,c("mpg","hp","wt","disp")]))
plot.fan(fit,3); plot.fan(fit,5)
Regarding "label the nodes", if you mean label the tips, it looks like you've already done that. If you want different labels, unfortunately, unlike plot.hclust(...) the labels=... argument is rejected. You could experiment with the tiplabels(....) function, but it does not seem to work very well with type="fan". The labels come from the row names of Data, so your best bet IMO is to change the row names prior to clustering.
If you actually mean label the nodes (the connection points between the edges, have a look at nodelabels(...). I don't provide a working example because I can't imagine what labels you would put there.

How to plot a large ctree() to avoid overlapping nodes

When I plotted the decision tree result from ctree() from party package, the font was too big and the box was also too big. They are overlapping other nodes.
Is there a way to customize the output from plot() so that the box and the font would be smaller ?
The short answer seems to be, no, you cannot change the font size, but there are some good other options.
I know of three possible solutions. First, you can change other parameters in the plot to make it more compact. Second, you can write it to a graphic file and view that file. Third, you can use an alternative implementation of ctree() in the partykit package, which is a newer package by some of the same authors.
Default Plot Example
library(party)
airq <- subset(airquality, !is.na(Ozone))
airct <- ctree(Ozone ~ ., data = airq,
controls = ctree_control(maxsurrogate = 3))
plot(airct) #default plot, some crowding with N hidden on leafs
Simplified plot
# simpler version of plot
plot(airct, type="simple", # no terminal plots
inner_panel=node_inner(airct,
abbreviate = TRUE, # short variable names
pval = FALSE, # no p-values
id = FALSE), # no id of node
terminal_panel=node_terminal(airct,
abbreviate = TRUE,
digits = 1, # few digits on numbers
fill = c("white"), # make box white not grey
id = FALSE)
)
This is somewhat better and one might be able to improve it further. To figure out these details, I originally did class(airct) which returned "BinaryTree". Armed with this info, I started reading ?plot.BinaryTree
Write to a file
A second simple solution is to write the plot to a file and then view the file. You may need to play with the settings to find the best fit.
png("airct.png", res=80, height=800, width=1600)
plot(airct)
dev.off()
Plot with partykit package instead
Finally, you can use a newer and not-yet-finished re-implementation of the party package by some of the same authors. At this point (Dec 2012), the only function they have re-done is ctree(). This version allows you to change font size.
library(partykit)
airct <- ctree(Ozone ~ ., data = airq)
class(airct) # different class from before
# "constparty" "party"
plot(airct, gp = gpar(fontsize = 6), # font size changed to 6
inner_panel=node_inner,
ip_args=list(
abbreviate = TRUE,
id = FALSE)
)
Here I have left the leafs in their default setting because I have frankly never figured out how to get it to work the way I want. I suspect this has to do with the fact that the package is incomplete (as of Dec 2012). You can read about the plot method starting with ?plot.party
Another option (that doesn't change what you want but does potentially solve the underlying problem) is to change the size of the figure itself, as I learned in my class for my assignment.
Replace the r in the below:
{r}
with:
{r, fig.width=X, fig.height=Y}
where the X and Y need to be replaced by numbers chosen by you depending on what size you think works better.
This website, talks about doing this in more detail and universally throughout the document.

Trying to determine why my heatmap made using heatmap.2 and using breaks in R is not symmetrical

I am trying to cluster a protein dna interaction dataset, and draw a heatmap using heatmap.2 from the R package gplots. My matrix is symmetrical.
Here is a copy of the data-set I am using after it is run through pearson:DataSet
Here is the complete process that I am following to generate these graphs: Generate a distance matrix using some correlation in my case pearson, then take that matrix and pass it to R and run the following code on it:
library(RColorBrewer);
library(gplots);
library(MASS);
args <- commandArgs(TRUE);
matrix_a <- read.table(args[1], sep='\t', header=T, row.names=1);
mtscaled <- as.matrix(scale(matrix_a))
# location <- args[2];
# setwd(args[2]);
pdf("result.pdf", pointsize = 15, width = 18, height = 18)
mycol <- c("blue","white","red")
my.breaks <- c(seq(-5, -.6, length.out=6),seq(-.5999999, .1, length.out=4),seq(.100009,5, length.out=7))
#colors <- colorpanel(75,"midnightblue","mediumseagreen","yellow")
result <- heatmap.2(mtscaled, Rowv=T, scale='none', dendrogram="row", symm = T, col=bluered(16), breaks=my.breaks)
dev.off()
The issue I am having is once I use breaks to help me control the color separation the heatmap no longer looks symmetrical.
Here is the heatmap before I use breaks, as you can see the heatmap looks symmetrical:
Here is the heatmap when breaks are used:
I have played with the cutoff's for the sequences to make sure for instance one sequence does not end exactly where the other begins, but I am not able to solve this problem. I would like to use the breaks to help bring out the clusters more.
Here is an example of what it should look like, this image was made using cluster maker:
I don't expect it to look identical to that, but I would like it if my heatmap is more symmetrical and I had better definition in terms of the clusters. The image was created using the same data.
After some investigating I noticed was that after running my matrix through heatmap, or heatmap.2 the values were changing, for example the interaction taken from the provided data set of
Pacdh-2
and
pegg-2
gave a value of 0.0250313 before the matrix was sent to heatmap.
After that I looked at the matrix values using result$carpet and the values were then
-0.224333135
-1.09805379
for the two interactions
So then I decided to reorder the original matrix based on the dendrogram from the clustered matrix so that I was sure that the values would be the same. I used the following stack overflow question for help:
Order of rows in heatmap?
Here is the code used for that:
rowInd <- rev(order.dendrogram(result$rowDendrogram))
colInd <- rowInd
data_ordered <- matrix_a[rowInd, colInd]
I then used another program "matrix2png" to draw the heatmap:
I still have to play around with the colors but at least now the heatmap is symmetrical and clustered.
Looking into it even more the issue seems to be that I was running scale(matrix_a) when I change my code to just be mtscaled <- as.matrix(matrix_a) the result now looks symmetrical.
I'm certainly not the person to attempt reproducing and testing this from that strange data object without code that would read it properly, but here's an idea:
..., col=bluered(20)[4:20], ...
Here's another though which should return the full rand of red which tha above strategy would not:
shift.BR<- colorRamp(c("blue","white", "red"), bias=0.5 )((1:16)/16)
heatmap.2( ...., col=rgb(shift.BR, maxColorValue=255), .... )
Or you can use this vector:
> rgb(shift.BR, maxColorValue=255)
[1] "#1616FF" "#2D2DFF" "#4343FF" "#5A5AFF" "#7070FF" "#8787FF" "#9D9DFF" "#B4B4FF" "#CACAFF" "#E1E1FF" "#F7F7FF"
[12] "#FFD9D9" "#FFA3A3" "#FF6C6C" "#FF3636" "#FF0000"
There was a somewhat similar question (also today) that was asking for a blue to red solution for a set of values from -1 to 3 with white at the center. This it the code and output for that question:
test <- seq(-1,3, len=20)
shift.BR <- colorRamp(c("blue","white", "red"), bias=2)((1:20)/20)
tpal <- rgb(shift.BR, maxColorValue=255)
barplot(test,col = tpal)
(But that would seem to be the wrong direction for the bias in your situation.)

Resources