RandomLinkSplit not working with HeteroData - graph

I am having some serious trouble with torch-geometric when dealing with my own data.
I am trying to build a graph that has 4 different node entities (of which only 1 bears some node features, the others are simple nodes), and 5 different edge type (of which only one bears a weight).
I have managed to do so by building a HeteroData() object and loading the different matrices with labels, attributes and so on.
The problem arises when I try to call RandomLinkSplit. Here's what my call looks like:
import torch_geometric.transforms as T
transform = T.RandomLinkSplit(
num_val = 0.1,
num_test = 0.1,
edge_types = [('Patient', 'suffers_from', 'Diagnosis'),
('bla', 'bla', 'bla') #I copy all the edge types here
],
)
but I get the empty AssertionError on the condition:
assert is instance(rev_edge_types, list)
So I thought that I needed to transform the graph to undirected (for some weird reason) like the tutorial does, and then to sample also reverse edges (even though I don't need them):
import torch_geometric.transforms as T
data = T.ToUndirected()(data)
transform = T.RandomLinkSplit(
num_val = 0.1,
num_test = 0.1,
edge_types = [('Patient', 'suffers_from', 'Diagnosis'),
('bla', 'bla', 'bla') #I copy all the edge types here
],
rev_edge_types = [('Diagnosis', 'rev_suffers_from', 'Patient'),
...
]
)
but this time I get the error unsupported operand type(s) for *: 'Tensor' and 'NoneType'.
Does any expert have any ideas on why this is happening? I am simply trying to do a train test split, and from the docs I read the Heterogeneous graphs should be well supported, but I don't understand why this is not working and I have been trying different things for quite a lot of time.
Any help would be appreciated!
Thanks

Related

Generating random configuration model graphs in Julia using iGraph

Recently I started to use iGraph in Julia to generate random configuration models, since LightGraphs has a problem with time realization of these objects (link to a previous question related to this: random_configuration_model(N,E) takes to long on LightGraphs.jl). To generate such graphs, I generate a vector E (1-based indexed) and from it I generate an iGraph object g2 as follows
using PyCall, Distributions
ig = pyimport("igraph")
α=0.625;N=1000;c=0.01*N;p=α/(α+c)
E = zeros(Int64,N)
test=false
while test == false
s=0
for i in 1:N
E[i] = rand(NegativeBinomial(α,p))
s += E[i]
end
if iseven(s) == true
test = true
else
end
end
g = ig.Graph.Realize_Degree_Sequence(E)
My first question is related to the fact that python is 0-based indexed. By comparison of the components of E to the degrees of g, it seems that ig.Graph.Realize_Degree_Sequence(E) automatically convert the index bases, generating a 0 based object g from a 1-based object E. Is this correct?
Secondly, I would like to enforce the random configuration graph g to be simple, with no self loops nor multi-edges. iGraphs documentation (https://igraph.org/c/doc/igraph-Generators.html#igraph_realize_degree_sequence) says that the flag allowed_edge_types:IGRAPH_SIMPLE_SW does the job, but I am not able to find the syntax to use it in Julia. Is it possible at all to use this flag in Julia?
Be careful with LightGraph's random_configruaton_model. Last time I looked, it was broken, and it did not sample uniformly, yet the authors outright refused to fix it. I don't know if anything changed since then.
C/igraph's degree_sequence_game() has a correctly implemented method that samples uniformly, called IGRAPH_DEGSEQ_SIMPLE_NO_MULTIPLE_UNIFORM, but for some reason it is not yet exposed in Python ... we'll look into exposing it soon.
Then you have two options:
Use python-igraph's "simple" method, and keep generating graphs until you get a simple one (test it with Graph.is_simple()). This uses the stub-matching method, and will sample exactly uniformly. For large degrees, it will take a long time due to many rejections. Note that this rejection method exactly what the IGRAPH_DEGSEQ_SIMPLE_NO_MULTIPLE_UNIFORM implements (albeit bit faster).
Use igraph's Graph.Realize_Degree_Sequence() to create one graph with the given degree sequence, then rewrite it using Graph.rewire() with a sufficiently large number of rewiring steps (at least several times the edge count). This method uses degree-preserving edge switches and can be shown to sample uniformly in the limit of a large number of switches.
The "no_multiple" method in python-igraph will again not sample uniformly.
Take a look at section 2.1 of this paper for a gentle explanation of what techniques are available for uniform sampling.
You are reading C docs of igraph. You need to read Python documentation https://igraph.org/python/api/latest/igraph._igraph.GraphBase.html#Degree_Sequence. So:
julia> ige = [collect(e .+ 1) for e in ig.Graph.Degree_Sequence(collect(Any, E), method="no_multiple").get_edgelist()];
julia> extrema(reduce(vcat, ige)) # OK check
(1, 1000)
julia> sg = SimpleGraph(1000)
{1000, 0} undirected simple Int64 graph
julia> for (a, b) in ige
add_edge!(sg, a, b)
end
julia> sg # OK check
{1000, 5192} undirected simple Int64 graph
julia> length(ige) # OK check
5192
julia> sort(degree(sg)) == sort(E) # OK check
true
I used "no_multiple" algorithm, as "vl" algorithm assumes connected graph and some of degrees of nodes in your graph can be 0.

Julia image feature extraction using EfficientNet.jl

I am trying to use Efficientnet.jl as a feature extractor, meaning I want to extract all features after a given block in the flux chain.
There is the build in function
features = model(x, Val(:stages))
which returns all features after each block, which is very memory inefficient, since I only need to store values after exactly 3 blocks.
My thought was to only use a subset of the layers this way:
transparent_model = model.blocks[1:model.stages[3]]
features = transparent_model(x)
Unfortunately I get the following Error:
DimensionMismatch("Input channels must match! (3 vs. 1)")
which is in my opinion just due to a bad error message.
size(x) -> (1280,720,3,1)

Z3Py solver producing different results in Jupyter

I'm learning how to use Z3Py through the Jupyter notebooks provided here, starting with guide.ipynb. I noticed something odd when running the below example code included in the Boolean Logic section.
p = Bool('p')
q = Bool('q')
r = Bool('r')
solve(Implies(p, q), r == Not(q), Or(Not(p), r))
The first time I run this in the Jupyter notebook it produces the result [p = False, q = True, r = False]. But if I run this code again (or outside of Jupyter) I instead get the result [q = False, p = False, r = True]
Am I doing something wrong to get these different results? Also, since the notebook doesn't say it, which solution is actually correct?
If you take both obtained results, i.e. assignments to your boolean variables, you'll see that each assignment set satisfies your constraints. Hence, both results are correct.
The fact that you obtain different results on different platforms/environments might be odd, but can be explained: SMT solvers typically use heuristics during their solving process, these are often randomised, and different environments may yield different random seeds.
Bottom line: it's all good :-)

Recursion in FP-Growth Algorithm

I am trying to implement FP-Growth (frequent pattern mining) algorithm in Java. I have built the tree, but have difficulties with conditional FP tree construction; I do not understand what recursive function should do. Given a list of frequent items (in increasing order of frequency counts) - a header, and a tree (list of Node class instances) what steps should the function take?
I have hard time understanding this pseudocode above. Are alpha and Betha nodes in the Tree, and what do generate and construct functions do? I can do FP-Growth by hand, but find the implementation extremely confusing. If that could help, I can share my code for FP-Tree generation. Thanks in advance.
alpha is the prefix that lead to this specific prefix tree
beta is the new prefix (of the tree to be constructed)
the generate line means something like: add to result set the pattern beta with support anItem.support
the construct function creates the new patterns from which the new tree is created
an example of the construct function (bottom up way) would be something like:
function construct(Tree, anItem)
conditional_pattern_base = empty list
in Tree find all nodes with tag = anItem
for each node found:
support = node.support
conditional_pattern = empty list
while node.parent != root_node
conditional_pattern.append(node.parent)
node = node.parent
conditional_pattern_base.append( (conditional_pattern, support))
return conditional_pattern_base

"Social Network Analysis Labs in R" (Stanford tutorials): Confusion over graph object / network class

I apologize if this question seems redundant, but I am beginning to play around with R and its SNA tools for a class and have been running a couple of different tutorials/labs to get accustomed. A resource that always gets recommended are the SNA labs over at Stanford, but even just running the introductory lab returns a number of errors that leave me confused. The full R code with annotations is available here:
http://sna.stanford.edu/lab.php?l=1
The first parts are fairly straight-forward and I understand most of what's going on.But once I try adding vertex attributes to the graph (line 236 onwards), I encounter problems with the graph object "krack_full", that we just created. Running this... :
for (i in V(krack_full)) {
for (j in names(attributes)) {
krack_full <- set.vertex.attribute(krack_full,
j,
index = i,
attributes[i + 1, j])
}
}
... returns this:
Error in set.vertex.attribute(krack_full, j, index = i, attributes[i + :
unused argument (index = i)
So I think, fine, use the second method they outlined and just follow through:
attributes = cbind(1:length(attributes[,1]), attributes)
krack_full <- graph.data.frame(d = krack_full_nonzero_edges,
+ vertices = attributes)
Which seems to work fine - except that it literally creates an attribute called "(1:length(attributes[, 1])"...
> summary(krack_full)
IGRAPH DN-- 21 232 --
attr: name (v/c), 1:length(attributes[, 1]) (v/n), AGE (v/n), TENURE (v/n), LEVEL (v/n), DEPT
(v/n), advice_tie (e/n), friendship_tie (e/n), reports_to_tie (e/n)
So, everything is acting weird already. And finally, when I try to get the vertex attributes in the next step, I encounter some errors regarding the object's class:
> get.vertex.attribute(krack_full, 'AGE')
Error in get.vertex.attribute(krack_full, "AGE") :
get.vertex.attribute requires an argument of class network.
> get.vertex.attribute(krack_full, 'TENURE')
Error in get.vertex.attribute(krack_full, "TENURE") :
get.vertex.attribute requires an argument of class network.
> get.vertex.attribute(krack_full, 'LEVEL')
Error in get.vertex.attribute(krack_full, "LEVEL") :
get.vertex.attribute requires an argument of class network.
> get.vertex.attribute(krack_full, 'DEPT')
Error in get.vertex.attribute(krack_full, "DEPT") :
get.vertex.attribute requires an argument of class network.
... From here on out pretty much nothing works the way I expected. So I suspect the graph object "krack_full" that the data was imported to is somehow not what it's supposed to be...?
Again, I'm sorry if this a complete rookie mistake I'm making, but I would greatly appreciate if you could point me in the right direction. I'd like to get a better grasp of what's going on here.
Thank you very much.
I strongly suspect that the tutorial you are trying to follow was developed for igraph version 0.5.4 or earlier. At that time, vertices and edges in an igraph object were indexed from zero instead of one, and the tutorial seems to account for this, judging from the following comment in the tutorial:
# IMPORTANT NOTE: Unlike in most languages, R objects are numbered
# from 1 instead of 0, so if you want the first element in a
# vector, you would reference it by vector_name[1]. HOWEVER,
# igraph objects are numbered starting from 0. This can lead to
# lots of confusion, since it's not always obvious at first which
# objects are native to R and which belong to igraph.
Since igraph 0.6, this is not true anymore; vertices and edges in the R interface of igraph are indexed from 1 just like every other well-behaved R object. You have two options here (besides asking the authors of the tutorial to update it for igraph 0.6):
You can modify the commands in the tutorial to make sure that every vertex and edge index is 1-based; i.e., if they subtracted 1 from the indices somewhere for some reason, just omit the subtraction, and similarly, if they added 1 to the indices somewhere, omit the addition. This would also be a good way to check whether you really understand what you are doing :)
Use the igraph0 package instead of igraph. The igraph0 package is identical to igraph but uses zero-based indexing to ensure that old igraph codes are still functional during the transition period. However, you should keep on using igraph for new analysis projects.
For the function
get.vertex.attribute
try the new function
vertex_attr
instead

Resources