JanusGraph BulkLoad reports NoSuchElement Error, nodes loaded but no edges - gremlin

I am exporing ways of loading csv in JanusGraph. I tried the grateful-dead example given by the official document and it worked just fine. Approach as follows:
hadoop-load-csv.properties
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.script.ScriptInputFormat
gremlin.hadoop.scriptInputFormat.script=./data/script-input-grateful-dead.groovy
gremlin.hadoop.inputLocation=./data/grateful-dead.txt
gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat
gremlin.hadoop.graphOutputFormat=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat
gremlin.hadoop.outputLocation=output
gremlin.hadoop.jarsInDistributedCache=true
janusgraph-grateful.properties
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.hbase.table=grateful
storage.hostname=
run.groovy
outputGraphConfig = [path to janusgraph-grateful.properties]
:load ./data/grateful-dead-jansugraph-schema.groovy
graph = JanusGraphFactory.open(outputGraphConfig)
defineGratefulDeadSchema(graph)
graph.close()
readGraph = GraphFactory.open([path to hadoop-load-csv.properties])
blvp = BulkLoaderVertexProgram.build().writeGraph(outputGraphConfig).create(readGraph)
readGraph.compute(SparkGraphComputer).program(blvp).submit().get()
g = GraphFactory.open(outputGraphConfig).traversal()
g.V().count()
g.E().count()
After that I dropped the whole graph, subsampled the data and loaded it again, and it failed.
1,song,HEY BO DIDDLEY,cover,5 followedBy,2,1|followedBy,3,2|followedBy,4,1|followedBy,5,1|followedBy,6,1
2,song,IM A MAN,cover,1 followedBy,1,1
3,song,NOT FADE AWAY,cover,531 followedBy,5,572 followedBy,5,40|followedBy,1,2
4,song,BERTHA,original,394 followedBy,10,4 followedBy,1,1
5,song,GOING DOWN THE ROAD FEELING BAD,cover,293
6,song,MONA,cover,1 sungBy,3|writtenBy,5 followedBy,1,1|followedBy,2,1
7,song,WHERE HAVE THE HEROES GONE,,0 followedBy,8,1 followedBy,9,1
8,song,OH BOY,cover,2 followedBy,9,1|followedBy,3,1|followedBy,7,1|sungBy,5|writtenBy,4 followedBy,1,1|followedBy,7,1|followedBy,6,1
800,song,WINING BOY BLUES,cover,1 sungBy,5|writtenBy,4
9,song,HERE COMES SUNSHINE,original,65 followedBy,10,1 followedBy,6,2
10,song,HERE COMES SUNSHINE,original,65
I got a NoSuchElement Error and when I looked into the graph, g.V().count() returns 10 while g.E().count() returns 0.
Does anyone know what is happening? It would be very kind of you to give me some advice.

Related

Neo4j - Project graph from Neo4j with GraphDataScience

So I have a graph of nodes -- "Papers" and relationships -- "Citations".
Nodes have properties: "x", a list with 0/1 entries corresponding to whether a word is present in the paper or not, and "y" an integer label (one of the classes from 0-6).
I want to project the graph from Neo4j using GraphDataScience.
I've been using this documentation and I indeed managed to project the nodes and vertices of the graph:
Code
from graphdatascience import GraphDataScience
AURA_CONNECTION_URI = "neo4j+s://xxxx.databases.neo4j.io"
AURA_USERNAME = "neo4j"
AURA_PASSWORD = "my_code:)"
# Client instantiation
gds = GraphDataScience(
AURA_CONNECTION_URI,
auth=(AURA_USERNAME, AURA_PASSWORD),
aura_ds=True
)
#Shorthand projection --works
shorthand_graph, result = gds.graph.project(
"short-example-graph",
["Paper"],
["Citation"]
)
When I do print(result) it shows
nodeProjection {'Paper': {'label': 'Paper', 'properties': {}}}
relationshipProjection {'Citation': {'orientation': 'NATURAL', 'aggre...
graphName short-example-graph
nodeCount 2708
relationshipCount 10556
projectMillis 34
Name: 0, dtype: object
However, no properties of the nodes are projected. I then use the extended syntax as described in the documentation:
# Project a graph using the extended syntax
extended_form_graph, result = gds.graph.project(
"Long-form-example-graph",
{'Paper': {properties: "x"}},
"Citation"
)
print(result)
#Errors
I get the error:
NameError: name 'properties' is not defined
I tried various variations of this, with or without " ", but none have worked so far (also documentation is very confusing because one of the docs always uses " " and in another place I did not see " ").
Also, note that all my properties are integers in the Neo4j db (in AuraDS), as I used to have the error that String properties are not supported.
Some clarification on the correct way of projecting node features (aka properties) would be very useful.
thank you,
Dina
The keys in the Python dictionaries that you use with the GraphDataScience library should be enclosed in quotation marks. This is different from Cypher syntax, where map keys are not enclosed with quotation marks.
This should work for you.
extended_form_graph, result = gds.graph.project(
"Long-form-example-graph",
{'Paper': {"properties": "x"}},
"Citation"
)
Best wishes,
Nathan

How to Speed up read_html runtime in r?

I have a character string of 400 URLs called URLs.
I have a loop that has been working for a while but now it takes way too long. It used to just report the url as an error and then I would omit but its is getting hung up.
dput(URLs)
c("http://www.chinadaily.com.cn/a/202102/04/WS601b5bd7a31024ad0baa736d.html",
"http://www.xinhuanet.com/english/2021-02/02/c_139716479.htm",
"http://www.china.org.cn/world/Off_the_Wire/2021-02/02/content_77181645.htm",
"http://english.sina.com/world/af/2021-02-02/detail-ikftssap2511288.shtml",
"https://www.beijingnews.net/news/267750643/fox-takes-clubhouse-lead-as-johnson-makes-move-in-saudi-arabia",
"https://www.beijingnews.net/news/267768819/johnson-excited-for-season-after-second-saudi-title",
"https://en.wtcf.org.cn/GlobalNews/2021020320227.html", "https://www.ladepeche.fr/2021/02/08/golf-un-top-4-royal-pour-victor-perez-9360378.php",
"https://sport24.lefigaro.fr/golf/tour-europeen/actualites/victor-perez-dans-les-pas-de-dustin-johnson-en-arabie-saoudite-1032163",
"https://sport24.lefigaro.fr/golf/tour-europeen/actualites/european-tour-victor-perez-a-longtemps-tenu-tete-a-dustin-johnson-en-arabie-saoudite-1032273",
"https://www.france24.com/en/live-news/20210206-johnson-seizes-two-shot-lead-in-saudi-international",
"https://www.france24.com/en/live-news/20210205-fox-takes-clubhouse-lead-as-johnson-makes-move-in-saudi-arabia",
"https://www.france24.com/en/live-news/20210203-big-hitting-dechambeau-happy-to-take-longer-clubs-out-of-rivals-hands",
"https://www.france24.com/en/live-news/20210203-as-bubble-life-drags-on-psychologists-say-cricketers-need-more-support",
"https://www.sports.fr/golf/circuit-europeen/golf-perez-gratin-arabie-saoudite-426859.html",
"https://www.sport.fr/golf/lopen-de-france-est-sauve-758291.shtm",
"https://www.ffgolf.org/Actus/Pro/European-Tour/Saudi-International-ET-Perez-n-est-pas-passe-loin",
"https://www.ffgolf.org/Actus/Pro/European-Tour/Saudi-International-ET-Perez-a-rendez-vous-avec-DJ-dimanche",
"https://www.ffgolf.org/Actus/Pro/European-Tour/Saudi-International-ET-Rozner-au-sec-a-6-Perez-a-7",
"https://www.ffgolf.org/Actus/Pro/European-Tour/Saudi-International-ET-Rozner-et-Perez-demarrent-bien",
"https://www.ffgolf.org/Actus/Pro/LPGA-Tour/Franck-Riboud-On-va-pouvoir-continuer-a-travailler-sereinement",
"https://www.ffgolf.org/Actus/Pro/Feuilletons/Paroles-de-coach/Paroles-de-coach-6-Gwladys-Nocera",
"https://franceracing.fr/other/porsche-et-tag-heuer-scellent-un-partenariat-strategique/",
"https://www.rfi.fr/en/sports/20210206-johnson-seizes-two-shot-lead-in-saudi-international",
"https://www.rfi.fr/en/sports/20210205-fox-takes-clubhouse-lead-as-johnson-makes-move-in-saudi-arabia",
"https://www.rfi.fr/en/sports/20210203-big-hitting-dechambeau-happy-to-take-longer-clubs-out-of-rivals-hands",
"https://www.rfi.fr/en/sports/20210203-as-bubble-life-drags-on-psychologists-say-cricketers-need-more-support",
"https://www.jeudegolf.org/EasyBlog/Agathe-sauzon.html", "http://topactu.net/2021/02/viktor-hovland-vaults-into-farmers-lead-at-wet-torrey-pines/",
"https://www.sueddeutsche.de/sport/golf-kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt-dpa.urn-newsml-dpa-com-20090101-210207-99-337940",
"https://www1.wdr.de/sport/golf-martin-kaymer-saudi-arabien-100.html",
"https://www.augsburger-allgemeine.de/sport/sonstige-sportarten/Kaymer-18-bei-Golf-Turnier-in-Saudi-Arabien-Johnson-siegt-id59059886.html",
"https://www.schwaebische.de/sport/ueberregionaler-sport_artikel,-kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt-_arid,11325827.html",
"https://www.sport.de/news/ne4341625/golf--kaymer-beendet-turnier-in-saudi-arabien-als-18/",
"https://www.mz-web.de/sport/golf/kaymer-18--bei-golf-turnier-in-saudi-arabien---johnson-siegt-38027428",
"https://www.nwzonline.de/sport-meldungen/european-tour-kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_a_50,12,475833623.html",
"https://www.volksstimme.de/golf/news/kaymer-18.-bei-golf-turnier-in-saudi-arabien---johnson-siegt/1612702615000",
"https://www.wn.de/Sport/Weltsport/Golf/4360897-European-Tour-Kaymer-18.-bei-Golf-Turnier-in-Saudi-Arabien-Johnson-siegt",
"https://www.mainpost.de/sport/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt-art-10562664",
"https://www.moz.de/nachrichten/sport/news/european-tour-kaymer-18.-bei-golf-turnier-in-saudi-arabien-johnson-siegt-54931493.html",
"https://www.svz.de/sport/weitere-sportarten/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt-id31187247.html?nojs=true",
"https://www.rhein-zeitung.de/sport/aus-aller-welt/aus-aller-welt-golf_artikel,-kaymer-18-bei-golfturnier-in-saudiarabien-johnson-siegt-_arid,2220135.html",
"https://www.rhein-zeitung.de/sport/aus-aller-welt/aus-aller-welt-golf_artikel,-martin-kaymer-sagt-olympiastart-in-tokio-ab-_arid,2274019.html",
"https://www.allgemeine-zeitung.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://www.echo-online.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://www.mittelhessen.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://www.muensterschezeitung.de/Sport/Sportarten/Golf/4360897-European-Tour-Kaymer-18.-bei-Golf-Turnier-in-Saudi-Arabien-Johnson-siegt",
"https://www.wiesbadener-kurier.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://www.giessener-anzeiger.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://newsroom.porsche.com/de/2021/unternehmen/porsche-sportwagenhersteller-tag-heuer-luxusuhren-schmiede-zusammenarbeit-videostream-23558.html",
"https://www.azonline.de/Sport/Weitere-Sportarten/Golf/4360897-European-Tour-Kaymer-18.-bei-Golf-Turnier-in-Saudi-Arabien-Johnson-siegt",
"https://www.borkenerzeitung.de/welt/sport/Kaymer-18-bei-Golf-Turnier-in-Saudi-Arabien-Johnson-siegt-327224.html",
"https://www.golfpost.de/european-tour-saudi-international-2021-ergebnisse-runde-2-7777396527/",
"https://www.golfpost.de/396354-7777396354/", "https://www.golfpost.de/german-challenge-powerd-by-vcg-golf-challenge-tour-kehrt-nach-deutschland-zurueck-7777396396/",
"https://www.golfpost.de/die-macht-der-moneten-saudi-arabien-auf-dem-weg-zum-big-player-im-golf-7777396387/",
"https://www.kreis-anzeiger.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://www.wormser-zeitung.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://m.azonline.de/Sport/Weitere-Sportarten/Golf/4361712-PGA-Turnier-US-Golfstar-Koepka-triumphiert-bei-Phoenix-Open",
"https://www.mv-online.de/sport/sportmix/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt-409658.html",
"https://www.golf.de/publish/dgv-sport/golf-team-germany/news/60228375/sophia-popov-nach-major-sieg-in-elite-team-germany",
"https://www.golf.de/publish/tournews/nachrichten-tour/60228372/einmal-saudi-einmal-etwas-gaudi",
"https://www.golf.de/publish/tournews/nachrichten-tour/60228387/koepka-comeback-und-eine-wuestenbilanz",
"https://www.ev-online.de/sport/sportmix/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt-409655.html",
"https://www.nach-welt.com/dustin-johnson-setzt-masstabe-aber-jordan-spieth-justin-rose-und-brooks-koepka-kehren-zur-form-zuruck/",
"https://www.nach-welt.com/ryan-fox-wird-sechster-wahrend-dustin-johnson-saudi-international-gewinnt/",
"https://www.usinger-anzeiger.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://www.gaeubote.de/Nachrichten/Golf-Turnier-in-Muenchen-Kaymer-faellt-zurueck-86604.html",
"https://www.gaeubote.de/Nachrichten/Kaymer-nach-Traumrunde-Zweiter-bei-Golf-Turnier-in-Muenchen-86664.html",
"https://www.main-spitze.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://www.lauterbacher-anzeiger.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://www.oberhessische-zeitung.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://de.advfn.com/p.php?pid=nmona&article=84265497", "https://www.buerstaedter-zeitung.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://www.golftime.de/golf-nachrichten/challenge-tour-in-deutschland-neues-profi-turnier/",
"https://www.golftime.de/golf-nachrichten/martin-kaymer-saudi-international-tour-news/",
"https://www.golftime.de/magazin/distanz-usga-ra-elite-spieler-regel-anpassung/",
"https://www.dmm.travel/nc/news/porsche-und-tag-heuer-arbeiten-zusammen/",
"https://www.lampertheimer-zeitung.de/sport/golf/kaymer-18-bei-golf-turnier-in-saudi-arabien-johnson-siegt_23109750",
"https://www.hongkongherald.com/news/267768819/johnson-excited-for-season-after-second-saudi-title",
"https://www.hongkongherald.com/news/267750643/fox-takes-clubhouse-lead-as-johnson-makes-move-in-saudi-arabia",
"http://hongkongcityportal.com/saudi-international-englands-david-horsey-leads-from-scotlands-stephen-gallacher/",
"http://hongkongcityportal.com/bryson-dechambeau-flattered-and-welcomes-proposed-rule-changes/",
"http://hongkongcityportal.com/paul-casey-englishman-defends-saudi-international-u-turn/",
"https://as.com/masdeporte/2021/02/03/golf/1612378989_020231.html",
"https://www.marca.com/golf/2021/02/07/601fd7c122601d860c8b45dc.html",
"https://www.marca.com/golf/2021/05/02/608ece1b22601d9d5d8b45f0.html",
"https://www.marca.com/golf/2021/02/03/601ad5d7268e3ef01e8b4670.html",
"https://www.republicworld.com/sports-news/other-sports/johnson-eases-to-another-victory-at-saudi-international.html",
"https://www.republicworld.com/sports-news/other-sports/dustin-johnson-within-1-shot-of-lead-at-saudi-international.html",
"https://timesofindia.indiatimes.com/sports/golf/top-stories/dustin-johnson-excited-for-season-after-second-saudi-title/articleshow/80737390.cms",
"https://timesofindia.indiatimes.com/sports/golf/top-stories/johnson-eases-to-another-victory-at-saudi-international/articleshow/80736264.cms",
"https://timesofindia.indiatimes.com/sports/golf/top-stories/ryan-fox-takes-surprise-lead-at-saudi-international/articleshow/80711869.cms",
"https://timesofindia.indiatimes.com/sports/golf/top-stories/horsey-goes-on-birdie-blitz-for-saudi-international-lead/articleshow/80691513.cms",
"https://timesofindia.indiatimes.com/sports/golf/top-stories/shubhankar-shoots-69-in-opening-round-at-saudi-international/articleshow/80691501.cms",
"https://timesofindia.indiatimes.com/sports/golf/top-stories/big-hitting-dechambeau-happy-to-take-longer-clubs-out-of-rivals-hands/articleshow/80672723.cms",
"https://timesofindia.indiatimes.com/sports/cricket/news/as-bubble-life-drags-on-psychologists-say-cricketers-need-more-support/articleshow/80662353.cms",
"https://www.abc.es/deportes/abci-sergio-garcia-apunta-ryder-202102070038_noticia.html",
"https://www.abc.es/deportes/abci-golfistas-golpe-gimnasio-202102050031_noticia.html",
"https://www.investing.com/news/general/golf-johnson-holds-on-to-clinch-second-saudi-international-title-2411514"
)
####I have tried this:
html_reader<- function(x){return( tryCatch(xml2::read_html(URLs[k]), error = function(e) NULL))}
for (k in seq_along(URLs)) parsed_pages[k] <-lapply(as.list(URLs), html_reader)
I havent run into issues with runtime for some reason until now. The function will not complete even with the try() error function.
My current working code is the following:
pp <- replicate(list(), n = length(ESPN))
for (k in seq_along(ESPN)) pp[[k]] <- try(xml2::read_html(ESPN[k]), silent = TRUE)
It used to just take a while but now it never finishes.
I think the issue I am running into is due to the open connections. The script would get progressively slower and I feel it was due to the old connections. Here is a simple loop that closes out all of the connections. I will know when I run a particular report again if this is the solution but it has seemed to help so far.
for (i in seq_along(df$URLs)){function(i)
closeAllConnections(i)
}

Users who mentioned each other in Gremlin

We have a smaller example twitter database:
user -[TWEETED]-> tweet -[MENTIONED]-> user2
and I would like to find out how to write a query in Gremlin, that shows who were the users who mentioned each other. I have already read the docs but I don't know how to do it.
Given this sample data that assume marko and stephen mention each other and marko and daniel mention each other:
g = new TinkerGraph()
vMarko = g.addVertex("marko", [type:"user"])
vStephen = g.addVertex("stephen", [type:"user"])
vDaniel = g.addVertex("daniel", [type:"user"])
vTweetm1s = g.addVertex("m1s", [type:"tweet"])
vTweetm2d = g.addVertex("m2d", [type:"tweet"])
vTweets1m = g.addVertex("s1m", [type:"tweet"])
vTweetd1m = g.addVertex("d1m", [type:"tweet"])
vMarko.addEdge("tweeted",vTweetm1s)
vMarko.addEdge("tweeted",vTweetm2d)
vStephen.addEdge("tweeted",vTweets1m)
vDaniel.addEdge("tweeted",vTweetd1m)
vTweetm1s.addEdge("mentioned", vStephen)
vTweetm2d.addEdge("mentioned", vDaniel)
vTweets1m.addEdge("mentioned", vMarko)
vTweetd1m.addEdge("mentioned", vMarko)
you could handle it with the following:
gremlin> g.V.has("type","user").as('s')
.out("tweeted").out("mentioned").as('m').out("tweeted")
.out("mentioned").as('e').select.filter{it[0]==it[2]}
==>[s:v[daniel], m:v[marko], e:v[daniel]]
==>[s:v[stephen], m:v[marko], e:v[stephen]]
==>[s:v[marko], m:v[stephen], e:v[marko]]
==>[s:v[marko], m:v[daniel], e:v[marko]]
This approach uses select to extract the data from the labelled steps then a final filter to find those where "s" (vertex in the first position) is equal to the "e" (vertex in the final position). This of course means that there is cycle pattern detected where the one user mentioned another and the other mentioned that person back at some point.
If you follow that much then we can clean up the result a little bit so as to get the unique set of pairs:
gremlin> g.V.has("type","user").as('s')
.out("tweeted").out("mentioned").as('m')
.out("tweeted").out("mentioned").as('e')
.select.filter{it[0]==it[2]}
.transform{[it[0].id,it[1].id] as Set}.toList() as Set
==>[daniel, marko]
==>[stephen, marko]
By adding a transform to the previous code, we can convert the result to "id" (the user's name in this case) and flip everything to Set so as to get unique pairs of results.

Updating edge attributes of a large dense graph

I have a large and dense graph whose edge attributes are updated using the following code. Briefly, I set the edge attributes based on some calculations on the values fetched from other dictionaries (degdict, pifeadict, nodeneidict etc). My smallest graph has 15 million edges. When the execution reaches this stage, the CPU usage dips as low as 10% and memory hikes up to 69%. For large graphs, my process is getting killed because of 90% memory usage. I am not sure where things are going wrong.
In addition to fixing this memory problem, I also need to speed up this loop, if possible - perhaps, a parallel solution to update the edge attributes. Please suggest solutions.
for fauth, sauth in Gcparam.edges_iter():
first_deg = degdict[fauth]
sec_deg = degdict[sauth]
paval = float(first_deg*sec_deg)/float(currmaxdeg * \
currmaxdeg)
try:
f2 = dmpdict[first_deg][sec_deg]
except KeyError:
f2 = 0.0
try:
pival = pifeadict[first_deg][sec_deg]
except KeyError:
pival = 0.0
delDval = float(abs(first_deg - sec_deg))/(float(currmaxdeg)*delT)
f5 = calc_comm_kws(fauth, sauth, kwsdict)
avg_ndeg = getAvgNeiDeg(fauth, sauth, nodeneidict, currmaxdeg)/delT
prop = getPropensity(fauth, sauth, nodeneidict, currmaxdeg, Gparam)/delT
tempdict = {'years':[year], 'pa':[paval],\
'dmp':[f2], 'pi':[pival], 'deld':[delDval],\
'delndeg':[avg_ndeg], 'delprop' :[prop],\
'ck' :[f5]
}
Gcparam[fauth][sauth].update(tempdict)
You can estimate the amount of storage you need for the data on each edge like this:
In [1]: from pympler.asizeof import asizeof
In [2]: tempdict = {'years':[1900], 'pa':[1.0],\
'dmp':[2.0], 'pi':[3.0], 'deld':[7],\
'delndeg':[3.4], 'delprop' :[7.5],\
'ck' :[22.0]
}
In [3]: asizeof(tempdict)
Out[3]: 1000
So it looks like 1000 bytes is a lower bound for what you are doing. Multiply that by the number of edges for the total.
NetworkX also has some overhead for the node and edge data structures which depends on what type of object you use for nodes. Integers are smallest.

OrientDB GraphDatabase: OSQLSynchQuery for #RID to get graph.getVertex(rid) ... the fastest way to load a vertex from index key?

Given a basic Blueprints-compatible OrientGraph with Index 'name' (unique or notunique), any suggestions for how the following could be improved, if needs be?
Note: I can't find a definitive guide to load a [blueprints] vertex using index. I have a large graph and using has('name','bob') (in console) takes 2 minutes! On the other hand, an index-based search returns in milliseconds.
The best I've come up with so far:
OrientGraph graph = new OrientGraph("local:/graph1/databases/test", "admin", "admin");
List<ODocument> resultlist = graph.getRawGraph().query(new OSQLSynchQuery<ODocument>("SELECT FROM INDEX:name WHERE KEY = 'bob'"));
ODocument resultodoc = resultlist.get(0).field("rid");
String rid = resultodoc.getIdentity().toString(); // would return something like #6:1500000
Vertex v1 = graph.getVertex(rid);
System.out.println(v1.getProperty("name"));
OrientDB supports the IndexableGraph interface. To use it take a look at:
https://github.com/tinkerpop/blueprints/wiki/Graph-Indices

Resources