Gremlin - TypeError: Object of type GraphTraversal is not JSON serializable - gremlin

This below piece of gremlin code runs perfectly well in the gremlin console (it finds the unique start and end points of an k-step ego network, along with the minimum distance to that endpoint):
g.V(42062000).as("from")
.repeat(both().as("to")).emit().times(3).path()
.count(local).as("pathlen")
.select("from", "to", "pathlen")
.dedup("from", "to").toList()
And gives an output similar to the following, which is as expected:
==>{from=v[42062000], to=v[83607800], plen=2}
==>{from=v[42062000], to=v[23683248], plen=3}
==>{from=v[42062000], to=v[41762840], plen=3}
==>{from=v[42062000], to=v[42062000], plen=3}
==>{from=v[42062000], to=v[83599456], plen=3}
However, when converting the code to conform to the gremlinpython wrapper
(i.e. after substituting as for as_), I'm given the error TypeError: Object of type GraphTraversal is not JSON serializable, even though it's the same query.
Has anyone faced similar issues?
I am using gremlinpython 3.4.2, but was originally using 3.3.3. My version of Python is 3.7.3.

import static classes using
from gremlin_python import statics
statics.load_statics(globals())
or
from gremlin_python.process.graph_traversal import elementMap, range_,
local, count
and common reserved words must end with _
Example:
as_, range_

Related

How to avoid RuntimeError while call __dict__ on module?

it is appearing in some big modules like matplotlib. For example expression :
import importlib
obj = importlib.import_module('matplotlib')
obj_entries = obj.__dict__
Between runs len of obj_entries can vary. From 108 to 157 (expected) entries. Especially pyplot can be ignored like some another submodules.
it can work stable during manual debug mode with len computing statement after dict extraction. But in auto it dont work well.
such error occures:
RuntimeError: dictionary changed size during iteration
python-BaseException
using clear python 3.10 on windows. Version swap change nothing at all
during some attempts some interesting features was found.
use of repr is helpfull before dict initiation.
But if module transported between classes like variable more likely lazy-import happening? For now there is evidence that not all names showing when command line interpriter doing opposite - returning what expected. So this junk of code help bypass this bechavior...
Note: using pkgutil.iter_modules(some_path) to observe modules im internal for pkgutil ModuleInfo form.
import pkgutil, importlib
module_info : pkgutil.ModuleInfo
name = module_info.name
founder = module_info.module_finder
spec = founder.find_spec(name)
module_obj = importlib.util.module_from_spec(spec)
loader = module_obj.__loader__
loader.exec_module(module_obj)
still unfamilliar with interior of import mechanics so it will be helpfull to recive some links to more detail explanation (spot on)

Reticulate AWS Cogntito

This is my Python code (that I've checked and it works):
from warrant.aws_srp import AWSSRP
def auth(USERNAME,PASSWORD):
client = boto3.client('cognito-idp',region_name=region_name)
aws = AWSSRP(username=USERNAME, password=PASSWORD, pool_id=POOL_ID,
client_id=CLIENT_ID,client=client)
try:
tokens = aws.authenticate_user()
return(tokens)
except Exception as e:
return(e)
I'm working with R in order to create a visual interface for doing some operation (including this one) and it is a requirement.
I use the reticulare R package to execute Python code. I tested it with some dummy code in order to check the correct functioning (and it is okay).
When i execute the above function by running:
reticulate::source_python(FILE_PATH)
py$auth(USERNAME,PASSWORD)
i get the following error:
An error occurred (InvalidParameterException) when calling the RespondToAuthChallenge operation: TIMESTAMP format should be EEE MMM d HH:mm:ss z yyyy in english.
I tried to search a lot but I found nothing, I suppose that can exist a sort of wrapper or formatter. Maybe someone as already face this problem...
Thank a lot of any help.

reusable traversal components not always working with gremlin

I'm trying to create reusable components for my traversals using gremlin python by putting traversal components into functions and I'm running into a problem where some of the traversal components aren't working correctly.
As setup I'm running gremlin server using the docker container with the configuration file loading in the modern graph from the github repo
docker run -p 8182:8182 tinkerpop/gremlin-server:3.4.6 conf/gremlin-server-modern.yaml
My test python code looks like the following:
from gremlin_python.process.anonymous_traversal import traversal
from gremlin_python.process.graph_traversal import __
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
def connect_gremlin(endpoint='ws://localhost:8182/gremlin'):
return traversal().withRemote(DriverRemoteConnection(endpoint,'g'))
def n():
return __.values('name')
def r():
return __.range(2,4)
g = connect_gremlin()
# works as expected
g.V().map(n()).toList()
# returns an empty list
g.V().map(n()).filter(r()).toList()
# but using range step directly works as expected
g.V().map(n()).range(2,4).toList()
I can successfully move the values step into a function but when I try to do the same thing with the range step it returns an empty list rather than the 2nd through 4th items. Anyone know what I'm doing wrong?
The map step is intended to map the state of each traverser to a new state. In the context of a single traverser a range starting anywhere but zero is not going to do what you expect.
Here are some examples using Python:
>>> g.V().map(__.range(0,1)).limit(5).toList()
[v[1400], v[1401], v[1402], v[1403], v[1404]]
>>> g.V().map(__.range(0,2)).limit(5).toList()
[v[1400], v[1401], v[1402], v[1403], v[1404]]
>>> g.V().map(__.range(1,2)).limit(5).toList()
[]
This is why the values step works inside a map step and range does not.
Rather than inject code using a map step why not just incrementally add to the traversal and then iterate it when complete?

PyParsing returns null token inconsistently

I have a pyparsing pattern that I'm applying to the same text string in two different environments (my local machine vs a Google Dataproc environment; both are using pyparsing 2.3.1), and I'm getting two slightly different outputs. Basically, when I run
MY_PATTERN.searchString(text)
on my local I get the expected result
[[['MY RESULT']]]
and when I run it on Dataproc (using the same exact pattern and text string input), I get
[[<pyparsing._NullToken object at 0x7ff663d20208>, ['MY RESULT']]]
Any explanation for the difference?

Issue while ingesting a Titan graph into Faunus

I have installed both Titan and Faunus and each seems to be working properly (titan-0.4.4 & faunus-0.4.4)
However, after ingesting a sizable graph in Titan and trying to import it in Faunus via
FaunusFactory.open( )
I am experiencing issues. To be more precise, I do seem to get a faunus graph from the call FaunusFactory.open( ),
faunusgraph[titanhbaseinputformat->titanhbaseoutputformat]
but then, even asking a simple
g.v(10)
I do get this error:
Task Id : attempt_201407181049_0009_m_000000_0, Status : FAILED
com.thinkaurelius.titan.core.TitanException: Exception in Titan
at com.thinkaurelius.titan.diskstorage.hbase.HBaseStoreManager.getAdminInterface(HBaseStoreManager.java:380)
at com.thinkaurelius.titan.diskstorage.hbase.HBaseStoreManager.ensureColumnFamilyExists(HBaseStoreManager.java:275)
at com.thinkaurelius.titan.diskstorage.hbase.HBaseStoreManager.openDatabase(HBaseStoreManager.java:228)
My property file is taken straight out of the Faunus page with Titan-HBase input, except of course changing the url of the hadoop cluster:
faunus.graph.input.format=com.thinkaurelius.faunus.formats.titan.hbase.TitanHBaseInputFormat
faunus.graph.input.titan.storage.backend=hbase
faunus.graph.input.titan.storage.hostname= my IP
faunus.graph.input.titan.storage.port=2181
faunus.graph.input.titan.storage.tablename=titan
faunus.graph.output.format=com.thinkaurelius.faunus.formats.titan.hbase.TitanHBaseOutputFormat
faunus.graph.output.titan.storage.backend=hbase
faunus.graph.output.titan.storage.hostname= IP of my host
faunus.graph.output.titan.storage.port=2181
faunus.graph.output.titan.storage.tablename=titan
faunus.graph.output.titan.storage.batch-loading=true
faunus.output.location=output1
zookeeper.znode.parent=/hbase-unsecure
titan.graph.output.ids.block-size=100000
Anyone can help?
ADDENDUM:
To address the comment below, here is some context: as I have mentioned, I have a graph in Titan and can perform basic gremlin queries on it.
However, I do need to run a gremlin global query which, due to the size of the graph, needs Faunus and its underlying MR capabilities. Hence the need to import it. The error I get doesn't look to me as if it points to some inconsistency in the graph itself.
I'm not sure that you have your "flow" of Faunus right. If your end result is to do a global query of the graph, then consider this approach:
pull your graph to sequence file
issue your global query over the sequence file
More specifically create hbase-seq.properties:
# input graph parameters
faunus.graph.input.format=com.thinkaurelius.faunus.formats.titan.hbase.TitanHBaseInputFormat
faunus.graph.input.titan.storage.backend=hbase
faunus.graph.input.titan.storage.hostname=localhost
faunus.graph.input.titan.storage.port=2181
faunus.graph.input.titan.storage.tablename=titan
# hbase.mapreduce.scan.cachedrows=1000
# output data (graph or statistic) parameters
faunus.graph.output.format=org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat
faunus.sideeffect.output.format=org.apache.hadoop.mapreduce.lib.output.TextOutputFormat
faunus.output.location=snapshot
faunus.output.location.overwrite=true
In Faunus, copy do:
g = FaunusFactory.open('hbase-seq.properties')
g._()
That will read the graph from hbase and write it to sequence file in HDFS. Next, create: seq-noop.properties with these contents:
# input graph parameters
faunus.graph.input.format=org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat
faunus.input.location=snapshot/job-0
# output data parameters
faunus.graph.output.format=com.thinkaurelius.faunus.formats.noop.NoOpOutputFormat
faunus.sideeffect.output.format=org.apache.hadoop.mapreduce.lib.output.TextOutputFormat
faunus.output.location=analysis
faunus.output.location.overwrite=true
The above configuration will read your sequence file from the previous step and without re-writing the graph (that's what NoOpOutputFormat is for). Now in Faunus do:
g = FaunusFactory.open('seq-noop.properties')
g.V.sideEffect('{it.degree=it.bothE.count()}').degree.groupCount()
This will execute a degree distribution, writing the results in HDFS to the 'analysis' directory. Obviously you can do whatever Faunus-flavored Gremlin you want here - I just wanted to provide an example. I think this is a pretty standard "flow" or pattern for using Faunus from a graph analysis perspective.

Resources