Gremlin graph traversal backtracking - graph

I have some code that generates gremlin traversals dynamically based on a query graph that a user supplies. It generates gremlin that relies heavily on backtracking. I've found some edge cases that are generating gremlin that doesn't do what I expected it to, but I also can't find anything online about the rules surrounding the usage of some of these pipes (like 'as' and 'back' in this case). An example of one of these edge cases:
g.V("id", "some id").as('1').inE("edgetype").outV.has("#class", "vertextype").as('2').inE("edgetype").outV.has("#class", "vertextype").filter{(it.getProperty("name") == "Bob")}.outE("edgetype").as('target').inV.has("id", "some id").back('2').outE("edgetype").inV.has("id", "some other id").back('target')
The goal of the traversal is to return the edges that were 'as'd as 'target'. When I run this against my orientdb database it encounters a null exception. I narrowed down the exception to the final pipe in the traversal: back('target'). I'm guessing that the order of the 'as's and 'back's matters and that the as('target') went 'out of scope' as soon as back('2') was executed. So a few questions:
Is my understanding (that 'target' goes out of scope because I backtracked to before it was 'as'd) correct?
Is there anywhere I can learn the rules surrounding backtracking without having to reverse engineer the gremlin source?
edit:
I found the relevant source code:
public static List<Pipe> removePreviousPipes(final Pipeline pipeline, final String namedStep) {
final List<Pipe> previousPipes = new ArrayList<Pipe>();
for (int i = pipeline.size() - 1; i >= 0; i--) {
final Pipe pipe = pipeline.get(i);
if (pipe instanceof AsPipe && ((AsPipe) pipe).getName().equals(namedStep)) {
break;
} else {
previousPipes.add(0, pipe);
}
}
for (int i = 0; i < previousPipes.size(); i++) {
pipeline.remove(pipeline.size() - 1);
}
if (pipeline.size() == 1)
pipeline.setStarts(pipeline.getStarts());
return previousPipes;
}
It looks like the BackFilterPipe does not remove any of the aspipes between the back pipe and the named as pipe, but they must not be visible to pipes outside the BackFilterPipe. I guess my algorithm for generating the code is going to have to be smarter about detecting when the target is within a meta pipe.

Turns out the way I was using 'as' and 'back' made no sense. When you consider that everything between the 'back('2')' and the 'as('2')' is contained within the BackFilterPipe's inner pipeline, it becomes clear that meta pipes can't be overlapped like this: as('2'), as('target'), back('2'), back('target').

Related

So, a mutant escaped. Now what?

I've just managed to get mutation testing working for the first time. My usual testing framework is Codeception but as of writing, it is not compatible with mutation testing (although I believe work is being done on it and it's not far off). I'm using PHPUnit and Infection, neither of which seem easy to work out how to use.
My test suite generated ten mutants. Nine were killed and one escaped. However, I don't know what part of the code or the tests needs to be improved to kill the final mutant.
How do you get information about what code allowed the mutant to escape?
I found in this blog what I couldn't find in Infection's documentation: the results are saved in infection.log.
The log file looks like this:
Escaped mutants:
================
1) <full-path-to-source-file>.php:7 [M] ProtectedVisibility
--- Original
+++ New
## ##
use stdClass;
trait HiddenValue
{
- protected function hidden_value($name = null, $value = null)
+ private function hidden_value($name = null, $value = null)
{
static $data = [];
$keys = array_map(function ($item) {
Timed Out mutants:
==================
Not Covered mutants:
====================
It says that the mutation changed the protected visibility to private and that no tests failed as a result. If this is important, I can now either change the code or write another test to cover this case.
Now that I've found this, I've searched on the Infection website for infection.log and found --show-mutations or -s which will output escaped mutants to the console while running.

.Net Core 3 Preview SequenceReader Length Delimited Parsing

I'm trying to use SequenceReader<T> in .Net Core Preview 8 to parse Guacamole Protocol network traffic.
The traffic might look as follows:
5.error,14.some text here,1.0;
This is a single error instruction. There are 3 fields:
OpCode = error
Reason = some text here
Status = 0 (see Status Codes)
The fields are comma delimited (semi-colon terminated), but they also have the length prefixed on each field. I presume that's so that you could parse something like:
5.error,24.some, text, with, commas,1.0;
To produce Reason = some, text, with, commas.
Simple comma delimited parsing is simple enough to do (with or without SequenceReader). However, to utilise the length I've tried the following:
public static bool TryGetNextElement(this ref SerializationContext context, out ReadOnlySequence<byte> element)
{
element = default;
var start = context.Reader.Position;
if (!context.Reader.TryReadTo(out ReadOnlySequence<byte> lengthSlice, Utf8Bytes.Period, advancePastDelimiter: true))
return false;
if (!lengthSlice.TryGetInt(out var length))
return false;
context.Reader.Advance(length);
element = context.Reader.Sequence.Slice(start, context.Reader.Position);
return true;
}
Based on my understanding of the initial proposal, this should work, though also could be simplified I think because some of the methods in the proposal make life a bit easier than that which is available in .Net Core Preview 8.
However, the problem with this code is that the SequenceReader does not seem to Advance as I would expect. It's Position and Consumed properties remain unchanged when advancing, so the element I slice at the end is always an empty sequence.
What do I need to do in order to parse this protocol correctly?
I'm guessing that .Reader here is a property; this is important because SequenceReader<T> is a mutable struct, but every time you access .SomeProperty you are working with an isolated copy of the reader. It is fine to hide it behind a property, but you'd need to make sure you work with a local and then push back when complete, i.e.
var reader = context.Reader;
var start = reader.Position;
if (!reader.TryReadTo(out ReadOnlySequence<byte> lengthSlice,
Utf8Bytes.Period, advancePastDelimiter: true))
return false;
if (!lengthSlice.TryGetInt(out var length))
return false;
reader.Advance(length);
element = reader.Sequence.Slice(start, reader.Position);
context.Reader = reader; // update position
return true;
Note that a nice feature of this is that in the failure cases (return false), you won't have changed the state yet, because you've only been mutating your local standalone clone.
You could also consider a ref-return property for .Reader.

Selectively removing node labels in D3 force directed diagram

Overall context: I have a db of cross-references among pages in a wiki space, and want an incrementally-growing visualization of links.
I have working code that shows clusters of labels as you mouseover. But when you move away, rather than hiding all the labels, I want to keep certain key labels (e.g. the centers of clusters).
I forked an existing example and got it roughly working.
info is at http://webseitz.fluxent.com/wiki/WikiGraphBrowser
near the bottom of that or any other page in that space, in the block that starts with "BackLinks:", at the end you'll find "Click here for WikiGraphBrowser" which will launch a window with the interface
equivalent static subset example visible at http://www.wikigraph.net/static/d3/cgmartin/WikiGraphBrowser/:
code for that example is at https://github.com/BillSeitz/WikiGraphBrowser/blob/master/js/wiki_graph.js
Code that works at removing all labels:
i = j = 0;
if (!bo) { //bo=False - from mouseout
//labels.select('text.label').remove();
labels.filter(function(o) {
return !(o.name in clicked_names);
})
.text(function(o) { return ""; });
j++;
}
Code attempting to leave behind some labels, which does not work:
labels.forEach(function(o) {
if (!(d.name in clicked_names)) {
d.text.label.remove();
}
I know I'm just not grokking the d3 model at all....
thx
The problem comes down to your use of in to search for a name in an array. The Javascript in keyword searches object keys not object values. For an array, the keys are the index values. So testing (d.name in clicked_names) will always return false.
Try
i = j = 0;
if (!bo) { //bo=False - from mouseout
//labels.select('text.label').remove();
labels.filter(function(o) {
return (clicked_names.indexOf(o.name) < 0);
})
.text(function(o) { return ""; });
j++;
}
The array .indexOf(object) method returns -1 if none of the elements in the array are equal (by triple-equals standards) to the parameter. Alternatively, if you are trying to support IE8 (I'm assuming not, since you're using SVG), you could use a .some(function) test.
By the way, there's a difference between removing a label and just setting it's text content to the empty string. Which one to use will depend on whether you want to show the text again later. Either way, just be sure you don't end up with a proliferation of empty labels clogging up your browser.

Lasso 9 Hangs on Inserting Pair with Map Value into Array?

EDIT: I accidentally misrepresented the problem when trying to pare-down the example code. A key part of my code is that I am attempting to sort the array after adding elements to it. The hang appears on sort, not insert. The following abstracted code will consistently hang:
<?=
local('a' = array)
#a->insert('test1' = map('a'='1'))
#a->insert('test2' = map('b'='2')) // comment-out to make work
#a->sort
#a
?>
I have a result set for which I want to insert a pair of values into an array for each unique key, as follows:
resultset(2) => {
records => {
if(!$logTypeClasses->contains(field('logTypeClass'))) => {
local(i) = pair(field('logTypeClass'), map('title' = field('logType'), 'class' = field('logTypeClass')))
log_critical(#i)
$logTypeClasses->insert(#i) // Lasso hangs on this line, will return if commented-out
}
}
}
Strangely, I cannot insert the #i local variable into thread variable without Lasso hanging. I never receive an error, and the page never returns. It just hangs indefinitely.
I do see the pairs logged correctly, which leads me to believe that the pair-generating syntax is correct.
I can make the code work as long as the value side of the pair is not a map with values. In other words, it works when the value side of the pair is a string, or even an empty map. As soon as I add key=value parameters to the map, it fails.
I must be missing something obvious. Any pointers? Thanks in advance for your time and consideration.
I can verify the bug with the basic code you sent with sorting. The question does arise how exactly one sorts pairs. I'm betting you want them sorted by the first element in the pair, but I could also see the claim that they should be sorted by last element in the pair (by values instead of by keys)
One thing that might work better is to keep it as a map of maps. If you need the sorted data for some reason, you could do map->keys->asArray->sort
Ex:
local(data) = map('test1' = map('a'=2,'b'=3))
#data->insert('test2' = map('c'=33, 'd'=42))
local(keys) = #data->keys->asArray
#keys->sort
#keys
Even better, if you're going to just iterate through a sorted set, you can just use a query expression:
local(data) = map('test1' = map('a'=2,'b'=3))
#data->insert('test2' = map('c'=33, 'd'=42))
with elm in #data->eachPair
let key = #elm->first
let value = #elm->second
order by #key
do { ... }
I doubt you problem is the pair with map construct per se.
This test code works as expected:
var(testcontainer = array)
inline(-database = 'mysql', -table = 'help_topic', -findall) => {
resultset(1) => {
records => {
if(!$testcontainer->contains(field('name'))) => {
local(i) = pair(field('name'), map('description' = field('description'), 'name' = field('name')))
$testcontainer->insert(#i)
}
}
}
}
$testcontainer
When Lasso hangs like that with no feedback and no immediate crash it is usually trapped in some kind of infinite loop. I'm speculating that it might have to do with Lasso using references whenever possible. Maybe some part of your code is using a reference that references itself. Or something.

Full text search on a mobile device?

We'll soon be embarking on the development of a new mobile application. This particular app will be used for heavy searching of text based fields. Any suggestions from the group at large for what sort of database engine is best suited to allowing these types of searches on a mobile platform?
Specifics include Windows Mobile 6 and we'll be using the .Net CF. Also some of the text based fields will be anywhere between 35 and 500 characters. The device will operate in two different methods, batch and WiFi. Of course for WiFi we can just submit requests to a full blown DB engine and just fetch results back. This question centres around the "batch" version which will house a database loaded with information on the devices flash/removable storage card.
At any rate, I know SQLCE has some basic indexing but you don't get into the real fancy "full text" style indexes until you've got the full blown version which of course isn't available on a mobile platform.
An example of what the data would look like:
"apron carpenter adjustable leather container pocket waist hardware belt" etc. etc.
I haven't gotten into the evaluation of any other specific options yet as I figure I'd leverage the experience of this group in order to first point me down some specific avenues.
Any suggestions/tips?
Just recently I had the same issue. Here is what I did:
I created a class to hold just an id and the text for each object (in my case I called it a sku (item number) and a description). This creates a smaller object that uses less memory since it is only used for searching. I'll still grab the full-blown objects from the database after I find matches.
public class SmallItem
{
private int _sku;
public int Sku
{
get { return _sku; }
set { _sku = value; }
}
// Size of max description size + 1 for null terminator.
private char[] _description = new char[36];
public char[] Description
{
get { return _description; }
set { _description = value; }
}
public SmallItem()
{
}
}
After this class is created, you can then create an array (I actually used a List in my case) of these objects and use it for searching throughout your application. The initialization of this list takes a bit of time, but you only need to worry about this at start up. Basically just run a query on your database and grab the data you need to create this list.
Once you have a list, you can quickly go through it searching for any words you want. Since it's a contains, it must also find words within words (e.g. drill would return drill, drillbit, drills etc.). To do this, we wrote a home-grown, unmanaged c# contains function. It takes in a string array of words (so you can search for more than one word... we use it for "AND" searches... the description must contain all words passed in... "OR" is not currently supported in this example). As it searches through the list of words it builds a list of IDs, which are then passed back to the calling function. Once you have a list of IDs, you can easily run a fast query in your database to return the full-blown objects based on a fast indexed ID number. I should mention that we also limit the maximum number of results returned. This could be taken out. It's just handy if someone types in something like "e" as their search term. That's going to return a lot of results.
Here's the example of custom Contains function:
public static int[] Contains(string[] descriptionTerms, int maxResults, List<SmallItem> itemList)
{
// Don't allow more than the maximum allowable results constant.
int[] matchingSkus = new int[maxResults];
// Indexes and counters.
int matchNumber = 0;
int currentWord = 0;
int totalWords = descriptionTerms.Count() - 1; // - 1 because it will be used with 0 based array indexes
bool matchedWord;
try
{
/* Character array of character arrays. Each array is a word we want to match.
* We need the + 1 because totalWords had - 1 (We are setting a size/length here,
* so it is not 0 based... we used - 1 on totalWords because it is used for 0
* based index referencing.)
* */
char[][] allWordsToMatch = new char[totalWords + 1][];
// Character array to hold the current word to match.
char[] wordToMatch = new char[36]; // Max allowable word size + null terminator... I just picked 36 to be consistent with max description size.
// Loop through the original string array or words to match and create the character arrays.
for (currentWord = 0; currentWord <= totalWords; currentWord++)
{
char[] desc = new char[descriptionTerms[currentWord].Length + 1];
Array.Copy(descriptionTerms[currentWord].ToUpper().ToCharArray(), desc, descriptionTerms[currentWord].Length);
allWordsToMatch[currentWord] = desc;
}
// Offsets for description and filter(word to match) pointers.
int descriptionOffset = 0, filterOffset = 0;
// Loop through the list of items trying to find matching words.
foreach (SmallItem i in itemList)
{
// If we have reached our maximum allowable matches, we should stop searching and just return the results.
if (matchNumber == maxResults)
break;
// Loop through the "words to match" filter list.
for (currentWord = 0; currentWord <= totalWords; currentWord++)
{
// Reset our match flag and current word to match.
matchedWord = false;
wordToMatch = allWordsToMatch[currentWord];
// Delving into unmanaged code for SCREAMING performance ;)
unsafe
{
// Pointer to the description of the current item on the list (starting at first char).
fixed (char* pdesc = &i.Description[0])
{
// Pointer to the current word we are trying to match (starting at first char).
fixed (char* pfilter = &wordToMatch[0])
{
// Reset the description offset.
descriptionOffset = 0;
// Continue our search on the current word until we hit a null terminator for the char array.
while (*(pdesc + descriptionOffset) != '\0')
{
// We've matched the first character of the word we're trying to match.
if (*(pdesc + descriptionOffset) == *pfilter)
{
// Reset the filter offset.
filterOffset = 0;
/* Keep moving the offsets together while we have consecutive character matches. Once we hit a non-match
* or a null terminator, we need to jump out of this loop.
* */
while (*(pfilter + filterOffset) != '\0' && *(pfilter + filterOffset) == *(pdesc + descriptionOffset))
{
// Increase the offsets together to the next character.
++filterOffset;
++descriptionOffset;
}
// We hit matches all the way to the null terminator. The entire word was a match.
if (*(pfilter + filterOffset) == '\0')
{
// If our current word matched is the last word on the match list, we have matched all words.
if (currentWord == totalWords)
{
// Add the sku as a match.
matchingSkus[matchNumber] = i.Sku.ToString();
matchNumber++;
/* Break out of this item description. We have matched all needed words and can move to
* the next item.
* */
break;
}
/* We've matched a word, but still have more words left in our list of words to match.
* Set our match flag to true, which will mean we continue continue to search for the
* next word on the list.
* */
matchedWord = true;
}
}
// No match on the current character. Move to next one.
descriptionOffset++;
}
/* The current word had no match, so no sense in looking for the rest of the words. Break to the
* next item description.
* */
if (!matchedWord)
break;
}
}
}
}
};
// We have our list of matching skus. We'll resize the array and pass it back.
Array.Resize(ref matchingSkus, matchNumber);
return matchingSkus;
}
catch (Exception ex)
{
// Handle the exception
}
}
Once you have the list of matching skus, you can iterate through the array and build a query command that only returns the matching skus.
For an idea of performance, here's what we have found (doing the following steps):
Search ~171,000 items
Create list of all matching items
Query the database, returning only the matching items
Build full-blown items (similar to SmallItem class, but a lot more fields)
Populate a datagrid with the full-blow item objects.
On our mobile units, the entire process takes 2-4 seconds (takes 2 if we hit our match limit before we have searched all items... takes 4 seconds if we have to scan every item).
I've also tried doing this without unmanaged code and using String.IndexOf (and tried String.Contains... had same performance as IndexOf as it should). That way was much slower... about 25 seconds.
I've also tried using a StreamReader and a file containing lines of [Sku Number]|[Description]. The code was similar to the unmanaged code example. This way took about 15 seconds for an entire scan. Not too bad for speed, but not great. The file and StreamReader method has one advantage over the way I showed you though. The file can be created ahead of time. The way I showed you requires the memory and the initial time to load the List when the application starts up. For our 171,000 items, this takes about 2 minutes. If you can afford to wait for that initial load each time the app starts up (which can be done on a separate thread of course), then searching this way is the fastest way (that I've found at least).
Hope that helps.
PS - Thanks to Dolch for helping with some of the unmanaged code.
You could try Lucene.Net. I'm not sure how well it's suited to mobile devices, but it is billed as a "high-performance, full-featured text search engine library".
http://incubator.apache.org/lucene.net/
http://lucene.apache.org/java/docs/

Resources