I've been working on learning backtracking and I know how the general template goes, but I'm struggling to fully understand how the algorithm backtracks, specifically how it knows when to pop from a current solution and how often to.
I know it should be that we have a base case, and when we hit this base case, we then return from this current iteration. But then I'm not fully sure on why we pop from a solution many times until we start exploring again.
For example, I've been working on the classic "Generate Parentheses" problem:
Given n pairs of parentheses, write a function to generate all combinations of well-formed parentheses.
E.g.
Input: n = 3
Output: ["((()))","(()())","(())()","()(())","()()()"]
Here's my working solution after applying my existing knowledge of how the template should be, but I just can't work out how currCombo.pop() falls into the thinking and visualising how it works.
function generateParenthesis(n) {
const result = [];
backtrack(result, n, 0, 0, []);
return result;
};
function backtrack(result, n, open, close, currCombo) {
if (currCombo.length === 2 * n) {
result.push(currCombo.join(''));
return;
}
if (open < n) {
currCombo.push('(');
backtrack(result, n, open + 1, close, currCombo);
currCombo.pop();
}
if (close < open) {
currCombo.push(')');
backtrack(result, n, open, close + 1, currCombo);
currCombo.pop();
}
}
So for example, the algorithm first outputs:
"((()))"
And the second result is then:
"(()())"
But how does the algorithm know it needs to pop off 3 close brackets and then 1 open bracket, and then to continue adding brackets from there? I debugged the code line by line and just couldn't see why it would do a certain number of pop operations and then continue.
I've tried checking out Youtube videos, articles, blogs, but I just can't visualise what the algorithm is doing and how it's making the decisions that it is when it is.
Any help much appreciated. Thanks
There's nothing for the algorithm to "know" exactly; each pop is an undo of the push that set up the call frame for the recursion. That's it. Most of the logic has to do with your if statements that protect the recursion, ensuring balance.
Think of it as a depth-first graph exploration where each node is a possible arrangement of up to n ( parentheses and n ) parentheses. The open and close variables don't encode any unique information. These variables are conveniences to avoid each frame having to count the already-chosen parentheses, and having these counts enables you to avoid exporing pointless subtrees which can never produce a result, like (((( on n=3.
Each recursive call frame begins by asking whether it's at a result point. If so, save the result and return. If not, if it'd be possible to reach a result by adding a ( to the end of the string so far, explore that subtree, then undo the move by popping, resetting to a clean state. Next, try adding ) if that might lead to a result, explore the subtree, and undo the move. After undoing any modifications made and exploring one or both subtrees, return to the caller since all possibilities have been explored rooted at this node in the graph.
Let's trace how we get to the first result "((()))" and then to "(()())".
On the first call, open < n ("the first branch") is true, so we push ( and spawn one recursive call with open = 1. This initial ( stays in the result for the duration of the algorithm; close < open ("the second branch") is false for the first call frame.
On the second call, both branches are true, so we'll eventually make two recursive calls, but for starters we try pushing ( again to give the string (( and recursing with open = 2.
On the third call, both branches are true, so we'll eventually make two recursive calls, but for starters we try pushing ( again to give the string ((( and recursing with open = 3.
On the fourth call, the first branch is false, so we only perform one recursion from the second branch close < open with the string ((() and open = 3, close = 1.
On the fifth call, the first branch is false, so we only perform one recursion from the second branch close < open with the string ((()) and open = 3, close = 2.
On the sixth call, the first branch is false, so we only perform one recursion from the second branch close < open with the string ((())) and open = 3, close = 3.
On the seventh call, currCombo.length === 2 * n is true so we add ((())) to the result and return back up one call frame.
Now the sixth call resumes executing and there's no code left to run; recall that for this frame, the first branch was false, so we skipped it, and we've already explored the second branch recursively. Pop the string to ((()) and return to the caller.
Now the fifth call resumes executing and there's no code left to run; recall that for this frame, the first branch was false, so we skipped it, and we've already explored the second branch recursively. Pop the string to ((() and return to the caller.
Now the fourth call resumes executing and there's no code left to run; recall that for this frame, the first branch was false, so we skipped it, and we've already explored the second branch recursively. Pop the string to ((( and return to the caller.
Now the third call resumes executing but we haven't explored the second branch yet, so pop ( again to give the string ((, then push ) and recurse on (() and open = 2, close = 1.
A new call, the eighth call, begins, and both branches are true, so we'll eventually make two recursive calls, but for starters we try pushing ( again to give the string (()( and recursing with open = 3, close = 1.
A new call, the ninth call, begins. The first branch is false, so we only perform one recursion from the second branch close < open with the string (()() and open = 3, close = 2.
A new call, the tenth call, begins. The first branch is false, so we only perform one recursion from the second branch close < open with the string (()()) and open = 3, close = 3.
A new call, the eleventh call, begins.
On this call, currCombo.length === 2 * n is true so we add (()()) to the result and return back up one call frame.
Now the tenth call resumes executing and there's no code left to run; recall that for this frame, the first branch was false, so we skipped it, and we've already explored the second branch recursively. Pop the string to (()() and return to the caller.
Now the ninth call resumes executing and there's no code left to run; recall that for this frame, the first branch was false, so we skipped it, and we've already explored the second branch recursively. Pop the string to (()( and return to the caller.
Now the eighth call resumes executing but we haven't explored the second branch yet, so pop ( again to give the string ((), then push ) and recurse on (()) and open = 2, close = 2.
... I'll stop here; finish tracing the execution to the next result (())() which you can see we're well on our way to building.
But how does the algorithm know it needs to pop off 3 close brackets and then 1 open bracket, and then to continue adding brackets from there?
It popped off 3 close parens because there was nothing left to explore on those call frames. The first branch was false and the second branch had already been explored.
The reason the 1 open paren was popped next is because the recursive subtree of possibilities starting with the open paren had already been explored fully, but the subtree rooted with ) hadn't been explored yet. When we left off our algorithm trace, we were just launching into exploring that subtree. When the subtree is exhausted and any results that it ultimately yielded had been stored, that paren would also pop off and the frame would be completely explored.
At that point, its caller (the second call in the above example) would still need to explore the subtree starting with (), open = 1, close = 1, since it had only tried ((, open = 2, close = 0 up to that point. In other words, it'll have explored the ( branch but not the ) branch that will ultimately lead to the last result ()()().
Here's a visualization of the recursive call tree:
function generateParenthesis(n) {
const result = [];
backtrack(result, n, 0, 0, []);
return result;
};
function backtrack(result, n, open, close, currCombo, depth=0) {
console.log(`${" ".repeat(depth * 2)}enter '${currCombo.join("")}'`);
if (currCombo.length === 2 * n) {
result.push(currCombo.join(''));
console.log(`${" ".repeat(depth * 2)}result '${currCombo.join("")}'`);
console.log(`${" ".repeat(depth * 2)}exit '${currCombo.join("")}'`);
return;
}
if (open < n) {
currCombo.push('(');
backtrack(result, n, open + 1, close, currCombo, depth + 1);
currCombo.pop();
}
if (close < open) {
currCombo.push(')');
backtrack(result, n, open, close + 1, currCombo, depth + 1);
currCombo.pop();
}
console.log(`${" ".repeat(depth * 2)}exit '${currCombo.join("")}'`);
}
generateParenthesis(3);
If that's too complex, you can dial it back to n = 2, trace that, then do n = 3.
I faced this issue during a migration of gremlin queries from v2 to v3.
V2-way: inE().has(some condition).outV().map().toList()[0] will return an object. This is wrapped in transform{label: it./etc/} step.
V3-way, still WIP: inE().has(some condition).outV().fold() will return an array. This is wrapped in project(...).by(...) step.
V3 works fine, I just have to unwrap an item from the array manually. I wonder if there is a more sane approach (anyway, this feels like non-graph-friendly step).
Environment: JanusGraph, TinkerPop3+. For v2: Titan graph db and TinkerPop2+.
Update: V3 query sample
inE('edge1').
has('cond1').outV(). // one vertex left
project('items', 'count'). // pagination
by(
order().
by('field1', decr).
project('vertex_itself', 'vertex2', 'vertices3').
by(identity()).
by(outE('edge2').has('type', 'type1').limit(1).inV().fold()). // now this is empty array or single-element array, can we return element itself?
by(inE('edge2').has('type', 'type2').outV().fold()).
fold()).
by(count())
Desired result shape:
[{
items: [
{vertex_itself: Object, vertex2: Object/null/empty, veroces3: Array},
{}...
],
cont: Number,
}]
Problem: vertex2 property is always an array, empty or single-element.
Expected: vertex2 to be object or null/empty.
Update 2: it turns out my query is not finished yet, it returns many object if there are no single element in has('cond1').outV() step, e.g. [{items, count}, {items, count}...]
it looks like your main issue is getting a single item from the traversal.
you can do this with next(), which will retrieve the next element in the current traversal iteration:
inE().has(some condition).outV().next()
the iteratee's structure is, i think, implementation specific. e.g. in javascript, you can access the item with the value property:
const result = await inE().has(some condition).outV().next();
const item = result.value;
I may not fully understand, but it sounds like from this:
inE().has(some condition).outV().fold()
you want to just grab the first vertex you come across. If that's right, then is there a reason to fold() at all? maybe just do:
inE().has(some condition).outV().limit(1)
Just for learning purpose, I tried to set a dictionary as a global variable in accumulator the add function works well, but I ran the code and put dictionary in the map function, it always return empty.
But similar code for setting list as a global variable
class DictParam(AccumulatorParam):
def zero(self, value = ""):
return dict()
def addInPlace(self, acc1, acc2):
acc1.update(acc2)
if __name__== "__main__":
sc, sqlContext = init_spark("generate_score_summary", 40)
rdd = sc.textFile('input')
#print(rdd.take(5))
dict1 = sc.accumulator({}, DictParam())
def file_read(line):
global dict1
ls = re.split(',', line)
dict1+={ls[0]:ls[1]}
return line
rdd = rdd.map(lambda x: file_read(x)).cache()
print(dict1)
For anyone who arrives at this thread looking for a Dict accumulator for pyspark: the accepted solution does not solve the posed problem.
The issue is actually in the DictParam defined, it does not update the original dictionary. This works:
class DictParam(AccumulatorParam):
def zero(self, value = ""):
return dict()
def addInPlace(self, value1, value2):
value1.update(value2)
return value1
The original code was missing the return value.
I believe that print(dict1()) simply gets executed before the rdd.map() does.
In Spark, there are 2 types of operations:
transformations, that describe the future computation
and actions, that call for action, and actually trigger the execution
Accumulators are updated only when some action is executed:
Accumulators do not change the lazy evaluation model of Spark. If they
are being updated within an operation on an RDD, their value is only
updated once that RDD is computed as part of an action.
If you check out the end of this section of the docs, there is an example exactly like yours:
accum = sc.accumulator(0)
def g(x):
accum.add(x)
return f(x)
data.map(g)
# Here, accum is still 0 because no actions have caused the `map` to be computed.
So you would need to add some action, for instance:
rdd = rdd.map(lambda x: file_read(x)).cache() # transformation
foo = rdd.count() # action
print(dict1)
Please make sure to check on the details of various RDD functions and accumulator peculiarities because this might affect the correctness of your result. (For instance, rdd.take(n) will by default only scan one partition, not the entire dataset.)
For accumulator updates performed inside actions only, their value is
only updated once that RDD is computed as part of an action
I'm trying to get the number of records with QML LocalStorage, which uses sqlite. Let's take this snippet in account:
function f() {
var db = LocalStorage.openDatabaseSync(...)
db.transaction (
function(tx) {
var b = tx.executeSql("SELECT * FROM t")
console.log(b.rows.length)
var c = tx.executeSql("SELECT COUNT(*) FROM t")
console.log(JSON.stringify(c))
}
)
}
The output is:
qml: 3
qml: {"rowsAffected":0,"insertId":"","rows":{}}
What am I doing wrong that the SELECT COUNT(*) doesn't output anything?
EDIT: rows only seems empty in the second command. Calling
console.log(JSON.stringify(c.rows.item(0)))
gives
qml: {"COUNT(*)":3}
Two questions now:
Why is rows shown as empty
How can I access the property inside c.rows.item(0)
In order to visit the items, you have to use:
b.rows.item(i)
Where i is the index of the item you want to get (in your first example, i belongs to [0, 1, 2] for you have 3 items, in the second one it is 0 and you can query it as c.rows.item(0)).
The rows field appears empty and it is a valid result, for the items are not part of the rows field itself (indeed you have to use a method to get them, as far as I know that method could also be a memento that completely enclose the response data) and the item method is probably defined as not enumerable (I cannot verify it, I'm on the beach and it's quite difficult to explore the Qt code now :-)). You can safely rely on the length parameter to know if there are returned values, thus you can iterate over them to print them out. I did something like that in a project of mine and it works fine.
The properties inside item(0) have the same names given for the query. I suggest to rewrite that query as:
select count(*) as cnt from t
Then, you can get the count as:
c.rows.item(0).cnt
Help me understand the output of this program:
int n;
void rec() {
n = n + 1;
if (n < 3) {
rec();
System.out.println(n); // (*)
}
}
Output is "3 3". why line (*) is even executed?
Assuming n is initialized to 0 at the beginning:
The first time the function is called n gets incremented to 1. 1 < 3 therefore rec() is called a second time.
The second time through n gets incremented to 2. 2 < 3 therefore rec() is called a third time.
Now the third time through n gets incremented to 3. 3 is not less than 3, therefore the if statement doesn't execute. So now you exit the current call of the function (third time) and return to the the previous call, which is the second call.
Now that the call to rec() has finished in your second call, System.out.println is called and the value of n (3) is displayed. Now the second call finishes, so you exit the current call of the function (second time) and return to the previous call, which is the first call.
Now you're in the first call of the function and since the call to rec() has finished, you call System.out.println again, which again displays the value 3.