Fibonacci Recursion : Return address? - recursion

Usually when there is a recursive call to a function then in the stack the return address points to the next instruction after the function call. But in Fibonacci code where will the return address point to? i.e. the next line or the remaining line of code after '+' operator?
int Fibonacci(int x) {
if (x == 0) return 0; // Stopping conditions
if (x == 1) return 1;
return Fibonacci(x - 1)/*cond1*/ + Fibonacci(x - 2);/*cond2*/
}
As far as my understanding of recursion goes, cond1 gets executed until a 0 or 1 value is returned (i.e. depth first on the leftmost branch of the recursion tree) and then only will cond2 get executed and so on. Is that right? What will the return address be saved as in stack(i.e EBP) when the control is performing Fibonacci(x-1) ?
Fibonacci(3)
/ \
/ \
/ \
/ \
/ \
Fibonacci(2) * Fibonacci(1)
/ \ \
/ \ \
/ \ \
/ \ \
Fibonacci(1) * Fibonacci(0) 1
| |
| |
| |
| |
1 0

Yes, your understanding of recursion is right. It will work just like a DFS.
About the return address: recall that one line of code is not one instruction. In fact, a line of code can result in lots of instructions.
In this case, the compiler will generate code resembling something like:
a = call Fibonacci(x-1)
b = call Fibonacci(x-2)
c = add a, b
return c
So the return address for Fibonacci(x-1) is the address of the next instruction - in this case, the instruction that calls Fibonacci(x-2). Note, however, that most languages don't give you a guarantee on the order of evaluation of operands, all you know is that both operands of + are fully evaluated before the addition is performed. In effect, you could have this instead:
a = call Fibonacci(x-2)
b = call Fibonacci(x-1)
c = add a, b
return c
The point is, the return address of one of the recursive calls will be the address for the instruction for the 2nd call, and the return address of the 2nd call will be an ADD instruction.

Related

bjam - cannot assign a literal to a variable?

Well, this must be the most stupid and idiotic behavior I've seen from a programming language.
https://www.bfgroup.xyz/b2/manual/release/index.html says:
Syntactically, a Boost.Jam program consists of two kinds of
elements—keywords (which have a special meaning to Boost.Jam) and
literals. Consider this code:
a = b ;
which assigns the value b to the variable a. Here, = and ; are
keywords, while a and b are literals.
⚠ All syntax elements, even
keywords, must be separated by spaces. For example, omitting the space
character before ; will lead to a syntax error.
If you want to use a literal value that is the same as some keyword,
the value can be quoted:
a = "=" ;
OK, so far so good. So I have this in my Jamroot:
import path : basename ;
actions make_mytest_install
{
echo "make_mytest_install: MY_ROOT_PATH $(MY_ROOT_PATH) PWD $(PWD:E=not_set)" ;
epath = "$(MY_ROOT_PATH)/projects/mytest/bin/gcc-9/release/qt5client" ;
ename = basename ( $(epath) ) ;
echo "epath $(epath) ename $(ename)" ;
}
explicit install-gettext ;
make install-mytest : : #make_mytest_install ;
... and I try this:
bjam install-mytest
...updating 1 target...
Jamfile</home/USER/src/myproject>.make_mytest_install bin/install-mytest
make_mytest_install: MY_ROOT_PATH /home/USER/src/myproject PWD not_set
[ SHELL pstree -s -p 2720269 && echo PID 2720269 PWD /home/USER/src/myproject ]
/bin/sh: 13: epath: not found
/bin/sh: 14: Syntax error: "(" unexpected
.....
...failed Jamfile</home/USER/src/myproject>.make_mytest_install bin/install-mytest...
...failed updating 1 target...
Now - how come that the SIMPLEST assignment to a string, EXACTLY AS in the manual:
epath = "$(MY_ROOT_PATH)/projects/mytest/bin/gcc-9/release/qt5client" ;
... fails, and this variable cannot be found anymore?
What is the logic in this? How the hell is this supposed to work? I would get it if MY_ROOT_PATH was undefined - but the echo before it, shows that it is not? What is this lunacy?
So I cannot believe I'm asking something this trivial, but:
How do you assign a string to a variable in bjam language?
Well, the error gives somewhat of a hint: /bin/sh: -> so apparently inside actions, it is sh that runs - then again, if it was really sh I could have assigned variables, but I can't. So best I could do, was to remove the assignments OUT of actions:
import path : basename ;
epath = "$(MY_ROOT_PATH)/projects/mytest/bin/gcc-9/release/qt5client" ;
# ename = basename ( $(epath) ) ; # nope, causes target install-mytest to not be found :(
# calling a shell for basename works - but adds a damn NEWLINE at end!?!?!?!
ename = [ SHELL "basename $(epath)" ] ;
actions make_mytest_install
{
echo "make_mytest_install: MY_ROOT_PATH $(MY_ROOT_PATH) PWD $(PWD:E=not_set)" ;
echo "epath $(epath) ename $(ename)" ;
}
explicit install-mytest ;
make install-mytest : : #make_mytest_install ;
So, assignment kind of passes, but you still can't get the basename ?!
I still don't understand, who thought this kind of variable management is a good idea ... I don't even understand, how people managed to build stuff with this system

How does null move up the call stack in recursive functions, if a return value of null is not specified?

Hi I'm currently learning about recursive Inorder Binary Tree Traversal using C#. There's one main aspect I cannot understand, in particular with this code below.
public void InOrder(BinaryTreeNode node)
{
if (node != null)
{
InOrder(node.Left);
Console.WriteLine(node.Value);
InOrder(node.Right);
}
}
If I had a Binary tree that looked like this...
9
/ \
4 20
/ \ / \
1 6 15 170
I know that eventually by recursively calling Inorder(node.left) I will get to the left leaf of the binary tree i.e. the very end of the tree, where node.left will equal null as there are no more nodes.
The tree would look like this...
9
/ \
4 20
/ \ / \
1 6 15 170
/
null
Because node.left = null, the first recursive function
InOrder(node.left)
will terminate, and
Console.Writeline(node.left)
will execute
Printing a value of 1
Eventually these null values move up the call stack after each node is analysed, and all nodes are printed, the tree starts to look like this, as null value moves up the tree..
9
/ \
4 20
/ \ / \
null 6 15 170
/ \ / \
null null null
Eventually all the nodes in the tree are equal to null, and all nodes are printed in order to an output of ...
1, 4, 6, 9, 15, 20, 170
What I don't understand is how this null value is moving up the tree, and changing all the nodes that have been analysed to null when there is no return value. Normally there would be a base case like...
if (node == null)
{
return null;
}
For this, I understand that null is being returned so will persist/return up the call stack. But for fist block of code above, there is no return statement.
I also find it just as confusing when there is only a return statement without a return value like...
if (node == null)
{
return;
}
Again there is no return of null specified, so how does this null value move up the tree as each node is evaluated?
There isn't a problem with any of this code, it works as expected, and prints all the nodes of the Binary Tree InOrder. This is more about understanding Recursion, and why the first block of code still works even though a return null value is not specified.
Thanks in Advance for the help.
there is no return of null specified, so how does this null value move up the tree as each node is evaluated?
The function will still return, even if there is no value to return. It's done executing, so control is passed back to the caller.
if (node != null) <- skipped entirely when the node is null
{
InOrder(node.Left);
Console.WriteLine(node.Value);
InOrder(node.Right);
}
For the tree you gave, this is what happens at the node with value=1:
It's not null, so we go into the if block.
We evaluate InOrder(node.Left) which is just InOrder(null):
It's null, so the if block is skipped.
We return to the caller, InOrder(node with value=1)
Console.WriteLine(node.Value) prints 1.
etc...
Although you can't 'see' the base case in the code, it's still there :) just implicitly.

Frama-c: Function calls and static variables

I'm currently discovering frama-c's features, and especially WP & Value's analysis tools. My final goal is to be able to use frama-c on larger codes that involves several layers with:
a lot of function calls
use of complex data structures
static and/or global variables
So far I've been trying to apply a bottom-up method i.e starting specifying functions that does not includes any function calls and analyze their behaviors by isolating them thanks to -lib-entry and -main kernel options. By doing that I make sure that if pre conditions are assumed to be true, then the whole function contract is verified. As soon as I tried to specify upper layers that invokes those functions, things gets complicated. First I often have to specify the behavior of the called functions which is not always easy because those functions may deal with variables/functions out of the scope of the current function.
Let me give you an easy example:
Let's say that in file1.h we define a data structure "my_struct" that contains a field number and a field parity.
In file1.c I have two functions:
A first function "check_parity" that just tests if the parity field of the static variable _sVar is correct.
A second function "correct_parity" that calls the first function, and corrects the parity if the field was not correct.
In file2.c, I have a function "outside_caller" that just calls correct_parity(). My objective is to be able to specify outside_caller the same way I'm specifying correct_parity. Below is the corresponding source code:
file1.h
/* parity = 0 => even ; 1 => odd */
typedef unsigned char TYP_U08;
typedef unsigned short TYP_U16;
typedef unsigned int TYP_U32;
typedef unsigned long TYP_U64;
typedef struct {
unsigned char parity;
unsigned int number;
} my_stuct;
typedef enum
{
S_ERROR = -1
,S_OK = 0
,S_WARNING = 1
} TYPE_STATUS;
/*# ghost my_stuct* g_sVar; */
/*# predicate fc_pre_is_parity_ok{Labl}(my_stuct* i_sVar) =
(
\at(i_sVar->parity, Labl) == ((TYP_U08) (\at(i_sVar->number,Labl) % 2u))
);
# predicate fc_pre_valid_parity{Labl}(my_stuct* i_sVar) =
(
(\at(i_sVar->parity,Labl) == 0) ||
(\at(i_sVar->parity, Labl) == 1)
);
# predicate fc_pre_is_parity_readable(my_stuct* i_sVar) =
(
\valid_read(&i_sVar->parity)
);
# predicate fc_pre_is_parity_writeable(my_stuct* i_sVar) =
(
\valid(&i_sVar->parity)
);
# predicate fc_pre_is_number_readable(my_stuct* i_sVar) =
(
\valid_read(&i_sVar->number)
);
# predicate fc_pre_is_number_writeable(my_stuct* i_sVar) =
(
\valid(&i_sVar->number)
);
*/
TYPE_STATUS check_parity(void);
TYPE_STATUS correct_parity(void);
file1.c
static my_stuct* _sVar;
/*# requires check_req_parity_readable:
fc_pre_is_parity_readable(_sVar);
# requires check_req_number_readable:
fc_pre_is_number_readable(_sVar);
# assigns check_assigns:
g_sVar;
# ensures check_ensures_error:
!fc_pre_valid_parity{Post}(g_sVar) ==> \result == S_ERROR;
# ensures check_ensures_ok:
(
fc_pre_valid_parity{Post}(g_sVar) &&
fc_pre_is_parity_ok{Post}(g_sVar)
) ==> \result == S_OK;
# ensures check_ensures_warning:
(
fc_pre_valid_parity{Post}(g_sVar) &&
!fc_pre_is_parity_ok{Post}(g_sVar)
) ==> \result == S_WARNING;
# ensures check_ensures_ghost_consistency:
\at(g_sVar, Post) == _sVar;
*/
TYPE_STATUS check_parity(void)
{
//# ghost g_sVar = _sVar;
TYPE_STATUS status = S_OK;
if(!(_sVar->parity == 0 || _sVar->parity == 1)) {
status = S_ERROR;
} else if ( _sVar->parity == (TYP_U08)(_sVar->number % 2u) ){
status = S_OK;
} else {
status = S_WARNING;
}
return status;
}
/*# requires correct_req_is_parity_writeable:
fc_pre_is_parity_writeable(_sVar);
# requires correct_req_is_number_readable:
fc_pre_is_number_readable(_sVar);
# assigns correct_assigns:
_sVar->parity,
g_sVar,
g_sVar->parity;
# ensures correct_ensures_error:
!fc_pre_valid_parity{Pre}(g_sVar) ==> \result == S_ERROR;
# ensures correct_ensures_ok:
(
fc_pre_valid_parity{Pre}(g_sVar) &&
fc_pre_is_parity_ok{Pre}(g_sVar)
) ==> \result == S_OK;
# ensures correct_ensures_warning:
(
fc_pre_valid_parity{Pre}(g_sVar) &&
!fc_pre_is_parity_ok{Pre}(g_sVar)
) ==> \result == S_WARNING;
# ensures correct_ensures_consistency:
fc_pre_is_parity_ok{Post}(g_sVar);
# ensures correct_ensures_validity :
fc_pre_valid_parity{Post}(g_sVar);
# ensures correct_ensures_ghost_consistency:
\at(g_sVar, Post) == _sVar;
*/
TYPE_STATUS correct_parity(void)
{
//# ghost g_sVar = _sVar;
TYPE_STATUS parity_status = check_parity();
if(parity_status == S_ERROR || parity_status == S_WARNING) {
_sVar->parity = (TYP_U08)(_sVar->number % 2u);
/*# assert (\at(g_sVar->parity,Here) == 0) ||
(\at(g_sVar->parity, Here) == 1);
*/
//# assert \at(g_sVar->parity, Here) == (TYP_U08)(\at(g_sVar->number,Here) % 2u);
}
return parity_status;
}
file2.c
/*# requires out_req_parity_writable:
fc_pre_is_parity_writeable(g_sVar);
# requires out_req_number_writeable:
fc_pre_is_number_readable(g_sVar);
# assigns out_assigns:
g_sVar,
g_sVar->parity;
# ensures out_ensures_error:
!fc_pre_valid_parity{Pre}(g_sVar) ==> \result == S_ERROR;
# ensures out_ensures_ok:
(
fc_pre_valid_parity{Pre}(g_sVar) &&
fc_pre_is_parity_ok{Pre}(g_sVar)
) ==> \result == S_OK;
# ensures out_ensures_warning:
(
fc_pre_valid_parity{Pre}(g_sVar) &&
!fc_pre_is_parity_ok{Pre}(g_sVar)
) ==> \result == S_WARNING;
# ensures out_ensures_consistency:
fc_pre_is_parity_ok{Post}(g_sVar);
# ensures out_ensures_validity:
fc_pre_valid_parity{Post}(g_sVar);
*/
TYPE_STATUS outside_caller(void)
{
TYPE_STATUS status = correct_parity();
//# assert fc_pre_is_parity_ok{Here}(g_sVar) ==> status == S_OK;
/*# assert !fc_pre_is_parity_ok{Here}(g_sVar) &&
fc_pre_valid_parity{Here}(g_sVar) ==> status == S_WARNING; */
//# assert !fc_pre_valid_parity{Here}(g_sVar) ==> status == S_ERROR;
return status;
}
Here the main issue is that in order to specify outside_caller(), I need to access _sVar which is out of scope in file2.c. That implies to deal with a ghost variable (g_sVar) that is declared in file1.h and updated in correct_parity function. In order to make the caller (correct_parity) able to use the callee's contracts, the ghost variable g_sVar must be used inside the contracts of the callees.
Here are the results of WP analysis:
(1) check_parity()
frama-c -wp src/main.c src/test.c -cpp-command 'gcc -C -E -Isrc/'
-main 'check_parity' -lib-entry -wp-timeout 1 -wp-fct check_parity -wp-rte -wp-fct check_parity -then -report
[rte] annotating function check_parity
[wp] 14 goals scheduled [wp] Proved goals: 14 / 14
Qed: 9 (4ms)
Alt-Ergo: 5 (8ms-12ms-20ms) (30)
(2) correct_parity()
frama-c -wp src/main.c src/test.c -cpp-command 'gcc -C -E -Isrc/' -main 'correct_parity' -lib-entry -wp-timeout 1 -wp-fct correct_parity -wp-rte -wp-fct correct_parity -then -report
[rte] annotating function correct_parity
[wp] 18 goals scheduled
[wp] Proved goals: 18 / 18
Qed: 12 (4ms)
Alt-Ergo: 6 (4ms-37ms-120ms) (108)
(3) outside_caller()
frama-c -wp src/main.c src/test.c -cpp-command 'gcc -C -E -Isrc/' -main 'outside_caller' -lib-entry -wp-timeout 1 -wp-fct outside_caller -wp-rte -wp-fct outside_caller -then -report
[rte] annotating function outside_caller
[wp] 14 goals scheduled
[wp] [Alt-Ergo] Goal typed_outside_caller_assign_exit : Unknown (Qed:4ms) (515ms)
[wp] [Alt-Ergo] Goal typed_outside_caller_call_correct_parity_pre_correct_req_is_par___ : Unknown (636ms)
[wp] [Alt-Ergo] Goal typed_outside_caller_assert : Timeout
[wp] [Alt-Ergo] Goal typed_outside_caller_assign_normal_part1 : Timeout
[wp] [Alt-Ergo] Goal typed_outside_caller_call_correct_parity_pre_correct_req_is_num___ : Unknown (205ms)
[wp] Proved goals: 9 / 14
Qed: 9 (4ms)
Alt-Ergo: 0 (interrupted: 2) (unknown: 3)
==> WP : GUI Output
In this configuration, the callees are specified with g_sVar ghost variable, except for requires and assings clauses for 2 reasons:
I need to check _sVar R/W accesses with \valid & \valid_read since its a pointer
When I tried to specify assigns clauses of the callees with g_sVar, I was not able to verify the corresponding clause.
But by doing so, I somehow made the specification of the caller invalid, as you can see on WP's output.
Why does it seems the more functions calls I have, the more it becomes complicated to prove the behavior of the functions? Is there a proper way to deal with multiple function calls and static variables?
Thank you a lot in advance!
PS: I'm working with Magnesium-20151002 version, on a VM running with Ubuntu 14.04, 64-bit machine. I know that getting started with WhyML and Why3 could help me a lot but so far I haven't been able to install Why3 ide neither on windows nor on Ubuntu following each step of this tutorial.
First of all, please note that -main and -lib-entry aren't that useful for WP (you mentioned that you are also interested in EVA/Value Analysis, but your question is directed towards WP).
Your issue with static variables is a known one, and the easiest way to deal with it is indeed to declare a ghost variable in the header. But then you must express your contracts in terms of the ghost variable and not the static one.
Otherwise, the callers will not be able to make use of these contracts, since they do not know anything about _sVar. As a rule of thumb, it is better to put the contract in the header: this way, you're bound to only use identifiers that are visible outside of the translation unit.
Regarding function calls, the main point is that any function that is called by the function you're trying to prove with WP must come with a contract that at least contain an assigns clause (and possibly more precise specifications, depending on how much the effects of the callee are relevant for the property that you want to prove on the caller). The important thing to remember here is that, from WP's point of view, after the call, only what is explicitly stated in the callee's contract through ensures is true, plus the fact that any location not in the assigns clause has been left unchanged.

Backtracking in Standard ML

I have seen in my SML manual the following function, which computes how many coins of a particular kind are needed for a particular change.
For example change [5,2] 16 =[5,5,2,2,2] because with two 5-coins and three 2-coins one gets 16.
the following code is a backtracking approach:
exception Change;
fun change _ 0 = nil|
change nil _ = raise Change|
change (coin::coins)=
if coin>amt then change coins amt
else (coin:: change (coin::coins) (amt-coin))
handle Change=> change coins amt;
It works, but I don't understand how exactly.
I know what backtracking is, I just don't understand this particular function.
What I understood so far: If amt is 0, it means our change is computed, and there is nothing to be cons'd onto the final list.
If there are no more coins in our 'coin-list', we need to go back one step.
This is where I get lost: how exactly does raising an exception helps us go back?
as I see it, the handler tries to make a call to the change function, but shouldn't the "coins" parameter be nil? therefore entering an infinite loop? why does it "go back"?
The last clause is pretty obvious to me: if the coin-value is greater than the amount left to change, we use the remaining coins to build the change. If it is smaller than the amount left, we cons it onto the result list.
This is best seen by writing out how evaluation proceeds for a simple example. In each step, I just replace a call to change by the respective right-hand side (I added extra parentheses for extra clarity):
change [3, 2] 4
= if 3 > 4 then ... else ((3 :: change [3, 2] (4 - 3)) handle Change => change [2] 4)
= (3 :: change [3, 2] 1) handle Change => change [2] 4
= (3 :: (if 3 > 1 then change [2] 1 else ...)) handle Change => change [2] 4
= (3 :: change [2] 1) handle Change => change [2] 4
= (3 :: (if 2 > 1 then change [] 1 else ...)) handle Change => change [2] 4
= (3 :: (raise Change)) handle Change => change [2] 4
At this point an exception has been raised. It bubbles up to the current handler so that evaluation proceeds as follows:
= change [2] 4
= if 2 > 4 then ... else ((2 :: change [2] (4 - 2)) handle Change => change [] 4)
= (2 :: change [2] 2) handle Change => change [] 4
= (2 :: (if 2 > 2 then ... else ((2 :: change [2] (2 - 2)) handle Change => change [] 2)) handle Change => change [] 4
= (2 :: ((2 :: change [2] 0) handle Change => change [] 2)) handle Change => change [] 4
= (2 :: ((2 :: []) handle Change => change [] 2)) handle Change => change [] 4
= (2 :: (2 :: [])) handle Change => change [] 4
= 2 :: 2 :: []
No more failures up to here, so we terminate successfully.
In short, every handler is a backtracking point. At each failure (i.e., raise) you proceed at the innermost handler, which is the last backtracking point. Each handler itself is set up such that it contains the respective call to try instead.
You can rewrite this use of exceptions into using the 'a option type instead. The original function:
exception Change;
fun change _ 0 = []
| change [] _ = raise Change
| change (coin::coins) amt =
if coin > amt
then change coins amt
else coin :: change (coin::coins) (amt-coin)
handle Change => change coins amt;
In the modified function below, instead of the exception bubbling up, it becomes a NONE. One thing that becomes slightly more apparent here is that coin only occurs in one of the two cases (where in the code above it always occurs but is reverted in case of backtracking).
fun change' _ 0 = SOME []
| change' [] _ = NONE
| change' (coin::coins) amt =
if coin > amt
then change' coins amt
else case change' (coin::coins) (amt-coin) of
SOME result => SOME (coin :: result)
| NONE => change' coins amt
Another way to demonstrate what happens is by drawing a call tree. This does not gather the result as Andreas Rossberg's evaluation by hand, but it does show that only the times change is taking an else-branch is there a possibility to backtrack, and if a backtrack occurs (i.e. NONE is returned or an exception is thrown), don't include coin in the result.
(original call ->) change [2,5] 7
\ (else)
`-change [2,5] 5
/ \ (else)
___________________/ `-change [2,5] 3
/ / \ (else)
/ / `-change [2,5] 1
`-change [5] 5 / \ (then)
\ (else) / `-change [5] 1
`-change [] 0 / \ (then)
\ / `-change [] 1
`-SOME [] `-change [5] 3 \ (base)
\ (then) `-NONE
`-change [] 3
\
`-NONE
Source: https://www.cs.cmu.edu/~rwh/introsml/core/exceptions.htm
The expression exp handle match is an exception handler. It is
evaluated by attempting to evaluate exp. If it returns a value, then
that is the value of the entire expression; the handler plays no role
in this case. If, however, exp raises an exception exn, then the
exception value is matched against the clauses of the match (exactly
as in the application of a clausal function to an argument) to
determine how to proceed. If the pattern of a clause matches the
exception exn, then evaluation resumes with the expression part of
that clause. If no pattern matches, the exception exn is re-raised so
that outer exception handlers may dispatch on it. If no handler
handles the exception, then the uncaught exception is signaled as the
final result of evaluation. That is, computation is aborted with the
uncaught exception exn.
In more operational terms, evaluation of exp handle match proceeds by
installing an exception handler determined by match, then evaluating
exp. The previous binding of the exception handler is preserved so
that it may be restored once the given handler is no longer needed.
Raising an exception consists of passing a value of type exn to the
current exception handler. Passing an exception to a handler
de-installs that handler, and re-installs the previously active
handler. This ensures that if the handler itself raises an exception,
or fails to handle the given exception, then the exception is
propagated to the handler active prior to evaluation of the handle
expression. If the expression does not raise an exception, the
previous handler is restored as part of completing the evaluation of
the handle expression.

trie reg exp parse step over char and continue

Setup: 1) a string trie database formed from linked nodes and a vector array linking to the next node terminating in a leaf, 2) a recursive regular expression function that if A) char '*' continues down all paths until string length limit is reached, then continues down remaining string paths if valid, and B) char '?' continues down all paths for 1 char and then continues down remaining string paths if valid. 3) after reg expression the candidate strings are measured for edit distance against the 'try' string.
Problem: the reg expression works fine for adding chars or swapping ? for a char but if the remaining string has an error then there is not a valid path to a terminating leaf; making the matching function redundant. I tried adding a 'step-over' ? char if the end of the node vector was reached and then followed every path of that node - allowing this step-over only once; resulted in a memory exception; I cannot find logically why it is accessing the vector out of range - bactracking?
Questions: 1) how can the regular expression step over an invalid char and continue with the path? 2) why is swapping the 'sticking' char for '?' resulting in an overflow?
Function:
void Ontology::matchRegExpHelper(nodeT *w, string inWild, Set<string> &matchSet, string out, int level, int pos, int stepover)
{
if (inWild=="") {
matchSet.add(out);
} else {
if (w->alpha.size() == pos) {
int testLength = out.length() + inWild.length();
if (stepover == 0 && matchSet.size() == 0 && out.length() > 8 && testLength == tokenLength) {//candidate generator
inWild[0] = '?';
matchRegExpHelper(w, inWild, matchSet, out, level, 0, stepover+1);
} else
return; //giveup on this path
}
if (inWild[0] == '?' || (inWild[0] == '*' && (out.length() + inWild.length() ) == level ) ) { //wild
matchRegExpHelper(w->alpha[pos].next, inWild.substr(1), matchSet, out+w->alpha[pos].letter, level, 0, stepover);//follow path -> if ontology is full, treat '*' like a '?'
} else if (inWild[0] == '*')
matchRegExpHelper(w->alpha[pos].next, '*'+inWild.substr(1), matchSet, out+w->alpha[pos].letter, level, 0, stepover); //keep adding chars
if (inWild[0] == w->alpha[pos].letter) //follow self
matchRegExpHelper(w->alpha[pos].next, inWild.substr(1), matchSet, out+w->alpha[pos].letter, level, 0, stepover); //follow char
matchRegExpHelper(w, inWild, matchSet, out, level, pos+1, stepover);//check next path
}
}
Error Message:
+str "Attempt to access index 1 in a vector of size 1." std::basic_string<char,std::char_traits<char>,std::allocator<char> >
+err {msg="Attempt to access index 1 in a vector of size 1." } ErrorException
Note: this function works fine for hundreds of test strings with '*' wilds if the extra stepover gate is not used
Semi-Solved: I place a pos < w->alpha.size() condition on each path that calls w->alpha[pos]... - this prevented the backtrack calls from attempting to access the vector with an out of bounds index value. Still have other issues to work out - it loops infinitely adding the ? and backtracking to remove it, then repeat. But, moving forward now.
Revised question: why during backtracking is the position index accumulating and/or not deincrementing - so at somepoint it calls w->alpha[pos]... with an invalid position that is either remaining from the next node or somehow incremented pos+1 when passing upward?
SOLVED: combine the regular expression wilds function as loops in the matching function

Resources