So I am struggling with understanding recursion in the context of decision trees. I have looked at two different codes from two different websites: Example 1 and Example 2. This is a part of the code from the first example:
class DecisionTreeClassifier():
def __init__(self, min_samples_split=2, max_depth=2):
''' constructor '''
# initialize the root of the tree
self.root = None
# stopping conditions
self.min_samples_split = min_samples_split
self.max_depth = max_depth
def build_tree(self, dataset, curr_depth=0):
''' recursive function to build the tree '''
X, Y = dataset[:,:-1], dataset[:,-1]
num_samples, num_features = np.shape(X)
# split until stopping conditions are met
if num_samples>=self.min_samples_split and curr_depth<=self.max_depth:
# find the best split
best_split = self.get_best_split(dataset, num_samples, num_features)
# check if information gain is positive
if best_split["info_gain"]>0:
# recur left
left_subtree = self.build_tree(best_split["dataset_left"], curr_depth+1)
# recur right
right_subtree = self.build_tree(best_split["dataset_right"], curr_depth+1)
# return decision node
return Node(best_split["feature_index"], best_split["threshold"],
left_subtree, right_subtree, best_split["info_gain"])
# compute leaf node
leaf_value = self.calculate_leaf_value(Y)
# return leaf node
return Node(value=leaf_value)
I can't wrap my head around how the information about the split decisions and thresholds are "saved" while performing the recursion. In both of the codes they return only one node after the recursion is done, but how can this node store the whole tree, when it is only one node? Should not a tree be able to contain many nodes?
Since the recursion is performed before the making of the node, how is the information about the different split decisions and thresholds "stored" before making the node? This might be some stupid questions but I am really struggling with understanding this concept and was hoping if someone had a good explanation that could help me visualize what happens during the recursion.
Related
The task is to find the Power Set of an array which is to find set of all subsets of an array for example for [1,2] it will be [[],[1],[2],[1,2]]. For an n length array, the power set is 2^n long. It is evident when I try to do it using recursion and 2 calls like below
class Solution:
def helperSubsets(self,output_li, nums, klist):
if len(nums)==0:
output_li.append(klist)
return
self.helperSubsets(output_li,nums[1:],klist+[nums[0]])
self.helperSubsets(output_li,nums[1:],klist)
def subsets(self, nums: List[int]) -> List[List[int]]:
output_li = []
self.helperSubsets(output_li,nums,[])
return output_li
Recurrence Relation is : T(n) = 2T(n-1) + O(1)
But when trying to do it using for loop in recursive call, how do I prove or find the time complexity of the recursion, what would be the Recurrence relation also if you know. I am not able to figure it out. Reference of doing it with for loop and recursion:
class Solution:
def helperSubsets(self,output_li, nums, klist):
output_li.append(klist)
for i in range(len(nums)):
self.helperSubsets(output_li,nums[i+1:],klist+[nums[i]])
def subsets(self, nums: List[int]) -> List[List[int]]:
output_li = []
self.helperSubsets(output_li,nums,[])
return output_li
In the first example, the element is either being picked or not, so the recursive calls are being divided twice at each element so complexity is 2^n. But for the second code which is doing the same thing, how do I find the complexity? (apart from knowing it's doing the same thing as the first so it'll be similar). Can you also give me the recurrence relation for it if you know?
After many hours spent googling, I still have not come across an in depth, intuitive as well as solidly proven treatment of this question. The closest article I have found, linked to on some obscure discussion forum, is this: https://11011110.github.io/blog/2013/12/17/stack-based-graph-traversal.html. I have also seen this Stack Overflow question DFS vs BFS .2 differences, but the responses do not arrive at a clear consensus.
So here is the question:
I have seen it stated (in Wikipedia, as well as Algorithms Illuminated by Tim Roughgarden) that, to transform a BFS implementation into an iterative DFS one, the following two changes are made:
The non-recursive implementation is similar to breadth-first search but differs from it in two ways:
it uses a stack instead of a queue, and
it delays checking whether a vertex has been discovered until the vertex is popped from the stack rather than making this check before adding the vertex.
Can anyone help explain, via intuition or example, the reason for the second distinction here? Specifically: what is the differentiating factor between BFS, iterative DFS, and recursive DFS that necessitates postponing the check until after popping off the stack only for iterative DFS?
Here is a basic implementation of BFS:
def bfs(adjacency_list, source):
explored = [False] * len(adjacency_list)
queue = deque()
queue.append(source)
explored[source] = True
while queue:
node = queue.popleft()
print(node)
for n in adjacency_list[node]:
if explored[n] == False:
explored[n] = True
queue.append(n)
If we simply swap the queue for a stack, we get this implementation of DFS:
def dfs_stack_only(adjacency_list, source):
explored = [False] * len(adjacency_list)
stack = deque()
stack.append(source)
explored[source] = True
while stack:
node = stack.pop()
print(node)
for n in adjacency_list[node]:
if explored[n] == False:
explored[n] = True
stack.append(n)
The only difference between these two algorithms here is that we swapped the queue from BFS for a stack in DFS. This implementation of DFS actually produces incorrect traversals (in a non simplistic graph; possibly for a very simple graph it might anyway produce a correct traversal).
I believe that this is the 'error' referenced in the article linked above.
However, this can be fixed in one of two ways.
Either of these two implementations produces a correct traversal:
First, the implementation suggested in the sources above, with the check delayed until after popping the node from the stack. This implementation results in many duplicates on the stack.
def dfs_iterative_correct(adjacency_list, source):
explored = [False] * len(adjacency_list)
stack = deque()
stack.append(source)
while stack:
node = stack.pop()
if explored[node] == False:
explored[node] = True
print(node)
for n in adjacency_list[node]:
stack.append(n)
Alternatively, this is a popular online implementation (this one taken from Geeks for Geeks) which also produces the correct traversal. There are some duplicates on the stack, but hardly as many as the previous implementation.
def dfs_geeks_for_geeks(adjacency_list, source):
explored = [False] * len(adjacency_list)
stack = deque()
stack.append(source)
while len(stack):
node = stack.pop()
if not explored[node]:
explored[node] = True
print(node)
for n in adjacency_list[node]:
if not explored[n]:
stack.append(n)
So in summary, it seems that the difference is not solely about when you check the visited status of a node, but more about when you actually mark it as visited. Furthermore, why does marking it as visited immediately work just fine for BFS, but not for DFS? Any insight is greatly appreciated!
Thank you!
I don't see a difference in that respect between BFS and DFS.
I see two requirements to "marking nodes as visited":
It should not prevent pushing nodes neighbors into the stack or queue.
It should prevent pushing the node again into the stack or queue.
Those requirements apply to DFS as well as BFS, so the squance for both can be:
fetch node from stack or queue
mark node as visited
get node's neighbors
put any unvisited neighbor into the stack or queue
I am trying to implement FP-Growth (frequent pattern mining) algorithm in Java. I have built the tree, but have difficulties with conditional FP tree construction; I do not understand what recursive function should do. Given a list of frequent items (in increasing order of frequency counts) - a header, and a tree (list of Node class instances) what steps should the function take?
I have hard time understanding this pseudocode above. Are alpha and Betha nodes in the Tree, and what do generate and construct functions do? I can do FP-Growth by hand, but find the implementation extremely confusing. If that could help, I can share my code for FP-Tree generation. Thanks in advance.
alpha is the prefix that lead to this specific prefix tree
beta is the new prefix (of the tree to be constructed)
the generate line means something like: add to result set the pattern beta with support anItem.support
the construct function creates the new patterns from which the new tree is created
an example of the construct function (bottom up way) would be something like:
function construct(Tree, anItem)
conditional_pattern_base = empty list
in Tree find all nodes with tag = anItem
for each node found:
support = node.support
conditional_pattern = empty list
while node.parent != root_node
conditional_pattern.append(node.parent)
node = node.parent
conditional_pattern_base.append( (conditional_pattern, support))
return conditional_pattern_base
I' am doing my homework in programming, and I don't know how to solve this problem:
We have a set of n weights, we are putting them on a scale one by one until all weights is used. We also have string of n letters "R" or "L" which means which pen is heavier in that moment, they can't be in balance. There are no weights with same mass. Compute in what order we have to put weights on scale and on which pan.
The goal is to find order of putting weights on scale, so the input string is respected.
Input: number 0 < n < 51, number of weights. Then weights and the string.
Output: in n lines, weight and "R" or "L", side where you put weight. If there are many, output any of them.
Example 1:
Input:
3
10 20 30
LRL
Output:
10 L
20 R
30 L
Example 2:
Input:
3
10 20 30
LLR
Output:
20 L
10 R
30 R
Example 3:
Input:
5
10 20 30 40 50
LLLLR
Output:
50 L
10 L
20 R
30 R
40 R
I already tried to compute it with recursion but unsuccessful. Can someone please help me with this problem or just gave me hints how to solve it.
Since you do not show any code of your own, I'll give you some ideas without code. If you need more help, show more of your work then I can show you Python code that solves your problem.
Your problem is suitable for backtracking. Wikipedia's definition of this algorithm is
Backtracking is a general algorithm for finding all (or some) solutions to some computational problems, notably constraint satisfaction problems, that incrementally builds candidates to the solutions, and abandons a candidate ("backtracks") as soon as it determines that the candidate cannot possibly be completed to a valid solution.
and
Backtracking can be applied only for problems which admit the concept of a "partial candidate solution" and a relatively quick test of whether it can possibly be completed to a valid solution.
Your problem satisfies those requirements. At each stage you need to choose one of the remaining weights and one of the two pans of the scale. When you place the chosen weight on the chosen pan, you determine if the corresponding letter from the input string is satisfied. If not, you reject the choice of weight and pan. If so, you continue by choosing another weight and pan.
Your overall routine first inputs and prepares the data. It then calls a recursive routine that chooses one weight and one pan at each level. Some of the information needed by each level could be put into mutable global variables, but it would be more clear if you pass all needed information as parameters. Each call to the recursive routine needs to pass:
the weights not yet used
the input L/R string not yet used
the current state of the weights on the pans, in a format that can easily be printed when finalized (perhaps an array of ordered pairs of a weight and a pan)
the current weight imbalance of the pans. This could be calculated from the previous parameter, but time would be saved by passing this separately. This would be total of the weights on the right pan minus the total of the weights on the left pan (or vice versa).
Your base case for the recursion is when the unused-weights and unused-letters are empty. You then have finished the search and can print the solution and quit the program. Otherwise you loop over all combinations of one of the unused weights and one of the pans. For each combination, calculate what the new imbalance would be if you placed that weight on that pan. If that new imbalance agrees with the corresponding letter, call the routine recursively with appropriately-modified parameters. If not, do nothing for this weight and pan.
You still have a few choices to make before coding, such as the data structure for the unused weights. Show me some of your own coding efforts then I'll give you my Python code.
Be aware that this could be slow for a large number of weights. For n weights and two pans, the total number of ways to place the weights on the pans is n! * 2**n (that is a factorial and an exponentiation). For n = 50 that is over 3e79, much too large to do. The backtracking avoids most groups of choices, since choices are rejected as soon as possible, but my algorithm could still be slow. There may be a better algorithm than backtracking, but I do not see it. Your problem seems to be designed to be handled by backtracking.
Now that you have shown more effort of your own, here is my un-optimized Python 3 code. This works for all the examples you gave, though I got a different valid solution for your third example.
def weights_on_pans():
def solve(unused_weights, unused_tilts, placement, imbalance):
"""Place the weights on the scales using recursive
backtracking. Return True if successful, False otherwise."""
if not unused_weights:
# Done: print the placement and note that we succeeded
for weight, pan in placement:
print(weight, 'L' if pan < 0 else 'R')
return True # success right now
tilt, *later_tilts = unused_tilts
for weight in unused_weights:
for pan in (-1, 1): # -1 means left, 1 means right
new_imbalance = imbalance + pan * weight
if new_imbalance * tilt > 0: # both negative or both positive
# Continue searching since imbalance in proper direction
if solve(unused_weights - {weight},
later_tilts,
placement + [(weight, pan)],
new_imbalance):
return True # success at a lower level
return False # not yet successful
# Get the inputs from standard input. (This version has no validity checks)
cnt_weights = int(input())
weights = {int(item) for item in input().split()}
letters = input()
# Call the recursive routine with appropriate starting parameters.
tilts = [(-1 if letter == 'L' else 1) for letter in letters]
solve(weights, tilts, [], 0)
weights_on_pans()
The main way I can see to speed up that code is to avoid the O(n) operations in the call to solve in the inner loop. That means perhaps changing the data structure of unused_weights and changing how it, placement, and perhaps unused_tilts/later_tilts are modified to use O(1) operations. Those changes would complicate the code, which is why I did not do them.
This is another question from my past midterm, and i am supposed to give a formal formulation, describe the algorithm used, and justify the correctness. Here is the problem:
The University is trying to schedule n different classes. Each class has a start and finish time. All classes have to be taught on Friday. There are only two classrooms available.
Help the university decide whether it is possible to schedule these classes without causing any time conflict (i.e. two classes with overlapping class times are scheduled in the same classroom).
Sort the classes by starting time (O(nlogn)), then go through them in order (O(n)), noting starting and ending times and looking for the case of more than two classes going on at the same time.
This isn't a problem with a bipartite graph solution. Who told you it was?
#Beta is nearly correct. Create a list of pairs <START, time> and <END, time>. Each class has two pairs in the list, one START, and one END.
Now sort the list by time. Or if you like, put them in a min heap, which amounts to heapsort. For equal times, put the END triples before START. Then execute the following loop:
set N = 0
while sorted list not empty
pop <tag, time> from the head of the list
if tag == START
N = N + 1
if N > 2 return "can't schedule"
else // tag == END
N = N - 1
end
end
return "can schedule"
You can easily enrich the algorithm a bit to return the time periods where more than 2 classes are in session at the same time, return those classes, and other useful information.
This indeed IS a bipartite/bicoloring problem.
Imagine each class to be a node of a graph. Now create an edge between 2 nodes if they have time overlap. Now the final graph that you get if you can bicolor this graph then its possible to schedule all the class. Otherwise not.
The graph you created if it can be bicolored, then each "black" node will belong to room1 and each "white" node will belong to room2