What is the definition for the height of a tree? - math

I can't seem to find a definitive answer for this, I'm trying to do some elementary proofs on heaps but here's what's throwing me off a little bit:
Is an empty tree valid? If so, what is its height?
I would think this would be 0.
What is the height of a tree with a single node?
I would think this would be 1 but I have seen definitions where it is 0 (and if this is the case then I don't know how to account for an empty tree).

Height of a tree is the length of the path from root of that tree to its farthest node (i.e. leaf node farthest from the root).
A tree with only root node has height 0 and a tree with zero nodes would be considered as empty. An empty tree has height of -1. Please check this.
I hope this helps.

I think you should take a look at the Dictionary of Algorithms and Data Structures at the NIST website. There definition for height says a single node is height 0.
The definition of a valid tree does include an empty structure. The site doesn't mention the height of such a tree, but based on the definition of the height, it should also be 0.

I have seen it used in both ways (counting a single node as 0 or 1), but the majority of sources would define a root-only tree as a tree of height 0, and would not consider a 0-node tree valid.

If your tree is a recursively defined data structure which may be either empty or a node with a left and right subtree (for example search trees, or your heap), then the natural definition is to assign 0 to the empty tree and 1 + the height of the highest subtree to a nonempty tree.
If your tree is a graph then the natural definition is the longest path from the root to a leaf, so a root-only tree has depth 0. You normally wouldn't even consider empty trees in this case.

The height of a tree is the length of the longest path to a terminal node in either of its children.
Wikipedia says the height of an empty tree is -1. I disagree. An empty tree is literally just a tree containing one terminal node (a null or special value which represents an empty tree). Since the node has no children, the length of its longest path must be the empty sum = 0, not -1.
Likewise, a non-empty tree has two children, so by definition there is at least a path >= 1 to a terminal node.
We might define our tree as follows:
type 'a tree =
| Node of 'a tree * 'a * 'a tree
| Nil
let rec height = function
| Node(left, x, right) -> 1 + max (height left) (height right)
| Nil -> 0

According to Wikipedia, the height of a (sub-)tree with one single node is 0. The height of a tree with no nodes would be -1. But I think it's up to you, how you define the height and your proofs should work with either definition.

The definition of the height of a rooted tree is the length of the longest path from the root to a leaf, expressed in the number of edges. This definition does not cover the case of an empty tree, as there is no path at all.
However, for practical reasons, it is convenient to define the height of an empty tree as −1. Here are some of those reasons:
For non-empty trees we have this rule: the height of the tree is equal to the number of levels in that tree, minus 1. If we extrapolate this rule to an empty tree, then we have 0 levels, and thus a height of −1.
A tree with height ℎ has at least ℎ+1 nodes. If the tree is binary, then it has at most 2ℎ+1−1 nodes. If we substitute −1 for ℎ we get 0 for both expressions, and indeed an empty tree has zero nodes.
The height of a tree is one more than the maximum among the heights of the root's subtrees. If the root happens to have no children, we could say it only has "empty" subtrees. And if we consider the height of those empty subtrees to be −1, then we come to the (correct) conclusion that this tree's height is 0.
It would be impractical to define the height of an empty tree as 0, as you would need to define exceptions to the points raised above.

actully a perfect defn for height of tree is d level of leaf of d longest path from root plus 1..accordin 2 this defn f a tree s empty,it wont b havin any level n v cant consider it had zero,coz level of a root s zero ..so empty tree level is -1,than accordin 2 defn its -1+1=0..so ZERO s d height of empty tree...bt n many book they hav given -1 bt no explanation s given

Related

How do I solve this data structure

I need help with this data structure question.
each tree has this properties:
right
left
color
key
here are the details:
picture of example tree
https://i.gyazo.com/85a59c69301c214ddc03f2d54324ca6f.png
A good path is a path where parent and child don't have the same colors (for example good path is red-white-red-white or white-red-white-red)
you need to find the longest good path inside a given tree and print its length. (in this example tree output would be 5)
for example in this tree the longest path is
17->13->32->18->22
rules:
you can have other functions to assist you.
you may use fixed number of variables like x,y,z.
you cannot use additional data structures.
not even sure if its recursion or not.
To get you moving, let's look at the basic properties that can affect a legal tree:
leaf node (dead end; return length of 1)
both children have same color as parent (dead end at parent node, return 0)
one child has color different from parent (follow the one different branch, return that depth + 1)
both children have color different from parent (follow both, return left + 1 + right)
That should get you to form detailed pseudo-code for a solution.

sum graph from bottom to top

given the following graph modeled in neo4j
goal:
calculate the sum of all nodes multiplied by the edge percentage from the bottom up.
e.g.
(((30*.6)+(50*.1)+100)*.5)+10)=71,5
status:
I found the REDUCE function (http://neo4j.com/docs/stable/query-functions-collection.html#functions-reduce)
but in my opinion it sums from top to the bottom, instead of bootom up.
Is this a commen problem with a well known name, and I dont know it?
Is there any solution in neo4j or in another (graph)database/language?
This was a really interesting one :
I assumed 2 things, first all nodes have the :A label, second the property on nodes and relationship has the key p
Here is a working query :
MATCH p=(:A)-[r]->(pike)
WITH pike, collect(p) as paths
OPTIONAL MATCH (pike)-[r]->()
WITH
CASE r WHEN null THEN 1 ELSE r.p END as multiplier,
CASE r WHEN null THEN last(nodes(paths[0])).p
ELSE reduce(x=0, path in paths | x + (head(nodes(path)).p * head(rels(path)).p)) + last(nodes(paths[0])).p END as total
RETURN sum(total*multiplier) as total
The logic behind :
Find one depths paths, agreggate the children by the pike (first WITH)
In case the optional match doesn't pass, the multiplier will be 1 instead of a possible float value on the relationship property
The second case, do the math logic, if this is the top of the pikes (hence here node A) it will just add the value of the top node, otherwise it will take the value of the children
Then it sum the score + the multiplication
You can test it here : http://console.neo4j.org/r/ih8obf

How is this Huffman Table created?

I have a table that shows the probability of an event happening.
I'm fine with part 1, but part 2 is not clicking with me. I'm trying to get my head around how
the binary numbers are derived in part 2?
I understand 0 is assigned to the largest probability and we work back from there, but how do we work out what the next set of binary numbers is? And what do the circles around the numbers mean/2 shades of grey differentiate?
It's just not clicking. Maybe someone can explain it in a way that will make me understand?
To build huffman codes, one approach is to build a binary tree, using a priority queue, in which the data to be assigned codes are inserted, sorted by frequency.
To start with, you have a queue with only leaf nodes, representing each of your data.
At each step you take the two lowest priority nodes from the queue, make a new node with a frequency equal to the sum of the two removed nodes, and then attach those two nodes as the left and right children. This new node is reinserted into the queue, according to it's frequency.
You repeat this until you only have one node in the queue, which will be the root.
Now you can traverse the tree from the root to any leaf node, and the path you take (whether you go left or right) at each level gives you either a 0 or a 1, and the length of the path (how far down the tree the node is) gives you the length of the code.
In practice you can just build this code as you build the tree, but appending 0 or 1 to the code at each node, according to whether the sub-tree it is part of is being added to the left or the right of some new parent.
In your diagram, the numbers in the circles are indicating the sum of the frequency of the two nodes which have been combined at each stage of building the tree.
You should also see that the two being combined have been assigned different bits (one a 0, the other a 1).
A diagram may help. Apologies for my hand-writing:

Gaining information from nodes of tree

I am working with the tree data structure and trying to come up with a way to calculate information I can gain from the nodes of the tree.
I am wondering if there are any existing techniques which can assign higher numerical importance to a node which appears less frequently at lower level (Distance from the root of the tree) than the same nodes appearance at higher level and high frequency.
To give an example, I want to give more significance to node Book, at level 2 appearing once,
then at level 3 appearing thrice.
Will appreciate any suggestions/pointers to techniques which achieve something similar.
Thanks,
Prateek
One metric I just thought of is this: for a labelk, let it's "value" be the sum of the levels it appears at. So, if it appears at the root and the root's left child, let it's value be 1.
Then, your most "important" labels are those with the lowest value.
EDIT: This will make the root more important than the label of it's children, even if they are both the same. So, some scaling by occurrence count might be in order.
It depends how much significance you want to give to it at each level.
Just multiply by a number that decreases as you move down the levels of the tree. For example, n_nodes * 1/(3^n), where n is the level of the tree. Thus, a node on level 2 gets a value of 1/4, and 3 nodes on level 3 get a value of 1/9. Thus, the single node on level 2 is more significant.
Adjust the denominator to your liking. As long as it increases with n, it will give more significance to nodes higher in the tree.

Given a spanning tree and an edge not on the spanning tree, how to form a cycle base?

I have a graph with Edge E and Vertex V, I can find the spanning tree using Kruskal algorithm (or any other traverse-backtrack-traverse-again kind of algorithms), now I want to find all the cycle bases that are created by utilitizing that spanning tree and the edges that are not on the tree, any algorithm that allows me to do that, besides brute force search?
I can, of course, starts from one vertex of the non-spanning tree edge, gets all the edges, explore all of them, retracts if I find dead end, until I come back to the other vertex of the edge. But this is a bit, err... brutal. Any other ideas?
After constructing spanning tree, iterate on every edge(A,B) which is not in tree and find Lowest Common Ancestor(LCA) for nodes of this edge, your cycle would be path from
A -> LCA -> B -> A
you can use this link:
http://www.topcoder.com/tc?module=Static&d1=tutorials&d2=lowestCommonAncestor
for efficient Lowest Common Ancestor algorithm implementation.
A simple algorithm we use for finding cycles in graphs:
Create a depth-first spanning tree
where each node has a parent.
In traversing the tree create a record of used nodes.
When an edge points to a previously used node, store it as a
cyclic edge.
When the spanning tree is complete, the count of the cyclic
edges gives the count of cycles.
For each cyclic edge recurse through the ancestors of the two nodes
until a common ancestor is found. That
will give exactly the cycles.
It may be useful to have an index (hashtable) of all the ancestors of a cyclic edge node so that is quick to find the common ancestor.
I doubt this is the best algorithm but it is fairly quick.
EDIT in repsonse to comment
Each node in the spanning tree has a parent. When a node in a cyclic edge is reached it calculates its list of ancestors (List<Node>This list could be indexed for speed (i.e. contains() is < O(n)). When a cyclic edge with two nodes (n1, n2) is found then iterate through the ancestors of n1, n1.ancestorList (rapidly since the list has already been created) and test whether the ancestor is in n2.ancestorList. If it (commonAncestor) is, then exactly those ancestors traversed correspond to cycles. Then iterate through n2 until you reach commonAncestor (rapid). The time should depend on the number of cyclic edges, combined with the lookup in lists (probably O(logN) but cheap). There is no need to re-explore the graph and there is no backtracking.

Resources