Lectures 19 - School of Computer Science and Statistics

cs2010: algorithms and data structures
Lectures 19: Binary Search Trees
.
Vasileios Koutavas
12 Nov 2014
School of Computer Science and Statistics
Trinity College Dublin
1
assignment 3
https://www.scss.tcd.ie/Vasileios.Koutavas/teaching/cs2010/michaelmas1415/assignment-3
→ Due in 2 weeks
→ 25% of MT assignment marks
→ Critical thinking required: what is the most efficient algorithm/data structure to
solve the problem?
2
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
3.2 B INARY S EARCH T REES
‣ BSTs
‣ ordered operations
Algorithms
F O U R T H
E D I T I O N
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
‣ deletion
3.2 B INARY S EARCH T REES
‣ BSTs
‣ ordered operations
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
‣ deletion
Binary search trees
Definition. A BST is a binary tree in symmetric order.
root
a left link
A binary tree is either:
a subtree
・Empty.
・Two disjoint binary trees (left and right).
right child
of root
null links
Anatomy of a binary tree
Symmetric order. Each node has a key,
and every node’s key is:
・
・Smaller than all keys in its right subtree.
Larger than all keys in its left subtree.
parent of A and R
left link
of E
S
E
X
A
R
C
keys smaller than E
key
H
9
value
associated
with R
keys larger than E
Anatomy of a binary search tree
3
binary trees: more definitions
→ leaves of tree: the nodes with no child nodes
→ height of tree: the maximum number of links from the root to a leaf
→ levels of tree: the maximum number of nodes from the root to a leaf (inl. root
and leaf)
→ size of tree: the number of nodes in the tree
→ depth of a node: the number of links from the root to this node.
3
binary trees: more definitions
→ leaves of tree: the nodes with no child nodes
→ height of tree: the maximum number of links from the root to a leaf
→ levels of tree: the maximum number of nodes from the root to a leaf (inl. root
and leaf)
→ size of tree: the number of nodes in the tree
→ depth of a node: the number of links from the root to this node.
Q: how many leafs in this tree?
Q: what is the height of this tree?
Q: how many levels in this tree?
Q: what is the size of this tree?
Q: what is the depth of ‘H’?
3
binary trees: more definitions
→ leaves of tree: the nodes with no child nodes
→ height of tree: the maximum number of links from the root to a leaf
→ levels of tree: the maximum number of nodes from the root to a leaf (inl. root
and leaf)
→ size of tree: the number of nodes in the tree
→ depth of a node: the number of links from the root to this node.
Q: how many leafs in this tree? 4
Q: what is the height of this tree? 4
Q: how many levels in this tree? 5
Q: what is the size of this tree? 9
Q: what is the depth of ‘H’? 3
3
Binary search tree demo
Search. If less, go left; if greater, go right; if equal, search hit.
successful search for H
S
E
X
A
R
C
H
M
4
Binary search tree demo
Insert. If less, go left; if greater, go right; if null, insert.
insert G
S
E
X
A
R
C
H
G
M
5
BST representation in Java
Java definition. A BST is a reference to a root Node.
A Node is composed of four fields:
・A Key and a Value.
・A reference to the left and right subtree.
smaller keys
larger keys
private class Node
{
private Key key;
private Value val;
private Node left, right;
public Node(Key key, Value val)
{
this.key = key;
this.val = val;
}
}
BST
Node
key
left
BST with smaller keys
val
right
BST with larger keys
Binary search tree
Key and Value are generic types; Key is Comparable
6
BST implementation (skeleton)
public class BST<Key extends Comparable<Key>, Value>
{
private Node root;
private class Node
{ /* see previous slide */
root of BST
}
public void put(Key key, Value val)
{ /* see next slides */ }
public Value get(Key key)
{ /* see next slides */ }
public void delete(Key key)
{ /* see next slides */ }
public Iterable<Key> iterator()
{ /* see next slides */ }
}
7
BST search: Java implementation
Get. Return value corresponding to given key, or null if no such key.
public Value get(Key key)
{
Node x = root;
while (x != null)
{
int cmp = key.compareTo(x.key);
if
(cmp < 0) x = x.left;
else if (cmp > 0) x = x.right;
else if (cmp == 0) return x.val;
}
return null;
}
Cost. Number of compares is equal to 1 + depth of node.
8
BST insert
Put. Associate value with key.
inserting L
S
E
X
A
H
C
Search for key, then two cases:
・Key in tree ⇒ reset value.
・Key not in tree ⇒ add new node.
R
M
P
search for L ends
at this null link
S
E
X
A
R
H
C
M
create new node
P
L
S
E
X
A
R
C
H
M
reset links
on the way up
L
P
Insertion into a BST
9
BST insert: Java implementation
Put. Associate value with key.
public void put(Key key, Value val)
{ root = put(root, key, val); }
concise, but tricky,
recursive code;
read carefully!
private Node put(Node x, Key key, Value val)
{
if (x == null) return new Node(key, val);
int cmp = key.compareTo(x.key);
if
(cmp < 0)
x.left = put(x.left, key, val);
else if (cmp > 0)
x.right = put(x.right, key, val);
else if (cmp == 0)
x.val = val;
return x;
}
Cost. Number of compares is equal to 1 + depth of node.
10
S
C
Tree shape
R
E
A
X
・Many BSTs correspond to same set of keys. typical case
S
case
search/insert
is equal to 1 + depth of node.
・Number of compares for best
H
E
X
S
C
best case
R
E
X
typical case
E
X
R
H
C
H
R
H
S
X
S
A
E
R
C
worst case
C
X
A
R
H
C
A
S
E
S
C
A
X
typical case
H
A
R
E
A
A
worst case
C
BST possibilities
E
H
R
S
BottomAline. Tree
depends on order of X
insertion.
worstshape
case
C
E
BST possibilities
H
11
BST insertion: random order visualization
Ex. Insert keys in random order.
12
BSTs: mathematical analysis
Proposition. If N distinct keys are inserted into a BST in random order,
the expected number of compares for a search/insert is ~ 2 ln N.
Pf. 1–1 correspondence with quicksort partitioning.
Proposition. [Reed, 2003] If N distinct keys are inserted in random order,
expected height of tree is ~ 4.311 ln N.
How Tall is a Tree?
How Tall is a Tree?
Bruce Reed
Bruce Reed
CNRS, Paris, France
CNRS, Paris, France
[email protected]
[email protected]
ABSTRACT
ABSTRACT
purpose of this note
have:
purpose
this note
that
for /3 -- ½ + ~ 3,
Let H~ be the height of a random
binaryofsearch
tree toonprove
n
nodes.
Wesearch
show that
there
constants a = 4.31107...
Let H~ be the height of a random
binary
tree on
n existshave:
THEOREM 1. E(H
n d / 3 = 1.95...
such that E(H~) = c~logn - / 3 1 o g l o g n +
nodes. We show that there existsa constants
a = 4.31107...
.
We also
O(1).
1. E(H~) = ~ l o g nVar(Hn)
- / 3 1 o g l=o gO(1)
n + O(1
a n d / 3 = 1.95... such that E(H~)O(1),
= c~logn
- / 3show
1 o g l othat
g n +Var(H~) =THEOREM
Var(Hn) = O(1) .
O(1), We also show that Var(H~) = O(1).
R e m a r k By the def
Categories and Subject Descriptors
nition
given
is
more
R e m a r k By the definition of a, /3 = 7"g~"
3~ The firs
we this
will see.
Categories and Subject Descriptors
E.2 [ D a t a S t r u c t u r e s ] : Trees nition given is more suggestive ofaswhy
value is co
For more informatio
as we will see.
E.2 [ D a t a S t r u c t u r e s ] : Trees
may
consult
[6],[7],
For
more
information
on
random
binary
search
trees[
1. THE RESULTS
R
e
m
a
r
k
After
I ann
may consult [6],[7], [1], [2], [9], [4], and [8].
A binary search tree is a binary tree to each node of which
1. THE RESULTS
an alterna
Rkeys
e m aaxe
r k drawn
After I from
announced
these developed
results, Drmota(unp
we
have
associated
a
key;
these
some
using
complet
A binary search tree is a binary tree to each node of which
proof ofO(1)
the fact
that
Var(H
totally
set and
the key developed
at v cannotanbealternative
larger than
illuminate
we have associated a key; these keys
axeordered
drawn from
some
O(1) than
using
different proofs
techniques.
As
ou
15dif
at its
child
thecompletely
key at its left
submit we
th
totally ordered set and the key atthev key
cannot
beright
larger
thannor smaller
proofs illuminate different aspectsdecided
of the to
problem,
But… Worst-case height is N.
[ exponentially small chance when keys are inserted in random order ]
ST implementations: summary
guarantee
average case
operations
on keys
implementation
search
insert
search hit
insert
sequential search
(unordered list)
N
N
½N
N
equals()
binary search
(ordered array)
lg N
N
lg N
½N
compareTo()
BST
N
N
1.39 lg N
1.39 lg N
compareTo()
Why not shuffle to ensure a (probabilistic) guarantee of 4.311 ln N?
16
quiz
Q: Which of the following are Binary Search Trees? Why?
4