version - Cloudfront.net

advertisement
VERSION:
A
CSE 100 Midterm #1
Summer 2014: July 15
Problem
Topic
Points Possible
1
Data Structure
comparisons
15
2
BSTs
10
3
Running Time Analysis
30
4
Huffman Coding
30
5
C++
15
Total
Points Earned
Grader
100
This exam is closed book, closed notes. Write your name on every page, including reference
and scratch paper. Scratch paper must be turned in at the end of the exam.
You have 80 minutes to complete this exam. Work to maximize points. If you don’t know the
answer to a problem, move on and come back later. Most importantly, stay calm and don’t
panic. You can do this.
Name:________________________________________
ID:___________________________________________
Exam versions of adjacent students MUST BE DIFFERENT. If your version is the same as your
neighbor’s version, raise your hand.
Name of student to your LEFT:
Name of student to your RIGHT:
Exam version of student to your LEFT:
Exam version of student to your RIGHT:
(Write “N/A” if seat immediately to your left or right is not occupied, or a wall or aisle, etc.)
DO NOT OPEN THIS EXAM UNTIL YOU ARE INSTRUCTED
TO DO SO.
2
Name__________________________________
1. Data Structure Comparison
[15 points]
Assume you have the choice of the following data structures: sorted array, sorted linked list, unsorted
linked list, binary search tree and heap.
Choose the appropriate data structure if your algorithm repeatedly performs each of the following
functions. Briefly justify your answer.
a. Searches for elements in a dynamic data set (insertions and deletions are frequent).
Use binary search tree.
Reason: Both sorted array and binary search tree have time O(log n) for searching for elements, while
sorted linked list, unsorted linked list and heap have time O(n) for searching. When we do search for
elements in a dynamic data set, it’s better to choose binary search tree rather than sorted array
because the running time of insertions and deletions of BST is O(1), which is much faster than sorted
array(O(n)).
b. Searches for elements in a static data set (insertions and deletions are rare).
Use sorted array or binary search tree.
Reason: Both sorted array and binary search tree has time O(log n) for searching for elements, while
sorted linked list, unsorted linked list and heap has time O(n) for searching. Because insertions and
deletions are rare in static data set, we don’t need to consider the running time of insertion and
deletion.
c. Extracts the element with the minimum key value.
Choose sorted linked list.
Reason: The running time of extracting the element with the minimum key value of sorted linked list
is O(1), which is much faster than others.
3
Name________________________________________
2. Binary Search Trees
[10 points]
For each of the following trees, state whether or not it is a legal Binary Search Tree (BST). If the tree is
not a legal BST, state why not, annotating the tree where appropriate. For all the given trees, indicate if
they are balanced or not. Justify your answer.
52
52
A.
B.
75
30
12
40
55
55
a) Is tree (A) a legal BST (circle one)? Yes
If not, why not?
No
[3 points]
Node with 55
b) Is tree (A) Balanced (circle one)?
Justify your answer.
Yes
No
[2 points]
c) Is tree (B) a legal BST (circle one)?
If not, why not?
Yes
No
[2 points]
Yes
No
[3 points]
Node with 55
b) Is tree (B) Balanced (circle one)?
Justify your answer.
Node with 52
4
Name________________________________________
3. Running Time Analysis
[30 points]
a. Write the most general equation for the average number of comparisons needed to find an
element in a particular binary search tree with N nodes, where 𝑑(𝑥𝑖 ) is the depth of node 𝑥𝑖 in
the tree and 𝑝𝑖 is the probability of searching for node 𝑥𝑖 . [3 points]
See version B
b. Which of the following assumptions did you rely on in writing the above equation? [2 points]
i. The tree is approximately balanced.
ii. All nodes in the tree are equally likely to be searched for
iii. All orders of insertions are equally likely to occur
iv. All priorities are drawn from a uniform probability distribution
v. None of the above
See version B
c. Construct all possible binary search trees with the keys 1, 2, 7, 9 under the restriction that the
second key inserted into the tree is even. [10 points]
5
6
Name________________________________________
d. Compute the average total depth over all trees that can be constructed with the keys
(5, 2, 7, 9, 15, 21) under the restriction that the first key inserted into the tree is even. You
are given the following recursive relationship:
N*D(N)=(N+1)*D(N-1)+2N-1, where D(N) is the expected total depth of all trees with N
keys, under the assumption that all keys are equally likely to be inserted into the tree.
[15points]
2
s
The right subtree s contains 5,7,8,15,21
D(2) = 1+2 = 3;
D(3) = 17/3;
D(4) = 53/6;
D(5) = 62/5
Average total depth = 62/5 + 5 + 1 = 92/5
7
Name________________________________________
4. Huffman coding [30 points]
i)
Which of the binary trees is a better encoding scheme over the symbols {h, u, f, m, a, n}
if all symbols had a non-zero frequency? Justify your answer. [5 points]
A.
B.
m
h
u
f
a
n
h
u
f
m
a
n
B is better
B is prefix free
8
Name________________________________________
ii) Consider the following symbols with the given frequency distribution:
Symbol
Frequency
Code (see part a)
A
0.15
000
Frequency that would yield a
better lower bound on expected
codeword length (See part c)
1
H
0.2
01
0
M
0.1
0010
0
F
0.5
1
0
N
0.01
00111
0
U
0.04
00110
0
a. Draw the Huffman code tree below using the following conventions, and then use that tree to
fill in the code table above: [10 points]
 The subtree with the lower frequency is always the right child when two trees are merged
 The left child is always the 0 child, the right child is always the 1 child
 Ties are broken using alphabetical ordering. In the case of a tie in frequency between two
trees, the tree with the symbol that is earlier in the alphabet is the tree that is picked first to be
merged. E.g., if trees with the symbols A and E had the same frequency, then the tree with A
would be picked first.
 When merging two trees, the symbol that is alphabetically earlier is propagated up to the new
root. If the trees have the same frequency, the tree with symbol earlier in the alphabet is
chosen as the left child of the new root.
9
Name________________________________________
10
b. Using your tree, encode the following string. If you find extra bits at the end of the string just
ignore them. [5 points]
HUFFMAN =
___________010011011001000000111_______________________________________
c. What is the average code length of your Huffman code? [5 points]
0.15*3+0.2*2+0.1*4+0.5*1+0.01*5+0.04*5=2
d. Fill in the final column table above with a frequency distribution which would lead to a better
theoretical minimum expected length per coded symbol (i.e., has a lower entropy) than the
current frequency distribution. Hint: The theoretical minimum expected length per coded symbol
was referred to as Lave in your book and in class, but you don’t necessarily need to remember the
exact formula to get the right answer here. You can assume any distribution over the frequency
distribution. [5 points]
5. C++ Concepts [15 points]
a. Consider the following implementation of a node in the Huffman Tree
#ifndef HCNODE_HPP
#define HCNODE_HPP
class HCNode {
public:
HCNode* parent; // pointer to
HCNode* child0; // pointer to
HCNode* child1; // pointer to
unsigned char symb; // symbol
int count; // count/frequency
parent; null if root
"0" child; null if leaf
"1" child; null if leaf
of symbols in subtree
bool operator<(HCNode const &) const;
};
#endif
Name________________________________________
11
bool HCNode::operator<(HCNode const & other) const {
if(count != other.count)
return count > other.count;
return symb < other.symb;
};
Now condsider following code snippet:
HCNode n1, n2, n3, n4;
n1.count = 200; n1.symb = ’A’;
n2.count = 100; n2.symb = ’B’;
n3.count = 100; n3.symb = ’C’;
For the above code snippet, what do each of the expressions given below evaluate to. Choose
TRUE or FALSE: [4 points]
i)
n1 < n2 TRUE
ii)
n3 < n1 FALSE
II) Explain why the less than operator was overloaded in the HCNode class [1 point]
See version B
12
Name________________________________________
b. Show the contents of the array ‘arr’ before and after line 4 of the given code is
executed. [10 points]
int arr[6]={1,5,7,9,2,21}; //line 1
int* p = arr+3; // line 2
int &ra = *(p+1); //line 3
ra = 50; //line 4
p = arr;
Before: 1 5 7 9 2 21
After: 1 5 7 9 50 21
13
Download