Uploaded by Paul John

daily-coding-problem-get-exceptionally-good-at-coding-interviews-by-solving-one-problem-every-day

advertisement
Daily Coding Problem
Alex Miller and Lawrence Wu
1
2
Copyright© 2019, Alex Miller and Lawrence Wu
All rights reserved. No part of this book may be reproduced in any form or by any
electronic or mechanical means, including information storage and retrieval systems,
without permission in writing from the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted
by copyright law. For more information, email the authors at founders@dailycodingproblem.com.
SECOND EDITION
ISBN 978-1-7932966-3-4
(
Printed in the United States of America
Cover Design by Irene Zou
10987654321
'
Alex: For Mom, Dad,Jordan, Hannah, and Zaddy
Lawrence: For Mama and Papa
3
Contents
5
Contents
I
Data Structures
17
1 Arrays
1.1 Get product of all other elements .
1.2 Locate smallest window to be sorted
1.3 Calculate maximum subarray sum .
1.4 Find number of smaller elements to the right .
24
26
2
29
3
4
Strings
2.1 Find anagram indices
2.2 Generate palindrome pairs
2.3 Print zigzag form . . . . .
2.4 Determine smallest rotated string .
Linked Lists
3.1 Reverse linked list . . . . . . . . . . . . . .
3.2 Add two linked lists that represent numbers
3.3 Rearrange a linked list to alternate high-low
3.4 Find intersecting nodes of linked lists .
Stacks and Qyeues
4.1 Implement a max stack . . . . . . . . . .
4.2 Determine whether brackets are balanced
4.3 Compute maximum ofk-length subarrays
4.4 Reconstruct array using+/- signs . . . . .
5 Hash Tables
5.1 Implement an LRU cache .
5.2
19
20
22
29
32
34
36
41
42
44
46
48
51
53
54
56
59
63
65
Cut brick wall . . . . . . .
68
5
CONTENTS
5.3
6
6.3
6.4
8
Count unival trees .
Reconstruct tree from pre-order and in-order traversals
Evaluate arithmetic tree . . . . . .
Get tree level with minimum sum
75
79
81
83
85
Binary Search Trees
7 .1 Find floor and ceiling . . . . . .
7.2 Convert sorted array to BST ..
7.3 Construct all BSTs with n nodes
90
Tries
93
8.1
8.2
8.3
9
70
73
Trees
6.1
6.2
7
Implement a sparse array
6
Implement autocomplete system
Create PrefixMapSum class . . .
Find Maximum XOR of element pairs
87
88
95
98
102
Heaps
9 .1 Compute the running median
9 .2 Find most similar websites
9 .3 Generate regular numbers .
9.4 Build a Huffman tree
105
107
109
111
113
10 Graphs
10.1 Determine if a cycle exists .
10.2 Remove edges to create even trees
10.3 Create stepword chain ..
10.4 Beat Snakes and Ladders
10.5 Topological sort . . .
119
122
124
126
128
130
11 Advanced Data Structures
11.1 Fenwick tree . . . . .
11.2 Disjoint-set data structure
11.3 Bloom filter . . . . . . . .
133
134
137
140
II Algorithms
143
12 Recursion
12.1 Tower of Hanoi . . . . . . . .
12.2 Implement regular expressions
12.3 Find array extremes efficiently
12.4 Play Nim . . . . . . . . . . .
145
146
149
151
154
CONTENTS
7
13 Dynamic Programming
13.1 Number of ways to climb a staircase
13.2 Number of ways to decode a string
13.3 Painting houses . . . . . . . . . .
157
160
162
164
14 Backtracking
14.1 Compute flight itinerary
14.2 Solve Sudoku . . . . . .
14.3 Count Android unlock combinations .
167
169
171
174
15 Sorting and Searching
15 .1 Dutch flag problem . . . . . . .
15 .2 Pancake sort . . . . . . . . . . .
15 .3 Efficiently sort a million integers
15 .4 Find minimum element in rotated sorted array .
179
181
183
185
186
16 Pathfinding
16.1 Dijkstra's algorithm
16.2 Bellman-Ford .
16.3 Floyd-Warshall
189
190
193
195
17 Bit Manipulation
17.1 Find element that appears once in list . . . . . . . .
17.2 Implement division without / or * operators . . . . .
17.3 Compute longest consecutive string of ones in binary
17.4 Find n th sevenish number . . . . . . . . . . . . . . .
199
201
202
204
205
18 Randomized Algorithms
18.1 Pick random element from infinite stream
18.2 Shuffle deck of cards
18.3 Markov chain ..
209
210
212
215
19 Advanced Algorithms
19.1 Rabin-Karp . . .
19.2 Hierholzer's algorithm
19.3 A* search . . . . . . .
219
220
224
227
III Applications
235
20 Applications
20.1 Ghost . . . . . .
20.2 Connect 4 . . . .
20.3 Cryptarithmetic .
237
238
241
247
CONTENTS
20.4 Cheapest itinerary .
20.5 Alien dictionary ..
20.6 Prime numbers ..
20.7 Crossword puzzles.
20. 8 UTF-8 encodings .
20.9 Blackjack . . . . .
8
251
254
256
259
263
265
IV Design
271
21 Data Structure Design
21.1 Dictionary with time key
21.2 Qyeue with fixed-length array
21.3 Qyack . . . . . . . . . . . . .
273
274
278
282
22 System Design
22.1 Crawl Wikipedia . . . . . . . . . . .
22.2 Design a hit counter . . . . . . . . . .
22.3 What happens when you visit a URL?
287
288
293
296
Glossary
299
About the authors
Alex Miller is a software engineer who has interviewed hundreds of candidates on
behalf of Yelp, Pinterest, and Intuit. He currently works as an expert interviewer
at Karat. He has extensive experience teaching algorithms, data structures, and
mathematics. Alex holds a degree in mathematics from Wesleyan University.
Lawrence Wu is a software engineer who has worked at Google, Twitter, and Yelp
and has gained deep insight into each company's hiring practices. Most recently,
he has worked at Lyft on their self-driving division. Lawrence holds a degree in
computer science from the University ofToronto.
11
About this book
Hello, and thanks for purchasing this book!
You may have bought this book because you are studying for an upcoming interview.
Or possibly you anticipate having a coding interview in the future. Or perhaps you
are simply interested in improving your knowledge of algorithms and data structures
by working through a set of problems! Whatever your use case, we hope you enjoy it.
The questions in this book have been chosen with practicality, clarity, and selfimprovement in mind. Each one is based on a real question that was asked recently
by top tech companies. The problems and explanations were then carefully edited so
that each one communicates a key idea that you can apply more generally. Finally,
we have organized these problems into chapters by topic, to ensure that you can
methodically build up your skills in specific areas.
At the beginning of each chapter we provide a crash course on the topic that follows,
but this book is no substitute for years of coding experience or a course in computer
science. As such, we assume that readers have experience with at least one programming language, and have a basic familiarity with concepts such as graphs, recursion,
and hash tables.
The structure of this book is as follows. First, we introduce you to the most essential
data structures that pop up in coding interviews, such as linked lists, arrays, strings,
and hash tables. For each data structure, we offer a refresher on its advantages and
disadvantages, the time and space complexities of its operations, its implementation,
and what themes and key words to look for in order to recognize it.
Next, we take a tour through a series of must-know algorithms, including dynamic
13
CONTENTS
14
programming, backtracking, sorting, and searching. At the start of each chapter,
we discuss when it is a good idea to use each algorithm, and walk through a simple
example to describe step by step how it is performed. We examine patterns one can
identify to :figure out which algorithm to apply in a given problem, and :finally we
look at a few specialized algorithms that require combining multiple approaches.
Third, we present a set of more advanced problems that require you to use the preceding data structures and algorithms in novel ways in order to solve real-world
applications. From deriving a perfect blackjack strategy to deciphering an alien dictionary, these questions are designed to challenge you and widen your understanding
of what can be achieved with the right concepts and implementation.
Lastly, we address the topic of design. Interviewers like to gauge the ability of
candidates to understand tradeoffs between different approaches. As a result, it is not
uncommon to see problems with time and space constraints that require formulating
novel data structures. It has also become increasingly frequent for candidates to be
asked to design a high-level system that meets a particular need. Our :final chapters
on data structure and system design walk through each of these question types,
respectively, and provide a general strategy for approaching similar problems in the
future.
Before you jump in, we offer some general advice for working through the problems.
1. First, really try to solve each problem, even if it seems challenging, and even
if you aren't sure where to begin! Look for key words in the question, and
write out some pseudocode to get things started. Use the topic groupings to
your advantage: review the chapter introduction to see how a particular data
structure or algorithm may be relevant. This process of brainstorming and
engaging with the question will help to build your problem solving muscle
memory.
2. After giving the problem your best shot, read through the solution, looking
for not just how the algorithm works but why. What are the core concepts,
and to what other problems might they apply? About an hour later, and then
a week after, try to implement it again, this time without the solution.
3. Finally, stay positive! Many of these concepts are challenging, and without
CONTENTS
15
practice these data structures and algorithms may not seem intuitive. For example, dynamic programming, a now-common technique, was first developed
by top mathematicians during the Cold War to optimize military operations!
Rather than trying to understand it all at once, put aside some time each day
to work through one or two problems, and let the material sink in gradually.
Good luck!
Part I
Data Structures
17
Arrays
Arrays are without a doubt the most fundamental data structure in computer science.
Under the hood, an array is represented as a fixed-size, contiguous block of memory
with 0(1) time to store and access an element. Because of this efficiency, many other
data structures frequently use arrays for their implementation, such as strings, stacks,
queues, and hash tables.
You can picture an array as a bounded row of labeled containers, starting at 0, where
you can quickly put items in, take them out, or look up a value from an index (or
label).
0
1
2
3
4
5
6
7
For example, in the diagram above, we have an array of length 8. We can set and
get the value associated with the third index in constant time using the following
operations:
array[2]
x
=
=
'foo'
array[2]
#
'foo'
19
. CHAPTER 1. ARRAYS
20
However, arrays do have a few limitations. Looking up an element up by value
typically requires an entire traversal of the array, unless it is sorted in some way.
Deleting an element from an array means that all subsequent elements have to be
shifted left by one, leading to an 0( n) time operation. If possible, it is better to
overwrite the value. Similarly, inserting an element early in the array requires the
rest of the elements to be shifted right, so this should be done sparingly.
Finally, arrays have a fixed bound, which means they may not be suitable for applications where the size of the collection of elements is not known ahead of time. In an
interview setting, you should be careful of off-by-one errors that lead to trying to
access an element outside the range of the array.
Python does not have native support for arrays; typically, you'll use the list data
structure, which dynamically resizes under the hood. What this means is that to you,
the developer, it seems like the list is unbounded. In reality, as the list grows, the
data structure may allocate a larger (typically twice the current size) array, copy all
its elements to the larger one, and then use that as the underlying array.
In this chapter, we'll look at some common interview questions involving arrays and
strategies for solving them. Let's get started!
1.1
Get product of all other elements
Given an array of integers, return a new array such that each element at index i of
the new array is the product of all the numbers in the original array except the one
at i.
For example, if our input was [ 1, 2, 3, 4, 5], the expected output would be [ 120,
60, 40, 30, 24]. Ifourinputwas [3, 2, 1],theexpectedoutputwouldbe [2,
3, 6].
Follow-up: What if you can't use division?
CHAPTER 1. ARRAYS
21
Solution
This problem would be easy with division: an optimal solution could just find the
product of all numbers in the array and then divide by each of the numbers. To
solve this without division, we will rely on a common technique in array problems:
precomputing results from subarrays, and building up a solution from these results.
First note that to find the value associated with the i th element, we must compute
the product of all numbers before i and the product of all numbers after i. If we
could efficiently calculate these two, we could then simply multiply them to get our
desired product.
In order to find the product of numbers before i, we can generate a list of prefix
products. Specifically, the i th element in the list will be a product of all numbers
including i. Similarly, we can generate a list of suffix products. Finally, for each index
we can multiply the appropriate prefix and suffix values to obtain our solution.
def products(nums):
Generate prefix products.
prefix_products = []
#
for num in nums:
if prefix_products:
prefix_products.append(prefix_products[-1] * num)
else:
prefix_products.append(num)
Generate suffix products.
suffix_products = []
for num in reversed(nums):
if suffix_products:
#
suffix_products.append(suffix_products[-1] * num)
else:
suffix_products.append(num)
suffix_products = list(reversed(suffix_products))
Generate result from the product of prefixes and suffixes.
result= []
for i in range(len(nums)):
#
if i == 0:
CHAPTER 1. ARRAYS
result.append(suffix_products[i
22
+
1])
etif i == len(nums) - 1:
result.append(prefix_products[i - 1])
else:
result.append(
prefix_products[i - 1] * suffix_products[i
+
1]
return result
This runs in O(n) time and space, since iterating over the input array takes O(n)
time and the prefix and suffix arrays take up 0( n) space.
1.2
Locate smallest window to be sorted
Given an array of integers that are out of order, determine the bounds of the smallest
window that must be sorted in order for the entire array to be sorted. For example,
given [ 3 , 7 , 5 , 6 , 9] , you should return ( 1 , 3 ) .
Solution
One method we can try is to first find out what the array elements would look like
when sorted. For example, [ 3, 7, 5, 6, 9], after sorting, becomes [ 3, 5, 6, 7,
9]. We can see that the first and last elements remain unchanged, whereas the middle
elements are altered. Therefore, it suffices to take the first and last altered elements
as our window.
def window(array):
left, right= None, None
s = sorted(array)
for i in range(len(array)):
if array[i] != s[i] and left is None:
left= i
etif array[i] != s[i]:
right= i
CHAPTER 1. ARRAYS
23
return left, right
This solution takes 0( n log n) time and space, since we create a sorted copy of the
original array.
Often when dealing with arrays, a more efficient algorithm can be found by looping
through the elements and computing a running minimum, maximum, or count. Let's
see how we can apply this here.
Suppose instead that we traversed the array, from left to right, and took note of
whether each element was less than the maximum seen up to that point. This
element would have to be part of the sorting window, since we would have to move
the maximum element past it.
As a result, we can take the last element that is less than the running maximum, and
use it as our right bound. Similarly, for our left bound, we can traverse the array from
right to left, and find the last element that exceeds the running minimum.
This will take two passes over the array, operating in O(n) time and 0(1) space.
def window(array):
left, right= None, None
n = len(array)
max_seen, min_seen = -float("inf"), float("inf")
for i in range(n):
max_seen = max(max_seen, array[i])
if array[i] < max_seen:
right= i
for i in range(n - 1, -1, -1):
min_seen = min(min_seen, array[i])
if array[i] > min_seen:
left= i
return left, right
CHAPTER 1. ARRAYS
1.3
24
Calculate maximum subarray sum
Given an array of numbers, :find the maximum sum of any contiguous subarray of
the array. For example, given the array [34, -50, 42, 14, -5, 86], the maximum
sum would be 137, since we would take elements 42, 14, -5, and 86. Given the array
[ -5, -1, -8, -9], the maximum sum would be 0, since we would choose not to
take any elements.
Do this in O (n) time.
Follow-up: What if the elements can wrap around? For example, given [ 8, -1, 3,
4], return 15, as we choose the numbers 3, 4, and 8 where the 8 is obtained from
wrapping around.
Solution
The brute force approach here would be to iterate over every contiguous subarray
and calculate its sum, keeping track of the largest one seen.
def max_subarray_sum(arr):
current_max = 0
for i in range(len(arr) - 1):
for j in range(i, len(arr)):
current_max = max(current_max, sum(arr[i:j]))
return current_max
This would run in O(n 3 ) time. How can we make this faster?
We can work backwards from our desired solution by iterating over the array and
looking at the maximum possible subarray that can be made ending at each index.
For each index, we can either include the corresponding element in our sum or
exclude it.
As we iterate over our array, we can keep track of the maximum subarray we've seen
so far in a variable called max_so_ far. Whenever we find a larger subarray ending at
25
CHAPTER 1. ARRAYS
a given index, we update this variable.
def max_subarray_sum(arr):
max_ending_here = max_so_far = 0
for x in arr:
max_ending_here = max(x, max_ending_here + x)
max_so_far = max(max_so_far, max_ending_here)
return max so far
Th.is algorithm is known as Kadane's algorithm, and it runs in O(n) time and 0(1)
space.
We split the follow-up problem into two parts. The :first part is the same as before:
finding the maximum subarray sum that doesn't wrap around. Next, we compute the
maximum subarray sum that does wrap around, and take the maximum of the two.
To get the largest wrap-around sum, we can use a little trick. For any subarray that
wraps around, there must be some contiguous elements that are excluded, and these
elements actually form the minimum possible subarray! Therefore, we can :first :find
the minimum subarray sum using exactly the method above, and subtract this from
the array's total.
For example, in the example above, the minimum subarray is [ -1], with a total of
-1. We then subtract this from the array total, 14, to get 15.
def maximum_circular_subarray(arr):
max_subarray_sum_wraparound
=
sum(arr) - min_subarray_sum(arr)
return max(max_subarray_sum(arr), max_subarray_sum_wraparound)
def max_subarray_sum(arr):
max_ending_here
max_so_far
0
for x in arr:
max_ending_here = max(x, max_ending_here + x)
max_so_far = max(max_so_far, max_ending_here)
return max so far
CHAPTER 1. ARRAYS
26
def min_subarray_sum(arr):
min_ending_here
= min_so_far = 0
for x in arr:
min_ending_here = min(x, min_ending_here + x)
min_so_far = min(min_so_far, min_ending_here)
return min so far
This takes O(n) time and 0(1) space.
1.4
Find number of smaller elements to the right
Given an array of integers, return a new array where each element in the new array
is the number of smaller elements to the right of that element in the original input
array.
For example, given the array [ 3, 4, 9, 6, 1], return [ 1, 1, 2, 1, 0], since:
• There is 1 smaller element to the right of 3
• There is 1 smaller element to the right of 4
• There are 2 smaller elements to the right of 9
• There is 1 smaller element to the right of 6
• There are no smaller elements to the right of 1
Solution
A naive solution for this problem would simply be to create a new array, and for each
element count all the smaller elements to the right of it.
CHAPTER 1. ARRAYS
27
def smaller_counts_naive(lst):
result= []
for i, num in enumerate(lst):
count= sum(val < num for val in lst[i
result.append(count)
+
1:])
return result
This takes O(n 2 ) time. Can we do this any faster?
To speed this up, we can try the following idea:
• Iterate backwards over the input list
• Maintain a sorted list seen of the elements we've seen so far
• Look at seen to see where the current element would fit in
The index will be how many elements on the right are smaller.
import bisect
def smaller_counts(lst):
result = []
seen=[]
for num in reversed(lst):
i = bisect.bisect_left(seen, num)
result. append( i)
bisect.insort(seen, num)
return list(reversed(result))
Now this only takes O(nlogn) time and O(n) space.
Strings
Strings are an unavoidable part of programming. Every word in this sentence, and
this whole book itself, can be considered a string! As a result there's a good chance
you'll be asked a question involving strings in an interview question.
Behind the scenes, the contents of a string are typically stored in a read-only sequential
array in memory, meaning that strings are immutable. In other words, you can reassign
a string variable to a new value, but you cannot change a particular character in the
underlying array.
The most common operations performed on strings are indexing to get a particular
character or substring, joining two strings together by concatenation, ai;id splitting
by a delimiter.
Expect to be asked about string rotations, reversals, prefixes and suffixes, and sorting.
We'll explore these topics in the following questions.
2.1
Find anagram indices
Given a word wand a strings, find all indices ins which are the starting locations
of anagrams of w. For example, given w is ab ands is abxaba, return [ 0, 3, 4].
29
CHAPTER 2. STRINGS
30
Solution
The brute force solution here would be to go over each word-sized window ins and
check if it forms an anagram, like so:
from collections import Counter
def is_anagram(s1, s2):
return Counter(s1) == Counter(s2)
def anagram_indices(word, s):
result = []
for i in range(len(s) - len(word)
window= s[i:i
+
+
1):
len(word)]
if is_anagram(window, word):
result.append(i)
return result
In the above code, we use Python's built-in Counter collection, which when applied
to a word forms a dictionary whose keys are characters and whose values are their
respective counts.
This would take O (w x s) time, where w is the length of the word and s is the length
of the input string. Can we make this any faster?
When approaching any string question, using a hash table should be at the tip of
your fingers as a potential strategy. Notice that at each window we are recomputing
the frequency counts of the entire window, when only a small part of it actually is
updated. If we could efficiently update these frequency counts for each substring,
our algorithm would be much quicker.
This insight leads us to the following strategy. First, we make a frequency dictionary
of both the initial window and the target word. As we move along the string, we
increment the count of each new character and decrement the count of the old. If at
any point there is no difference between the frequencies of the target word and the
current window, we add the corresponding starting index to our result.
CHAPTER 2. STRINGS
from collections import defaultdict
def del_if_zero(dict, char):
if dict[char] == 0:
del dict[char]
def anagram_indices(word, s):
result= []
freq= defaultdict(int)
for char in word:
freq[char] += 1
for char in s[:len(word)]:
freq[char] -= 1
del_if_zero(freq, char)
if not freq:
result.append(0)
for i in range(len(word), len(s)):
start_char, end_char = s[i - len(word)], s[i]
freq[start_char] += 1
del_if_zero(freq, start_char)
freq[end_char] -= 1
del_if_zero(freq, end_char)
if not freq:
beginning_index = i - len(word) + 1
result.append(beginning_index)
return result
This runs in O (s) time and space.
31
32
CHAPTER2. STRINGS
2.2
Generate palindrome pairs
Given a list of words, find all pairs of unique indices such that the concatenation of
the two words is a palindrome.
For example, given the list [ code
11
11
11
,
edoc
11
11
,
da
11
11
,
d
11
],
return [ ( 0, 1), ( 1,
0), (2, 3)].
Solution
Here we see an example where taking a closer took at substrings yields a significant
improvement in efficiency.
One algorithm we can try is to check each possible word pair for palindromicity and
add their indices to the result:
def is_palindrome(word):
return word== word[::-1]
def palindrome_pairs(words):
result
= []
for i, wordl in enumerate(words):
for j, word2 in enumerate(words):
if i
==
j:
continue
if is_palindrome(wordl + word2):
result.append((i, j))
return result
This takes O (n 2 x c) time, where n is the number of words and c is the length of
the longest word.
To speed this up, we can insert all words into a dictionary and then check the
dictionary for each word's prefixes and suffixes.
CHAPTER 2. STRINGS
33
Our dictionary will map each word to its index in the list. If the reverse of a word's
prefix/suffix is in the dictionary and its corresponding suffix/prefix is palindromic,
we add it to our list of results. For example, say we're looking at the word aabc. We
check the following prefixes:
• Since
11 11
is a palindrome, we look for cbaa in the dictionary. If we find it, then
we can make cbaaaabc.
• Since a is a palindrome, we look for c ba in the dictionary. If we find it, then
we can make cbaaabc.
• Since aa is a palindrome, we look for cb in the dictionary. If we find it, then
we can make cbaabc.
• Since aab and aabc are not palindromes, we don't do anything.
We do the same thing for the suffixes.
def is_palindrome(word):
return word== word[::-1]
def palindrome_pairs(words):
d = {}
for i, word in enumerate(words):
d[word] = i
result= []
for i, word in enumerate(words):
for char_i in range(len(word)):
prefix, suffix= word[:char_i], word[char_i:]
reversed_prefix = prefix[::-1]
reversed_suffix = suffix[::-1]
if (is_palindrome(suffix) and
reversed_prefix ind):
if i != d[reversed_prefix]:
result.append((i, d[reversed_prefix]))
CHAPTER 2. STRINGS
if (is_palindrome(prefix)
34
and
reversed_suffix ind):
if i != d[reversed_suffix]:
result.append((d[reversed_suffix], i))
return result
This changes the time complexity to 0( n x c2 ). Since we will likely be constrained
more by the number of words than the number of characters, this seems like a
significant improvement.
2.3
Print zigzag form
Given a string and a number of lines k, print the string in zigzag form. In zigzag,
characters are printed out diagonally from top left to bottom right until reaching
the k th line, then back up to top right, and so on.
For example, given the sentence "thisisazigzag" and k
t
a
s z
g
i i
a
i z
s
g
h
4, you should print:
Solution
One of the most natural things to do with a string is to print it, so you should be
prepared for questions that ask you to print strings in odd ways, such as this one.
One way to solve this would be to go one line at a time, figuring out what that line
should be, and printing it. An advantage of this method would be that we would
only need 0( n) space at any given time, where n is the length of a line. Let's see
how this would work.
CHAPTER 2. STRINGS
35
For the zigzag pattern above, we can see that the letters in the top and bottom lines
are each separated by 5 spaces. What about the middle lines? Here it is trickier -
it
depends on whether the pattern is descending or ascending. When we are ascending,
we should put 3 spaces after the second line and 1 space after the third line, but if
we are ascending the reverse is true.
Let's try to clear up this confusion by looking at what happens with 5 lines:
t
h
i
0
z
n t
g a
i
h
a
s s
e z
i
r
g
Here as we move from top to bottom there are 7, 5, 3, and 1 spaces added after each
letter, and the same is true when we go from bottom to top.
So if row is the current row we're on, desc represents whether or not we are descending, and k is the number of lines, we can predict the number of tacked-on spaces
using the following formula:
def get_spaces(row, desc, k):
max_spaces = (k - 1) * 2 - 1
if desc:
spaces= max_spaces - row* 2
else:
spaces= max_spaces - (k - 1 - row)* 2
return spaces
This presents us with another challenge: how do we know whether or not the pattern
is descending? Note that if we have :five rows, we will be descending for the first four,
ascending for the next four, and so on. This can be represented mathematically like
so:
def is_descending(index, k):
#
Check whether the index is more or less than halfway
CHAPTER 2. STRINGS
36
through its oscillation back to the starting,point.
return index% (2 * (k - 1)) < k - 1
#
Putting these together, our algorithm will create a list of empty strings for the first
row. After placing the first character in this list at the appropriate index, it will
check whether the pattern is ascending or descending, find out how many spaces are
needed, and move to the next index. When we get to the end of the row, we print it
out, and repeat this process for subsequent rows.
def zigzag(sentence, k):
n = len(sentence)
for row in range(k):
i = row
line= [" " for_ in range(n)]
while i < n:
line[i] = sentence[i]
desc = is_descending(i, k)
spaces= get_spaces(row, desc, k)
i +=spaces+ 1
print("".join(line))
Even though is_descending and get_spaces are constant-time operations, we still
need to join and print each line of the string, which will take 0( n) time, so the
whole algorithm will be O(k x n).
2.4
Determine smallest rotated string
You are given a string of length n and an integer k. The string can be manipulated
by taking one of the first k letters and moving it to the end of the string.
Write a program to determine the lexicographically smallest string that can be created
after an unlimited number of moves.
CHAPTER2. STRINGS
37
For example, suppose we are given the string da i 1 y and k
= l.
The best we can
create in this case is ailyd.
Solution
Sorting strings is something we will revisit in the chapter on sorting, but this question
gives us a glimpse into the type of operations that are helpful.
We can break this problem down into two cases.
First, consider the case where k
= l. Here we are only allowed to rotate the string,
so we can simply choose the alphabetically earliest rotation.
Now suppose k > l. This situation is a bit trickier, as it seems we must figure out
which of the first k items to move at each step. However, it turns out that there is a
series of moves that allows us to effectively swap any two letters.
We can understand these moves by looking at the general example of converting
xxabxx
to xxbaxx. In the table below, each string represents the newly formed result
of the preceding transformation.
String
Transformation
xxabxx
Move all x to end, one at a time
abxxxx
Move b to end
axxxxb
Move a to end
xxxxba
Move x to end, one at a time, until reaching initial position
xxbaxx
Swapped
In code, this would look like the following:
def bubble_swap(string, i, j):
string= list(string)
#
Rotate so that i is at the beginning.
while i > 0:
string= string[l:]
+
string[:1]
CHAPTER 2. STRINGS
i
#
38
-= 1
Move the first two letters to the end in reversed order.
string
string[:1]
+
string[2:]
string
string[l:]
+
string[:1]
#
+
string[l:2]
Rotate back to the initial position.
while len(string) > j
+
string= string[l:]
j
+=
1:
+
string[:1]
1
return '' .join(string)
As indicated by the name, this operation is essentially the same as that used for
bubble sort. Therefore, so long as we are allowed to move either the :first or second
letter, we can always obtain a fully sorted string.
Our full solution, then, will be to return the alphabetically earliest rotation if k
=
1,
and otherwise the sorted string.
def get_best_word(string, k):
string= list(string)
if k
==
1:
best= string
for i in range(l, len(string)):
if string[i:]
+
string[:i] < best:
best= string[i:]
return '' .join(best)
+
string[:i]
else:
return '' .join(sorted(string))
In the :first case, our algorithm loops through n rotations and compares two strings
of length n, for a time complexity of O(n 2 ). The space required will be O(n), the
size of our two string variables.
CHAPTER 2. STRINGS
39
For the latter, sorting our string will take O (n log n) time, and building the new
string will require O (n) space.
Linked Lists
One way you can think of a linked list is as a music playlist, where each item contains
the song to be played and a "next song" button. In this abstract playlist, you cannot
play any song you want; to play a given song you must play through all the songs
before it first.
There are two main kinds oflinked lists. Singly linked lists only contain a pointer to
the next node, typically called next, and are implemented as follows:
class Node:
def __ init __ (self, data, next=None):
self.data= data
self.next= next
Linked lists are a recursive data structure: the type of next is another linked list
node. Because of this, linked lists have no fixed size like arrays do: a new node can
be initialized and appended to a linked list on the fly.
Doubly linked lists, meanwhile, have pointers to the previous and next nodes. They
take up more space, but allow you to traverse backwards. The implementation for a
doubly linked list looks like this:
41
CHAPTER 3. LINKED LISTS
42
class Node:
def __ init __ (self, data, next=None, prev=None):
self.data= data
self.next= next
self.prev = prev
Returning to the analogy above, a doubly linked list would mean that each song has
both a "previous song" and "next song" button.
Common operations on linked lists include searching, appending, prepending, and
removing nodes. You should be able to quickly write an implementation of each of
these in an interview.
In an interview setting, you should be prepared to answer questions about traversing
and reversing linked lists, as well as rearranging their nodes. Let's try out a few
problems!
3.1
Reverse linked list
Given the head of a singly linked list, reverse it in-place.
Solution
Reversing a linked list is a classic interview question that can be surprisingly tricky
if you've never tried it before, so this problem merits close attention.
First, let's consider a few cases. What if the linked list has just one element, such
as 15? Then we only need to return that node, since it is already sorted. Now how
about if the linked list has two elements? Here, we need to rearrange the pointers so
that the tail becomes the head, and the head becomes the tail.
To solve this problem more generally, we must think recursively. We will explain
recursion more fully in a later chapter, but since the concept pops up frequently in
linked list questions it will be useful to work through an example here.
43
CHAPTER 3. LINKED LISTS
For any recursive solution we require a base case and an inductive case. What are the
base cases here? For an empty linked list, we should return null, since it is trivially
already sorted. Similarly, for a linked list with one element, we need only return that
one element.
Now let's consider a linked list with an arbitrary number of elements. How can we
reverse it, assuming we can reverse smaller linked lists in place? Suppose we take the
head node and store it in a variable somewhere. Then, we can take the head's next,
which is a smaller linked list, and recursively reverse that -
let's call this smaller,
reverse linked lists. Finally, the head must be reattached to the end of s, after which
we return the head of s.
Lets be the tail of the linked list (the list without the head):
0-0
Recursively call reverse on the tail, and then reattach a to the end:
0-0
Note that we have created a helper function to return both the head and the tail of
the linked list, to simplify our logic.
def reverse(node):
#
_reverse() reverses and returns both head and tail.
Conventionally, an underscore denotes an unused variable.
head,_= _reverse(node)
#
return head
def _reverse(node):
if node is None:
return None, None
if node.next is None:
return node, node
#
Reverse rest of linked list and move node to after tail.
CHAPTER 3. LINKED LISTS
44
head, tail= _reverse(node.next)
node.next= None
tail.next= node
return head, node
This runs in 0( n) time, which is optimal-we cannot reverse a list without traversing
through all its elements at least once. However, it also runs in O (n) space, since each
call to reverse adds to our call stack.
Ideally, what we would like to do is update the list as we traverse it. As we do so,
we will need to change each node so that it points to the node that came before it,
instead of the one after. To help us implement this, we will use a technique that is
very common in linked list problems: iterating over the list with two pointers.
In particular, we will maintain two pointers, current and prev. As the current
pointer traverses the list, prev follows one node behind. At each step, we will set
the current node's next to point to the previous node, and then move both pointers
forward. Finally, we return the last node.
def reverse(head):
prev, current= None, head
while current is not None:
#
Make current node point to prev and move both forward one.
tmp = current.next
current.next= prev
prev = current
current= tmp
return prev
Our new and improved solution now only uses constant space!
3.2 Add two linked lists that represent numbers
We can represent an integer in a linked list format by having each node represent a
digit in the number. The nodes are connected in reverse order, such that the number
CHAPTER 3. LINKED LISTS
45
54321 is represented by the following linked list:
Given two linked lists in this format, return their sum.
For example, given:
You should return 124 (99
+ 25) as:
Solution
In a problem such as this one, knowing how to iterate through a linked list gets you
halfway to a solution. More concretely, we can add two numbers using the same
process as elementary grade school addition: adding the least significant digits with
a carry.
We'll start at the head of the two nodes, and compute the sum of both values modulo
10. We write this down, move the two nodes up and add a carry if the sum was
greater than 10. A tricky part here is finding the terminating condition. We can see
that this happens when there is no more carry and the two linked lists have reached
the end. Once this happens, we extend the nodes one more until they are both None
and carry is O and then return None.
def add(node0, nodel, carry=0):
if not node0 and not nodel and not carry:
CHAPTER 3. LINKED LISTS
46
return None
node0_val = node0.data if node0 else 0
nodel_val = nodel.data if nodel else 0
LOtal = node0_val
+
nodel_val
+
carry
node0_next = node0.next if node0 else None
nodel_next = nodel.next if nodel else None
carry_next = 1 if total>= 10 else 0
return Node(total % 10, add(node0_next, nodel_next, carry_next))
This will run in O (m
+ n) time, where m
and n are the lengths of the two linked
lists.
3.3
Rearrange a linked list to alternate high-low
Given a linked list, rearrange the node values such that they appear in alternating
low âž” high âž” low âž” high âž”
... form.
For example, given 1 âž” 2 âž” 3 âž” 4 âž” 5, you should return 1 âž” 3 âž” 2 âž” 5 âž” 4.
Solution
Let's take a look at the example input and see if we can derive an algorithm. One
straightforward method is to examine each consecutive pair of nodes, and perform a
swap if they do not alternate as required. For the example above, we would carry out
the following steps:
• 1 < 2? Yes, proceed with 1 âž” 2
• 2 > 3? No, swap these values to end up with 1 âž” 3 âž” 2
• 2 < 4? Yes,proceed with 1 âž” 3 âž” 2 âž” 4
• 4 > s? No, swap these values to end up with 1 âž” 3 âž” 2 âž”
5➔4
CHAPTER 3. LINKED LISTS
47
In order to implement this, we must know at any given time whether a node's value
should be less than or greater than that of its successor. To do this we can use a
variable that is True at even nodes, and False at odd ones.
def alternate(ll):
even= True
cur= ll
white cur.next:
if cur.data> cur.next.data and even:
cur.data, cur.next.data= cur.next.data, cur.data
etif cur.data< cur.next.data and not even:
cur.data, cur.next.data= cur.next.data, cur.data
even= not even
cur= cur.next
return ll
While this works, the use of even is somewhat inelegant. Note that in order for the
node values to alternate in this way, it must be true that every odd node's value is
greater than its preceding and succeeding values. So an alternative algorithm would
be to check every other node, and perform the following swaps:
• If the previous node's value is greater, swap the current and previous values.
• If the next node's value is greater, swap the current and next values.
Instead of using a variable for parity, we can use two pointers that jump forward two
steps after each check.
def alternate(ll):
prev = ll
cur = l l. next
CHAPTER 3. LINKED LISTS
48
while cur:
if prev.data > cur.data:
prev.data, cur.data= cur.data, prev.data
if not cur.next:
break
if cur.next.data> cur.data:
cur.next.data, cur.data= cur.data, cur.next.data
prev = cur.next
cur= cur.next.next
return l l
Both of these algorithms use O(n) time and 0(1) space, since we must traverse the
entire linked list, and we are only tracking one or two nodes at a time.
3.4
Find intersecting nodes of linked lists
Given two singly linked lists that intersect at some point, find the intersecting node.
Assume the lists are non-cyclical.
For example, given A
=
3 âž” 7 âž” 8 âž” 10 and B
=
99 âž” 1 âž” 8 âž” 10, return the
node with value 8. In this example, assume nodes with the same value are the exact
same node objects.
Do this in 0( m
+ n) time (where m
and n are the lengths of the lists) and constant
space.
Solution
To handle this problem, we will again use our favorite linked list tactic: iterating
with two pointers. Let's start by first ignoring the time and space constraints, in
order to get a better grasp of the problem.
CHAPTER 3. LINKED LISTS
49
Naively, we could iterate through one of the lists and add each node to a set. Then
we could iterate over the other list and check each node to see if it is in the set, and
return the first node present in the set. This takes O (m + n) time and also O (m + n)
space (since we don't know initially which list is longer). How can we reduce the
amount of space we need?
We can get around the space constraint with the following trick: first, get the length
of both lists. Find the difference between the two, and then keep two pointers at the
head of each list. Move the pointer of the larger list up by the difference, and then
move the pointers forward in conjunction until they match.
def length(head):
if not head:
return 0
return 1 + length(head.next)
def intersection(a, b):
m, n = length(a), length(b)
cur_a, cur_b = a, b
if m > n:
for_ in range(m - n):
cur_a
=
cur_a.next
else:
for_ in range(n - m):
cur_b
=
cur_b.next
while cur_a != cur_b:
cur_a
cur_b
=
=
return cur_a
cur_a.next
cur_b.next
Stacks and Qyeues
When you find yourself needing to frequently add and remove items from a list,
stacks and queues are two data structures that you should consider.
To understand how a stack works, imagine a literal stack of cafeteria trays. Adding
a new one to the top, and removing the top one can be done quickly, whereas it is
difficult (read: not allowed) to change trays from the middle. This property is known
by the shorthand "last in, first out", or LIFO.
The traditional names for these operations, as well as a method for checking the value
of the top "tray", are given in the following implementation, in which all methods
are 0(1):
class Stack:
def __ init __ (self):
self.stack= []
def push(self, x):
#
Add an item to the stack.
self.stack.append(x)
def pop(self):
Remove and return the top element.
return self.stack.pop()
#
51
CHAPTER 4. STACKS AND QUEUES
52
def peek(self):
return self.stack[-1]
:Note that a pop operation on an empty stack will result in an exception, unless there
is proper error handling.
In the above implementation we have used a Python list as the underlying data
structure, meaning the size of the stack will dynamically resize as necessary. Alternatively we could have used a linked list, so that new elements would be added to, and
removed from, the tail of the existing chain.
A queue, on the other hand, can be thought of as a group of people standing in line,
perhaps waiting to buy this book. Each person enters the line from the back, and
leaves in exactly the order that they entered it, a property known as "first in, first
out", or FIFO.
Qy.eues are commonly implemented as linked lists, where we enqueue and item by
adding a tail node and dequeue an item by removing the head node and moving our
head pointer forward.
In a double-ended queue, one can efficiently append and remove items to either side
of the list. Python provides the built-in collections. deque library for this purpose,
which uses the following API:
from collections import deque
queue= deque()
queue.append(4)
queue.append(S)
queue.appendleft(6)
print(queue)
#
deque([6, 4, 5])
queue.popleft()
queue. pop() # 5
#
6
53
CHAPTER 4. STACKS AND QUEUES
print(queue)
#
deque([4])
The append and pop left operations above are more traditionally called enqueue and
dequeue, so in the following questions we will frequently use the latter terminology.
Along with pop and appendleft, these operations run in 0(1) time.
When the most recent item examined is the most important, a stack is frequently a
good choice. For this reason stacks often feature in depth-first search, backtracking,
and syntax parsing applications.
When the order of the items you are dealing with needs to be preserved, on the
other hand, a queue is preferable. Qyeues can be found, for example, in breadth-first
search, buffers, and scheduling applications.
4.1
Implement a max stack
Implement a stack that has the following methods:
• push ( v a1 ) : push v a1 onto the stack
• pop: pop off and return the topmost element of the stack.
If there are no
elements in the stack, throw an error.
• max: return the maximum value in the stack currently. If there are no elements
in the stack, throw an error.
Each method should run in constant time.
Solution
Implementing the stack part (push and pop) of this problem is easy - we can just
use a typical list to implement the stack with append and pop. However, getting the
max in constant time is a little trickier. We could do this in linear time if we popped
CHAPTER 4. STACKS AND QUEUES
54
off everything on the stack while keeping track of the maximum value, and then put
everything back on.
To accomplish this in constant time, we can use a secondary stack that only keeps
track of the max values at any time. It will have the exact same number of elements
as our primary stack at any point in time, but the top of the stack will always contain
the maximum value of the stack.
We can then, when pushing, check if the element we're pushing is greater than the
max value of the secondary stack (by just looking at the top), and if it is, then push
that instead. If not, we append the previous value.
class MaxStack:
def __ init __ (self):
self.stack= []
self.maxes=[]
def push(self, val):
self.stack.append(val)
if self.maxes:
self.maxes.append(max(val, self.maxes[-1]))
else:
self.maxes.append(val)
def pop(self):
if self.maxes:
self.maxes.pop()
return self.stack.pop()
def max( self):
return self.maxes[-1]
4.2 Determine whether brackets are balanced
Given a string of round, curly, and square opening and closing brackets, return
whether the brackets are balanced (well-formed).
55
CHAPTER 4. STACKS AND QUEUES
For example, given the string
Given the string
11
11
(
[
)
]
or
11
11
(
[]) [] (
11
(
(
(
)
" ,
{})
,
you should return true.
you should return false.
Solution
Let's start with a simplified case of the problem: dealing with only round brackets.
Notice that in this case, we only need to keep track of the current number of opening
brackets -
each closing bracket should be matched with the rightmost opening
' bracket. So we can keep a counter and increment it for every opening bracket we
see and decrement it on every closing bracket. If we get to the end of the string and
our counter is non-zero, our brackets will be unbalanced. A negative number would
indicate more closing brackets than opening ones, and a positive number would
indicate the opposite.
In the case of round, curly, and square brackets, we need to also keep track of what
kind of brackets they are, because we can't match a round opening bracket with a
curly or square closing bracket. A stack is an ideal data structure for this type of
problem, since it can keep track of which depth level we're at. We'll use a stack to
keep track of the actual characters. We push onto it when we encounter an opening
bracket, and pop whenever we encounter a matching closing bracket. If the stack is
empty or it's not the correct matching bracket, then we will return False. If at the
end of the iteration we have something left over in the stack, our input is sure to be
unbalanced. As a result we can return a boolean value representing whether or not
our stack is empty.
def balance(s):
stack
= []
for char ins:
if char in["(", "[", "{"]:
stack.append(char)
else:
# Check character is not unmatched
if not stack:
return False
#Charis a closing bracket. Check top of stack if it
56
CHAPTER 4. STACKS AND Q!JEUES
matches
tt) or \
if (char== ")" and stack[-1] ! =
(char== "]" and stack[-1] ! = " [") or \
(char== "}" and stack[-1] != " {"):
return False
stack.pop()
#
II (
return len(stack) == 0
This takes O (n) time and space, where n is the number of characters in s.
Fun fact:
4.3
11
11
(
(
)
)
is not a palindrome, nor is
11
11
(
)
(
)
11
•
11
(
)
)
(
is a palindrome, though.
Compute maximum of k-length subarrays
Given an array of integers and a number k, where 1 ~ k -~ array length, compute
the maximum values of each subarray of length k.
For example, let's say the array is [ 10, 5, 2, 7, 8, 7] and k = 3. We should get
[ 10 , 7 , 8 , 8] , since:
• 10 = max( 10, 5, 2)
• 7 = max(S,
2, 7)
• 8=max(2, 7, 8)
• 8 = max(7, 8, 7)
Do this in O(n) time and O(k) space. You can modify the input array in-place and
you do not need to store the results. You can simply print them out as you compute
them.
57
CHAPTER 4. STACKS AND QUEUES
Solution
Let's :first write out a naive solution: we can simply take each subarray oflength k
and compute its maximum.
def max_of_subarrays(lst, k):
for i in range(len(lst) - k
+
1):
print(max(lst[i:i + k]))
This takes O(n x k) time, which doesn't quite get us to where we want. How can
we make this faster?
Notice that, for example, for the input [ 1, 2, 3, 4, 5, 6, 7, 8, 9] and k
=
3,
after evaluating the max of the first range, since 3 is at the end, we only need to check
whether 4 is greater than 3. If it is, then we can print 4 immediately, and if it isn't,
we can stick with 3.
On the other hand, for the input [ 9, 8, 7, 6, 5, 4, 3, 2, 1] and k
=
3, after
evaluating the max of the first range, we can't do the same thing, since we can't use 9
again. We have to look at 8 instead, and then once we move on to the next range, we
have to look at 7.
These two data points suggest an idea: we can keep a double-ended queue with max
size k and only keep what we need to evaluate in it. That is, if we see [ 1, 3, 5],
then we only need to keep [ 5], since we know that 1 and 3 cannot possibly be the
maxes.
So what we can do is maintain an ordered list of indices, where we only keep the
elements we care about. That is, we will maintain the loop invariant that our queue
is always ordered so that we only keep the maximal indices.
It will help to go over an example. Consider our test input: [ 10,
and k
=
5, 2, 7, 8, 7]
3. Our queue at each step would look like this (recall that these are really
indices):
First, we preprocess the first k elements to get our window to size k:
CHAPTER 4. STACKS AND QUEUES
Process 10. We add it to the queue.
Process 5. 5 is smaller than 10, so we keep 10 at the front.
Process 2. 2 is smaller than 5, so we keep 10 and 5 in the list.
Then we begin our main loop:
Print first element in queue and dequeue it: 10
Process 7. 7 is bigger than 2 and 5, so we pop them off.
Print first element in queue and dequeue: 7
Process 8.
Print first element in queue and dequeue: 8
7 is smaller than 8, so we keep 8 in the queue.
58
CHAPTER 4. STACKS AND QUEUES
59
Print first element in queue and dequeue: 8
Done!
Here is how we could implement this in Python.
from cottections import deque
def max_of_subarrays(lst, k):
q
=
deque()
for i in range(k):
white q and lst[i] >= lst[q[-1)):
q. pop()
q.append(i)
#
Loop invariant: q is a list of indices where their
corresponding values are in descending order.
for i in range(k, len(lst)):
print(lst[q[0]])
#
white q and q[0] <= i - k:
q. pop left()
white q and lst[i] >= lst[q[-1)):
q. pop()
q.append(i)
print(lst[q[0]])
We've now achieved our desired O(n) time and O(k) space!
4.4
Reconstruct array using +/- signs
The sequence [ 0, 1, •.• , N] has been jumbled, and the only clue you have for its
order is an array representing whether each number is larger or smaller than the last.
Given this information, reconstruct an array that is consistent with it. For example,
given [None, +, +, -, +],you could return [0, 1, 3, 2, 4].
60
CHAPTER 4. STACKS AND QUEUES
Solution
Notice that if there are no negative signs in our input, we can return the original
sequence [ 0, 1 , ••• , N] . Furthermore, if we have just one run of consecutive
negatives, we can reverse the corresponding entries in the original sequence to
produce a decreasing run of numbers. For example, given [None,
+, - , - , - ]
we
can reverse the last three entries of [ 0, 1 , 2 , 3 , 4] to get [ 0, 1 , 4, 3 , 2].
We can extend this trick to more complicated input, matching plus signs with
elements of our original sequence and reversing subsequences whenever there is a
run of minus signs.
To keep track of which numbers are being reversed, we can use a stack. As we traverse
the input array, we keep track of the corresponding elements of the original sequence.
For a run of positive signs, we keep the elements from the original sequence. For
a run of negative signs, we push those elements onto the stack. When the run
of negatives ends, we can pop those elements off one by one to get a decreasing
subsequence in our answer.
Since there is one fewer
+
or - sign than elements in our answer, the last element
must be generated separately. Additionally, a run of negative signs can end with
either a positive or the end of the input array, so we must make sure to empty the
stack at the end.
def reconstruct(array):
answer= []
n = len(array) - 1
stack = []
for i in range(n):
if array[i + 1] == '-':
stack.append(i)
else:
answer.append(i)
while stack:
answer.append(stack.pop())
stack.append(n)
CHAPTER 4. STACKS AND QUEUES
61
while stack:
answer.append(stack.pop())
return answer
This algorithm runs in 0( n) time and space, since in the worst case we are filling up
a stack the same size as our original array. It may seem as if there is a higher time
complexity because of our inner loop over the stack, but the total number of items
we pop off is bounded by the size of the input array.
Hash Tables
A hash table is a crucial tool to keep in your data structure arsenal. Simply put, hash
tables associate keys with values using a hash function, allowing for 0(1) lookup,
insert, and delete times.
You may be wondering, what's the catch? For one, not everything can be hashed. It
is necessary that keys be immutable, so for example Python lists cannot be used as
keys. Additionally, under the hood there may be a lot of work needed to implement
a rigorous hash function. However, in an interview setting there aren't too many
downsides: if you see an opportunity to use a hash table, use it.
In Python, hash tables are closely tied to dictionaries, and use the following syntax:
d = {}
d['key']
=
'value'
print(d['key'])
#
'value'
#
KeyError: 'key'
del d [ ' key' ]
print(d['key'])
if 'key' ind:
print(d['key'])
63
CHAPTER 5. HASH TABLES
64
else:
print("key doesn't exist")
Note from above that if a key does not exist in a dictionary, simply trying to get the
value will cause a KeyError.
The last few lines show one way of getting around this. In the solutions that follow,
we will instead use the defaul tdict library, which allows you pass in a callable
parameter when declaring a dictionary to set the default value for each key.
A common motivating example for using hash tables is the two-sum problem, stated
as follows:
Given a list of numbers and a number k, return whether any two numbers from the
list add up to k. For example, given [ 10, 15, 3, 7] and k
True, since 10
+7=
=
17, we should return
17.
Instead of a brute force solution which checks all pairs of integers to search for this
total, we can use the following strategy. For each value we come across, we store it in
a hash table with the value True. We then check if the key k - value exists in the
table, and if so we can return True.
def two_sum(lst, k):
seen={}
for num in lst:
if k - num in seen:
return True
seen[num] = True
return False
This implementation cuts our time complexity down from O(n2 ) to O(n), since
each lookup is 0(1).
As the problem above demonstrates, if an interviewer asks you to make a solution
more efficient, a dictionary should be the first tool you look for.
Let's now look at some other uses of hash tables.
I
CHAPTER 5. HASH TABLES
5.1
65
Implement an LRU cache
Implement an LRU (Least Recently Used) cache. The cache should be able to be
initialized with cache size n, and provide the following methods:
• set(key, value): set key to value. If there are alreadyn items in the cache
and we are adding a new item, also remove the least recently used item.
• get(key): get the value at key. Ifno such keyexists,return null.
Each operation should run in O (1) time.
Solution
Hash tables are commonly used as the underlying data structure for a cache. This
problem gives us a taste of how this can be done.
To implement both these methods in constant time, we'll need to combine our hash
table with a linked list. The hash table will map keys to nodes in the linked list, and
the linked list will be ordered from least recently used to most recently used.
The logic behind each functions will be as follows.
For set:
• First look at our current capacity. If it's less than n, create a node with the
given value, set it as the head, and add it as an entry in the dictionary.
• If it's equal to n, add our node as usual, but also evict the least frequently used
node. We can achieve this by deleting the head of our linked list and removing
the entry from our dictionary.
We'll need to keep track of the key in each node so that we know which entry to
evict.
For get:
CHAPTER 5. HASH TABLES
66
• If the key doe~n't exist in our dictionary, return null.
• Otherwise, look up the relevant node in the dictionary. Before returning it,
update the linked list by moving the node to the tail of the list.
To keep our logic clean, we will implement a Linked List helper class that allows us
to reuse methods when adding and removing nodes. In particular, we will use the
add and remove methods of this class when bumping a node to the back of the list
after fetching it.
In the end, the code would look like this:
class Node:
def __ init __ (self, key, val):
self. key = key
self.val= val
self.prev
None
self.next= None
class LinkedList:
def __ init __ (self):
#
Create dummy nodes and set up head<-> tail.
self.head
Node(None, 'head')
self.tail= Node(None, 'tail')
self.head.next= self.tail
self.tail.prev = self.head
def get_head(self):
return self.head.next
def get_tail(self):
return self.tail.prev
def add(self, node):
prev = self.tail.prev
prev.next
node
node.prev = prev
node.next= self.tail
self.tail.prev = node
CHAPTER 5. HASH TABLES
def remove(self, node):
prev = node.prev
nxt = node.next
prev.next = nxt
nxt.prev = prev
class LRUCache:
def __ init __ (self, n):
self.n = n
self.diet
{}
self.list
Linkedlist()
def set(self, key, val):
if key in self.diet:
self.dict[key].delete()
n = Node(key, val)
self.list.add(n)
self.dict[key] = n
if len(self.dict) > self.n:
head= self.list.get_head()
self.list.remove(head)
del self.dict[head.key]
def get(self, key):
if key in self.diet:
n = self.dict[key]
Bump to the back of the list by removing and adding the node.
self.list.remove(n)
#
self.list.add(n)
return n.val
All of these operations run in 0(1) time.
67
CHAPTERS. HASHTABLES
68
5.2 Cut brick wall
A wall consists of several rows of bricks of various integer lengths and uniform height.
Your goal is to find a vertical line going from the top to the bottom of the wall that
cuts through the fewest number of bricks. If the line goes through the edge between
two bricks, this does not count as a cut.
For example, suppose the input is as follows, where values in each row represent the
lengths of bricks in that row:
[[3, 5 I 1 1 1],
[2 I 3, 3, 2],
[5 I SJ,
[4, 4, 2],
[1, 3, 3, 3],
[1, 1, 6, 1, 1]]
The wall would then look like this:
I
I
I
I
I
I
The best we can we do here is to draw a line after the eighth brick, which will only
require cutting through the bricks in the third and fifth row.
Given an input consisting of brick lengths for each row such as the one above, return
the fewest number of bricks that must be cut to create a vertical line.
CHAPTER5. HASHTABLES
69
Solution
At first glance we might consider testing each vertical line to see how many bricks it
would have to cut through. However, given the structure of our input, each line will
require us to accumulate the values of each row of bricks, which will be both messy
and time-consuming. If the length of the wall is m and there are n total bricks, this
will take O(m x n).
Let's reframe this with a little help from our friend the hash table. Instead of thinking
about how to minimize the number of cuts, we can try to maximize the number of
times a line can pass through an edge between two bricks.
To do this, we will examine each row and increment a counter in a hash table for the
accumulated distance covered after each brick, except for the last one. For example,
after the first row in the input above, our map would contain {3: 1, 8: 1, 9: 1}.
The key in the dictionary with the largest value represents the vertical line with the
fewest bricks cut. Finally, to find the actual number of bricks, we can subtract this
value from the total number of rows.
from collections import defaultdict
def fewest_cuts(wall):
cuts= defaultdict(int)
for row in wall:
length= 0
for brick in row[:-1]:
length+= brick
cuts[length] += 1
return len(wall) - max(cuts.values())
For each brick, we only need to update our dictionary once, so this algorithm will
take O(n) time. Our map will require O(m) space to store a value for each possible
edge.
CHAPTERS. HASHTABLES
70
5.3 Implement a sparse array
You have a large array, most of whose elements are zero.
Create a more space-efficient data structure, SparseArray, that implements the
following interface:
• ini t( arr, size): initialize with the original large array and size.
• set( i, val): update index at i to be val.
• get( i ): get the value at index i.
Solution
One advantage hash tables have over other data structures is that data need not be
sequential. If we wanted to set the 1000th value in a 1024-bi t array, we still need to
store zeroes for all the other bits.
With a hash table, however,
we can cut down on space tremendously by only keeping
....
track of the non-zero values and indices. We will use a dictionary that stores only
these values and defaults to zero if any key is not found.
We must also remember to check the bounds when setting or getting i, and to clean
up any indices if we're setting an index to zero again, to save space.
class SparseArray:
def __ init __ (self, arr, n):
self.n
=
n
self._dict
= {}
for i, e in enumerate(arr):
if e
!= 0:
self._dict[i]
=
e
def _check_bounds(self, i):
if i < 0 or i >= self.n:
raise IndexError('Out of bounds')
CHAPTER 5. HASH TABLES
71
def set(self, i, val):
self._check_bounds(i)
if val != 0:
self._dict[i] = val
return
elif i in self._dict:
del self._dict[i]
def get(self, i):
self._check_bounds(i)
return self._dict.get(i, 0)
Thanks to the magic of hash tables, our implementation will only use as much space
as there are non-zero elements in the array.
Trees
Trees are among the most common topics asked about in software interviews. Fortunately, with a few key concepts under your belt you should be well prepared to
answer any of these questions.
You probably have an intuitive understanding of trees, as they pop up frequently
enough outside of computer science: family tree diagrams, grammar parsing, and
flow charts are some contexts in which they may be found. Here we present a slightly
more restricted definition: a tree is a recursive data structure consisting of a root
node (typically shown at the top) with zero or more child nodes, where each child
node acts as the root of a new tree.
For example, below is a binary tree rooted at 7. Binary here means simply that each
node is only allowed to have up to two leaf nodes.
73
CHAPTER 6. TREES
74
Note that we make no ·restriction at the moment as to the values of the tree. In the
next chapter, we will explore binary search trees, which impose a specific ordering on
the values of child and parent nodes.
Trees are directed and acyclic: the connections between parents and children always
flow downward, so that it is impossible to form a loop. Further, in contrast to a
typical family tree, two parents can never have the same child.
Common operations in tree questions involve:
• inserting, searching for, and deleting a particular node
• finding subtrees, or a subset of nodes that form their own tree
• determining the distance or relationship between two nodes
Typically to answer these questions you will need to perform a recursive tree traversal,
which comes in three flavors:
• in-order: Traverse left node, then current node, then right
• pre-order: Traverse current node, then left node, then right
• post-order: Traverse left node, then right node, then current
For the tree above, for example, the three traversals would generate the following
orders, respectively:
• [4, 10, 40, 7, -1, 5]
• [7, 10, 4, 40, 5, -1]
• [4, 40, 10, -1, 5, 7]
In the problems in this chapter we will explore the tradeoffs between these more
deeply.
There are a few additional pieces of terminology you should be familiar with when
dealing with trees:
CHAPTER 6. TREES
75
• A node A is called an ancestor of a node B if it can be found on the path
from the root to B.
• The height or depth of a tree is the length of the longest path from the root
to any leaf.
• A full binary tree is a binary tree in which every non-leaf node has exactly
two children.
• A complete binary tree is one in which all levels except for the bottom one
are full, and all nodes on the bottom level are filled in left to right.
To implement a tree, we typically begin by defining a Node class and then using it to
build a Tree class.
class Node:
def __ init __ (self, data, left=None, right=None):
self.data= data
self.left= left
self.right= right
The implemention of a given tree will often depend on the tree's application, and
the particular traversal algorithm chosen. For an example, take a look at the binary
search tree definition in the following chapter.
As mentioned above, trees can represent a wide variety of objects: animal classification
schemas, an HTML document object model, moves in a chess game, or a Linux file
system are a few. In general when you are faced with hierarchical data, trees are a
great data structure to choose.
6.1
Countunival trees
A unival tree (which stands for "universal value") is a tree where all nodes under it
have the same value.
Given the root to a binary tree, count the number of unival subtrees.
CHAPTER 6. TREES
76
For example, the following tree has 5 unival subtrees:
Solution
To start off, we should go through some examples.
This tree has 3 unival subtrees: the two a leaves, and the one b lea£ The b leaf causes
all its parents to not be counted as a unival tree.
CHAPTER 6. TREES
77
This tree has 5 unival subtrees: the leaf at c, and every b.
Let's begin by first writing a function that checks whether a tree is unival or not.
Then, perhaps we could use this to count up all the nodes in the tree.
To check whether a tree is a unival tree, we must check that every node in the tree
has the same value. To start off, we could define an is_unival function that takes in
a root to a tree. We would do this recursively with a helper function. Recall that a
leaf qualifies as a unival tree.
def is_unival(root):
return unival_helper(root, root.value)
def unival_helper(root, value):
if root is None:
return True
if root.value== value:
return unival_helper(root.left, value) and\
unival_helper(root.right, value)
return False
And then our function that counts the number of subtrees could simply use that
function:
def count_unival_subtrees(root):
if root is None:
CHAPTER 6. TREES
78
return 0
left= count_unival_subtrees(root.left)
right= count_unival_subtrees(root.right)
return 1 +left+ right if is_unival(root) else left+ right
However, this runs in 0( n 2 ) time. For each node of the tree, we're evaluating each
node in its subtree again as well. We can improve the runtime by starting at the
leaves of the tree, and keeping track of the unival subtree count as we percolate back
up. In this way we evaluate each node only once, making our algorithm run in 0( n)
time.
def count_unival_subtrees(root):
count, _ = helper(root)
return count
#
Return the number of unival subtrees, and a Boolean for
#
whether the root is itself a unival subtree.
def helper(root):
if root is None:
return 0, True
left_count, is_left_unival = helper(root.left)
right_count, is_right_unival = helper(root.right)
total_count = left_count + right_count
if is_left_unival and is_right_unival:
if root.left is not None and root.value != root.left.value:
return total_count, False
if root.right is not None and root.value != root.right.value:
return total_count, False
return total_count + 1, True
return total_count, False
CHAPTER 6. TREES
6.2
79
Reconstruct tree from pre-order and in-order
traversals
Given pre-order and in-order traversals of a binary tree, write a function to reconstruct
the tree.
For example, given the following pre-order traversal:
[a, b, d, e, c, f, g]
And the following in-order traversal:
[d, b, e, a, f, c, g]
You should return the following tree:
Solution
Recall the definitions of pre-order and in-order traversals:
For pre-order:
• Evaluate root node
• Evaluate left child node recursively
CHAPTER 6. TREES
80
• Evaluate right child node recursively
For in-order:
• Evaluate left child node recursively
• Evaluate root node
• Evaluate right child node recursively
Let's consider the given example again.
Notice that because we always evaluate the root node first in a pre-order traversal, the
first element in the pre-order traversal will always be the root. The second element
is then either the root of the left child node if there is one, or the root of the right
child node. But how do we know?
We can look at the in-order traversal.
Because we look at the left child node first in an in-order traversal, all the elements
up until the root will be part of the left subtree. All elements after the root will be
the right subtree.
Pre-order:
ab
r )
de
left
c
fg
right
In-order:
db
left
(r
=
ea
)r )
fc
g
right
root)
This gives us an idea for how to solve the problem:
• Find the root by looking at the first element in the pre-order traversal
81
CHAPTER 6. TREES
• Find out how many elements are in the left subtree and right subtree by
searching for the index of the root in the in-order traversal
• Recursively reconstruct the left subtree and right subtree
The code for this problem would look like this:
def reconstruct(preorder, inorder):
if not preorder and not inorder:
return None
if len(preorder) == len(inorder)
return preorder[0]
1:
We assume that elements of the input lists are tree nodes.
root= preorder[0]
root_i = inorder.index(root)
root.left= reconstruct(preorder[1:1 + root_i],
inorder[0:root_i])
#
root.right= reconstruct(preorder[1 + root_i:],
inorder[root_i + 1:])
return root
6.3
Evaluate arithmetic tree
Suppose an arithmetic expression is given as a binary tree. Each leaf is an integer
and each internal node is one of+,-,*, or/.
Given the root to such a tree, write a function to evaluate it.
For example, given the following tree:
CHAPTER 6. TREES
You should return 45, as it is ( 3
82
+ 2)
* (4
+ 5).
Solution
This type of tree is more formally known as an expression tree, and we can use a
form of post-order traversal to evaluate it. We start by checking the value of the root
node. If it is one of the four operators described above, we recursively find the value
of the node's left and right children and apply the operator to them.
If it is not an arithmetic operator, the node must contain a number, which we can
simply return.
class Node:
def __ init __ (self, data, left=None, right=None):
self.data= data
self.left= left
self.right= right
PLUS = "+"
MINUS=
-
TIMES = "*"
DIVIDE
= "/"
def evaluate(root):
if root.data== PLUS:
return evaluate(root.left)
+
evaluate(root.right)
etif root.data== MINUS:
return evaluate(root.left) - evaluate(root.right)
etif root.data== TIMES:
return evaluate(root.left) * evaluate(root.right)
CHAPTER 6. TREES
83
elif root.data== DIVIDE:
return evaluate(root.left) / evaluate(root.right)
else:
return root. val
This algorithm runs in O(n) time and O(h) space, since we must traverse all nodes
and our stack will hold at most a number of elements equal to the tree's height.
6.4
Get tree level with minimum sum
Given a binary tree, return the level of the tree that has the minimum sum. The level
of a node is defined as the number of connections required to get to the root, with
the root having level zero.
For example:
In this tree, level 0 has sum 1, level 1 has sum 5, and level 2 has sum 9, so the level
with the minimum sum is 0.
Solution
In order to calculate the sum of each level, we would like to iterate over this tree level
by level. Unfortunately, none of the traversal methods we discuss in the introduction
seem to achieve this. An in-order traversal might work for the example described
above, but it would fail ifNode(2) had a child.
CHAPTER 6. TREES
84
Instead, we can use a queue to ensure that each node is dealt is processed in the
correct order. More concretely, items in our queue will be tuples containing the value
and level for a given node. Each time we pop an element from the queue, we add
to the total for the corresponding level and add any child nodes to the end of the
queue, with an incremented level.
We will keep track of the sum in each level using a dictionary, allowing us to store
both positive and negative values.
from collections import defaultdict, deque
class Node:
def __ init __ (self, data, left=None, right=None):
self.data= data
self.left= left
self.right= right
def smallest_level(root):
queue= deque([])
queue.append((root, 0))
Create a map to accumulate the sum for each level.
level to sum= defaultdict(int)
#
while queue:
node, level= queue.popleft()
level_to_sum[level] += node.data
if node.right:
queue.append((node.right, level+ 1))
if node.left:
queue.append((node.left, level+ 1))
return min(level_to_sum, key=level_to_sum.get)
The time complexity for this function is O(n).
l
Binary Search Trees
A binary search tree, or B ST, is a binary tree whose node values are guaranteed to
stay in sorted order; that is, an in-order traversal of its nodes will create a sorted list.
For example, here is a BST of integers rooted at 7:
Similar to how a sorted array offers more efficient search times over unsorted arrays,
BSTs provide several improvements over standard binary trees.
In particular, insert, find, and delete operations all run in O(h) time, where his
the height of the tree. If an efficient implementation is used to maintain the height
of the tree around O(logn), where n is the number of nodes, then these operations
will all be logarithmic inn.
A simple Python BST can be written as follows:
85
CHAPTER 7. BINARY SEARCH TREES
class Node:
def __ init __ (self, data, left=None, right=None):
self.data
data
self.left
left
self.right= right
class BST:
def __ init __ (self):
self.root= None
def insert(self, x):
if not self.root:
self.root
Node(x)
else:
self._insert(x, self.root)
def _insert(self, x, root):
if x < root.data:
if not root.left:
root.left= Node(x)
else:
self.insert(x, root.left)
else:
if not root.right:
root.right= Node(x)
else:
self.insert(x, root.right)
def find(self, x):
if not self.root:
return False
else:
return self._find(x, self.root)
def _find(self, x, root):
if not root:
return False
elif x == root.data:
return True
elif x < root.data:
return self._find(x, root.left)
else:
86
CHAPTER 7. BINARY SEARCH TREES
87
return self._find(x, root.right)
Note that, as is common in recursive implementations, we use a helper function to
properly define our insert and find methods.
The most common questions on binary search trees will ask you to search for elements,
add and remove elements, and determine whether a tree is indeed a BST. We'll cover
most of these in the problems to come.
7.1
Find floor and ceiling
Given a binary search tree, find the floor and ceiling of a given integer. The floor is
the highest element in the tree less than or equal to an integer, while the ceiling is
the lowest element in the tree greater than or equal to an integer.
If either value does not exist, return None.
Solution
Under the hood, this problem is closely related to that of checking whether an
element exists in a binary search tree.
In that case, we would proceed recursively, starting from the root and comparing
each node we hit to the value we are searching for. If the value is less than the node
data, we search the left child; if the value is greater, we search the right child. Finally,
if we reach a leaf node and still have not found the element, we return None.
This problem is not too different, in fact. Our recursive function will have two extra
parameters, floor and cei l, which will initially be defined as None. At each node,
we can update these parameters as follows:
• If value < node. data, we know the ceiling can be no greater than node. data
• If value > node. data, we know the floor can be no less than node. data
CHAPTER 7. BINARY SEARCH TREES
88
Once updated, we continue as before, calling our function recursively on the appropriate child node. At the end, when we reach a leaf node, we return the latest and
most accurate values for these parameters.
class Node:
def __ init __ (self, data, left=None, right=None):
self.data= data
self.left= left
self.right= right
def get_bounds(root, x, floor=None, ceil=None):
if not root:
return floor, ceil
if x == root.data:
return x, x
elif x < root.data:
floor, ceil = get_bounds(root.left, x, floor, root.data)
elif x > root.data:
floor, ceil = get_bounds(root.right, x, root.data, ceil)
return floor, ceil
This algorithm requires a single traversal from the top to the bottom of the tree.
Therefore, the time complexity will be O(h), where his the height of the tree. If
the tree is balanced, this is equal to O(logn ). Similarly, the space complexity will be
0 ( h), since we will need to make space on the stack for each recursive call.
7 .2
Convert sorted array to BST
Given a sorted array, convert it into a height-balanced binary search tree.
CHAPTER 7. BINARY SEARCH TREES
89
Solution
If the tree did not have to be balanced, we could initialize the first element as the
root of the tree, and add each subsequent element as a right child.
However, as mentioned in the introduction, keeping a binary search tree balanced is
crucial to ensuring the efficiency of its operations.
Instead, since the list is sorted, we know that the root should be the element in
the middle of the list, which we can call M. Furthermore, the left subtree will be
equivalent to a balanced binary search tree created from the first M - 1 elements in
the list. Analogously, the right subtree can be constructed from the elements after
M in our input.
Therefore, we can create this tree recursively by calling our function on successively
smaller input ranges to create each subtree.
class Node:
def __ init __ (self, data, left=None, right=None):
self.data= data
self.left= left
self.right= right
def make_bst(array):
if not array:
return None
mid= len(array) // 2
root= Node(array[mid])
root.left= make_bst(array[:mid])
root.right= make_bst(array[mid + 1:))
return root
This algorithm will take 0( n) time and space, since for each element of the list,
we must construct a node and add it as a child to the tree, each of which can be
considered 0(1) operations.
CHAPTER 7. BINARY SEARCH TREES
7.3
90
Construct all BSTs with n nodes
Given an integer n, construct all possible binary search trees with n nodes where all
values from [ 1 , ... , n ] are used.
For example, given n
= 3, return the following trees:
Solution
As with many tree problems, we can formulate this recursively. Let the range of
values that nodes in a tree can take be bounded by low and high. If the root of the
tree is i, the left subtree will hold data between low and i - 1, and the right subtree
will hold data between i + 1 and high.
For each possible value in the left subtree, we can choose one, say j, to be the root,
which will again determine what data its left and right children can hold. This process
continues until there are no more values to choose.
91
CHAPTER 7. BINARY SEARCH TREES
Here is what the left subtrees would look like if n
i
=
5 and we start by choosing
= 3 as our root node.
At the same time, an analogous process can be carried out to find all the possible
right subtrees.
Finally, for every possible root value, and every possible left and right subtree, we
create a node with this value and the corresponding left and right children, and add
this node to our list of possible trees.
class Node:
def __ init __ (self, data, left=None, right=None):
self.data= data
self.left= left
self.right= right
def make_trees(low, high):
trees = []
if low> high:
trees.append(None)
return trees
for i in range(low, high+ 1):
left= make_trees(low, i - 1)
right= make_trees(i + 1, high)
for l in left:
for r in right:
node= Node(i, left=l, right=r)
trees.append(node)
CHAPTER 7. BINARY SEARCH TREES
92
return trees
To print out the tree, we can perform a preorder traversal.
def preorder(root):
result=[]
if root:
result.append(root.data)
result+= preorder(root.left)
result+= preorder(root.right)
return result
Putting it all together, for a given input n, we first construct all trees that use values
between 1 and n, and then iterate over each tree to print the nodes.
def construct_trees(N):
trees= make_trees(l, N)
for tree in trees:
print(preorder(tree))
The number of possible binary search trees grows exponentially with the size of n, as
can be seen by inspecting the :first few values: 1, 2, 5, 14, 42, 132, 429, 1430,
. . . . In fact, this sequence is defined by the Catalan numbers, which appear in a
variety of combinatorial problems.
O(n) time to build each possible tree, so with an
exponential number of trees, this algorithm will run in 0( n x 2n) time. Since each
tree takes O (n) space to store its nodes, the space complexity is O (n x 2n) as well.
In any case,
make_trees takes
Tries
The first thing to know about a trie is that it is pronounced "try", not "tree".
With that out of the way, a trie is a kind of tree whose nodes typically represent
strings, where every descendant of a node shares a common prefix. For this reason
tries are often referred to as prefix trees. Here is an example:
Following all paths from the root to each leaf spells out all the words that this trie
· m
· t h.1s case "b ear""
contams,
, cat ""
, coat ", and"dog".
There are two main methods used with tries:
93
CHAPTER 8. TRIES
94
• insert(word): add a word to the trie
• find (word): check if a word or prefix exists in the trie
Each of these methods will run in O ( k), where k is the length of the word.
Tries can be implemented in several ways, but in an interview setting the simplest
way is to use a nested dictionary, where each key maps to a dictionary whose keys
are successive letters in a given word.
Printing out the underlying trie in the image above, we would obtain the following
dictionary:
{
I
b I:
{
I
e I:
{Ia I:
{'r': {#': True}}}},
{Ii I:
{In I:
{
I
g
I:
{ 'o' : {, #, : True}}}}}'
IC I:
{Ia I:
{, t, : { '#' : True}}}'
{I O I:
{Ia I:
{ 't' : { '#' : True}}}},
Id I:
{I O I:
{ 'g' : {, #, : True}}}}
We will see several modifications of tries in the coming problems, but you would do
well to understand and be ready with the following basic implementation:
ENDS_HERE
= '#'
class Trie:
def __ init __ (self):
95
CHAPTER 8. TRIES
self ._trie
= {}
def insert(self, text):
trie
=
self. trie
for char in text:
if char not in trie:
trie[char] = {}
trie
=
trie[char]
trie[ENDS_HERE]
=
True
def find(self, prefix):
trie
=
self._trie
for char in prefix:
if char in trie:
trie
=
trie[char]
else:
return None
return trie
Let's now see some tries in action.
8.1
Implement autocomplete system
Implement an autocomplete system. That is, given a query string s and a set of all
possible query strings, return all strings in the set that have s as a prefix.
For example, given the query string de and the set of strings [dog, deer, deal],
return [deer, deal].
Solution
Autocomplete can be considered a canonical trie-based question: whenever text
completion arises in an interview setting, tries should be the first tool you reach for.
For comparison's sake, though, we will explore a more straightforward solution. We
can iterate over the dictionary and check if each word starts with our prefix. If so, we
CHAPTER 8. TRIES
96
add it to our set of results, and then return it once we're done.
WORDS= ['dog', 'deer', 'deal']
def autocomplete(s):
results= set()
for word in WORDS:
if word.startswith(s):
results.add(word)
return results
This runs in O (n) time, where n is the number of words in the dictionary.
Fortunately, we can improve on this with a trie. The first step is to insert each word
in our dictionary into our trie, using the insert method outlined above. For the
given words, this would result in the following diagram:
To find all words beginning with "de", we would search from the root along the path
from "d" to "e", and then collect all the words under this prefix node. We also make
use of the
#
terminal value to mark whether or not "de" is actually a word in our
dictionary or not.
While the worst-case runtime would still be O (n) if all the search results have that
prefix, if the words are uniformly distributed across the alphabet, it should be much
CHAPTER 8. TRIES
97
faster on average since we no longer have to evaluate words that don't start with our
prefix.
ENDS_HERE
'#'
class Trie:
def __ init __ (self):
self ._trie = {}
def insert(self, text):
trie = self._trie
for char in text:
if char not in trie:
trie[char] = {}
trie = trie[char]
trie[ENDS_HERE] = True
def find(self, prefix):
trie = self._trie
for char in prefix:
if char in trie:
trie
trie[char]
else:
return[]
return self._elements(trie)
def _elements(self, d):
result
[]
for c, v ind.items():
if c
ENDS HERE:
subresul t
[' ']
else:
subresult = [c + s for sin self._elements(v)]
result.extend(subresult)
return result
trie = Trie()
for word in words:
trie.insert(word)
def autocomplete(s):
suffixes= trie.find(s)
98
CHAPTER 8. TRIES
return [s
8.2
+
w for win suffixes]
Create PrefixMapSum class
Implement a Pref ixMapSum class with the following methods:
• insert( key: str, value: int): Set a given key's value in the map. If the
key already exists, overwrite the value.
• sum( pref ix: str ): Return the sum of all values of keys that begin with a
given prefix.
For example, you should be able to run the following code:
mapsum.insert("columnar", 3)
assert mapsum.sum("col")
==
3
mapsum.insert("column", 2)
assert mapsum.sum("col")
==
5
Solution
Depending on how efficient we want our insert and sum operations to be, there can
be several solutions.
,,..If we care about making
insert as fast as possible, we can use a simple dictionary to
store the value of each key inserted. As a result insertion will be O (1). Then, if we
want to :find the sum for a given key, we would need to add up the values for every
word that begins with that prefix. If n is the number of words inserted so far, and k
is the length of the prefix, this will be O ( n x k).
This could be implemented as follows:
CHAPTER 8. TRIES
99
class PrefixMapSum:
def __ init __ (self):
self.map= {}
def insert(self, key: str, value: int):
self.map[key] = value
def sum(self, prefix):
return sum(value for key, value in self.map.items()
if key.startswith(prefix))
On the other hand, perhaps we will rarely be inserting new words, and need our
sum retrieval to be very efficient. In this case, every time we insert a new key, we
can recompute all its prefixes, so that finding a given sum will be 0(1). However,
insertion will take O(k 2 ), since slicing is O(k) and we must do this k times.
from collections import defaultdict
class PrefixMapSum:
def __ init __ (self):
self.map= defaultdict(int)
self.words= set()
def insert(self, key: str, value: int):
#
If the key already exists, increment prefix totals
#
by the difference of old and new values.
if key in self.words:
value-= self.map[key]
self.words.add(key)
for i in range(l, len(key) + 1):
self.map[key[:i]] += value
def sum(self, prefix):
return self.map[prefix]
A solution with the best of both options is to use trie data structure. In fact, whenever
CHAPTER 8. TRJES
100
a problem involves prefixes, a trie should be one of your go-to options.
When we insert a word into the trie, we associate each letter with a dictionary that
stores the letters that come after it in various words, as well as the total at any given
time.
For example, suppose that we wanted to perform the following operations:
mapsum.insert("bag", 4)
mapsum.insert("bath", 5)
For the first insert call, since there is nothing already in the trie, we would create
the following:
{ "b": "total": 4,
{"a": "total": 4,
{"g": "total": 4}
When we next insert bath, we will step letter by letter through the trie, incrementing
the total for the prefixes already in the trie, and adding new totals for prefixes that
do not exist. The resulting dictionary would look like this:
{"b": "total": 9,
{"a": "tot a1": 9,
{"g": "total": 4},
{"t": "total": 5,
{"h": "total": 5}
CHAPTER 8. TRIES
101
As a result, insert and sum will involve stepping through the dictionary a number
of times equal to the length of the prefix. Since finding the dictionary values at each
level is constant time, this algorithm is O(k) for both methods.
from collections import defaultdict
class TrieNode:
def __ init __ (self):
self.letters
{}
self.total= 0
class PrefixMapSum:
def __ init __ (self):
self._trie = TrieNode()
self .map = {}
def insert(self, key, value):
#
#
If the key already exists, increment prefix totals by
the difference of old and new values.
value-= self.map.get(key, 0)
self.map[key] = value
trie = self._trie
for char in key:
if char not in trie.letters:
trie.letters[char] = TrieNode()
trie = trie.letters[char]
trie.total += value
def sum(self, prefix):
d = self._trie
for char in prefix:
if char ind.letters:
d
d.letters[char]
else:
return
return d.total
0
CHAPTER 8. TRIES
8.3
102
Find Maximum XOR of element pairs
Given an array of integers, find the maximum XOR of any two elements.
Solution
One solution here would be to loop over each pair of integers and XOR them,
keeping track of the maximum found so far. If there are n numbers, this would take
O(n 2 ) time.
We can improve on this by using a trie data structure. If we represent each integer as
a binary number with k bits, we can insert it into a trie with its most significant bit
at the top and each successive bit one level down. For example, 4, 6, and 7 could be
represented in a three-level trie as follows:
Why would we want to do this? Well, once we have constructed such a trie, we can
find the maximum XOR product for any given element by going down the trie and
always trying to take the path with an opposite bit.
For example, suppose we wanted to :find the maximum XOR for 4, represented as
100 in the trie above. We would use the following procedure:
• The first bit is 1, so we look for a node on the top level with the value 0. Since
this does not exist, we continue without incrementing our XOR counter.
• Next, since the second bit is 0, we want to :find a child node with the value 1.
This time, a node does exist, so we move down to 1 and increment our XOR
value by 1 «
1.
CHAPTER 8. TRIES
103
• Finally, the last bit is 0, so we look for a child node with a value of 1. Again
the right node exists, so we increment our count by 1 «
0.
After traversing the trie, we would find the maximum XOR to be 1 << 1 + 1 <<
0, or 3.
These trie operations can be implemented as shown below:
class Trie:
def __ init __ (self, k):
self ._trie = {}
self.size= k
def insert(self, item):
trie = self._trie
for i in range(self.size, -1, -1):
bit= bool(item
&
(1 << i))
if bit not in trie:
trie[bit] = {}
trie = trie[bit]
def find_max_xor(self, item):
trie = self._trie
xor = 0
for i in range(self.size, -1, -1):
bit= bool(item & (1 << i))
if (1 - bit) in trie:
xor I= (1 « i)
trie
trie[l - bit]
else:
trie
trie[bit]
return xor
Putting it all together, our solution is to first instantiate a Trie, using the maximum
bit length to determine the size. Then, we insert the binary representation of each
element into the trie. Finally, we loop over each integer to find the maximum XOR
CHAPTER 8. TRIES
104
that can be generated, updating an XOR counter if the result is the greatest seen so
far.
def find_max_xor(array):
k = max(array).bit_length()
trie = Trie(k)
for i in array:
trie.insert(i)
xor
=
0
for i in array:
xor
=
max(xor, trie.find_max_xor(i))
return xor
The complexity of each insert and find_max_xor operation is O(k ), where k is the
number of bits in the maximum element of the array. Since we must perform these
operations for every element, this algorithm takes 0( n x k) time overall. Similarly,
because our trie holds N words of size k, this uses 0( n x k) space.
Heaps
We now build on our foundation of tree knowledge by exploring heaps. A heap is a
tree that satisfied the (aptly named) heap property, which comes in two flavors:
• In a max-heap, the parent node's value is always greater than or equal to its
child node(s)
• In a min-heap, the parent node's value is always smaller than or equal to its
child node( s)
Note that, unlike with BSTs, it is possible for a left child to have a greater value (in
the case of a min-heap) or for a right child to have a smaller value (in the case of a
max-heap).
While it is possible for parent nodes to have more than two children, almost all
interview questions will deal with binary heaps, so we will make that assumption
throughout the following problems. In the following explanation we will also assume
that we are dealing with a min-heap, but the same principles apply for max-heaps.
For example, here is a heap of integers:
105
CHAPTER 9. HEAPS
106
We can also represent a heap in a more space-efficient way by using an array. In this
style, the two child nodes of a parent node located at index i can be found at indices
2i
+ 1 and 2i + 2, like so:
I 10 I 14 I 19 I 26 I 31 142 I
When using an array to represent a heap, the heap must be filled in level by level,
left to right.
Whenever you are asked to find the top k or minimum k values, a heap should be
the first thing that comes to mind. Heaps are closely tied to the heapsort sorting
algorithm, priority queue implementations, and graph algorithms such as Dijkstra's
algorithm, which we will explore in later chapters.
You should be familiar with the following heap operations:
• insert(heap, x): add an element x to the heap, O(logn)
• delete-min(heap): remove the lowest node, O(logn)
• heapi fy(array ): convert an array into a heap by repeated insertions, 0( nlogn)
In the solutions that follow we will make use of Python's heapq module to implement
the methods above. The corresponding operations are as follows:
• heapq.heappush(heap, x)
• heapq.heappop(heap)
CHAPTER 9. HEAPS
107
• heapq.heapify(array)
Let's jump in.
9.1 ·Compute the running median
Compute the running median of a sequence of numbers. That is, given a stream of
numbers, print out the median of the list so far after each new element.
Recall that the median of an even-numbered list is the average of the two middle
numbers.
For example, given the sequence [ 2 , 1 , 5 , 7 , 2 , 0, 5], your algorithm should
print out:
2
1.5
2
3.5
2
2
2
Solution
In the introduction we learned that finding minimal and maximal values are great
reasons to use heaps. But how do they apply to finding medians?
For this problem, we need to think outside the box and use two heaps: a min-heap
and a max-heap. We will keep all elements smaller than the median in the max-heap
and all elements larger than the median in the min-heap. As long as we keep these
heaps the same size, we can guarantee that the median is either the root of the
min-heap or the max-heap (or both).
CHAPTER 9. HEAPS
108
Whenever we encounter a new element from the stream, we will first add it to one of
our heaps: the max-heap if the element is smaller than the median, or the min-heap
if it is bigger. If the element equals the median, we will arbitrarily choose to add it
to the min-heap.
Then we re-balance if necessary by moving the root of the larger heap to the smaller
one. This is only necessary if one heap is larger than the other by more than 1
element.
Finally, we can print out our median. This will either be the root of the larger heap,
or the average of the two roots if they're of equal size. We can save ourselves some
trouble here by noting that since the root of a heap occupies the initial index of the
underlying array, getting the median simply involves accessing heap [ 0].
Unfortunately, Python's heapq library only supports min-heaps. We can get around
this limitation, though, by negating values before pushing them to the max heap,
and then negating again when popping.
import heapq
def get_median(min_heap, max_heap):
if len(min_heap) > len(max_heap):
return min_heap[0]
elif len(min_heap) < len(max_heap):
return -1 * max_heap[0]
else:
return (min_heap[0] + -1 * max_heap[0]) / 2.0
def add(num, min_heap, max_heap):
#
If empty, then just add it to the min heap.
if len(min_heap) + len(max_heap) < 1:
heapq.heappush(min_heap, num)
return
median= get_median(min_heap, max_heap)
if num > median:
heapq,heappush(min_heap, num)
else:
heapq.heappush(max_heap, -1 * num)
CHAPTER 9. HEAPS
109
def rebalance(min_heap, max_heap):
if len(min_heap) > len(max_heap) + 1:
root= heapq.heappop(min_heap)
heapq.heappush(max_heap, -1 * root)
elif len(max_heap) > len(min_heap) + 1:
root= -1 * heapq.heappop(max_heap)
heapq.heappush(min_heap, root)
def print_median(min_heap, max_heap):
print(get_median(min_heap, max_heap))
def running_median(stream):
min_heap = []
max_heap = []
for num in stream:
add(num, min_heap, max_heap)
rebalance(min_heap, max_heap)
print_median(min_heap, max_heap)
The running time for this algorithm is O (n log n), since for each element we perform
a constant number of heappush and heappop operations, each of which take O(logn)
in the worst case.
9 .2
Find most similar websites
You are given a list of (website, user) pairs that represent users visiting websites.
Come up with a program that identifies the top k pairs of websites with the greatest
similarity.
For example, suppose k
= 1, and the list of tuples is:
[('google.com', 1), ('google.com', 3), ('google.com', 5),
('pets.com', 1), ('pets.com', 2), ('yahoo.com', 6),
('yahoo.com', 2), ('yahoo.com', 3), ('yahoo.com', 4), ('yahoo.com', 5)
('wikipedia.org', 4), ('wikipedia.org', 5), ('wikipedia.org', 6),
('wikipedia.org', 7), ('bing.com', 1), ('bing.com', 3), ('bing.com': 5),
('bing.com', 6)]
CHAPTER 9. HEAPS
110
To compute the similarity between two websites you should compute the number of
users they have in common divided by the number of users who have visited either
site in total. (This is known as the Jaccard index.)
For example, in this case, we would conclude that google. com and bing. com are the
most similar, with a score of 3/4, or 0.75.
Solution
First, let's implement the similarity metric defined above as a helper function.
def compute_similarity(a, b, visitors):
return len(visitors[a] & visitors[b]) / len(visitors[a] I visitors[b])
This function relies on our ability to quickly find all the visitors for a given website.
Therefore, we should first iterate through our input and build a hash table that maps
websites to sets of users.
Following this, we can consider each pair of websites and compute its similarity
measure. Since we will need to know the top scores, we can store each value and the
pair that generated it in a heap. To keep our memory footprint down, we can ensure
that at any given time only the k pairs with the highest scores remain in our heap by
popping lower-valued pairs.
Finally, we can return the values that are left in our heap after processing each pair
of websites.
def top_pairs(log, k):
visitors= defaultdict(set)
for site, user in log:
visitors[site].add(user)
pairs = []
sites
for
list(visitors.keys())
in range(k):
CHAPTER 9. HEAPS
111
heapq.heappush(pairs, (0, (' ', '' )))
for i in range(len(sites) - 1):
for j in range(i + 1, len(sites)):
score= compute_similarity(sites[i], sites[j], visitors)
heapq.heappushpop(pairs, (score, (sites[i], sites[j])))
return [pair[l] for pair in pairs]
For each pair of websites, we must compute the union of its users. As a result, this
part of our algorithm will take O(n 2 x m), where n is the number of sites and mis
the number of users. Inserting into and deleting from the heap is logarithmic in the
size of the heap, so if we assume k < m, our heap operations will be dominated by
the calculation above. Therefore, our time complexity will be O(n 2 x m).
As for space complexity, our hash table will require n 2 keys, and our heap will have
at most k elements. Assuming k < n 2 , then, the space required by this program will
be O(n 2 ).
9.3
Generate regular numbers
A regular number in mathematics is defined as one which evenly divides some power
of 60. Equivalently, we can say that a regular number is one whose only prime divisors
are 2, 3, and 5.
These numbers have had many applications, from helping ancient Babylonians keep
time to tuning instruments according to the diatonic scale.
Given an integer n, write a program that generates, in order, the first n regular
numbers.
Solution
A naive solution would be to first generate all powers of 2, 3, and 5 up to some
stopping point, and then find every product we can obtain from multiplying one
CHAPTER 9. HEAPS
112
power from each group. We can then sort these products and take the first n to find
our solution.
def regular_numbers(n):
twos= [2 ** i for i in range(n)]
threes= [3 ** i for i in range(n)]
fives= [5 ** i for i in range(n)]
solution= set()
for two in twos:
for three in threes:
for five in fives:
solution.add(two *three* five)
return sorted(solution)[:n]
Since there are n integers in each prime group, our solution set will contain 0( n 3 )
numbers. As a result, the sorting process will take 0( n 3 log n) time.
Note that in the above solution, we had to sort all the multiples at the end. If we
were able to just keep track of the smallest N multiples at any given point, we could
make this solution significantly more efficient. It sounds like this is a perfect use
case for a heap!
For any regular number x, we can generate three additional regular numbers by
calculating 2x, 3x, and 5x. Conversely, it must be the case that any regular number
must be twice, three times, or five times some other regular number.
To take advantage of this, we can initialize a min-heap starting with the value 1.
Each time we pop a value x from the heap, we yield it, then push 2x, 3x, and 5x
onto the heap. We can continue this process until we have yielded N integers.
One point to consider is that, for example, the number 6 will be pushed tp the heap
twice, once for being a multiple of two and once for being a multiple of three. To
avoid yielding this value twice, we maintain a variable for the last number popped,
and only process a value if it is greater than this variable.
CHAPTER 9. HEAPS
113
import heapq
def regular_numbers(n):
solution = [1]
last= 0; count= 0
while count< n:
x = heapq.heappop(solution)
if x > last:
yield x
last= x; count+= 1
heapq.heappush(solution, 2 * x)
heapq.heappush(solution, 3 * x)
heapq.heappush(solution, 5 * x)
Each pop and push operation will take O(log n) time. Since we will consider at most
the first n multiples of 2, 3, and 5, there will be O (n) of these operations, leading to
an O (n log n) runtime.
Now let's try something a little more complicated.
9.4
Build a Huffman tree
Huffman coding is a method of encoding characters based on their frequency. Each
letter is assigned a variable-length binary string, such as 0101 or 111110, where
shorter lengths correspond to more common letters. To accomplish this, a binary
tree is built such that the path from the root to any leaf uniquely maps to a character.
When traversing the path, descending to a left child corresponds to a 0 in the prefix,
while descending right corresponds to 1.
Here is an example tree (note that only the leaf nodes have letters):
CHAPTER 9. HEAPS
114
1
0
With this encoding, "cats"would be represented as 0000110111.
Given a dictionary of character frequencies, build a Huffman tree, and use it to
determine a mapping between characters and their encoded binary strings.
Solution
First note that regardless of how we build the tree, we would like each leaf node to
represent a character.
class Node:
def __ init __ (self, char, left=None, right=None):
self.char= char
self.left= left
self.right= right
When building the tree,we should try to ensure that less frequent characters end up
further away from the root. We can accomplish this as follows:
• Start by initializing one node for each letter.
• Create a new node whose children are the two least common letters, and whose
value is the sum of their frequencies.
• Continuing in this way, take each node, in order of increasing letter frequency,
and combine it with another node.
CHAPTER 9. HEAPS
115
• When there is a path from the root to each character, stop.
For example, suppose our letter frequencies were { a
11
11
:
3,
11
c
11
:
6,
11
e
11
:
8,
11
f
11
:
2}.
The stages to create our tree would be as follows:
In order to efficiently keep track of node values, we can use a priority queue. We will
repeatedly pop the two least common letters, create a combined node, and push that
node back onto the queue.
CHAPTER 9. HEAPS
116
import heapq
def build_tree(frequencies):
nodes=[]
for char, frequency in frequencies.items():
heapq.heappush(nodes, (frequency, Node(char)))
while len(nodes) > 1:
fl, nl = heapq.heappop(nodes)
f2, n2 = heapq.heappop(nodes)
node= Node('*', left=nl, right=n2)
heapq.heappush(nodes, (fl+ f2, node))
root= nodes[0][1]
return root
Each pop and push operation takes O(log n) time, so building this tree will be
O(nlogn), where n is the number of characters.
Finally, we must use the tree to create our encoding. This can be done recursively:
starting with the root, we traverse each path of the tree, while keeping track of a
running string. Each time we descend left, we add O to this string, and each time we
descend right, we add 1. Whenever we reach a leaf node, we assign the current value
of the string to the character at that node.
def encode(root, string='', mapping={}):
if not root:
return
if not root.left and not root.right:
mapping[root.char] = string
encode(root.left, string+ '0', mapping)
encode(root.right, string+ '1', mapping)
return mapping
117
CHAPTER 9. HEAPS
As a result, the encoding for the tree above will be {"f": 000, "a": 001, "c":
01, "e": 1}.
0
It will take, on average, O(logn) time to traverse the path to any character, so
encoding a string of length m using this tree will take O (m log n).
Graphs
Graphs are one of the most important and widely used data structures. Website
links, friend connections, and map routes all rely on graph representations, along
with countless other applications.
Formally, graphs are defined as a set of vertices connected by edges. If these edges
go in one direction, the graph is said to be directed; otherwise it is undirected.
Each edge can additionally be associated with a number that represents its "cost" or
"benefit".
An example of a directed graph would be followers on Twitter. Just because you
follow Elon Musk does not mean he follows you back. On the other hand, friend
connections on Facebook are undirected.
Mathematicians working in graph theory have names for many different graph
concepts. We don't need to know all of them, but a few will be useful for the
explanations that follow:
• neighbor of X: any vertex connected to X by an edge
• path: a route of edges that connects two vertices
• cycle: a path that begins and end on the same vertex
119
CHAPTER 10. GRAPHS
120
• directed acyclic graph (DAG): a directed graph that does not contain any
cycles
• connected graph: a graph in which there is always a path between any two
vertices
Another classic example of a (directed) graph is airline routes. In the following
diagram, we see that there are flights between JFK and SFO, ORL and LAX, and so
on, and each one has an associated plane ticket cost. This graph has several cycles,
since it is indeed possible to start and end at JFK after following several edges.
Graphs can be represented in two main ways: adjacency lists and adjacency matrices.
An adjacency list is essentially a dictionary mapping each vertex to the other vertices
between which there is an edge. For the airline diagram above this would be as
follows:
'JFK': ['SFO', 'LAX'],
'SFO': ['ORL'],
'ORL': ['JFK', 'LAX', 'DFW'],
CHAPTER 10. GRAPHS
I
LAX I
:
[
I
DFW I
121
]
On the other hand, in an adjacency matrix, each vertex is associated with a row and
column of an N x N matrix, and matrix [ i ] [ j ] will be 1 if there is an edge from i
to j, else 0.
This would look like the following:
indices= {
'JFK': 0,
'SFO': 1,
'ORL': 2,
'LAX': 3,
'DFW': 4
graph=
[0, 1, 0, 1, 0],
[0, 0, 1, 0, 0],
[1, 0, 0, 1, 1],
[0, 0, 0, 0, 1],
[0, 0, 0, 0, 0]
In general, the adjacency list representation is more space efficient if there are not
that many edges (also known as a sparse graph), whereas an adjacency matrix has
faster lookup times to check if a given edge exists but uses more space.
You should know the two main traversal methods for graphs: depth-first search
(DFS) and breadth-first search (BFS).
Below is a typical DFS implementation. Note the recursive aspect: for each vertex
we visit, we call our function again on each of its neighbors.
def DFS(graph, start, visited=set()):
visited.add(start)
122
CHAPTER 10. GRAPHS
for neighbor in graph[start]:
if neighbor not in visited:
DFS(graph, neighbor, visited)
return visited
BFS, on the other hand, relies on a queue. For each item that we pop off the queue,
we find its unvisited neighbors and add them to the end of the queue.
from collections import deque
def BFS(graph, start, visited={}):
queue= deque([start])
while queue:
vertex= queue.popleft()
visited.add(vertex)
for neighbor in graph[vertex]:
if neighbor not in visited:
queue.append(neighbor)
return visited
Both of these algorithms run in O(V
+ E)
time and O(V) space in the worst case.
In this chapter we will explore the advantages and disadvantages of each of these
traversal methods, as well as a technique for ordering vertices known as topological
sort.
10.1
Determine if a cycle exists
Given an undirected graph, determine if it contains a cycle.
CHAPTER 10. GRAPHS
123
Solution
One way to think about this problem is as follows: suppose we are traversing the
graph's edges, starting from a given vertex. If, for some vertex, we find that one of its
neighbors has already been visited, then we know that there are two ways to reach
that neighbor from the same starting point, which indicates a cycle.
We can implement this solution using depth-first search. For each vertex in the
graph, ifit has not already been visited, we call our search function on it. This function
will recursively traverse unvisited neighbors of the vertex, and return True if we come
across the situation described above.
If we are able to visit all vertices without finding a duplicate path, we return False.
def search(graph, vertex, visited, parent):
visited[vertex] = True
for neighbor in graph[vertex]:
if not visited[neighbor]:
if search(graph, neighbor, visited, vertex):
return True
elif parent != neighbor:
return True
return False
def has_cycle(graph):
visited= {v: False for v in graph.keys()}
for vertex in graph.keys():
if not visited[vertex]:
if search(graph, vertex, visited, None):
:return True
return False
The time complexity of this solution will be O(V
+ E), since in the worst case we
CHAPTER 10. GRAPHS
124
will have to traverse all edges of the graph. Our search will take O (V) space in the
worst case to store the vertices in a given traversal on the stack.
10.2
Remove edges to create even trees
You are given a tree with an even number of nodes. Consider each connection
between a parent and child node to be an "edge". You would like to remove some
of these edges, such that the disconnected subtrees that remain each have an even
number of nodes.
For example, suppose your input is the following tree:
In this case, if we remove the edge (3, 4), both resulting subtrees will be even.
Write a function that returns the maximum number of edges you can remove while
still satisfying this requirement.
Solution
First note that if a node has an odd number of descendants, we can cut off the link
between that node and its parent in order to create an even-sized subtree. Each time
we do this, we are left with another even-sized group, to which we can apply the
same procedure.
CHAPTER 10. GRAPHS
125
For example, let's take the example tree above. The lowest edge we can cut is that
connecting nodes 3 and 4. Once this is cut, we are left with the following tree:
We now see that 3 still has an odd number of descendants, so we can cut the link
between 1 and 3. In total, then, we are able to remove two edges.
It is not necessary, however, to remove the edges precisely in this order. Instead, it is
sufficient to know that this greedy approach works, so that we can identify all the
nodes with an odd number of descendants (except for the root, which cannot be cut
off in this way), and increment a counter for each.
Let's assume our input is presented in the form of a graph, like so:
graph= {
1: [2, 3],
2: [],
3: [4, S],
4: [6, 7, 8],
5: [],
6: [],
7: [],
8: []
We will first perform a depth-first search traversal through this graph to populate a
dictionary which stores the number of descendants per node. Once this is done, we
simply count up how many of these values are odd and return this total.
CHAPTER 10. GRAPHS
126
from collections import defaultdict
def traverse(graph, curr, result):
descendants= 0
for child in graph[curr]:
num_nodes, result= traverse(graph, child, result)
result[child] += num_nodes - 1
descendants+= num_nodes
return descendants+ 1, result
def max_edges(graph):
start= list(graph)[0]
vertices= defaultdict(int)
_, descendants= traverse(graph, start, vertices)
return len([val for val in descendants.values() if val% 2
1])
Our tree will haven nodes, so the depth-first search will take 0( n) time. The space
complexity is likewise O (n), since we populate a dictionary with n keys, one for each
node.
10.3
Create stepword chain
Given a start word, an end word, and a dictionary of valid words, find the shortest
transformation sequence from start to end such that only one letter is changed at
each step of the sequence, and each transformed word exists in the dictionary. If
there is no possible transformation, return null. Each word in the dictionary has the
same length as start and end and is lowercase.
For example,given start = "dog", end = "cat and dictionary
11
,
"d at
11
11
,
cat
11
} ,
return [ "dog" , "dot " , "d at" , " cat" ] .
{"dot", "dop",
127
CHAPTER 10. GRAPHS
Given start
=
"dog", end
=
"cat",and dictionary
=
{"dot", "tod", "dat",
"d a r" }, return null as there is no possible transformation from "dog" to "cat".
Solution
We can model this problem as a graph: the nodes will be the words in the dictionary,
and we can form an edge between two nodes if and only if one character can be
modified in one word to get to the other.
Then we can do a typical breadth-first search starting from start and finishing once
we encounter end:
from collections import deque
from string import ascii_lowercase
def word_ladder(start, end, words):
queue= deque([(start, [start])])
while queue:
word, path= queue.popleft()
if word== end:
return path
for i in range(len(word)):
for char in ascii_lowercase:
next_word = word[:i] +char+ word[i + 1:]
if next_word in words:
words.remove(next_word)
queue.append([next_word, path+ [next_word]])
return None
This takes O (n 2 ) time and O (n) space.
CHAPTER 10. GRAPHS
10.4
128
Beat Snakes and Ladders
Snakes and Ladders is a game played on a 10 x 10 board, the goal of which is get
from square 1 to square 100. On each turn players will roll a six-sided die and
move forward a number of spaces equal to the result. If they land on a square that
represents a snake or ladder, they will be transported ahead or behind, respectively,
to a new square.
A typical board for Snakes and Ladders looks like this:
Find the smallest number of turns it takes to play snakes and ladders.
For convenience, here are the squares representing snakes and ladders, and their
outcomes:
snakes= {17: 13, 52: 29, 57: 40, 62: 22, 88: 18, 95: 51, 97: 79}
ladders= {3: 21, 8: 30, 28: 84, 58: 77, 75: 86, 80: 100, 90: 91}
CHAPTER 10. GRAPHS
129
Solution
We know that during each turn a player has six possible moves, advancing from one
to six squares. So our first thought might be to recursively try each move, exploring
all possible paths. Then, we can return the length of the shortest path.
However, because there are snakes, we would eventually enter a never-ending loop,
repeatedly advancing to a snake square and being sent back. Even without this issue,
this solution would be exponentially slow, since we have six potential moves at each
square.
A more efficient method is to use a version of breadth-first search.
We can maintain a queue of tuples representing the current square and the number
of turns taken so far, starting with (0, 0). For each item popped from the queue,
we examine the moves that can be made from it. If a move crosses the finish line,
we've found a solution. Otherwise, if a move takes us to a square we have not already
visited, we add that square to the queue.
The key point here is that squares will only be put in the queue on the earliest turn
they can be reached. For example, even though it is possible to reach square 5 by
moving [1, 1, 1, 1, 1], the initial move 5 will get there first. So we can guarantee that
we will only examine each square once, and that the number of turns associated with
each square will be minimal.
from collections import deque
def minimum_turns(snakes, ladders):
Create a board with the given snakes and ladders.
board= {square: square for square in range(l, 101)}
for start, end in snakes.items():
board[start] = end
#
for start, end in ladders.items():
board[start] = end
Perform BFS to reach the last square as quickly as possible.
start, end= 0, 100
#
turns= 0
CHAPTER 10. GRAPHS
130
path= deque([(start, turns)])
visited = set()
white path:
square, turns= path.popleft()
for move in range(square + 1, square+ 7):
if move>= end:
return turns+ 1
if move not in visited:
visited.add(move)
path.append((board[move], turns+ 1))
Since each square is only placed in the queue once, and our queue operations take
constant time, this algorithm is linear in the number of squares.
10.5 Topological sort
We are given a hashmap associating each courseid key with a list of courseids
values, which tells us that the prerequisites of courseid are courseids. Return a
sorted ordering of courses such that we can complete the curriculum.
Return null if there is no such ordering.
For example, given the following prerequisites:
'CSC300': [ 'CSC100', 'CSC200'],
'CSC200': [ 'CSC100'],
'CSC100' : []
You should return [ 'CSC100' , 'CSC200' , 'CSCS300' ] .
CHAPTERlO. GRAPHS
131
Solution
First, let's understand how the input we are given can be represented in the form of
a graph.
Our courses are related to each other by their order, so one promising way would be
to make each course a vertex and draw an edge from course A to course B if A is a
prerequisite for B.
Once this transformation is complete, our problem becomes one of traversing this
directed graph in order to efficiently find out which vertices come before other
ones. One technique specially designed to deal with questions like this is known as
topological sort.
To gain some context on how topological sort works, let's think about how we would
solve this problem manually.
Imagine that we have a to-do list and a result list, and to start our to-do list is
populated with all courses that do not have any prerequisites. We can start by taking
the first course in our to-do list and moving it to our result list. Next, we remove
it as a prerequisite from all its successor courses. If, while we are doing this, some
other course finds itself without any prerequisites, we can safely add it to the end of
our to-do list.
We can continue this process for each item at the top of our to-do list, and (if it is
possible) eventually make our way through the entire order. If in the end we are
unable to reach some courses, there must be a circular dependency.
from collections import defaultdict, deque
def find_order(course_to_prereqs):
#
Copy list values into a set for faster removal.
course_to_prereqs
#
{c: set(p) for c, pin course_to_prereqs.items()}
Start off our to-do list with all courses without prerequisites.
todo
#
=
=
deque([c for c, pin course_to_prereqs.items() if not p])
Create a new data structure to map prereqs to successor courses.
CHAPTER 10. GRAPHS
132
prereq_to_courses = defaultdict(list)
for course, prereqs in course_to_prereqs.items():
for prereq in prereqs:
prereq_to_courses[prereq].append(course)
result= []
while todo:
prereq = todo.popleft()
result.append(prereq)
Remove this prereq from all successor courses.
If any course now does not have any prereqs, add it to todo.
for c in prereq_to_courses[prereq]:
#
#
course_to_prereqs[c].remove(prereq)
if not course_to_prereqs[c]:
todo.append(c)
#
Circular dependency
if len(result) < len(course_to_prereqs):
return None
return result
Topological sort takes O(V + E) time and space, where Vis the number of vertices
and E is the number of edges in our graph.
Advanced Data Structures
In the preceding chapters we introduced the data structures that appear most frequently in coding interviews. These are both foundational building blocks of more
complicated structures and key ingredients of many of the algorithms that will be
considered in Part II.
Of course, we cannot cover the full breadth of this subject, as computer scientists and
those in industry have spent years developing data structures optimized for particular
applications. Depending on the company, role, and interviewer you may come across
a problem that would benefit from a more specialized approach.
In this chapter, we review a few advanced topics that are worth knowing. We
recommend becoming familiar with these, at least to the extent of knowing how
they work and when they are applicable. Besides allowing you to impress your
interviewer, these problems should also give you a glimpse of the wide landscape of
data structures.
We first review the Fenwick tree, designed to optimize both updates and range
sums in an array. Following this, we take a look at the disjoint-set data structure,
which efficiently allows you to partition and merge elements into groups. Finally,
we examine the Bloom filter, a probabilistic data structure which quickly checks
membership in a set.
133
CHAPTER 11. ADVANCED DATA STRUCTURES
11.1
134
Fenwick tree
You are given an array oflength 24, where each element represents the number of
new subscribers during the corresponding hour. Implement a data structure that
efficiently supports the following:
• update( hour, value): Increment the element at index hour by value.
• query(start, end): Retrieve the number of subscribers that have signed up
between start and end (inclusive).
You can assume that all values get cleared at the end of the day, and that you will not
be asked for start and end values that wrap around midnight.
Solution
If we look beyond the details, the data structure required here is one that efficiently
supports finding the sum of a subarray, and updating individual values in the array.
One data structure that fits the bill is a binary indexed tree, or Fenwick tree.
To see how this works, suppose the subscribers for an 8-hour range are as follows:
[ 4, 8, 1, 9, 3, 5, 5, 3], and we wanted to sum up number of subscribers from
index Oto index n - 1. A naive solution would require us to go through each element
and add it to a running total, which would be O(n). Instead, if we knew in advance
some of the subarray sums, we could break our problem apart into precomputed
subproblems. In particular, we can use a technique that relies on binary indexing.
We will create a new array of the same size of our subscriber array, and store values
in it as follows:
• If the index is even, simply store the value of subscribers [ i].
• If the index is odd, store the sum of a range of values up to i whose length is a
power of two.
135
CHAPTER 11. ADVANCED DATA STRUCTURES
This is demonstrated in the diagram below, with x representing values of the original
array, and arrows representing range sums.
1
2
3
4
5
6
7
8
X
X
X
X
How does this help us? For any range between O and n - 1, we can break it apart
into these binary ranges, in such a way that we only require O (log n) parts.
To make this more concrete, let's look again at our subscriber array, [ 4, 8, 1, 9,
3, 5, 5, 3]. For this array, the binary indexed tree would be [ 4, 12, 1, 22, 3,
8, 5, 38]. As a result, we can calculate query( 0, 6) using the following steps:
query(0, 6) = query(0, 3)
+
tree[7]
= 22
+ 8 + 5
+
query(4, 5)
+
query(6, 6) = tree[4]
+
tree[6]
= 35.
Note that if our start index is not 0, we can transform our problem from query( a,
b) to query( 0, b) - query( 0, a - 1 ), so this is applicable for any range.
To find the indices of our tree to sum up, we can use a clever bit manipulation trick.
We can find the lowest set bit of a number x by performing x
&
~x. Using this, we
can keep decrementing the index by the lowest set bit of the current index, until the
index gets to zero. We can implement this as follows:
def query(self, index):
total= 0
while index> 0:
total+= self.tree[index]
CHAPTER 11. ADVANCED DATA STRUCTURES
index-= index
return total
&
136
-index
Now let's take a look at the update operation. Changing the value of the 3 rd item in
the subscriber array from 1 to 2 would change the values of tree [ 3 ] , tree [ 4], and
tree [ 8]. Again, we can use the "lowest set bit" trick to increment the appropriate
indices:
def update(self, index, value):
while index< len(self.tree):
self.tree[index] += value
index+= index & -index
Note that in order for this "trick'' to work, we must prepend a zero to our tree array.
Otherwise, our update operation would never work for the first index, since 0 & -0
=
0! Essentially, we will change the array to start with an index of one, and modify
our function parameters accordingly.
Putting it all together, the code would look like this:
class BIT:
def __ init __ (self, nums):
Prepend a zero to our array to use lowest set bit trick.
self.tree= [0 for_ in range(len(nums) + 1)]
#
for i, num in enumerate(nums):
self.update(i + 1, num)
def update(self, index, value):
while index< len(self.tree):
self.tree[index] += value
index+= index & -index
def query(self, index):
total= 0
while index> 0:
total+= self.tree[index]
index-= index & -index
CHAPTER 11. ADVANCED DATA STRUCTURES
137
return total
class Subscribers:
def __ init __ (self, nums):
self.bit= BIT(nums)
self.nums = nums
def update(self, hour, value):
self.bit.update(hour, value - self.nums[hour])
self.nums[hour] = value
def query(self, start, end):
#
Shift start and end indices forward a~ our array is 1-based.
return self.bit.query(end
+
1) - self.bit.query(start)
Because we have decomposed each operation into binary ranges, both update and
query are O(logn).
11.2
Disjoint-set data structure
A classroom consists of n students, whose friendships can be represented in an
adjacency list. For example, the following describes a situation where Ois friends
with 1 and 2, 3 is friends with 6, and so on.
{0: [1, 2],
1: [0, 5],
2: [0],
3: [6],
4: [],
5: [ 1],
6: [3]}
Each student can be placed in a friend group, which can be defined as the transitive
closure of that student's friendship relations. In other words, this is the smallest
set such that no student in the group has any friends outside this group. For the
example above, the friend groups would be { 0, 1 , 2 , 5} , { 3 , 6} , { 4}.
CHAPTER 11. ADVANCED DATA STRUCTURES
138
Given a friendship list such as the one above, determine the number of friend groups
in the class.
Solution
This problem is a classic motivating example for the use of a disjoint-set data structure.
To implement this data structure, we must create two main methods: union and
find. Initially, each student will be in a friend group consisting of only him- or
herself. For each friendship in our input, we will call our union method to place the
two students in the same set. To perform this, we must call find to discover which
friend group each student is in, and, if they are not the same, assign one student to
the friend group of the other.
It may be the case that, after a few union calls, friend n may be placed in set n - l,
friend n - l may be placed in set n - 2, and so on. As a result, our find operation
must follow the chain of friend sets until reaching a student who has not been
reassigned, which will properly identify the group.
Because the chain we must follow for each find call can be n students long, both
methods run in 0( n) time. However, with just two minor changes we can cut this
runtime down to 0(1) (technically, O(a(n)), where a is the inverse Ackermann
function).
The first is called path compression. After we call find, we know exactly which group
the student belongs to, so we can reassign this student directly to that group. Second,
when we unite two students, instead of assigning based on value, we can always
assign the student belonging to the smaller set to the larger set. Taken together, these
optimizations drastically cut down the time it takes to perform both operations.
class DisjointSet:
def __ init __ (self, n):
self.sets= list(range(n))
self.sizes= [1] * n
self.count= n
def union(self, x, y):
CHAPTER 11. ADVANCED DATA STRUCTURES
139
x, y = self.find(x), self.find(y)
if
!= y:
X
Union by size: always add students to the bigger set.
if self.sizes[x] < self.sizes[y]:
#
X 1 y = y,
X
self.sets[y] = x
self.sizes[x] += self.sizes[y]
self.count-= 1
def find(self, x):
group= self.sets[x]
white group != self.sets[group]:
group= self.sets[group]
#
Path compression: reassign x to the correct group.
self.sets[x] = group
return group
With this data structure in place, our solution will be to go through the list of
friendships, calling union on each of them. Each time we reassign a student to a
different group, we decrement a counter for the number of friend groups, which
starts at n. Finally, we return the value of this counter.
def friend_groups(students):
groups= DisjointSet(len(students))
for student, friends in students.items():
for friend in friends:
groups.union(student, friend)
return groups.count
Since union and find operations are both 0(1), the time complexity of this solution
is O(E), where Eis the number of edges represented in the adjacency list. We will
also use 0( n) space to store the list of assigned friend groups.
CHAPTER 11. ADVANCED DATA STRUCTURES
11.3
140
Bloom filter
Implement a data structure which carries out the following operations without
resizing the underlying array:
• add(value): add a value to the set of values.
• check( value): check whether a value is in the set.
The check method may return occasional false positives (in other words, incorrectly
identifying an element as part of the set), but should always correctly identify a true
element.
Solution
While there may be multiple ways of implementing a data structure with these
operations, one of the most well-known is called a Bloom filter. A Bloom filter works
by hashing each item in multiple ways, so that several values in the array will be set
to True for any given input. Then, when we want to check whether a given value has
been added, we examine each of the locations that it can be sent to in the hash table,
and only return True if all have been set.
To give a simple example, suppose we are dealing with an underlying array of size
100, and have two functions which take in integers as input, defined as follows:
def hl(value):
return ((value+ 7)
** 11)
%
100
** 7)
%
100
def h2(value):
return ((value+ 11)
Now suppose we added 3 and 5 to our set. This would involve the following operations:
CHAPTER 11. ADVANCED DATA STRUCTURES
141
locationl = h1(3) # 0
location2 = h2(3) # 4
array[locationl] = array[location2] = True
locationl = hl(S) # 88
location2 = h2(5) # 56
array[locationl] = array[location2] = True
For most cases this will not cause any problems. However, look what happens when
we check 99. Since hl ( 99) = 56 and h2 ( 99) = 0, both values we check in the array
would already be assigned True, and we would report this as being in our set.
Although we can reduce the likelihood of this occurring by using more optimal hash
functions, and by increasing the initial array size, we cannot get around the fact that
a Bloom filter will occasionally return false positives. For this reason it is called a
probabilistic data structure.
import hashlib
class BloomFilter:
def __ init __ (self, n=1000, k=3):
self.array= [False]* n
self.hash_algorithms = [
hashlib.md5,
hashlib.shal,
hashlib.sha256,
hashlib.sha384,
hashlib.sha512
self.hashes= [self._get_hash(f) for fin self.hash_algorithms[:k]]
def _get_hash(self, f):
def hash_function(value):
h = f(str(value).encode('utf-8')).hexdigest()
return int(h, 16) % len(self.array)
return hash_function
CHAPTER 11. ADVANCED DATA STRUCTURES
142
def add(self, value):
for h in self.hashes:
v = h(value)
self.array[v]
=
True
def check(self, value):
for h in self.hashes:
v
=
h(value)
if not self.array[v]:
return False
return True
In the implementation above we maximize the number of hash functions at five, and
use built-in cryptographic algorithms that come with Python's hashlib library.
Since the number of hashes we must perform for each value is bounded by a constant,
and each hash is 0(1), the time complexity of each operation will also be constant.
The space complexity will be O (n), where n is the size of the underlying array.
Part II
Algorithms
143
Recursion
Recursion is one of the core concepts of computer science. It is a powerful technique
that involves breaking down a problem into smaller subproblems in order to obtain
a solution.
In general, every recursive solution consists of defining two parts:
• The base case: what should the algorithm do in the simplest situation?
• The inductive step: how does the algorithm build up the solution?
A common example is computing the n th Fibonacci number (or computing the n th
anything, really, as long as there is a clear method of going from n - 1 to n).
The base case involving defining the first two Fibonacci numbers, 1 and 1. From
here, we define the inductive step, which will apply to every subsequent term:
f(n)
=
f(n - 1)
+
f(n - 2)
A hallmark of every recursive solution is that you call your function on a subset of
the input inside the function itsel£
To illustrate, here is the full Fibonacci implementation, in only a few lines of code:
145
146
CHAPTER 12. RECURSION
def fib(n):
if n <= 1:
return n
else:
return fib(n - 1)
+
fib(n - 2)
A common mistake beginners make is to leave out the base case. Without this, our
algorithm would never terminate!
The downside of recursion is that it can be very inefficient: the number of calls to
fib in the code above grows exponentially with n! For this reason we will explore
improvements to recursive approaches in subsequent chapters such as dynamic
programming.
You can bet that most problems involving tree or graph traversal, searching, or
backtracking will use some kind of recursion, and even problems that have more
optimal solutions will often have a recursive solution that you can rely on in a pinch.
For more details on recursion, see Chapter 12.
12.1
Tower of Hanoi
The Tower of Hanoi is a puzzle game with three rods and n disks, each a different
size.
A
B
C
CHAPTER 12. RECURSION
147
All the disks start off on the first rod in a stack. They are ordered by size, with the
largest disk on the bottom and the smallest one at the top.
The goal of this puzzle is to move all the disks from the first rod to the last rod while
following these rules:
• You can only move one disk at a time.
• A move consists of taking the uppermost disk from one of the stacks and
placing it on top of another stack.
• You cannot place a larger disk on top of a smaller disk.
Write a function that prints out all the steps necessary to complete the Tower of
Hanoi. You should assume that the rods are numbered, with the first rod being 1,
the second (auxiliary) rod being 2, and the last (goal) rod being 3.
For example, with n
= 3, we
can do this in 7 moves:
• Move 1 to 3
• Move 1 to 2
• Move 3 to 2
• Move 1 to 3
• Move 2 to 1
• Move 2 to 3
• Move 1 to 3
Solution
The goal of the Tower of Hanoi is to get all n disks from the source peg to the
target peg, using a spare peg and abiding by all the constraints. Why does this call
for recursion? Note that after some series of moves we will arrive at a new state,
CHAPTER 12. RECURSION
148
hopefully one closer to the solution. We can think of this sequence of states as
subproblems to be solved.
As mentioned in the introduction, the first step to any recursive solution is formulating
a base case and an inductive case. First let's consider the base cases:
• If there are O disks, do nothing, since we are done.
• If there is only 1 disk, we can move it directly from the source peg to the target
peg.
Now, let's assume we have an existing tower _of _hanoi function that can move n
disks from a source peg to a target peg using a spare stack. The recurrence would
then look like this:
• If there is more than 1 disk, then we can do the following:
- Recursively move n - 1 disks from the source stack to the spare stack
- Move the last (biggest) disk from the source stack to the target stack
- Recursively move all n - 1 disks from the spare stack to the target stack
We are able to recursively move the disks because it doesn't break any constraints:
we can just treat the base disk as if it weren't there.
In our code, we'll call our source stack a, spare stack b, and target stack c.
def tower_of_hanoi(n, a='l', b='2', c='3'):
if n >= 1:
tower_of_hanoi(n - 1, a, c, b)
print('Move {}to{}' .format(a, c))
tower_of_hanoi(n - 1, b, a, c)
This will run in 0(2n) time, since for each call we're recursively calling ourselves
twice. This should also take 0( n) space since the function call stack goes n calls
deep.
CHAPTER 12. RECURSION
12.2
149
Implement regular expressions
Implement regular expression matching with the following special characters:
• . (period) which matches any single character
• * ( asterisk) which matches zero or more of the preceding element
That is, implement a function that takes in a string and a valid regular expression
and returns whether or not the string matches the regular expression.
For example, given the regular expression ra. and the string ray", your function
II
should return True. The same regular expression on the string raymond should
II
II
return False.
Given the regular expression . *at and the string ch at your function should return
II
11
,
true. The same regular expression on the string ch at s " should return false.
II
Solution
Let's think about how we can apply recursion here. Note that if the head of the
string matches the head of the regex, we can reduce our problem to comparing the
remainders of both. This in fact is a common tactic for finding the inductive step in
string problems.
The special characters . and* make implementing this a bit trickier, however, since
with * we can match O or any number of characters in the beginning.
The basic idea, then, is to do the following. Let's call the string we want to match s
and the regex r. Our base case here will be when r is empty, in which we can return
True ifs is also empty, and False otherwise.
For the inductive step, let's first consider the case where the first character in r is not
succeeded by a*· In this situation, we can safely compare the first character of both
r and s. If these match, we recursively continue to analyze match ( r [ 1 : ] , s [ 1 : ] ) .
Otherwise, we can return False.
150
CHAPTER 12. RECURSION
Finally, if the :first character in r is i fact succeeded by a
*, we can try every suffix
substring of son r[2 :] and return True if any of them provide a working solution.
The code should look something like this:
def matches_first_char(s, r):
return s[0] == r[0] or (r[0]
and len(s) > 0)
def matches(s, r):
if r ==
I
I:
returns
if len(r) == 1 or r[l] != '*':
# The first character in the regex is not succeeded by a*·
if matches_first_char(s, r):
return matches(s[l:], r[l:])
else:
return False
else:
# The first character is succeeded by a*·
# First, try zero length.
if matches(s, r[2:]):
return True
#
#
If that doesn't match straight away, try globbing more prefixes
until the first character of the string doesn't match anymore.
i = 0
white matches_first_char(s[i:], r):
if matches(s[i+l:], r[2:]):
return True
i += 1
This takes 0( Zen( s) x len(r) ) time and space, since we potentially need to iterate
over each suffix substring again for each character.
Fun fact: Stephen Kleene introduced the * operator in regular expressions and as
such, it is sometimes referred to as the Kleene star.
CHAPTER 12. RECURSION
12.3
151
Find array extremes efficiently
Given an array of numbers oflength n, find both the minimum and maximum using
less than 2 * (n - 2) comparisons.
Solution
It is certainly possible to solve this without using a recursive approach. The trick here
is to notice that each comparison actually provides two pieces of information: the
smaller element cannot possibly be the maximum of the list, and the larger element
cannot be the minimum. So if we take successive pairs of the list, we only need to
compare the smaller one to our running minimum, and the larger one to our running
maximum.
For example, take the input [4, 2, 7, 5, -1, 3, 6]. To start out, both the minimum and maximum can be initialized as the first element of the list, 4. Then, we
examine the next pair to find that 2 is smaller and 7 is bigger. So we update our
running minimum to be min(4, 2)
= 2
and update our running maximum to be
max( 4, 7) = 7. Carrying along like this, we would eventually find the minimum
and maximum to be -1 and 7, respectively.
Since it takes n/2 comparisons to order each pair, and each element will be compared
to either the running minimum or maximum, this algorithm uses about 3 * n/2
comparisons.
def min_and_max(arr):
min_element = max_element = arr[0]
compare= lambda x, y: (x, y) if y > x else (y, x)
#
Make the list odd so we can pair up the remaining elements neatly.
% 2 == 0:
arr.append(arr[-1])
if len(arr)
for i in range(1, len(arr), 2):
smaller, bigger= compare(arr[i], arr[i
min_element = min(min_element, smaller)
+
1])
CHAPTER 12. RECURSION
152
max_element = max(max_element, bigger)
return min_element, max_element
A more elegant approach is to use a technique called divide and conquer. Divide and
conquer uses the same base-and-inductive approach as typical recursive solutions,
but each subproblem is mutually exclusive. In other words, the input is neatly split
into separate divisions which are then combined to form the solution.
For the problem at hand, our base cases are as follows:
• If there is only one element in the array, return that element for both the
minimum and maximum.
• If there are two elements, return the smaller one as the minimum, and the
larger one as the maximum.
For the general case, we recursively apply our algorithm to the left and right halves
of our array, and return the minimum and maximum of the results.
def min_and_max(arr):
if len(arr) == 1:
return arr[0], arr[0]
elif len(arr) == 2:
return (arr[0], arr[l]) if arr[0] < arr[l] else (arr[l], arr[0])
else:
n = len(arr) // 2
lmin, lmax = min_and_max(arr[:n])
rmin, rmax = min_and_max(arr[n:])
return min(lmin, rmin), max(lmax, rmax)
To be more concrete, here are the intermediate steps when this is applied to the
example above:
First, let's recursively break down the array:
153
CHAPTER 12. RECURSION
I 4 I 2 I 1 I 5 I- 1 I 3 I 6 I
I 4 I 2 I 1 I 5 I l- 1 I 3 I 6 I
Then, reorder so that smaller comes before larger:
Finally, merge to find min and max:
~I ~I
.___m_i_n_(2_,5_)~_m_ax_(_4,_7_)
min(2, -1)
~
_m_in_(_-_1,_6_)~-m_ax_(_3,_6_)
max(7,6)
We can derive the complexity of this algorithm as follows. For each array of size
N, we are breaking down the problem into two subproblems of size n/2, plus 2
additional comparisons.
More formally, T( n)
=
2x
rm) + 2, with base case T(2) =
of two, this recurrence relation resolves exactly to T( n)
will take a few more steps.
=
1. When n is a power
3 x ~ - 2; otherwise, it
154
CHAPTER 12. RECURSION
12.4 Play Nim
The game of Nim is played as follows. Starting with three heaps, each containing a
variable number of items, two players take turns removing one or more items from
a single pile. The player who eventually is forced to take the last stone loses. For
example, if the initial heap sizes are 3, 4, and 5, a game could be played as shown
below:
A
B
C
Action
3
4
Player 1 takes 3 items from B
3
1
5
5
3
1
3
Player 1 takes 3 items from A
0
1
3
Player 2 takes 3 items from C
0
1
0
Player 1 takes 1 item from A
0
0
0
Player l loses
Player 2 takes 2 items from C
In other words, to start, the first player takes three items from pile B. The second
player responds by removing two stones from pile C. The game continues in this
way until player one takes the last stone and loses.
Given a list of non-zero starting values [a, b, c], and assuming optimal play,
determine whether the first player has a forced win.
Solution
Problems that involve two-player games often can be solved with a minimax approach.
Minimax is a recursive algorithm that involves evaluating all possible opponent moves
and choosing the one that minimizes the maximum value the opponent can receive.
For the base case, we know that if the piles dwindle down to ( 0, 0, 0), the current
player to move is the winner, since the last player must have removed the final stone.
Now let's say you are faced with the heaps ( 1, 3, 0). There are many possible moves,
but the only good one is to remove everything in pile B, so that your opponent
is forced to take the item in pile A. For any other move, there is a response that
CHAPTER 12. RECURSION
155
makes this a losing game. In other words, the value of a given move to player one is
equivalent to the value of the best response to player two.
The list of possible moves can be generated by taking between one and all items
from each pile. As a result, we can define a recursive solution that enumerates all
possible moves, and returns True if any of them prevent the opponent from making
an optimal move.
def update(heaps, pile, items):
heaps= list(heaps)
heaps[pile] -= items
return tuple(heaps)
def get_moves(heaps):
moves= []
for pile, count in enumerate(heaps):
for i in range(l, count+ 1):
moves.append(update(heaps, pile, i))
return set(moves)
def nim(heaps):
if heaps== (0, 0, 0):
return True
moves= get_moves(heaps)
return any([nim(move) != True for move in moves])
At the start of the game, if each pile has a, b, and c items respectively, there will be
a+ b + c possible moves, which we can denote by n. Unfortunately, because of our
recursive approach, each subsequent move may only bring the number of items down
by one, leading to a run time of O(n!).
In fact, though, there is a bitwise solution to this game that is only O (1) !
Note that the losing state, ( 0 , 0, 0), has an xor product, or "nim-sum'', of 0. Though
it is trickier to see, it is also the case that for any given state it is possible to make a
CHAPTER 12. RECURSION
156
move that turns this product from zero to non-zero, and vice versa. Therefore, we
can use the nim-sum after each pair of moves as an invariant.
More concretely, if you are playing Nim, and for a given move your opponent turns
the nim-sum from O to 3, you can make a move that turns it back to zero, putting
your opponent back in a losing state. As a result, with the exception of one special
case, a game is a win for the first player if and only if its nim-sum is nonzero.
def nim(heaps):
a, b, c
=
heaps
if a== b == c == 1:
return False
return a Ab Ac
!= 0
Dynamic Programming
Dynamic programming is a technique which combines the generality of recursion
with the efficiency of greedy algorithms. The key idea is to break a problem down
into reusable subproblems which combine to form the solution.
More formally, dynamic programming is an excellent choice when a problem exhibits
two key features:
• overlapping subproblems: there is a way to partition the problem into smaller,
modular components
• optimal substructure: these modular components can be efficiently put together
to obtain a solution
In particular, each subproblem should only be solved once, after which the solution
should be cached and reused whenever that state is reached again.
To take a simple example, let's try to figure out the number of ways it is possible
to lay pennies and nickels in a line on a table such that they sum to a dollar. If we
were solving this with recursion, we might come up with the following recursive
relationship:
157
CHAPTER 13. DYNAMIC PROGRAMMING
f(n) = f(n - 5)
+
158
f(n - 1)
In other words, to get to n we can either add a nickel to a row that previously summed
ton - 5, or add a penny to a row that summed ton - 1. Using the base case f( 1)
= 1, we
could solve for n
= 100
and eventually compute the answer. Seeing as there
are in fact 823, 322, 219, 501 arrangements, this might take a while!
The reason it would take a while is that we are solving the same subproblems over
and over again: each computation path is independent, so even though we may have
calculated f( 50) previously we cannot reuse the result. With dynamic programming,
we cache these previous results, so that in effect we perform as many computations
as necessary to find the values for f ( 1 ) , f ( 2 ) , f ( 3 ) , and so on, up to f ( n ) .
We have two options for implementing this logic, known as top-down and bottom-up
dynamic programming. With the top-down approach, we write code very similar to
a recursive solution, but check before each calculation whether the result has already
been stored in a cache. This may also be called memoization.
def coin_ways(n, cache={0: 1}):
if n in cache:
return cache[n]
if n < 0:
CHAPTER 13. DYNAMIC PROGRAMMING
159
return 0
cache[n] = coin_ways(n - 1)
+
coin_ways(n - 5)
return cache[n]
Note how after each call to coin_ways, we store the result in our cache.
The bottom-up approach, on the other hand, methodically builds up the values for
f ( 1 ) , f ( 2 ) , and so on, one after the other, typically by adding values to an array or
dictionary. Once all values have been computed, we simply return the final one.
def coin_ways(n):
cache= {0: 1}
for i in range(l, n
+
1):
cache[i] = cache.get(i - 1, 0)
+
cache.get(i - s, 0)
return cache[n]
In general, dynamic programming is a good tool for counting the number of solutions,
as in the problem above, or for finding an optimal solution. As such it is frequently
used as a building block for more complicated algorithms involving shortest path
discovery, text similarity, and combinatorial optimization (such as the knapsack
problem).
A method we recommend to solve these problems, and one we will follow in the
coming solutions, is to carry out the following steps:
1. Identify the recurrence relation: how can the problem be broken down into
smaller parts?
2. Initialize a cache capable of storing the values for each subproblem.
3. Create a memoized function (if top-down) or loop (if bottom-up) which
populates these cache values.
160
CHAPTER 13. DYNAMIC PROGRAMMING
13.1
Number of ways to climb a staircase
There exists a staircase with n steps which you can climb up either 1 or 2 steps at
a time. Given n, write a function that returns the number of unique ways you can
climb the staircase. The order of the steps matters.
For example, if n is 4, then there are 5 unique ways:
• 1, 1, 1, 1
• 2, 1, 1
• 1,2, 1
• 1, 1,2
• 2,2
Follow-up: what if, instead of being able to climb 1 or 2 steps at a time, you could
climb any number from a set of positive integers X? For example, if X
= 1, 3, 5,
you could climb 1, 3, or 5 steps at a time.
Solution
It's always good to start off with some test cases. Let's start with small cases and see
if we can :find some sort of pattern.
• n
=
1: [ 1]
• n
=
2: [ 1, 1], [ 2]
• n
=
3: [ 1, 2] , [ 1, 1, 1] , [ 2, 1]
• n
=
4: [ 1, 1, 2] , [ 2, 2] , [ 1, 2, 1] , [ 1, 1, 1, 1] , [ 2, 1, 1]
What's the relationship?
161
CHAPTER 13. DYNAMIC PROGRAMMING
= 3 is to first get to n = l and then go up by 2 steps, or
get ton= 2 and go up by 1 step. More mathematically, f(3) = f(2) + f(l).
The only ways to get to n
Now let's examine n
= 4. Here, we can only get to the 4th step by getting to the
third step and moving up by one, or by getting to the second step and moving up by
two. Therefore, f (4)
= f (3) + f (2).
In other words, our recurrence relation is f (n)
= f (n -
l)
+ f (n - 2). That's just
the Fibonacci sequence!
L'et's see if we can generalize this to an arbitrary set of steps X. Similar reasoning
tells us that if X = 1, 3, 5, then our formula should be f(n) = f(n - 1)
3) + f (n - 5). If n
+ f(n -
< 0, then we should return O since we can't start from a negative
number of steps.
def staircase(n, X):
if n < 0:
return 0
elif n == 0:
return 1
else:
return sum(staircase(n - step, X) for step in X)
This is still very slow (O(IXln)), since we are repeating computations. We can use
bottom-up dynamic programming to speed it up.
Each entry cache [ i] will contain the number of ways we can get to step i with the
set X. We will build up the array from zero using the same recurrence as before:
def staircase(n, X):
cache= [0 for_ in range(n + 1)]
cache[0] = 1
for i in range(l, n + 1):
cache[i] += sum(cache[i - step] for step in X if i - step>= 0)
return cache[n]
CHAPTER 13. DYNAMIC PROGRAMMING
Our algorithm now takes O (n x
162
IX I) time and uses O (n) space, thanks to dynamic
programming.
13.2 Number of ways to decode a string
Given the mapping a= 1, b = 2, ... , z = 26, and an encoded message, count the
number of ways it can be decoded.
For example, the message "111" should be 3, since it could be decoded as "aaa",
" ka", and "a k" .
You can assume that the messages are always decodable. For example, "001" is not
allowed.
Solution
First, let's try to think of a recurrence we can use for this problem. One way to look
for a pattern is to examine some simple cases.
• " ", the empty string, should return 1.
• "1" should return 1, since we can parse it as "a" + "".
• "11" should return 2, since we can parse it as "a" + "a" + "" and "k" +
• "111" should return 3, since we can parse it as:
"a" + "k" +
"k" + "a" +
-
"a"+ "a"+ "a"+""
• "011" should return 0, since no letter starts with 0 in our mapping.
This is a good starting point. We can :first note that for our base case, any time a
string's length is less than or equal to one, there can only be one encoding.
CHAPTER 13. DYNAMIC PROGRAMMING
163
What happens when the string is at least two digits? There are two possibilities:
• The first letter is encoded alone
• The first two digits form a number k <= 26, and are encoded as a pair
For each of these options, if applicable, we recursively count the number of encodings
using the remainder of the string and add them to a running total.
def num_encodings(s, total=0):
#
There is no valid encoding if the string starts with 0.
if s.startswith('0'):
return 0
#
Both the empty string and a single character should return 1.
elif len(s) <= 1:
return 1
total+= num_encodings(s[1:])
if int(s[:2]) <= 26:
total+= num_encodings(s[2:])
return total
However, this solution is not very efficient. Every branch calls itself recursively twice,
so our runtime is 0(2n). We can do better by using dynamic programming.
Using an approach typical of bottom-up dynamic programming, we can use the
same idea as above, but modify our logic to start from the base case and build up the
solution. In particular, we maintain a cache that stores the number of ways to encode
any substring s [ i : ] . Then, for each index from n - 1 down to 0, we compute the
number of possible solutions starting at that index and store the result to use in later
calculations.
I
from collections import defaultdict
CHAPTER 13. DYNAMIC PROGRAMMING
164
def num_encodings(s):
cache= defaultdict(int)
cache[len(s)] = 1
for i in reversed(range(len(s))):
if s[i].startswith('0'):
cache[i] = 0
elif i == len(s) - 1:
cache[i] = 1
else:
cache[i] += cache[i + 1]
if int(s[i:i + 2]) <= 26:
cache[i] = cache[i + 2]
return cache[0]
Since each iteration takes 0(1), the whole algorithm now runs in O(n)time.
13.3
Painting houses
A builder is looking to build a row of n houses that can be of k different colors. She
has a goal of minimizing cost while ensuring that no two neighboring houses are of
the same color.
Given an n by k matrix where the entry at the i th row and ph column represents the
cost to build the i th house with the /h color, return the minimum cost required to
achieve this goal.
Solution
The brute force solution here would be to generate all possible combinations of
houses and colors, filter out invalid combinations, and keep track of the lowest cost
seen. This would take 0( nk) time.
We can solve this problem faster using dynamic programming. We will maintain
a matrix cache where every entry [ i] [ j ] represents the minimum cost of painting
CHAPTER 13. DYNAMIC PROGRAMMING
165
house i the color j, as well as painting every house before i. We can calculate this by
looking at the minimum cost of painting each house< i - 1, and painting house
i - 1 any color except j, since that would break our constraint. We'll initialize the
first row with zeroes to start. Then, we just have to look at the smallest value in
the last row of our cache, since that represents the minimum cost of painting every
house ..
def build_houses(matrix):
n = len(matrix)
k = len(matrix[0])
solution_matrix = [[0] * k]
Solution matrix: matrix[i][j] represents the minimum cost to
# build house i with color j.
for r, row in enumerate(matrix):
row_cost = []
for c, val in enumerate(row):
row_cost.append(min(solution_matrix[r][i]
for i in range(k)
if i != c) + val)
solution_matrix.append(row_cost)
#
return min(solution_matrix[-1])
This runs in O(n x k 2 ) time and O(n x k) space. Can we do even better than this?
First off, notice that we're only ever looking at the last row when computing the
next row's cost. That suggests that we only need to keep track of one array of size k
instead of a whole matrix of size n x k:
def build_houses(matrix):
k = len(matrix[0])
solution_row = [0] * k
for r, row in enumerate(matrix):
new_row
= []
for c, val in enumerate(row):
new_row.append(min(solution_row[i]
CHAPTER 13. DYNAMIC PROGRAMMING
for i in range(k)
if i != c) + val)
solution_row = new_row
return min(solution_row)
Now we're only using 0( k) space!
166
Backtracking
Backtracking is an effective technique for solving algorithmic problems. In backtracking, we perform a depth-first search for solutions,jumping back to the last valid
path as soon as we hit a dead end.
The benefit ofbacktracking is that when it is properly implemented, we are guaranteed
to :find a solution, if one exists. Further, the solution will be more efficient than a
brute-force exploration, since we weed out paths that are known to be invalid, a
process known as pruning.
On the other hand, backtracking cannot guarantee that we will :find an optimal
solution, and it often leads to factorial or exponential time complexity if we are
required to choose one of M paths at each of N steps.
There are three core questions to ask in order to determine whether backtracking is
the right algorithm to use for a problem.
1. Can you construct a partial solution?
2. Can you verify if the partial solution is invalid?
3. Can you verify if the solution is complete?
167
CHAPTER 14. BACKTRACKING
168
To illustrate this concept, we will walk through one of the most common example of
backtracking: the N queens puzzle. In this problem, you are given an N x N board,
and asked to find the number of ways N queens can be placed on the board without
threatening each other. More explicitly, no two queens are allowed to share the same
row, column, or diagonal.
• Can we construct a partial solution?
Yes, we can tentatively place queens on the board.
• Can we verify if the partial solution is invalid?
Yes, we can check a solution is invalid if two queens threaten each other. To speed
this up, we can assume that all queens already placed so far do not threaten each
other, so we only need to check if the last queen we added attacks any other queen.
• Can we verify if the solution is complete?
Yes, we know a solution is complete if all N queens have been placed.
Now that we are confident that we can use backtracking, let's implement it. We'll
loop through the first row and try placing a queen in each column from left to right.
If we are able to find a valid location, we continue with the second row, and third
row, and so on, up to N. What differs here from brute force is that we'll be adding
the queens incrementally instead of all at once.
We will create an is_valid function that checks the board on each incremental
addition. This function will look at the last queen placed and see if any other queen can
threaten it. If so, we prune the branch, since there's no point pursuing it. Otherwise,
we'll recursively call our function with the new incremental solution. We only stop
once we hit the base case, which is when we've placed all queens on the board already.
We can represent our board as a one-dimensional array of integers from 1 to n, where
the value at index i represents the column the queen on row i is on. Since we're
169
CHAPTER 14. BACKTRACKING
working incrementally, we don't even need to initialize the whole board. We can just
append and pop as we go down the stack.
Here's the actual code in Python:
def n_queens(n, board=[]):
if n == len(board):
return 1
count= 0
for col in range(n):
board.append(col)
if is_valid(board):
count+= n_queens(n, board)
board.pop()
return count
def is_valid(board):
current_queen_row, current_queen_col = len(board) - 1, board[-1]
#
Check if any queens can attack the last queen.
for row, col in enumerate(board[:-1]):
diff = abs(current_queen_col - col)
if diff == 0 or diff == current_queen_row - row:
return False
return True
Now that you've got the hang of it, try your hand at some more interesting problems.
14.1
Compute flight itinerary
Given an unordered list of :flights taken by someone, each represented as (origin,
destination) pairs, and a starting airport, compute a possible itinerary. If no such
itinerary exists, return null. All :flights must be used in the itinerary.
For example, given the list of flights [ ( 'SFO' , 'HKO'), ( 'YYZ' ,
'YYZ' ) , ( 1 HKO' ,
1
SFO'), ('YUL' ,
'ORD' ) ] and starting airport 'YUL', you may return the list [ 'YUL' ,
CHAPTER 14. BACKTRACKING
I
YYZ
I
,
Is
FO I
,
I
HKO I
,
I
ORD I
]
170
•
Given the list of flights [ ( ' S FO ' , ' COM ' ) , ( ' COM' , 'YYZ ' ) ] and starting airport
' COM' , you
should return null.
Solution
Let's walk through our three-step process.
• Can we construct a partial solution?
Yes, we can build an (incomplete) itinerary and extend it by adding more flights to
the end.
• Can we verify if the partial solution is invalid?
Yes, a solution is invalid if there are no flights leaving from our last destination, and
there are still flights remaining that can be taken. Since we must use all flights, this
would mean we are at a dead end.
• Can we verify if the solution is complete?
Yes, we can check if a solution is complete if our itinerary uses all the flights.
Our strategy then will be as follows. We maintain a list that represents the current
itinerary. For each possible next flight, we try appending it to our path, and call our
function recursively on this new itinerary with the leftover flights. If no path can
succeed, we pop the last addition and continue.
def get_itinerary(flights, current_itinerary):
#
If we've used up all the flights, we're done
if not flights:
return current_itinerary
CHAPTER 14. BACKTRACKING
171
last_stop = current_itinerary[-1]
for i, (origin, destination) in enumerate(flights):
#
Make a copy of flights without the current one to mark it as
#
used
flights_minus_current = flights[:i]
+
flights[i
+
1:]
current_itinerary.append(destination)
if origin== last_stop:
return get_itinerary(flights_minus_current, current_itinerary)
current_itinerary.pop()
return None
At each step i, there will be i - 1 continuation paths to explore. As a result, similar
to the preceding n-queens problem, our time and space complexity will be 0( n!) in
the worst case.
14.2
Solve Sudoku
Sudoku is a puzzle where you're given a 9 by 9 grid partially :filled with digits. The
objective is to :fill the grid subject to the constraint that every row, column, and box
(3 by 3 subgrid) must contain all of the digits from 1 to 9.
Here is an example sudoku puzzle:
2 5
1
4
3
9
1
2
8
4
7
5
2
9
8
1
3
4
3 6
7
7
9
And this is its solution:
3
6
2
3
4
172
CHAPTER 14. BACKTRACKJNG
2 5 8 7 3
6 1 9 8 2
4 3 7 9 1
3 9 5 2 7
7 6 2 4 9
8 4 1 6 5
1 8 4 3 6
5 7 6 1 4
9 2 3 5 8
6 9 4 1
4 3 5 7
5 2 6 8
1 4 8 6
8 1 3 5
3 7 2 9
9 5 7 2
2 8 9 3
7 6 1 4
Implement an efficient sudoku solver.
Solution
Trying brute force on a sudoku board will take a really long time: we will need to try
every permutation of the numbers 1 - 9 for all the non-empty squares.
Let's try using backtracking to solve this problem instead. We can fill each empty
cell one by one and backtrack once we hit an invalid state.
By now you should know the drill.
• Can we construct a partial solution?
Yes, we can fill in some portions of the board.
• Can we verify if the partial solution is invalid?
Yes, we can check that the board is valid so far if there are no rows, columns, or
squares that contain the same digit.
• Can we verify if the solution is complete?
CHAPTER 14. BACKTRACKING
173
Yes, the solution is complete when the board has been filled up.
For our algorithm, we will try filling each empty cell one by one, and backtrack once
we hit an invalid state.
To do this, we'll need a valid_so_ far function that tests the board for its validity
by checking all the rows, columns, and squares. Then we'll backtrack as usual:
EMPTY= 0
def sudoku(board):
if is_complete(board):
return board
Set r, c to values from 1 to 9.
r, c = find_first_empty(board)
#
for i in range(l, 10):
board[r][c] = i
if valid_so_far(board):
result= sudoku(board)
if is_complete(result):
return result
board[r][c] = EMPTY
return board
def is_complete(board):
return all(all(val is not EMPTY for val in row) for row in board)
def find_first_empty(board):
for i, row in enumerate(board):
for j, val in enumerate(row):
if val== EMPTY:
return i, j
return False
def valid_so_far(board):
if not rows_valid(board):
return False
if not cols_valid(board):
return False
if not blocks_valid(board):
CHAPTER 14. BACKTRACKING
174
return False
return True
def rows_valid(board):
for row in board:
if duplicates(row):
return False
return True
def cols_valid(board):
for j in range(len(board[0])):
if duplicates([board[i][j] for i in range(len(board))]):
return False
return True
def blocks_valid(board):
for i in range(0, 9, 3):
for j in range(0, 9, 3):
block= []
fork in range(3):
for l in range(3):
block.append(board[i
+
k][j
+
l])
if duplicates(block):
return False
return True
def duplicates(arr):
C
= {}
for val in arr:
if val inc and val is not EMPTY:
return True
c[val] = True
return False
14.3
CountAndroid unlock combinations
One way to unlock an Android phone is by swiping in a specific pattern across a
1 - 9 keypad, which looks like this:
175
CHAPTER 14. BACKTRACKING
1
2
3
4
5
6
7
8
9
For a pattern to be valid, it must satisfy the following criteria:
• All of its keys must be distinct.
• It must not connect two keys by jumping over a third key, unless that key has
already been used.
For example, 4 - 2 - 1 - 7 is a valid pattern, whereas 2 - 1 - 7 is not.
Find the total number of valid unlock patterns of length n, where 1 <= n <= 9.
Solution
Let's first try to solve the problem without any restrictions on jumping over keys.
If there are n starting numbers to choose from, we will have n - l options for the
, second number, n - 2 options for the third, and so on.
Each time we visit a number, we mark it as visited, traverse all paths starting with
that number, and then remove it from the visited set. Along the way we keep a
running count of the number of paths seen thus far, which we eventually return as
our result.
def num_paths(current, visited, n):
if n == 1:
return 1
paths= 0
for number in range(l, 10):
if number not in visited:
176
CHAPTER 14. BACKTRACKING
visited.add(number)
paths+= num_paths(number, jumps, visited, n - 1)
visited.remove(number)
return paths
To modify this to account for jumps, we can use a dictionary mapping pairs of keys
to the key they skip over. Before visiting a number, we check to see that either the
current and next number do not exist as a pair in this dictionary, or that their value
has already been visited.
Notice also that because of the symmetry of the keypad, the number of patterns
starting from 1 is the same as the number of patterns starting from 3, 7, and 9. For
example, the path 1 - 6 - 3 - 8 can be rotated 180 degrees to get 9 - 4 - 7 - 2.
Similarly, paths starting with 2, 4, 6, and 8 are all rotationally symmetric. As a
result, our answer can be expressed as 4
1
*
*
num_paths( 1) + 4
*
num_paths(2) +
num_paths(S).
Putting it all together, the solution should look something like this:
def num_paths(current, jumps, visited, n):
if n == 1:
return 1
paths= 0
for number in range(l, 10):
if number not in visited:
if (current, number) not in jumps or\
jumps[(current, number)] in visited:
visited.add(number)
path~+= num_paths(number, jumps, visited, n - 1)
visited.remove(number)
return paths
def unlock_combinations(n):
jumps= {(1, 3): 2' (1, 7): 4, (1, 9): 5,
(2, 8): 5,
(3I
1):
2 I (3, 7): 5, (3' 9): 6,
CHAPTER 14. BACKTRACKING
177
(4, 6): 5, ( 6, 4): 5,
(7, 1): 4, (7, 3): 5, ( 7, 9): 8,
(8, 2): 5,
(9, 1): 5, (9, 3): 6, (9, 7): 8}
* num_paths(1, jumps, set([1]), n)
4 * num_paths(2, jumps, set([2]), n)
1 * num_paths(S, jumps, set([S]), n)
return 4
+ \
+ \
.Even though the jump restrictions have limited the options at each next step, the
time complexity for each of the three starting points is still O(n!).
Sorting and Searching
Given the fundamental importance of arrays, it is only natural that computer scientists
have developed many tools for finding and ordering their elements.
You may be familiar with classic sorting methods like quicksort and mergesort. Both
are comparison sorting algorithms that use a divide-and-conquer technique. In
quicksort, we choose a pivot element, move other elements to the left or right of
this element depending on if they are smaller or larger, and then recursively sort the
elements on either side. It has a worst-case performance of O(n 2 ), but on average
takes O(nlogn). For merge sort, we repeatedly divide our array into subarrays and
then join them so that larger and larger pieces end up sorted.
While it is worth implementing these from scratch on your own time, in an interview
setting it is more likely that you can rely on your preferred language's built-in sort
functionality for this purpose. Instead, this chapter will focus on how to recognize
the importance of sorting in a given problem, and when and how to apply more
specific sorting algorithms.
If an array is already sorted, you should leap to binary search as a potential solution.
Instead of having to traverse each element, binary search cuts the search time down
to O (log n), where n is the length of the array.
Binary search works by cutting down the search space in half, adjusting either the
179
CHAPTER 15. SORTING AND SEARCHING
180
left or right bound and recursively looking in the new subarray. For example, suppose
wehavetheten-elementarray[4, 5, 6, 15, 29, 65, 88, 99, 190, 250, 300],
and we are seeking the element 15. We can achieve this in only three steps:
• First, we compare 15 to the element at index 5. Since 15 < 65, we shift our
upper bound index down to 4. Our search space is now [ 4, 5 , 6, 15 , 2 9].
• Next, we compare 15 to the element at index 2. Since 15 > 6, we shift our
lower bound index up to 3. Our search space is now [ 15, 29].
• Finally, we compare 15 to the element at index 3, and find what we are looking
for.
5
4
6
15
29
65
88
99 190 250 300
We can code this iteratively or recursively; here is how an iterative version would
look:
def binary_search(array, x):
low= 0; high= len(array) - 1
found= False
while low<= high and not found:
mid= (low+ high)// 2
if x == array[mid]:
found= True
elif x < array[mid]:
high= mid - 1
else:
low= mid+ 1
CHAPTER 15. SORTING AND SEARCHING
181
return found
Implementing binary search can frequently be tricky, due to subtle off-by-one index
errors. As a result, it is not a bad idea to become familiar with a built-in library that
can perform this task for you, such as Python's bisect module.
Let's now apply these ideas to a few problems.
15.1
Dutch flag problem
Given an array of strictly the characters R, G, and B, segregate the values of the
array so that all the Rs come :first, the Gs come second, and the Bs come last. You
can only swap elements of the array.
Do this in linear time and in-place.
For example, given the array [ ' G ' , ' B ' , ' R' , ' R' , ' B ' , ' R' , ' G ' ] , you should
transform it to [ ' R' , ' R' , ' R' , ' G' , 'G ' , ' B ' , ' B ' J.
Solution
Let's :first consider an easier problem, where there are only the two values R and G.
In this case, we can think of our array as being split into three sections. Regardless
of what happens, we will try to maintain the loop invariant that only Rs will be in
the :first section, only Gs will be in the third section, and unidentified elements will
remain in the middle section.
If we let low and high be the indices marking the section boundaries, we can represent
these areas as follows:
• Strictly Rs: array[ :low]
• Unknown: array[low: high]
CHAPTER 15. SORTING AND SEARCHING
182
• Strictly Gs: array[high:]
Initially, low and high will be set to the first and last elements of the array, respectively,
since every element is unknown. As we iterate over the array, we will swap any Gs we
see to the third section and decrement high. Meanwhile, whenever we see an R, we
will increment low, since the boundary of red characters must shift forward. In this
way, we gradually shrink the unknown section down to nothing, finally terminating
our algorithm once the two indices meet.
def partition(arr):
low, high= 0, len(arr) - 1
while low<= high:
if arr[low] == 'R':
low+= 1
else:
arr[low], arr[high] = arr[high], arr[low]
high-= 1
This correctly splits our array into two separate categories. How can we extend this
to three partitions?
Using a similar idea to the method above, we can maintain four sections using 3
indices: low, mid, and high. The contents of each section will be as follows:
• Strictly Rs: array [:low]
• Strictly Gs: array[low:mid]
• Unknown: array[mid: high]
• Strictly Es: array [high:]
As before, we will initialize low and high to be the first and last elements of our array,
respectively. Additionally, mid will initially be equal to low. The key to this strategy
is that we will keep incrementing our midpoint index and moving the corresponding
element to the appropriate location.
CHAPTER 15. SORTING AND SEARCHING
183
More concretely:
• If the element is R, we swap array[mid] with array[low] and increment
low.
• If the element is G, we do nothing, since this element belongs in the middle.
• If the element is B, we swap array[mid] with array[high] and decrement
high.
When our mid and high indices meet, we know that the unknown section is gone
and we can terminate our algorithm.
def partition(arr):
low, mid, high= 0, 0, len(arr) - 1
while mid<= high:
if arr[mid] == 'R':
arr[low], arr[mid] = arr[mid], arr[low]
low+= 1
mid+= 1
elif arr[mid]
mid+= 1
'G':
else:
arr[mid], arr[high] = arr[high], arr[mid]
high-= 1
P.S. This problem is also called the Dutch national flag problem, since that flag
consists of three horizontal stripes of red, white and blue.
15 .2
Pancake sort
Given a list, sort it using the helper method reverse(lst, i, j ).
This method takes a sublist as indicated by the left and right bounds i and j and
reverses all its elements. For example, reverse( [ 10, 20, 30, 40, 50], 1, 3)
would result in [10, 40, 30, 20, 50].
CHAPTER 15. SORTING AND SEARCHING
184
Solution
This type of sorting is also called pancake sorting, since the process of reversing
sublists is analogous to flipping a pancake. We can approach this problem using
a technique similar to selection sort. The idea is to iteratively place the maximum
remaining element at the end of the list.
To see how this works, let size be the size of the list that we're concerned with
sorting at the moment. We can iterate over the first size elements of the list to find
the position of the maximum element, say max_ind. In order to place this element at
the end, we perform two steps:
• Reverse the sublist from Oto max ind to move the max element to the front.
• Then, reverse the sublist from Oto size to move the max element to the end.
Finally, we decrement size and repeat, until there is nothing left to sort.
def pancake_sort(lst):
for size in reversed(range(len(lst))):
max_ind = max_pos(lst[:size
reverse(lst, 0, max_ind)
reverse(lst, 0, size)
+
1])
return lst
def max_pos(lst):
return lst.index(max(lst))
def reverse(lst, i, j):
while i < j:
lst[i], lst[j]
i += 1
j
-=
=
lst[j], lst[i]
1
This algorithm takes 0( n 2 ) time and 0(1) space.
CHAPTER 15. SORTING AND SEARCHING
15 .3
185
Efficiently sort a million integers
Given an array of a million integers between zero and a billion, out of order, how
would you sort it efficiently with respect to time and space?
Solution
Sorting by an algorithm like quicksort or merge sort would give us an average time
complexity of O (n log n). But we can take advantage of the fact that our input is
bounded and only consists of integers to do even better. One algorithm that performs
particularly well in these cases is called radix sort.
To see how this works, suppose we have a list of non-negative numbers, such as [4,
100, 54, 537, 2, 89], and we know ahead of time that no number has more than
three digits. Then we can (stably) sort our list using three passes, corresponding to
each digit:
• First, we order by the ones' place, giving us [ 100, 2, 4, 54, 537, 89].
• Next, we order by the tens' place, giving us [ 100, 2, 4, 537, 54, 89].
• Finally, we order by the hundreds' place, giving us [2, 4, 54, 89, 100,
537].
Note that if a given number doesn't have a tens' or hundreds' place, we assign that
place value zero.
Each of these sorts is performed using counting sort, which gets around the efficiency
limits of comparison sorts like quicksort. In counting sort, we assign each number to
a bucket, and store each bucket as an element in an array of size equal to our base, in
this case 10. Then, we read off the elements in each bucket, in order, to get the new
sorted array.
We can implement this as follows:
186
CHAPTER 15. SORTING AND SEARCHING
def counting_sort(array, digit, base=10):
counts= [[]for_ in range(base)]
for num in array:
d = (num II base** digit)% base
counts[d].append(num)
result = []
for bucket in counts:
result.extend(bucket)
return result
def radix_sort(array, digits=?):
for digit in range(digits):
array= counting_sort(array, digit)
return array
Counting sort takes O(n
+ m), where n is the length of our input and mis the
number of buckets. Since m
< < n, we can consider this
to be O (n). We must
perform this sort once for each digit in the largest integer. If the largest integer has k
digits, the time complexity of this algorithm will be O (n * k). The space complexity
will be O (n), since each counting sort iteration overwrites an array of length n.
15.4
Find minimum element in rotated sorted array
A sorted array of integers has been rotated an unknown number of times.
Given this array, find the index of an element in the array in faster than linear time.
If the element doesn't exist in the array, return null.
For example, given the array [ 13, 18, 25, 2, 8, 10] and the element 8, return 4
(the index of 8 in the array).
You can assume all the integers in the array are unique.
CHAPTER 15. SORTING AND SEARCHING
187
Solution
We can obviously do this problem in linear time if we iterate over the array and
examine each element. How can we make this more efficient?
Whenever there is extra information given in an interview question, it pays to think
about how it can be used. In this case, a big clue should be that the array of integers
was previously sorted and then rotated. As we mention in the introduction, binary
search is an excellent technique for finding elements that are sorted. However, we
must make some tweaks to apply this to our rotated array.
In our solution, we first find the rotation point using binary search. Initially, our low
and high indices will be the start and end of our array. At each step we compare the
midpoint of our array to the first element, and make the following updates:
• If the midpoint is larger, the pivot must come after it, so we set low to be the
midpoint.
• If the midpoint is smaller, the pivot must come before it, so we set high to be
the midpoint.
With each iteration, we cut the search space in half in order to find the index at
which the original list was rotated.
Once we have this rotation point, we can do binary search as usual by remembering
to offset by the correct amount.
The code would look like this:
def shifted_array_search(lst, num):
#
First, find where the breaking point is in the shifted array.
i = len(lst) // 2
dist= i // 2
while True:
if lst[0] > lst[i]
break
elif dist== 0:
break
and
lst[i - 1] > lst[i]:
CHAPTER 15. SORTING AND SEARCHING
188
elif lst[0] <= lst[i]:
i = i + dist
elif lst[i - 1] <= lst[i]:
i = i - dist
else:
break
dist= dist II 2
#
#
Now that we have the bottom, we can do binary search as usual,
wrapping around the rotation.
low= i
high= i - 1
dist= len(lst) II 2
while True:
if dist== 0:
return None
guess_ind = (low+ dist)% len(lst)
guess= lst[guess_ind]
if guess== num:
return guess_ind
if guess< num:
low= (low+ dist)% len(lst)
if guess> num:
high= (len(lst) + high - dist)% len(lst)
dist= dist II 2
This solution runs in O(logn). However, this is definitely not the only solution!
There are many other possible ways to implement this, but as long as you have the
idea of doing binary search, you've got it.
Pathfinding
Since graphs can be used to represent almost anything, computer scientists have
spent a lot of time and energy trying to find efficient algorithms for manipulating
them.
One particularly rich area of application is that of path:finding. The goal of these
algorithms is to find the shortest, least expensive, or otherwise best path through
a graph with weighted edges. Whether it is powering GPS systems, modeling the
spread of diseases, or calculating the best route through a maze containing hidden
treasure, pathfinding algorithms frequently come in handy.
In particular, in this chapter we will focus on three important algorithms. First, we
motivate and explain Dijkstra's algorithm, used to find the shortest path from one
node to all other nodes. Next, we take a look at Bellman-Ford, which is similar
to Dijkstra's algorithm except it can also handle negative edge weights. Finally,
we explore the Floyd-Warshall algorithm, which efficiently finds the shortest path
between every pair of nodes in a graph.
189
190
CHAPTER 16. PATHFINDING
16.1
Dijkstra's algorithm
A network consists of nodes labeled O to n. You are given a list of edges (a, b, t),
describing the time tin seconds it takes for a message to be sent from node a to
node b. Whenever a node receives a message, it immediately passes the message on
to a neighboring node, if possible.
Assuming all nodes are connected, determine how long it will take for every node to
receive a message that begins at node 0.
For example, given n
= 5 and the following graph:
You should return 9, because propagating the message from
0
->
2
->
3
->
4
will
take nine seconds.
For convenience, here is the list of weighted edges ( u, v, weight) for this graph.
( 0 I 1, 5),
(0, 2, 3) I
(0, 5, 4) I
( 1, 3, 8) I
( 2 I 3, 1),
(3, 5, 10) I
(3, 4, 5)
CHAPTER 16. PATHFINDING
191
Solution
To help organize our input, we can think of the network nodes as vertices on a graph,
and each connection as an edge. We will use a helper class that maps each node to a
list of tuples of the form (neighbor, time).
class Network:
def __ init __ (self, N, edges):
self.vertices= range(N + 1)
self.edges= edges
def make_graph(self):
graph= {v: [] for v in network.vertices}
for u, v, win network.edges:
graph[u].append((v, w))
return graph
Finding the shortest amount of time it will take for each node to receive the message,
then, is equivalent to finding the shortest path between vertex O and all other vertices.
For this we can use Dijkstra's algorithm.
Put briefly, Dijkstra's algorithm is used to compute the shortest path between two
vertices of a graph, under the assumption that all edges have nonnegative weight. It
· works by repeatedly traveling to the closest vertex which has not yet been reached.
We will first create a dictionary, times, mapping each node to the minimum amount
of time it takes for a message to propagate to it. Initially this will be infinite for all
nodes except the start node, which is zero.
Then, we consider an unvisited node with the smallest propagation time. For each
of its neighbors, we replace the propagation time for that neighbor with the time
it would take to go through the current node to get to that neighbor, if the latter is
smaller. We continue this process until we have visited all nodes.
CHAPTER 16. PATHFINDING
192
In the end, the largest value in our dictionary will represent the time it will take for
the last node to get the message.
def propagate(network):
graph= network.make_graph()
times= {node: float('inf') for node in graph}
times[0] = 0
q = list(graph)
while q:
u = min(q, key=lambda x: times[x])
q.remove(u)
for v, time in graph[u]:
times[v] = min(times[v], times[u)
+
time)
return max(times.values())
Since we must find the minimum value of our dictionary for each unexamined node,
and there are n nodes, this will take O (n 2 ) time.
For sparse graphs, we can improve on this by using a priority queue, ordering each
node by propagation time. To start, this queue will just hold node zero, with value
zero.
Starting from 0, then, each time we encounter a new neighbor, we add it to the
queue, with value equal to the sum of the time from node zero to the current node,
and from the current node to the neighbor. Whenever we pop a node off the queue
that does not exist in our times dictionary, we add a new key with the corresponding
value.
def propagate(network):
graph= network.make_graph()
times = {}
q = [(0, 0)]
while q:
u, node= heapq.heappop(q)
if node not in times:
CHAPTER 16. PATHFINDING
193
times[node] = u
for neighbor, v in graph[node]:
if neighbor not in times:
heapq.heappush(q, (u + v, neighbor))
return max(times.values())
It takes O (log V) time to pop or push an element from the heap, and we must do
this for each vertex. In addition, we must consider every edge in our graph. As a
result, the complexity of this algorithm is O(E + V log V).
16.2
Bellman-Ford
Given a table of currency exchange rates, represented as a 2-D array, determine
whether there is a possible arbitrage opportunity. That is, find out if there is some
sequence of trades you can make, starting with some amount X of any currency, so
that you can end up with some amount greater than X of that currency.
Here is one possible sample input, describing the exchange rates between the US
dollar, the pound sterling, the Indian rupee, and the euro.
graph= {
'USD'
{'GBP': 0.77, 'INR': 71.71, 'EUR': 0.87},
'GBP'
{'USD': 1.30, 'INR': 93.55, 'EUR': 1.14},
'INR'
{'USD': 0.014, 'GBP': 0.011, 'EUR': 0.012},
'EUR'
{'USD': 1.14, 'GBP': 0.88, 'INR': 81.95}
Assume that there are no transaction costs and you can trade fractional quantities.
Solution
For this question, we can model the currencies and exchange rates as a graph, where
nodes are currencies and edges are the exchange rates between each currency. We can
194
CHAPTER 16. PATHFINDING
assume that our table contains rates for every possible pair of currencies, so that the
graph is complete. To solve this problem, then, we must determine if it is possible to
find a cycle whose edge weight product is greater than 1.
Unfortunately, all the algorithms we described in the introduction deal with computing the sum of edge weights, not the product! However, with a little math we can fix
this issue.
Recall that one of the properties of logarithms is that log( a x b)
= log( a) + log(b).
Therefore, if we take the logarithm of each exchange rate, and negate it to ensure that
weights are positive, we can transform this problem into one of finding a negative
sum cycle. And luckily for us, we can use the Bellman-Ford algorithm for just this
purpose.
As a refresher, the Bellman-Ford algorithm is commonly used to find the shortest
path between a source vertex and each of the other vertices. If the graph contains
a negative cycle, however, it can detect it and throw an exception (or, in our case,
return true).
The Bellman-Ford algorithms works as follows. We begin by fixing a source node,
and setting the distance to all other nodes to infinity. Then, for each edge ( u, v) in
our graph, we check if it is more efficient to get to v along the edge from u than the
current best option. If so, we update the distance value for v.
Note that for any graph with V vertices, the longest path can have at most
edges. As a result, we can repeat the above operation
IV I -
IVI -
1
1 times and eventually
arrive at the optimal way to reach each vertex.
If after
IVI -
1 iterations, we can still find a smaller path, there must be a negative
cycle in the graph.
from math import log
def arbitrage(table):
transformed_graph
#
#
=
[[-log(edge) for edge in row] for row in graph]
Pick any source vertex - we can run Bellman-Ford from any vertex and
get the right result
CHAPTER 16. PATHFINDING
195
source= 0
n = len(transformed_graph)
min_dist = [float('inf')] * n
min_dist[source] = 0
Relax edges IV - 11 times
for i in range(n - 1):
for v in range(n):
#
for win range(n):
if min_dist[w] > min_dist[v]
min_dist[w] = min_dist[v]
+
+
transformed_graph[v][w]:
transformed_graph[v][w]
If we can still relax edges, then we have a negative cycle
for v in range(n):
#
for win range(n):
if min_dist[w] > min_dist[v]
+
transformed_graph[v][w]:
return True
return False
Because of the triply-nested for loop, this runs in O (N 3 ) time.
16.3
Floyd-Warshall
The transitive closure of a graph is a measure of which vertices are reachable from
other vertices. It can be represented as a matrix M, where M[ i] [ j]
a path between vertices i and j, and otherwise 0.
For example, suppose we are given the following graph:
The transitive closure of this graph would be:
== 1 if there is
CHAPTER 16. PATHFINDING
196
[1, 1, 1, 1]
[0, 1, 1, 0]
[0, 0, 1, 0]
[0, 0, 0, 1]
Given a graph, find its transitive closure.
Solution
One algorithm we can use to solve this is a modified version ofFloyd-Warshall.
Traditio.nally Floyd-Warshall is used for finding the shortest path between all vertices
in a weighted graph. It works in the following way: for any pair of nodes ( i, j), we
check to see if there is an intermediate vertex k such that the cost of getting from
i to k to j is less than the current cost of getting from i to j. This is generalized
by examining each possible choice of k, and updating every (i, j) cost that can be
improved.
In our case, we are concerned not with costs but simply with whether it is possible to
get from i to j. So we can start with a boolean matrix reachable filled with zeros,
except for the connections given in our adjacency matrix.
Then, for each intermediate node k, and for each connection (i, j), if reachable [ i] [ j]
is zero but there is a path from i to k and from k to j, we should change it to one.
def closure(graph):
n = len(graph)
reachable= [[0 for_ in range(n)] for_ in range(n)]
for i, v in enumerate(graph):
for neighbor in v:
reachable[i][neighbor]
fork in range(n):
for i in range(n):
for j in range(n):
1
197
CHAPTER 16. PATHFINDING
reachable[i][j] I= (reachable[i][k]
and
reachable[k][j])
return reachable
Since we are looping through three levels of vertices, this will take O(V 3 ) time. Our
matrix uses O(V 2 ) space.
Bit Manipulation
Qyestions on bit manipulation questions are the curveballs of coding interviews:
they're less common, and can frequently trip up candidates who are unprepared. But
as we will explore below, as long as you understand a few key concepts you'll find
these problems are actually very approachable.
First, let's discuss what a bit is. A bit, short for binary digit, is either 0 or 1. String a
bunch of these bits together, and you can represent any integer. Each place value,
starting from the right column and extending left, represents a power of two, so that
00000101 stands for 5 (2° + 2 2). In this example we've represented 5 as an 8-bi t
number, and most of the bits are zero, or "off".
To negate an n-bit number, we use an operation called two's complement, which is
to say we invert all the bits and then add one. As a result, 5 would become 11111011.
Bits are useful because they provide an extremely fast and space-efficient way of
calculating numerical operations. In particular, you should be familiar with the
following three operators:
• &(AND)
The bitwise AND takes two integers as input and produces a third integer whose
199
CHAPTER 17. BIT MANIPULATION
200
bits are 1 if and only if both corresponding input bits are 1.
For example, 00000101 & 00011110 = 00000100.
• I (OR)
The bitwise OR takes two integers as input and produces a third integer whose bits
are 1 if either corresponding input bit is 1.
For example, 00000101 I 00011110
•
A
= 00011111.
(XOR)
The bitwise XOR takes two integers as input and produces a third integer whose
bits are 1 if the corresponding input bits are different. That is, for each place one of
the input integers must be O and the other must be 1.
For example, 00000101
A
00011110 = 00011011.
Bits also provide a quick way of multiplying or dividing a number by powers of two.
This method is called bitshifting, and is represented by the symbols« and». In
effect,« inserts zeroes at the right end of the bit, so that each corresponding bit is
shifted to the left. Conversely, » can be thought of as inserting zeroes at the left
end of the bit, pushing elements rightward.
Here are some bitshifts in action:
5 << 2
20 (00000101 << 2 = 00010100)
5 >> 2 = 1 (00000101 >> 2 = 00000001)
Note that in the last example, when we "push" the last two bits to the right, they
essentially disappear.
Some common questions that you can expect in this topic are clearing and setting
bits, reversing bits, and using a bit representation to more efficiently solve problems
that can be translated into binary.
Let's dive in.
CHAPTER 17. BIT MANIPULATION
17.1
201
Find element that appears once in list
Given an array of integers where every integer occurs three times except for one
integer, which only occurs once, find and return the non-duplicated integer.
Forexample,given [6, 1, 3, 3, 3, 6, 6],return 1. Given [13, 19, 13, 13],
return 19.
Do this in O (N) time and O ( 1) space.
Solution
We can find the unique number in an array of two duplicates by XORing all the
numbers in the array. What this does is cancel out all the bits that have an even
number of ls, leaving only the unique (odd) bits out.
Let's try to extend this technique to three duplicates. Instead of cancelling out all
the bits with an even number of ones, we want to cancel those with a multiple of
three.
Let's assume all integers fit in 32 bits. First, we will create an array 32 zeroes long.
When iterating over each number in our array, we can match up each bit to its proper
spot in this array and increment a counter if that bit is set. Finally, we'll go over each
bit in the array, and if the value at that index is not a multiple of three, we know to
include that bit in our result.
def find_unique(arr):
result_arr = [0]
for num in arr:
* 32
for i in range(32):
bit= num >> i & 1
result_arr[i] += bit
result= 0
for i, bit in enumerate(result_arr):
if bit% 3 != 0:
result+= 2
** i
202
CHAPTER 17. BIT MANIPULATION
return result
This runs in linear time, since we iterate over the array once, and in constant space,
since we initialize an array of constant size.
17.2
Implement division without/ or* operators
Implement division of two positive integers without using the division, multiplication,
or modulus operators. Return the quotient as an integer, ignoring the remainder.
Solution
We can start by trying the simplest solution. Define x as the dividend and y as the
divisor. To get the quotient, we need to ask how many times we can subtract y from
x until the remainder is less than y. The number of times we subtract is the resulting
quotient
t. The time complexity of this brute force approach is on the order oft,
which can be very high, for example if xis 231
-
1 and y is 1.
Let's instead think about how to perform division on paper. Recall grade-school
long division, where we consider the left-most digit that can be divided by the divisor.
At each step, the quotient becomes the first digit of the result, and we subtract the
product from the dividend to get the remainder. The remainder is initially the value
x. We can abstract this process into subtracting the largest multiple of y x 10d from
the remainder, where dis the place of the digit (d
=
0 for the zeros place). Then we
add the multiple times 10d to our result.
This process would be straightforward if we had the modulus or multiplication
operators. However, we instead can take advantage of the bit shift operators in order
to multiply by powers of two, since a «
z results in a multiplied by 2z (e.g. 3 «
2 = 12). Now, we can find the largest y x 2d that fits within the remainder. As we
do in long division, we decrease the possible value of din each iteration. We start
CHAPTER 17. BIT MANIPULATION
203
by finding the largest value of y x 2d ::; x, then testy x 2d, y x 2d-l, ... , until the
remainder is less than y.
For example, let's say we want to divide x
3. Here are the steps we'd
31 byy
follow:
Step
x (binary)
Qyotient (decimal)
Start
11111
11111 - (11 « 3) = 0111
111 - (11 « 1) = 1
0
1«3
(1 « 3) + (1
3
y x 2 fits in x
y x 21 fits in x
«
1) = 10
Here is the Python implementation:
def divide(x, y):
if y == 0:
raise ZeroDivisionError('Division by zero')
quotient= 0
power= 32
y_power = y << power
#
#
Assume 32-bit integer
Initial yAd value is yA32
remainder= x
#
Initial remainder is x
white remainder>= y:
white y_power > remainder:
y_power »= 1
power-= 1
quotient+= 1 << power
remainder-= y_power
return quotient
The time complexity of this solution is 0( n ), where n is the number of bits used to
represent
x/y, assuming shift and add operations take 0(1) time.
CHAPTER 17. BIT MANIPULATION
17.3
204
Compute longest consecutive string of ones in binary
Given an integer n, return the length of the longest consecutive run of ones in its
binary representation.
For example, given 156, which is 10011100 in binary, you should return 3.
Solution
The most straightforward way to solve this would be to loop over the bits of the
number, keeping track of a counter of the maximum number of consecutive ones
seen. Whenever we see a longer run of set bits, we update our counter.
def find_length(n):
n = bin(n)[2:]
max_length = current_length = 0
for digit inn:
if digit== '1':
current_length += 1
max_length = max(max_length, current_length)
else:
current_length = 0
return max_length
This is O (n), where n is the number of digits in our input. Can we do better?
Let's try using using bit manipulation. In particular, note that if we perform the
operation x & x « 1, the longest consecutive run of ones must decrease by one.
This is because all but one of the set bits in the original number will correspond to
set bits in the shifted number. Using the example in the problem, we can see that
the maximum length changes from 3 to 2:
CHAPTER 17. BIT MANIPULATION
205
10011100
& 00111000
00011000
With this in mind, we can continue to
AND
our input with a shifted version of itself
until we reach 0. The number of times we perform this operation will be our answer.
def find_length(n):
max_length
=
0
while n:
max_length += 1
n = n & (n « 1)
return max_length
While the worst case here is the same as above, the number of operations we must
perform is now limited to the length of the longest consecutive run.
17.4
Find nth sevenish number
Let's define a "sevenish" number to be one which is either a power of 7, or the sum of
unique powers of 7. The first few sevenish numbers are 1, 7, 8, 49, and so on. Create
an algorithm to find the n th sevenish number.
Solution
A brute force solution to this problem would involve looking at consecutive integers
one at a time and computing whether they are sevenish. Once we've found n of
these, we return the last one found. To make this a little more efficient, we can use a
helper function to precompute a set of sevenish numbers, by finding the totals of all
206
CHAPTER 17. BIT MANIPULATION
subsets of the first n powers of 7. This way, checking whether an integer is sevenish
is 0(1).
def get_sevenish_numbers(n):
powers
[7 ** i for i in range(n)]
totals= {0}
for pin powers:
#
Use set intersection to accumulate sums of powers.
totals I= {x + p for x in totals}
return totals
def nth_sevenish_number(n):
sevenish_numbers = get_sevenish_numbers(n)
i = 1
count, last sevenish number
0, 0
white count < n:
if i in sevenish numbers:
count+= 1
last sevenish number
i += 1
i
return last_sevenish_number
Still, generating all the subsets of the first n powers of 7 is O (2n), and we must use
an equivalent amount of space to store these totals.
Often when a problem involves taking powers of numbers, there is a bitwise solution,
and this is no exception. Note that when we convert a number to binary, we represent
it using the form
Xk
x 2k
+ Xk-l
x 2k-l
+ ... + xo
x 2°. To find unique sums of
powers of 7, then, we can imagine that each bit represents a power of 7 instead of 2!
Let's look at the first few sevenish numbers to see how this works:
CHAPTER 17. BIT MANIPULATION
• 100 ( 1
*
207
7 A 2 : 4 9)
So the n th sevenish number will be the n th binary number, translated into powers of
seven instead of two. This points the way to our solution: we will go through each
bit of n, from least to most significant, and check if it is set. If so, we add 7bit place to
<;mr total. Once we bitshift through the entire number, we can return the total.
def nth_sevenish_number(n):
answer= 0
bit_place = 0
while n:
if (n & 1):
answer+= 7
**
bit_place
n >>= 1
bit_place += 1
return answer
This algorithm is linear in the number of digits in our input and requires only constant
space.
Randomized Algorithms
A randomized algorithm is one in which there is some element of randomness at
play. As a result, the output from each run of a program may not be the same, and
we may have to rely on probabilistic guarantees for the result or run time.
In general there are two types of random algorithms, both named aptly after gambling
meccas: Las Vegas and Monte Carlo. In a Las Vegas algorithm, you can be sure that
the result will be correct, but the run time is potentially infinite (though finite in
expectation). A simple example would be rolling a die until you see the number 6,
and counting the number of rolls.
On the other hand, a Monte Carlo algorithm is one in which the run time is finite,
but the accuracy can only be stated in terms of probability. For example, if we flip a
fair coin 5 times, we can say there is a 1 -
½5 = 0.96875 probability we will see at
least one head.
When dealing with probabilities, simulating some effect, or selecting an item among
many according to a particular distribution, random algorithms are a good choice.
Randomness can also show up as part of a larger algorithm, such as quicksort, where
the pivot is often randomly selected in order to improve performance.
All common languages offer support for randomness, typically in the form of pseudorandom number generators (PRNGs). In Python, you should be familiar with the
209
CHAPTER 18. RANDOMIZED ALGORITHMS
random
#
library, which offers the following methods:
Generates a float between 0 and 1
random.random() e.g.
#
#
0.4288890546751146
Chooses an integer in the range a and b, inclusive
random.randint(a=3, b=5)
#
#
3
Choose an element in a sequence
random.choice(range(10))
#
6
#
Permute the ordering of a sequence
X
= [1 1 2 1 3, 4 1 5]
random.shuffle(x)
#
210
#
[4, 5, 3, 1, 2]
Reset the PRNG so that the result of subsequent runs are identical
random.seed(?)
Note that when running this code multiple times, the results will be different, unless
a particular seed is set before each operation.
While it is impossible to cover all applications, we will explore a few common
instances where randomness crops up in interview questions and apply several techniques that are generally useful.
18.1
Pick random element from infinite stream
Given a stream of elements too large to store in memory, pick a random element
from the stream with uniform probability.
Solution
Naively, we could process the stream and store all the elements we encounter in a list.
If the size of this list is n, we could then randomly generate a number between O and
CHAPTER 18. RANDOMIZED ALGORITHMS
211
n and choose the element at that index. The problem with this approach is that it
would take 0( n) space, and if the stream is very large this would not fit in memory.
Instead, there is a clever method of solving this using a technique called reservoir
sampling. The idea is simple: when examining the i th element in our stream, choose
that element with a probability of
iJl. To make the calculations easier, we assume
throughout this explanation that our indices start at zero.
We can prove this algorithm works using induction. For the base case, we can see
that choosing the first element with a probability of½ is correct, since there are no
other options.
Now suppose that we are facing the i th element, where i > 1, and the previous i 1 elements have been correctly dealt with.
In other words, any element k in [ 0, i -
1] had a¾ chance ofbeing chosen as the random element.
After the current iteration, we would like each element to have a
iJl probability of
being selected. Note that the chance of having been chosen previously is ¾, and the
chance of not being swapped for the current element is 1 together, we find that our inductive step indeed holds true.
1
i
- X
1
i+l
This is how the code might look:
import random
def pick(big_stream):
random_element
=
None
for i, e in enumerate(big_stream):
if random.randint(l, i + 1)
random_element
return random_element
=
1
i+l
(1--)--
e
1:
iJl. Multiplying these
CHAPTER 18. RANDOMIZED ALGORITHMS
212
Since we are only storing a single variable, this only takes up constant space!
18.2
Shuffle deck of cards
Given a function that generates perfectly random integers between 1 and k (inclusive),
where k is an integer, write a function that shuffles a deck of cards represented as an
array using only swaps.
Hint: Make sure each one of the 52! permutations of the deck is equally likely.
Solution
The most common mistake people make when implementing this shuffle is to use
the following procedure:
• Iterate through the array with an index i
• Generate a random index j between O and n - 1
• Swap A[ i] and A[ j]
That code would look something like this:
def shuffle(arr):
n
=
len(arr)
for i in range(n):
j = randint(0, n - 1)
arr[i], arr[j] = arr[j], arr[i]
return arr
This looks like it would reasonably shuffle the array. However, the issue with this code
is that it slightly biases certain outcomes. Consider the array [a, b, c]. At each
step i, we have three different possible outcomes, since we can switch the element at
CHAPTER 18. RANDOMIZED ALGORITHMS
213
i with any other index in the array. Since we swap up to three times, there are 33
=
27 possible (and equally likely) outcomes. But there are only 6 outcomes, and they
all need to be equally likely:
• [a, b, c]
• [a, c, b]
• [b,a,c]
• [b, c, a]
• [c, a, b]
• [c, b, a]
6 doesn't divide into 27 evenly, so it must be the case that some outcomes are overrepresented. Indeed, if we run this algorithm a million times, we see some skew:
(2, 1, 3): 184530
( 1, 3, 2): 185055
(3, 2, 1): 148641
(2' 3, 1): 185644
(3, 1, 2): 147995
( 1, 2' 3): 148135
Recall that we want every permutation to be equally likely: in other words, any
element should have a¾ probability of ending up in any spot. Instead, we can do
the following:
• Iterate through the array with an index i
• Generate a random index j between i and n - 1
• Swap A[ i] and A[j]
214
CHAPTER 18. RANDOMIZED ALGORITHMS
Why does this generate a uniform distribution? Let's use a loop invariant to prove
this.
Our loop invariant will be the following: at each index i of our loop, all indices
before i have an equally random probability of holding any element from our array.
=
Consider i
1. Since we are swapping A [ 0] with an index that spans the entire
array, A[ 0] has an equally uniform probability of being any element in the array. So
our invariant is true for the base case.
Now assume our loop invariant is true until i and consider the loop at i + 1. Then
we should calculate the probability of some element ending up at index i
+ 1.
That's
equal to the probability of not picking that element up until i and then choosing it.
All the remaining prospective elements must not have been picked yet, which means
it avoided being picked from Oto i. That's a probability of:
n-1
n-2
n-i-1
- - x - - x ... x - - - n
n-1
n-i
Finally, we need to actually choose it. Since there are n - i - 1 remaining elements
to choose from, that's a probability of
n-L
1.
Putting them together, we have a probability of:
n-1
n-2
n-1
n-i-1
1
n-i-1
- - X - - X ... X - - - - X - - - -
n
n-i
Notice that everything beautifully cancels out and we are left with a probability of¾Here's what the code looks like:
def shuffle(arr):
n
=
len(arr)
for i in range(n - 1):
CHAPTER 18. RANDOMIZED ALGORITHMS
215
j = randint(i, n - 1)
arr[i], arr[j] = arr[j], arr[i]
return arr
P.S. This algorithm is called the Fisher-Yates shuffie.
18.3
Markov chain
A Markov chain can be thought of as a description of how likely some events are to
follow others. More mathematically, it describes the probabilities associated with a
given state transitioning to any other state.
For example, let's say the transition probabilities are as follows:
0.9),
(Ia I I
Ia I
(Ia I I
I
(Ia I I
IC I,
0.025),
Ia I
0.15),
I
b I, 0.075),
( I
bI
,
( I
bI
,
I
( I
bI
,
IC I,
0.05),
(IC I ,
Ia I
0.25),
(IC I,
I
(IC I,
IC I,
I
b I, 0.8),
I
b I, 0.25),
0.5)
This indicates that if we begin with state a, after one step there is a 90% chance
the state will continue to be a, a 7.5% chance the state will change to b, and a 2.5%
chance the state will change to c.
Suppose you are given a starting state start, a list of transition probabilities such as
the one above, and a number of steps num_steps. Run the associated Markov chain
starting from start for num_steps steps and compute the number of times each
stated is visited.
CHAPTER 18. RANDOMIZED ALGORITHMS
216
One instance of running this particular Markov chain might produce { 'a': 3012,
'b' : 1656, 'c' : 332 }.
Solution
We need to run the Markov chain and keep counts of each state we visit.
It will be useful to define a next_state function that takes in the current state and
perhaps the possible transitions and their probabilities. Then we can run our Markov
chain, starting with start, by running next_state the desired number of times
while keeping track of the current state. All we have to do then is accumulate each
state's counts.
Finally, even though we receive the probabilities as a list of tuples, it will be convenient
if we transform the list into a dictionary so that we can look up the possible transitions
and their probabilities faster:
from collections import defaultdict
from random import random
def histogram_counts(start, trans_probs, num_steps):
probs_dict = transform_probs(trans_probs)
count_histogram = defaultdict(int)
current_state
=
start
for i in range(num_steps):
count_histogram[current_state] += 1
next_state_val = next_state(current_state, probs_dict)
current_state
=
next_state_val
return count_histogram
def next_state(current_state, probs_dict):
r
=
random()
for possible_state, probability in probs_dict[current_state].items():
r
-=
probability
if r <= 0:
return possible_state
CHAPTER 18. RANDOMIZED ALGORITHMS
def transform_probs(trans_probs):
d
=
defaultdict(dict)
for start, end, prob in trans_probs:
d[start][end]
return d
=
prob
217
Advanced Algorithms
With the concepts discussed in the previous chapters under your belt, you are well on
your way to acing your next coding interview. In general, interviewers are not out to
"trick'' you: they are testing your knowledge of core data structures and algorithms,
and your ability to apply this knowledge in novel situations.
There are cases, however, where more challenging algorithms may come in handy,
and indeed there is practically no limit to the complexity of algorithm design. Whole
books can be written on subfields such as network optimization, approximation
algorithms, computational geometry, and more.
In this chapter, therefore, we aim to showcase a couple of advanced algorithms,
both because of their importance in computer science and also to demonstrate the
way in which simpler building blocks can be combined to create more specialized
procedures.
Our first problem introduces the Rabin-Karp algorithm, which uses a clever hashing
tactic to efficiently find patterns in a string. Next, we describe a method for finding an
Eulerian cycle in a graph known as Hierholzer's algorithm. Lastly, we work through
a question involving A*, a path:finding algorithm which uses heuristic function to
more optimally choose subsequent steps.
219
220
CHAPTER 19. ADVANCED ALGORITHMS
19.1
Rabin-Karp
Given a string and a pattern, find the starting indices of all occurrences of the pattern
in the string. For example, given the string "abracadabra'' and the pattern "abr", you
should return [ 0 ,
7] .
Solution
One solution would be to traverse the string, comparing the pattern to every slice
of the same size. When we find a match, we append the starting index to a list of
matches.
In the example above, we would compare "abr" against "abr", "bra", "rac", "aca", and
so on. If k is the length of the pattern, we are making up to k comparisons for each
slice, so our time complexity is O(k x n).
def find_matches(pattern, string):
matches= []
k = len(pattern)
for i in range(len(string) - k + 1):
if string[i : i + k] == pattern:
matches.append(i)
return matches
If we could somehow reduce the time it takes to compare the pattern to new slices
of the string, we'd be able to make this more efficient. This is the motivation behind
using a rolling hash.
To explain this idea, suppose we wanted to match the pattern "cat" against the string
"scatter". First let's assign each letter in the alphabet a value: a
= 1, b = 2,
..• ,
z = 26. Now let's make a very simple hash function which adds up the values of
each letter in the hash. For example, simple_hash( "cat") would be 3 + 1 + 20
24.
To find our pattern matches, we start by comparing "cat" to "sea".
=
CHAPTER 19. ADVANCED ALGORITHMS
Since simple_hash( sea
11
11
)
= 19 + 3 + 1 = 23,
221
we know we have not found a
match, and we can continue. This is where the "rolling" part comes in. Instead of
computing the hash value for the next window from scratch, there is a constant-time
operation we can do: subtract the value of"s" and add the value of"t". And in fact
we find that 23 - value( s) + value( t) = 24, so we have a match!
Continuing in this way, we slide along the string, computing the hash of each slice
and comparing it to the pattern hash. If the two values are equal, we check character
by character to ensure that there is indeed a match, and add the appropriate index to
our result.
def value(letter):
return ord(letter) - 96
def simple_hash(prev, start, new):
return prev - value(start) + value(new)
def find_matches(pattern, string):
matches=[]
k = len(pattern)
#
First get the hash of the pattern.
pattern_val = 0
for i, char in enumerate(pattern):
pattern_val += value(char)
#
Then get the hash of the first k letters of the string.
hash_val = 0
for i, char in enumerate(string[:k]):
hash_val += value(char)
if pattern_val == hash_val:
if string[:k] == pattern:
matches.append(0)
#
Roll the hash over each window of length k.
for i in range(len(string) - k):
hash_val = simple_hash(hash_val, string[i], string[i + k])
if hash_val == pattern_val:
if string[i + 1 : i + k + 1] == pattern:
matches.append(i + 1)
CHAPTER 19. ADVANCED ALGORITHMS
222
return matches
The problem with this solution, though, is that our hash function is not that good.
For example, when we match "abr" against "abracadabra'', our pattern will match both
"abr" and "bra'', requiring us to perform extra string matches.
A more sophisticated rolling hash function is called Rabin-Karp, and a simplified
version of it works as follows.
Suppose we have a string of length k. Each letter in the string is first mapped to
1 - 26 as above, then multiplied by a power of 26. The first character is multiplied
by 25k-l, the second by 25k- 2 , and so on, all the way to the k th letter, which is
multiplied by 26°. Finally, we add all these values together to get the hash of the
string.
So for example, "cat"will evaluate to 3 * 26A2 + 1 * 26Al + 21 * 26A0
= 2075.
Now suppose we are sliding a rolling window over the word "cats", and we next want
to find the hash value of ats. Instead of computing the value from scratch, we carry
out the following steps:
• subtract the value of the first letter of the current hash (3
*
26A2)
• multiply the current hash value by 26
• add the value of the last letter in the shifted window (19)
Using these steps, the value of ats will be ( 2075 - 3
*
26A2)
*
26 + 19
1241.
This works because we are essentially shifting the powers of 26 leftward.
If we replace our simple hash function with the new and improved one, our solution
should look like this:
def value(letter, power):
return (26 **power)* (ord(letter) - 96)
CHAPTER 19. ADVANCED ALGORITHMS
223
def rabin_hash(prev, start, new, k):
return (prev - value(start, k - 1)) * 26 + value(new, 0)
def find_matches(pattern, string):
matches=[]
k = len(pattern)
pattern_val = 0
for i, char in enumerate(pattern):
pattern_val += value(char, k - i - 1)
hash_val = 0
for i, char in enumerate(string[:k]):
hash_val += value(char, k - i - 1)
if pattern_val == hash_val:
if string[:k] == pattern:
matches.append(0)
for i in range(len(string) - k):
hash_val = rabin_hash(hash_val, string[i], string[i + k], k)
if pattern_val == hash_val:
if string[i + 1 : i + k + 1] == pattern:
matches.append(i + 1)
return matches
In the worst case, if our hash function produces many false positives, we will still
have to check each substring against the pattern, so this would take O (k x n).
Practically, however, we should not see too many false positives. In the average case,
since our algorithm takes O ( k) to compute the hash of the pattern and the start of
the string, and O (n - k) to roll the hash over the rest of the string, our running time
should be O(n).
CHAPTER 19. ADVANCED ALGORITHMS
19.2
224
Hierholzer's algorithm
For a set of characters C and an integer k, a De Bruijn sequence is a cyclic sequence
in which every possible k- length string of characters in C occurs exactly once.
For example, suppose C
= { 0,
1} and k
= 3.
Then our sequence should contain the
substrings{'000', '001', '010', '011', '100', '101', '110', '111'},and
one possible solution would be 00010111.
Create an algorithm that finds a De Bruijn sequence for a given C and k.
Solution
There is a neat way of solving this problem using a graph representation.
Let every possible string of length k - 1 of characters in C be vertices. For the
example above, these would be 00, 01, 10, and 11. Then each string oflength k can
be represented as an edge between two vertices that overlap to form it. For example,
000 would be a loop from 00 to itself, whereas 001 would be an edge between 00
and 01.
We can construct this graph like so:
from itertools import product
def make_graph(C, k):
#
Use Cartesian product to get all strings of length k-1.
vertices= product(C, repeat=k-1)
edges={}
for v in vertices:
#
Create edges between two vertices that overlap properly.
edges[v] = [v[l:]
return edges
+
char for char in C]
CHAPTER 19. ADVANCED ALGORITHMS
225
The graph created would be as follows:
In order to find the De Bruijn sequence, we must traverse each edge exactly once.
In other words, we must find an Eulerian cycle. One method to accomplish this is
known as Hierholzer's algorithm, which works as follows.
Starting with a given vertex, we move along edges randomly, adding the vertices seen
to our path, until we come back to where we started. If this path uses up every edge
in our graph, we are done. Otherwise, we can take one of the vertices in our path
that has an unused edge, perform the same process on that vertex, and substitute the
new path back into the original path to replace the vertex.
We can continue inserting new cycles in this manner until every edge of the graph is
used.
For example, suppose we traversed the graph above starting with 01, and found the
following path: 01 -> 11 -> 11 -> 10 -> 00 -> 00 -> 01. Since there is still
an unused edge attached to 01, we would next find the path 01 -> 10 -> 01 and
substitute it for '01' at the beginning of our original path, resulting in a valid De
Bruijn sequence.
def find_eulerian_cycle(graph):
cycle= []
start= list(graph)[0]
before= after=[]
while graph:
if cycle:
#
Find the next vertex to expand into a cycle.
CHAPTER 19. ADVANCED ALGORITHMS
226
start= next(vertex for vertex in cycle if vertex in graph)
index= cycle.index(start)
before= cycle[:index]; after= cycle[index + 1:]
cycle= [start]
prev = start
white True:
#
Keep popping from the graph and appending to the chain
#
until a cycle is formed.
curr = graph[prev].pop()
if not graph[prev]:
graph.pop(prev)
cycle.append(curr)
if curr == start:
break
prev = curr
cycle= before+ cycle+ after
return cycle
Instead of using the vertices, it suffices to return the last element of each one, since
that determines the actual path taken. Therefore, our main function should look like
this:
def debruijn(C, k):
graph
make_graph(C, k)
cycle
find_eulerian_cycle(graph)
sequence
[v[-1] for v in cycle[:-1]]
return sequence
The time required to create the graph will be on the order of the number of edges,
which is
IClk. To find the Eulerian cycle, in the best case, we will not need to insert
CHAPTER 19. ADVANCED ALGORITHMS
227
any new paths, so we only need to consider the time needed to pop off and append
each edge, which is O (E). In the worst case, however, we might perform around
log E substitutions, so this algorithm would take closer to O(E log E) time.
19.3
A* search
An 8-puzzle is a game played on a 3 x 3 board of tiles with the ninth tile missing.
The remaining tiles are labeled 1 through 8 but shuffied randomly. Tiles may slide
horizontally or vertically into an empty space, but may not be removed from the
board.
Design a class to represent the board and find a series of steps to bring the board to
the state [[1, 2, 3], [4, 5, 6], [7, 8, None]].
Solution
This is a tough problem to implement in an interview setting, but fear not: we will
go through it step by step.
Let's get the challenging part out of the way first: the algorithm. We might first
think to use a graph algorithm like Dijkstra's or breadth-first search, but there are
sqme challenges. For one, there are tons of cycles. If we consider a state to be any
configuration of digits in the puzzle, it is clear that we can go back and forth between
any two adjacent states forever.
Even if we prohibit going back to previous states, moving tiles randomly around
the board is extremely inefficient. What we need is an algorithm with the following
properties:
• For any given position, we should evaluate successor states and select the best
one.
• We should track multiple paths at once, and switch paths if there is a path
more promising than the current one.
CHAPTER 19. ADVANCED ALGORITHMS
228
One algorithm with both of these qualities is A*.
This is a pathfinding algorithm in which we store all relevant paths in a priority
queue, implemented using a heap. With each item we pop from the queue, we:
• Check if the current state matches our goal, in which case we can return the
move count and path.
• Find all possible successors to the current state.
• Push each unvisited successor into the heap, along with an updated move
count and path.
Crucially, the priority used to push and pop items will be a heuristic which evaluates
the closeness of the current state to our goal, plus the number of moves already
traveled. This way, shorter paths, as well as those which look more promising, will be
selected first.
import heapq
def search(start):
heap= []
visited= set()
heapq.heappush(heap, [start.heuristic, 0, start, ''])
while heap:
_, moves, board, path= heapq.heappop(heap)
if board.tiles== board.goal:
return.moves, path
visited.add(tuple(board.tiles))
for successor, direction in board.get_moves():
if tuple(successor.tiles) not in visited:
item= [moves+ 1 + successor.heuristic, moves+ 1,
successor, path+ direction]
heapq.heappush(heap, item)
return None
229
CHAPTER 19. ADVANCED ALGORITHMS
Note that this algorithm will require our board class to store the tiles and the goal
state. We also must implement get_moves, which finds all valid successor states
from a given position.
We will store the tiles as a simple list.
To find the next moves, we first locate the index of the empty square, represented as
zero in our list. Next, we examine each of the four ways of swapping a tile vertically
or horizontally into this square. If the movement is valid, we add the new board
state into the list of successors to return, as well as the direction the tile was moved.
With this logic settled, we can begin to flesh out our class.
class Board:
def __ init __ (self, nums, goal='123456780'):
self.goal= list(map(int, goal))
self.tiles
nums
self.empty= self.tiles.index(0)
def swap(self, empty, diff):
tiles= copy(self.tiles)
tiles[empty], tiles[empty + diff] = \
tiles[empty + diff], tiles[empty]
return tiles
def get_moves(self):
successors=[]
empty= self.empty
if empty// 3 > 0:
successors. append(( Board( self. swap( empty,
if empty// 3 < 2:
successors.append((Board(self.swap(empty,
if empty% 3 > 0:
successors.append((Board(self.swap(empty,
if empty% 3 < 2:
successors.append((Board(self.swap(empty,
return successors
-3)), 'D'))
+3)), 'U'))
-1)), 'R'))
+1)), 'L'))
CHAPTER 19. ADVANCED ALGORITHMS
230
As you may have noticed above, A* depends heavily on the quality of the heuristic
used. A bad heuristic is not just inefficient; it may doom the algorithm to failure!
For this problem, we would like to estimate how close we are to the goal, which is a
board that looks like this:
1
2
3
4
5
6
7
8
0
One useful metric we can take advantage of is Manhattan distance. To calculate the
Manhattan distance of two items in a grid, we count up the number of horizontal or
vertical moves it would take to get from one to the other. Our heuristic will then be
the sum of each digit's Manhattan distance to its position in the goal state.
class Board:
def __ init __ (self, nums, goal='123456780'):
self.goal= list(map(int, goal))
self.tiles= nums
self.original= copy(self.tiles)
self.heuristic= self.heuristic()
def manhattan(self, a, b):
a_row, a_col = a II 3, a% 3
b_row, b_col = b II 3, b % 3
return abs(a_row - b_row) + abs(a_col - b_col)
def heuristic(self):
total= 0
for digit in range(l, 9):
total+= self.manhattan(self.tiles.index(digit),
self.goal.index(digit))
return total
CHAPTER 19. ADVANCED ALGORITHMS
231
Putting it all together, given an arbitrary starting list of numbers, we can first initialize
a board class with these numbers, and then use our search algorithm to find an efficient
solution.
import heapq
from copy import copy
class Board:
def __ init __ (self, nums, goal='123456780'):
self.goal= list(map(int, goal))
self.tiles= nums
self.empty= self.tiles.index(0)
self.original= copy(self.tiles)
self.heuristic= self.heuristic()
def __ lt __ (self, other):
return self.heuristic< other.heuristic
def manhattan(self, a, b):
a_row, a_col
a II 3, a% 3
b_row, b_col = b II 3, b % 3
return abs(a_row - b_row) + abs(a_col - b_col)
def heuristic(self):
total= 0
for digit in range(1, 9):
total+= self.manhattan(self.original.index(digit),
self.tiles.index(digit))
total+= self.manhattan(self.tiles.index(digit),
self.goal.index(digit))
return total
def swap(self, empty, diff):
tiles= copy(self.tiles)
tiles[empty], tiles[empty + diff] = \
tiles[empty + diff], tiles[empty]
return tiles
def get_moves(self):
successors= []
empty= self.empty
CHAPTER 19. ADVANCED ALGORITHMS
232
if empty// 3 > 0:
successors.append((Board(self.swap(empty, -3)) I ID I))
if empty// 3 < 2:
successors.append((Board(self.swap(empty, +3))' I uI))
if empty% 3 > 0:
successors.append((Board(self.swap(empty, -1)), IR I))
if empty% 3 < 2:
successors.append((Board(self.swap(empty, +1)), I LI ) )
return successors
def search(start):
heap= []
closed= set()
heapq. heappush( heap, [start. heuristic, 0, start, ''])
while heap:
_, moves, board, path= heapq.heappop(heap)
if board.tiles== board.goal:
return moves, path
closed.add(tuple(board.tiles))
for successor, direction in board.get_moves():
if tuple(successor.tiles) not in closed:
item= [moves+ 1 + successor.heuristic, moves+ 1,
successor, path+ direction]
heapq.heappush(heap, item)
return float('inf'), None
def solve(nums):
start= Board(nums)
count, path= search(start)
return count, path
The running time and space of A* is O(bd) in the worst case, where bis the average
number of successors per state, and d is the length of the shortest path solution.
Using our heuristic, however, reduces this considerably in practice.
Finally, we should note that up to now we have assumed that a solution always exists,
which in fact is not the case. To take an example, we will never be able to permute
CHAPTER 19. ADVANCED ALGORITHMS
233
the following grid to our goal state:
1
2
3
4
5
6
8
7
0
In this case, we will end up evaluating every possible permutation of the board
reachable from the starting state, which will be around O(n!), where n is the number
of digits.
Part III
Applications
235
Applications
We have up to this point gone through the essentials of data structures and algorithms.
If you have worked through each of the preceding problems, you should be confident
in your ability to break down any interview problem and find a solution.
In this section we have collected a set of questions which demonstrate the realworld usefulness of the concepts previously introduced. Each one will require you to
recognize which algorithm is required and tweak it to meet specific needs.
Fair warning: several of these problems are substantially more difficult than ones
in the preceding chapters, and indeed than what you will likely see in an interview
setting. Practicing these higher-difficulty questions will help to ensure that you really
understand the core concepts, and ideally will make your next interview feel relaxing
as a result!
We recommend the following method when working through this chapter's questions,
and it applies just as well to an interview setting.
First, read over each problem carefully. Understand what is being asked for, and look
for key words that jog your memory of data structures and algorithms from previous
chapters. Does a question require finding the top k items? Maybe a heap can be
used. Are there overlapping subproblems? Go for dynamic programming.
237
CHAPTER 20. APPLICATIONS
238
Next, think through how these data structures or algorithms can be put to use. Sketch
out a skeleton of how your code would work at a high level, without writing the
implementation of any functions. In an interview setting, this would be a good time
to check in with your interviewer to ensure that your structure makes sense.
Once your skeleton code is in place, flesh out the definitions of each function, and
adapt the code as necessary to deal with problems that arise. Don't be afraid to alter
your approach along the way, but try to :figure out why your current implementation
isn't working before making any drastic changes.
Of course, you should feel free to refer to the solution if you get stuck. If this happens,
we recommend coming back to the problem later and working through it a second
(or third!) time, to build up your solving skills.
Best of luck!
20.1
Ghost
Ghost is a two-person word game where players alternate appending letters to a
word. The :first person who spells out a dictionary word, or creates a prefix for which
there is no possible continuation, loses. Here is a sample game:
Turn
Letter
Player 1
g
Player 2
h
Player 1
0
Player 2
s
Player 1
t
Player 1 loses since they spelled "ghost".
Given a dictionary of words, determine the letters the :first player should start with,
such that with optimal play they cannot lose.
Forexample,ifthedictionaryis ["cat", "calf", "dog", "bear"],theonlywinning start letter would be b.
CHAPTER 20. APPLICATIONS
239
Solution
This is a case where using the right data structure gets you most of the way to a
solution.
For any set of starting letters, we want to efficiently find out which words can be
created later on in the game. In order to achieve this, we should be able to quickly
find words that complete a given prefix. This sounds like a perfect use case for a trie.
Just like in our chapter on tries, we can build one as follows:
ENDS_HERE
= '#'
class Trie:
def __ init __ (self, words):
self._trie
= {}
for word in words:
self.insert(word)
def insert(self, text):
trie
=
self._trie
for char in text:
if char not in trie:
trie[char]
trie
=
= {}
trie[char]
trie[ENDS_HERE]
=
True
def find(self, prefix):
trie
=
self._trie
for char in prefix:
if char in trie:
trie
=
trie[char]
else:
return None
return trie
When we initialize this trie with the words above, the resulting dictionary will look
like this:
CHAPTER 20. APPLICATIONS
240
Next, we must figure out what the winning prefixes are. Here, we can work backwards.
Any prefix which is itself a word is clearly a losing prefix. Going one step up, if
every next letter that can be added to a given prefix creates a loss, then this must be a
winning prefix. For example, "do" is losing, since the only continuation is g, which
creates a word.
Therefore, we can recursively determine the optimal starting letters by figuring out
which ones have only losing children.
def is_winning(trie, prefix):
root= trie.find(prefix)
if'#' in root:
return False
else:
next_moves =[prefix+ letter for letter in root]
if any(is_winning(trie, move) for move in next_moves):
return False
else:
return True
def optimal_starting_letters(words):
trie = Trie(words)
winners=[]
starts= trie.trie.keys()
for letter in starts:
CHAPTER 20. APPLICATIONS
241
if is_winning(trie, letter):
winners.append(letter)
return winners
Constructing the trie will take O ( n x k) time, where n is the number of words in
the dictionary, and k is their average length. To find the optimal first letters, we must
traverse each path in the trie, which again takes O(n x k). Therefore, this algorithm
runs in 0( n x k) time.
20.2
Connect 4
Connect 4 is a game where opponents take turns dropping red or black discs into
a 7 x 6 vertically suspended grid. The game ends either when one player creates a
line of four consecutive discs of their color (horizontally, vertically, or diagonally), or
when there are no more spots left in the grid.
Design and implement Connect 4.
Solution
For any design question, it is helpful to spend some time thinking about the structure
of our solution before jumping into coding. What are some of the core features of
Connect 4, as described above?
• We should represent the board in a way that allows it to change state with
each move.
• We should be able to display the board to the user.
• Players should be able to input valid moves.
• We should be able to check whether a win condition has been met.
CHAPTER 20. APPLICATIONS
242
• To play our game, we should repeatedly display the board, make a move, and
check for a win condition.
In an interview setting, it is often useful to create a skeleton of your solution which
contains the main methods and how they interact. Here is our skeleton code for
Connect 4, then:
class Game:
def __ init __ (self):
self.board=[['.'
for
in range(?)] for
in range(6)]
self.game_over = False
self.winner= None
self.last_move = None
self.players= ['x', 'o']
self.turn= 0
def play(self):
while not self.game_over:
self.print_board()
self.move(self.players[self.turn])
self.check_win()
self.print_outcome()
def print_board(self):
#
Display the board to the user.
pass
def move(self, player):
Get and validate input from the user, and update board state.
pass
#
def is_valid(self, move):
#
Make sure the move can be made.
pass
def check_win(self):
Check for four in a row, and set self.game_over if found.
pass
#
def print_outcome(self):
CHAPTER 20. APPLICATIONS
#
243
Congratulate the winner, or declare that there was a tie.
pass
Now that we have a general idea for how to proceed, let's go through each of these
methods.
• Printing the board
We will represent our board as a 7 x 6 matrix, where board [ i] [ j] represents the
i th row and /
h
column. Each location in the board can be filled in by an x or an o,
representing opposing players. To display the board, we will print out each row one
at a time.
def print_board(self):
for row in self.board:
print("".join(row))
• Making a move
To make a move, we should first ask the user for a column to place a disc in. An
input is valid as long as it is a number between Oand 6, and as long as that column is
not already full. Given a valid input, we update the board state and change the turn
variable. Note that since discs are dropped vertically into the board, each column
will be filled from bottom to top.
Additionally, we will track last_move as a class variable, since that will allow us to
more efficiently check the win condition.
def move(self, player):
col= input("{0}'s turn to move: ".format(player))
white not self.is_valid(col):
col= input("Move not valid. Please try again: ")
244
CHAPTER 20. APPLICATIONS
row, col= 5, int(col)
while self.board[row][col] !=
row-= 1
I
'•
self.board[row][col] = player
self.turn= 1 - self.turn
self.last_move = (row, col)
def is_valid(self, col):
try:
col= int(col)
except ValueError:
return False
if 0 <= col <= 6 and self .board[0][col]
' '·
return True
else:
return False
• Checking for a win
A naive way of checking for four in a row would be to enumerate all possible lines
of four, and for each of these, check to see if all the values are either x or o. We can
improve this by noting that if a player has just placed a disc in board [ row] [col],
the only possible wins are those that include that particular row and column.
To make use of this, we will first locate the row, column, positive diagonal, and
negative diagonal corresponding to the location of the last played move. For each
of these, we will check whether any length-four slice consists entirely of the same
player's value, and if so, update the variables game_over and winner.
Finally, we must check if the board is completely full, in which case the game is over
regardless of whether or not someone has won.
def check_win(self):
row, col= self.last_move
horizontal= self.board[row]
vertical= [self.board[i][col] for i in range(6)]
CHAPTER 20. APPLICATIONS
245
neg_offset, pos_offset = col - row, col+ row
neg_diagonal = [row[i + neg_offset] for i, row in
enumerate(self.board) if 0 <= i + neg_offset <= 6]
pos_diagonal = [row[-i + pos_offset] for i, row in
enumerate(self.board) if 0 <= -i + pos_offset <= 6]
possible_wins = [horizontal, vertical, pos_diagonal, neg_diagonal]
for pin possible_wins:
for i in range(len(p) - 3):
if len(set(p[i : i + 4]))
1 and p [ i] ! =
self.game_over = True
self.winner= p[i]
break
I
if all(self.board[0][col] !=
self.game_over = True
1,
for col in range(?)):
• The full game
Putting it all together, our code should look something like this:
class Game:
def __ init __ (self):
self.board=[['.' for
in range(?)] for
self.game_over = False
self.winner= None
self.last_move = None
self.players= ['x', 'o']
self.turn= 0
def play(self):
while not self.game_over:
self.print_board()
self.move(self.players[self.turn])
self.check_win()
self.print_outcome()
def print_board(self):
in range(6)]
246
CHAPTER 20. APPLICATIONS
for row in self.board:
print "".join(row)
def move(self, player):
col= input("{0}'s turn to move: ".format(player))
while not self.is_valid(col):
col= input("Move not valid. Please try again: ")
row, col= 5, int(col)
while self.board[row][col] !=
I
I,
row-= 1
self.board[row][col] = player
self.turn= 1 - self.turn
self.last_move = (row, col)
def is_valid(self, col):
try:
col= int(col)
except ValueError:
return False
if 0 <=col<= 6 and self.board[0][col]
' '·
return True
else:
return False
def check_win(self):
row, col= self.last_move
horizontal
self.board[row]
vertical= [self.board[i][col] for i in range(6)]
neg_offset, pos_offset = col - row, col+ row
neg_diagonal = [row[i + neg_offset] for i, row in
enumerate(self.board) if 0 <= i + neg_offset <= 6]
pos_diagonal = [row[-i + pos_offset] for i, row in
enumerate(self.board) if 0 <= -i + pos_offset <= 6]
possible_wins = [horizontal, vertical, pos_diagonal, neg_diagonal]
for pin possible_wins:
for i in range(len(p) - 3):
if len(set(p[i : i + 4]))
self.game_over
True
1 and p[ i] ! =
I
I,
CHAPTER 20. APPLICATIONS
247
self.winner= p[i]
break
if all(self.board[0][col] !=
for col in range(?)):
self.game_over = True
def print_outcome(self):
self.print_board()
if not self.winner:
print("Game over: it was a draw!")
else:
print("Game over: {0} won!".format(self.winner))
20.3
Cryptarithmetic
A cryptarithmetic puzzle is a mathematical game where the digits of some numbers
are represented by letters, and you must figure out the correct mapping. Each letter
represents a unique digit.
For example, a puzzle of the form:
SEND
+
MORE
MONEY
may have the solution:
{'S': 9, 'E': 5, 'N': 6, 'D': 7, 'M': 1, 'O', 0, 'R': 8, 'Y': 2}
Given a three-word puzzle like the one above, create an algorithm that finds a
solution.
248
CHAPTER 20. APPLICATIONS
Solution
One way of solving this would be to check every numerical value between O and 9
for each character, and return the first character-to-number mapping that works.
Assuming it takes 0( n) to validate a mapping, where n is the number of digits in
the sum, this would take O(n x 10k), where k is the number of distinct characters.
Instead, we can use backtracking to cut down on the number of possible mappings
to check. Recall that for backtracking to be effective, we should be able to construct
a partial solution, verify if that partial solution is invalid, and check if the solution is
complete. Let's answer each of these in turn.
• Can we construct a partial solution?
Yes, we can assign digits to a subset of our distinct characters. In the problem above,
for example, a partial solution might just assign S
= 5.
• Can we verify if the partial solution is invalid?
Even though we may not know the value of each letter, there are cases where we can
disqualify certain solutions. Once we have substitued all the known numbers for
letters, we can start with the rightmost column and try to add digits. If we find that
column addition results in an incorrect sum, we know the solution will not work. For
example, if we had assigned D
= 3, E = 2, and Y = 8, our partial solution above
would look like the following:
S2N3
+
MOR2
MON28
No matter what the other characters represent, this cannot work. If the ones column
works, we can continue this process, moving leftward across our columns until we
CHAPTER 20. APPLICATIONS
249
find an incorrect sum, or a character whose value is unknown, at which point we
cannot check any further.
def is_valid(letters, words):
a, b, c = words
n = len(c)
carry= 0
for i in range(n - 1, -1, -1):
if any(letters[word[i]] is None for word in words):
return True
elif letters[a[i]]
+
letters[b[i]]
+
carry
letters[c[i]]:
+
letters[b[i]]
+
carry
10 + letters[c[i]]:
carry= 0
elif letters[a[i]]
carry= 1
else:
return False
return True
• Can we check if the solution is complete?
Yes, if is_valid returns True, and we have assigned all letters to numbers, then we
have a complete solution.
Therefore, we can implement the solver as a depth-first search, where at each level
we take an unassigned letter, assign a digit to it, and check whether the resulting
letter map is valid. If it is, we go one step deeper. Otherwise, we try a different digit.
If we reach a point where all letters have been assigned and the solution is valid, we
return this result.
def solve(letters, unassigned, nums, words):
if not unassigned:
if is_valid(letters, words):
return letters
else:
return None
CHAPTER 20. APPLICATIONS
250
char= unassigned[0]
for num in nums:
letters[char] = num
nums.remove(num)
if is_valid(letters, words):
solution= solve(letters, unassigned[1:], nums, words)
if solution:
return solution
nums.add(num)
letters[char]
None
return False
If we chose characters to assign at random, this still would be fairly inefficient. For
example, it is not useful to assign S
= 5 first, because we do not have enough other
letters in place to check whether this could be valid. To fix this, we can order our
letters by when they appear when carrying out our validity check. We can find this
order by going down each column, from right to left, and appending new characters
to an ordered dictionary ofletters.
from collections import OrderedDict
def order_letters(words):
n = len(words[2])
letters= OrderedDict()
for i in range(n - 1, -1, -1):
for word in words:
if word[i] not in letters:
letters[word[i]] = None
return letters
In order for this to work, we must guarantee that each word is the same length. To
do so, we can prepend a character that represents 0, such as#, to each of the first two
CHAPTER 20. APPLICATIONS
251
words, until their length is the same as that of the sum.
def normalize(word, n):
diff = n - len(word)
return['#']* diff + word
Finally, the code that invokes these functions would be as follows:
def cryptanalyze(problem):
# Input problem is given as [word, word, total]
words= list(map(list, problem))
n = len(words[2])
words[0]
normalize(words[0], n)
words[l] = normalize(words[l], n)
letters= order_letters(words)
unassigned= [letter for letter in letters if letter != '#']
nums = set(range(0, 10))
return solve(letters, unassigned, nums, words)
To analyze the time complexity, we can look at each function. For is_valid, we
check up to three words for each column, so this will be 0( n ), where n is the number
ofletters in the sum. If we let k be the number of distinct characters, then we will
call solve O(k!) times, since each call will reduce the number of characters to solve
by one. Since each solve attempt requires a validity check, the overall complexity will
be O(n x kl).
20.4
Cheapest itinerary
You are given a huge list of airline ticket prices between different cities around the
world on a given day. These are all direct flights. Each element in the list has the
format ( source_ci ty, destination, price).
CHAPTER 20. APPLICATIONS
252
Consider a user who is willing to take up to k connections to get from their origin
city A to their destination B. Find the cheapest fare possible for this journey and
print the itinerary.
For example, our traveler wants to go from JFK to LAX with up to 3 connections, and
our input flights are as follows:
(
I
JFK I '
'ATL', 150),
(
I
ATL I '
I
( 'ORD I '
I
LAX'' 200),
LAX I '
I
DFW'' 80),
HKG'' 800),
SFO'' 400),
(
I
(
I
JFK I '
I
(
I
ATL I '
'ORD', 90),
(
I
JFK I '
I
LAX I ' 500),
Due to some improbably low flight prices, the cheapest itinerary would be JFK ->
ATL -> ORD -> LAX, costing
$440.
Solution
Let's first think about how we would approach this problem without the constraint
oflimiting our traveler to k connections. We can think of each location as a vertex
on a graph, and flights between them as edges. Considered in this light, our problem
is really about finding the shortest path between two vertices. As discussed in the
chapter on pathfinding, there are a few ways of achieving this, but one common
approach is to use Dijkstra's algorithm.
We will maintain a heap that is keyed on the total cost of the journey so far, and
which additionally holds the current node and the accumulated path. Initially, this
heap will store a single item representing the fact that it costs nothing to begin at
our source airport.
At each step of the process, we pop the lowest cost item off the heap. Then, we take
all unvisited connecting airports and place them on the heap, with their accumulated
CHAPTER 20. APPLICATIONS
253
flight cost and path. Once we reach our destination, we return these values.
To handle the extra constraint, we can add another variable to each heap item
representing how many remaining connections are allowed. Initially this will be
k
+ l, and for each flight taken we will decrement by one. If we reach 0, we know
that we cannot continue the current path, so we will skip to the next item.
import heapq
from collections import defaultdict
def get_itinerary(flights, source, destination, k):
prices= defaultdict(dict)
for u, v, cost in flights:
prices[u][v] = cost
path= [source]
visited= set()
heap= [(0, source, k + 1, path)]
while heap:
visited. add( u)
cost, u, k, path= heapq.heappop(heap)
#
Stop once we reach our destination.
if u == destination:
return cost, path
#
Decrement k with each flight taken.
if k > 0:
for v in prices[u]:
if v not in visited:
heapq.heappush(heap,
(prices[u][v] + cost, v, k - 1, path+ [v])
return -1
The time complexity of Dijkstra's algorithm is O(E
+ VlogV).
Here, E represents
CHAPTER 20. APPLICATIONS
254
/
the number of flights in our input and V represents the number of airports. As for
space complexity, our heap may store up to V items, each of which can have a path
oflength k, so we can describe this as O(V x k).
20.5
Alien dictionary
You come across a dictionary of sorted words in a language you've never seen before.
Write a program that returns the correct order ofletters in this language.
For example, given [ 'xww' , 'wxy z ' , 'wxyw' , 'ywx' , 'ywz ' ] , you should return
['x', 'z', 'w', 'y'].
Solution
It may be hard to know where to start with a problem like this, so let's look at the
example for some guidance.
Note that from the last two words, ywx and ywz, we can tell that "x" must come
before "z". This is because the first two letters of each word match, and therefore the
tiebreaker must be the third letter.
In fact, if we make pairwise comparisons of all the adjacent words in our dictionary,
using this same process, we can discover the rest of the rules that govern the ordering
ofletters. We will store these rules in a graph, where each letter is a key, whose values
(if they exist) are letters that must come before it.
Since we compare each letter at most twice, the time complexity of this part is O(n ),
where n is the number of characters in our input.
def create_graph(words):
letters= set('' .join(words))
graph= {letter: set() for letter in letters}
for pair in zip(words, words[l:]):
#
Unpack each pair of adjacent words.
CHAPTER 20. APPLICATIONS
255
for before, after in zip(*pair):
if before != after:
graph[after].append(before)
break
return graph
Fortheexampleabove,ourgraphwouldbe{'z': {'x'}, 'y': {'w'}, 'x': set(),
'w': {'z',
'x'}}.
We now want to find a way of combining these rules together. Luckily, there exists
a procedure for carrying out just such an operation, called topological sort. As
discussed in the chapter on graphs, the idea is that we maintain a to-do list of letters
to visit, and only add a letter to this list once all of its prerequisites have been added
to the result.
For example, in the dictionary above, we would only add w once we've visited y, so
there is no way that y could come before w in the result.
from collections import deque
def toposort(graph):
#
Start off our to-do list with all letters without prerequisites.
todo = deque([letter for letter, prev in graph.items() if not prev])
#
Create a new data structure to map letters to successor letters.
letter_to_next = defaultdict(list)
for letter, prevs in graph.items():
for prev in prevs:
letter_to_next[prev].append(letter)
result=[]
while todo:
letter= todo.popleft()
result.append(letter)
#
Remove this prereq from all successor courses.
#
If any course now does not have any prereqs, add it to todo.
CHAPTER 20. APPLICATIONS
256
for n in letter_to_next[letter]:
graph[n].remove(letter)
if not graph[n]:
todo.append(n)
Circular dependency
if len(result) < len(graph):
return None
#
return result
Topological sort is O(V + E), where Vis the number ofletters in the alphabet and
E is the number of rules we found earlier. Since neither of these is greater than
O(n) (where n is the number of characters in our input), the overall complexity is
still O(n).
The glue holding this code together is given below.
def alien_letter_order(words):
graph= create_graph(words)
return toposort(graph)
20.6
Prime numbers
The Sieve of Eratosthenes is an algorithm used to generate all prime numbers smaller
than n. The method is to take increasingly larger prime numbers and mark their
multiples as composite.
For example, to find all primes less than 100, we would first mark [4, 6, 8, ... ] (multiples of two), then [6, 9, 12, ... ] (multiples of three), and so on. Once we have done
this for all primes less than n, the unmarked numbers that remain will be prime.
Implement this algorithm.
Bonus: Create a generator that produces primes indefinitely (that is, without taking
n as an input).
CHAPTER 20. APPLICATIONS
257
Solution
Despite being very old, the Sieve of Eratosthenes is a fairly efficient method of
finding primes. As described above, here is how it could be implemented:
def primes(n):
is_prime
=
[False]* 2 +[True]* (n - 1)
for x in range(2, n):
if is_prime[x]:
for i in range(2 * x, n, x):
is_prime[i]
=
False
for i in range(n):
if is_prime[i]:
yield i
There are a few ways we can improve this. First, note that for any prime number p,
the first useful multiple to check is actually p 2 , not 2 x p. This is because all numbers
2p, 3p, ... , i x p where i
< p will already have been marked when iterating over the
multiples of 2, 3, ... , i respectively.
As a consequence of this we can make another optimization: since we only care
about p 2 and above, there is no need for x to range all the way up ton: we can stop
at the square root of n instead.
Taken together, these improvements would look like this:
def primes(n):
is_prime
=
[False]* 2 +[True]* (n - 1)
for x in range(2, int(n ** 0.5)):
if is_prime[x]:
for i in range(x ** 2, n, x):
is_prime[i]
for i in range(n):
=
False
CHAPTER 20. APPLICATIONS
258
if is_prime[i]:
yield i
To generate primes without limit we need to rethink our data structure, as we can no
longer store a boolean list to represent each number. Instead, we must keep track of
the lowest unmarked multiple of each prime, so that when evaluating a new number
we can check if it is such a multiple, and mark it as composite. Since we want to
keep track of the lowest multiples at any given time, this is an excellent candidate for
a heap based solution.
To implement this, we first start a counter at 2 and incrementally move up through
the integers. The first time we come across a prime number p, we add it to a min-heap
with priority p 2 (using the optimization noted above), and yield it. Whenever we
come across an integer with this priority, we pop the corresponding key and reinsert
it with a new priority equal to the next multiple of p.
For example, for integers between 2 and 10, we would perform the following actions:
Integer
Actions
2
push (4, 2), yield 2
3
push (9, 3), yield 3
4
pop (4,2),push (6,2)
5
push (25, 5), yield 5
6
pop (6,2),push (8,2)
7
push (49, 7), yield 7
8
pop (8, 2), push (10, 2)
9
pop (9, 3), push (12, 3)
An important thing to note is that at any given time the next composite number
will be first in the heap, so it suffices to check and update only the highest-priority
element.
I
import heapq
def primes():
CHAPTER 20. APPLICATIONS
259
composite=[]
i
= 2
while True:
#
Note that composite[0][0] is the min element of the heap
if composite and i == composite[0][0]:
while composite[0][0] == i:
multiple, p = heapq.heappop(composite)
heapq.heappush(composite, [multiple+ p, p])
else:
heapq.heappush(composite, [i * i, i])
yield i
i += 1
The time complexity is the same as above, as we are implementing the same algorithm.
However, at the point when our algorithm considers an integer n, we will already
have popped all the composite numbers less than n from the heap, leaving only
multiples of the prime ones. Since there are approximately n/logn primes up ton,
the space complexity has been reduced to O(n/logn).
20. 7
Crossword puzzles
A typical American-style crossword puzzle grid is an n x n matrix with black and
white squares, which obeys the following rules:
• Every white square must be part of an "across" word and a "down'' word.
• No word can be fewer than three letters long.
• Every white square must be reachable from every other white square.
• The grid is rotationally symmetric (for example, the colors of the top left and
bottom right squares must match).
Write a program to determine whether a given matrix qualifies as a crossword grid.
CHAPTER 20. APPLICATIONS
260
Solution
When dealing with a problem with multiple parts, it is a good idea to break apart the
task by creating separate functions for every requirement. In this case, we'll define a
new function to satisfy each bullet point above.
First, let us determine whether the across and down words all have at least three
letters. Note that this will also ensure that each white square is part of two words,
since otherwise there would have to be a one-letter word.
For each row in the grid, we will iterate over all words and increment a counter for
consecutive white squares. (We assume here that white squares are given as zeroes
and black squares are given as ones.) If at any point we encounter a word oflength
one or two, we return False.
def has_valid_word_length(grid):
for row in grid:
word_length = 0
for square in row:
if square== 0:
word_length += 1
else:
if 0 < word_length < 3:
return False
word_length = 0
if 0 < word_length < 3:
return False
return True
Note this will work for both across and down words, since we can transpose the
matrix and reapply on the new grid.
This function will take O(n) time and 0(1) space to complete, where n is the number
of rows in the matrix.
261
CHAPTER 20. APPLICATIONS
To check rotational symmetry, we need to ensure that the grid looks the same after
rotating 180 degrees. While this can be achieved by iterating over the grid square
by square, an alternative method is to use a combination of transposals and row
reversals.
The following steps will allow us to find the rotated grid:
1. Transpose the matrix
2. Reverse the matrix
3. Transpose the matrix again
4. Reverse the matrix again
Here is how these operations would look on a sample input matrix:
0
1
1
0
0
1
0
0
1
--+
--+
0
0
0
1
0
0
1
1
1
1
1
0
1
0
0
1
0
0
--+
--+
1
1
1
1
0
0
0
0
0
1
0
0
1
0
0
1
1
0
Note that in Python, we can transpose a matrix with the zip operation, by performing list( zip( *grid)), and reverse the rows of a matrix using slice notation with
grid[: :-1]. The zip operation will change each row into a tuple, so we must map
these back to lists at the end.
We can therefore write this operation as follows:
def is_rotationally_symmetric(grid):
transpose= list(zip(*grid))
CHAPTER 20. APPLICATIONS
262
reverse= transpose[::-1]
transpose= list(zip(*grid))
reverse= transpose[::-1]
return grid== list(map(list, reverse))
This operation takes 0( n 2 ) time and space, since we must iterate over each square
and create a new grid.
Finally, we must check the connectedness of our matrix. Recall from our chapter
on graphs that breadth-first search is an efficient way of traversing all vertices of a
graph. If we think of each grid square as a vertex, then, we can use BFS to check if it
is possible to start from a random white square and reach all other white squares.
from collections import deque
def is_connected(grid):
#
Check how many white squares there are in the grid.
count= sum([l - square for row in grid for square in row])
#
Find the first one to begin our search from.
start= None
for i, row in enumerate(grid):
for j in row:
if grid[i][j] == 0:
start= (i, j)
break
if not start:
return False
#
Perform BFS, adding each unvisited adjacent white square to a queue.
queue= deque([start])
visited = set()
connected_count = 0
while queue:
square= queue.popleft()
if square not in visited:
visited.add(square)
CHAPTER 20. APPLICATIONS
263
connected_count += 1
i,
square
for adj in [(i - 1, j), (i + 1, j), (i, j - 1), (i, j + 1)]:
row, col= adj
if (0 <=row< len(grid) and 0 <=col< len(grid) and\
grid[row][col] == 0):
queue.append(adj)
Check whether the visited count matches the overall count.
return count== connected_count
#
This function too will have 0( n 2 ) time and space complexity, as we may iterate over
the entire grid and store many of the squares in our queue.
Putting it all together, a valid grid must satisfy all four methods we have just defined.
def is_valid(grid):
return has_valid_word_length(grid) and\
has_valid_word_length(zip(*grid)) and\
is_rotationally_symmetric(grid) and\
is_connected(grid)
The overall time and space complexity of our solution will be O(n 2 ), since this is the
upper bound for each of our component functions.
20.8 UTF-8 encodings
UTF-8 is a character encoding that maps each symbol to one, two, three, or four
bytes.
For example, the Euro sign,€, corresponds to the three bytes 11100010 10000010
10101100. The rules for mapping characters are as follows:
• For a single-byte character, the first bit must be zero.
CHAPTER 20. APPLICATIONS
264
• For an n-byte character, the first byte starts with nones and a zero. The other
n - 1 bytes all start with 10.
Visually, this can be represented as follows.
Bytes
1
2
3
4
Byte format
0xxxxxxx
110xxxxx 10xxxxxx
1110xxxx 10xxxxxx 10xxxxxx
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
Write a program that takes in an array of integers representing byte values, and
returns whether it is a valid UTF-8 encoding.
Solution
Note that for an encoding to be valid, the number of prefix ones in the first byte
must match the number of remaining bytes, and each of those remaining bytes must
begin with 10.
Therefore, we can divide our algorithm into two parts. To start, we check the first
element of our input to determine how many remaining bytes there should be, and
initialize a counter with that value. Next, we loop through each additional byte.
If the byte starts with 10, we decrement our counter; if not, we can return False
immediately.
Since we are dealing with bytes, we may find bit manipulation operations useful. In
particular, we can perform bit shifts to check the number of starting ones in each
byte.
If at the end of our loop, the counter equals zero, we will know our encoding is valid.
def valid(data):
first= data[0]
CHAPTER 20. APPLICATIONS
265
if first>> 7 == 0:
count=
0
elif first>> 5 == 0b110:
count= 1
elif first>> 4 == 0b1110:
count= 2
elif first>> 3 == 0b11110:
count= 3
else:
return False
for byte in data[l:]:
if byte>> 6 == 0b10:
count-= 1
else:
return False
return count== 0
This algorithm is O(n) in the number of bytes, since we are only performing performing bit shifts and equality checks on each one.
20. 9 Blackjack
Blackjack is a two player card game whose rules are as follows:
• The player and then the dealer are each given two cards.
• The player can then "hit", or ask for arbitrarily many additional cards, so long
as his or her total does not exceed 21.
• The dealer must then hit if his or her total is 16 or lower, otherwise pass.
• Finally, the two compare totals, and the one with the greatest sum not exceeding
21 is the winner.
For this problem, we simplify the card values to be as follows: each card between 2
and 10 counts as their face value, face cards count as 10, and aces count as 1.
266
CHAPTER 20. APPLICATIONS
Given perfect knowledge of the sequence of cards in the deck, implement a blackjack
solver that maximizes the player's score (that is, wins minus losses).
Solution
Dynamic programming provides an elegant solution to this problem, but it is not
trivial to implement.
We can think about the problem like this: suppose we start with a fresh deck of
cards, and we deal out two cards each to the player and the dealer. At the end of
the hand, the player may have hit twice, and the dealer may have hit once, so that
in total 7 cards have been dealt. If we already know the best score the player can
obtain starting with the 8th card of the deck, then the overall score for this play can
be expressed as value(play)
+
best_scores[S].
In fact, this hand may have been played differently- neither player may have hit,
or both may have hit :five times, so that the number of cards dealt may have been
anywhere between 4 and 52. Each of these outcomes can be considered subproblems
of our original problem that begin with an alternate starting index. If all these
subproblems have been solved (that is, best_scores [ n] is known for all n > start),
we can :find the best score of any play beginning at start with the following logic:
scores=[]
for play in plays:
scores.append(value(play)
+
best_scores[start
+
cards_used(play)])
best_score = max(scores)
So our dynamic programming approach will be as follows. For each starting index
of the deck, beginning with the fourth-to-last card and going back to the :first card:
• Simulate all the possible ways that a round of blackjack starting with that card
can be played.
CHAPTER 20. APPLICATIONS
267
• For each play, compute its value (1 for wins, 0 for ties, and -1 for losses), and
track the number of cards used.
• Set the value of best_scores [start] to be the result of the logic above.
Once we have our algorithm in place, we just need to add boilerplate code to represent
the deck and the players, and to flesh out the logic for how the game is played.
In particular, Deck will be a class that starts with 52 randomly shuffled cards and
can deal out n cards beginning at a specified start index. The Player class will be
initialized with two starting cards, and will hold cards in a hand that can be appended
to and summed up.
The full implementation can be found below.
import random
class Deel<:
def __ init __ (self, seed=None):
self.cards= [i for i in range(1, 10)]
random.seed(seed)
random.shuffle(self.cards)
def deal(self, start, n):
return self.cards[start:start
+
n]
class Player:
def __ init __ (self, hand):
self.hand= hand
self.total= 0
def deal(self, cards):
self.hand.extend(cards)
self.total= sum(self.hand)
def cmp(x, y):
return (x > y) - (x < y)
def play(deck, start, scores):
player
Player(deck.deal(start, 2))
dealer= Player(deck.deal(start
+
2, 2))
*
4
+
[10]
*
16
CHAPTER 20. APPLICATIONS
results
#
[]
The player can hit as many times as there are cards left.
for i in range(49 - start):
count= start+ 4
player.deal(deck.deal(count, i))
count+= i
#
Once a player busts, there is no point in exploring further
#
hits, so we skip to evaluation.
if player.total> 21:
results.append((-1, count))
break
#
The dealer responds deterministically, only hitting if below 17.
while dealer.total< 17 and count< 52:
dealer.deal(deck.deal(count, 1))
count+= 1
#
#
If the dealer busts, the player wins.
Otherwise, compare their totals.
if dealer.total> 21:
results.append((l, count))
else:
results.append((cmp(player.total, dealer.total), count))
options=[]
for score, next_start in results:
options.append(
score+ scores[next_start] if next_start <= 48 else score
scores[start] = max(options)
def blackjack(seed=None):
deck= Deck(seed)
scores= [0 for_ in range(52)]
for start in range(48, -1, -1):
play(deck, start, scores)
268
CHAPTER 20. APPLICATIONS
269
return scores[0]
We have n overlapping subproblems corresponding to each starting index, and for
each of these the player can hit O (n) times, in response to which the dealer can hit
O(n) times. This leads to a complexity of O(n 3 ).
In reality, neither the player nor the dealer will actually able to hit n times- after
holding a hand of all four aces, twos, threes, and fours, the next card would put them
over 21. Further, the distribution oflow cards in the deck would make it impossible
for a player to hit 10 times near both the start and the end of the deck. But for the
purposes of an interview question, this is more than sufficient.
Part IV
Design
271
Data Structure Design
One final question type you should be prepared for is design. You can know all the
data structures and algorithms in the world, but if you cannot think through how to
put together a system that solves user needs or makes a product work properly, this
knowledge cannot be put to good use.
On the plus side, design questions usually do not have one "right" answer: the key
is being able to identify obstacles, develop a plan to solve them, and examine the
tradeoffs between different approaches. In the problems that follow we will address
different possibilities in order to illustrate how to work through each situation.
In this chapter we introduce several problems where the goal is to design a data
structure that satisfies specific criteria. As in a typical interview setting, these are
given in terms of class methods your data structure is required to support, often with
time and space complexity guarantees.
In the following chapter we move on to system design questions, in which the goal
is to describe at a high level the design of an application or process.
After reading through each problem, we recommend that you consider various
changes that can be made to the input requirements, or related questions that could
be asked, and think about how your implementation would change as a result.
273
274
CHAPTER 21. DATA STRUCTURE DESIGN
21.1
Dictionary with time key
Write a map implementation with a get function that lets you retrieve the value of a
key at a particular time.
It should contain the following methods:
• set( key, value, time): sets key to value fort
• get(key, time):getsthekeyatt
=
time.
time.
If we set a key at a particular time, our map should maintain that value forever, or
until it gets reset at a later time. As a result, when we get a key at a particular time,
it should return the value that was set for that key most recently.
Consider the following examples:
d.set(l, 1, 0)
d.set(1, 2, 2)
d.get(l, 1)
#
set key 1 to value 1 at time 0
#
set key 1 to value 2 at time 2
get key 1 at time 1 should be 1
#
d.get(l, 3) # get key 1 at time 3 should be 2
d.set(l, 1, 5)
d.get(l, 0)
d.get(l, 10)
#
set key 1 to value 1 at time 5
get key 1 at time 0 should be null
#
#
get key 1 at time 10 should be 1
d.set(l, 1, 0)
#
set key 1 to value 1 at time 0
d.set(l, 2, 0)
#
set key 1 to value 2 at time 0
d.get(l, 0)
#
get key 1 at time 0 should be 2
Solution
One possible way to solve this question is using a map of maps, where each key has
its own hash table of time-value pairs. That would resemble the following:
CHAPTER 21. DATA STRUCTURE DESIGN
275
key: {
time: value,
time: value,
}
'
key: {
time: value,
time: value,
},
If a particular time does not exist in the time-value map, we must be able to get the
value of the nearest previous time (or return null if it doesn't have one). A sorted
map would be nice in this situation, but unfortunately Python's standard library
doesn't have one. Instead, for each key's inner hash table, we can use binary search to
maintain a sorted list of time keys.
import bisect
class TimeMap:
def __ init __ (self):
self.map= diet()
self.sorted_keys_cache
None
def get(self, key):
value= self.map.get(key)
if value is not None:
return value
if self.sorted_keys_cache is None:
self.sorted_keys_cache
sorted(self.map.keys())
#
Find the nearest previous time key that has been set.
#
If it exists, return the corresponding value.
index= bisect.bisect_left(self.sorted_keys_cache, key)
CHAPTER 21. DATA STRUCTURE DESIGN
276
if index== 0:
return None
else:
return self.map.get(self.sorted_keys_cache[index - 1])
def set(self, key, value):
self.sorted_keys_cache = None
self.map[key] = value
The downside of this solution is that each time we perform a write operation with
set, we wipe the key's cache, causing a full sort of the keys on the next get call. For
write-heavy applications, this would mean most of our calls would take O(nlogn)
time.
For mixed workloads, a more suitable approach is to use arrays under the hood.
More specifically, we can store each key and value as corresponding elements in two
separate arrays. For example, the value for keys [3] can be found at values [3].
This way, inserting a new key will amount to performing binary search to find the
correct index, and then performing two insertions. Meanwhile, each get operation
will require a binary search to find the appropriate value index.
import bisect
class TimeMap:
def __ init __ (self):
self.keys=[]
self.values=[]
def get(self, key):
if self.keys is None:
return None
i = bisect.bisect_left(self.keys, key)
If this exact time is a key, return the corresponding value.
if len(self.keys) > i and self.keys[i] == key:
#
return self.values[i]
CHAPTER 21. DATA STRUCTURE DESIGN
#
277
If this time is less than any previous time, there is no value.
elif i == 0:
return None
Otherwise, return the value associated with the latest time.
else:
return self.values[i - 1]
#
def set(self, key, value):
i = bisect.bisect_left(self.keys, key)
#
If this time exceeds any previous time, add a new key-value pair.
if len(self.keys) == i:
self.keys.append(key)
self.values.append(value)
#
If this time already exists, overwrite the corresponding value.
elif self.keys[i] == key:
self.values[i] = value
#
Otherwise, insert a key-value pair at the appropriate indices.
else:
self.keys.insert(i + 1, key)
self.values.insert(i
+
1, value)
In this way, both get and set behave more predictably from a performance standpoint.
The time complexity of get will be logarithmic due to binary search, and set will be
0( n) in the worst case due to two array reallocations.
The last missing part to solve this question is the top level map. Each time we
encounter a new key, we initialize a TimeMap to store its time-value pairs, and use
the operations defined above to store and update that data.
from collections import defaultdict
class MultiTimeMap:
def __ init __ (self):
self.map= defaultdict(TimeMap)
def set(self, key, value, time):
CHAPTER 21. DATA STRUCTURE DESIGN
278
self.map[key].set(time, value)
def get(self, key, time):
time_map = self.map.get(key)
if time_map is None:
return None
else:
return time_map.get(time)
21.2
Qyeue with fixed-length array
Implement a queue using a set of fixed-length arrays.
The queue should support enqueue, dequeue, and get_size operations.
Solution
It may be difficult to know where to begin with a problem like this. Do we try
implementing one of the operations? Draw some diagrams to think through how
we might add and remove elements?
These are both good ideas, but often a good :first step when building new data
structures is to think of a simpler structure we can build upon. In this case, let's
imagine we only need to use a single fixed-size array, to implement a queue with a
maximum length.
In this data structure, we should be able to enqueue up to n elements, at which point
our queue will become full. We must then dequeue elements from the front in order
to make space. A neat trick when implementing this will be to use a circular array to
store our elements. Each time we enqueue an element, we will shift a tail pointer
right by one. Meanwhile, each time we dequeue an element, we will shift a head
pointer right by one.
Crucially, we will allow these shifts to circle round to the front of the array. To illustrate this process, let's look at the table below, which describes a series of operations
CHAPTER 21. DATA STRUCTURE DESIGN
279
on a queue of fixed length three.
Last operation
Array
Head index
Tail index
Size
enqueue(2)
[1,2, None]
0
2
2
enqueue(3)
[1, 2, 3]
0
0
3
dequeue()
[1, 2, 3]
1
0
2
dequeue()
[1, 2, 3]
2
0
1
enqueue(4)
[4, 2, 3]
2
1
2
First, we enqueue 2 to an array that already contains one enqueued item. At this
point our head index remains unchanged, and our tail index has shifted to point to
the last element. When we enqueue another element, our tail has no room left, so it
circles around to the front of the array.
Each of the two times we dequeue, we return the element located at the head index,
and shift that index right by one. Finally, when we enqueue one last time, we can
overwrite the value at our tail location, since we know it has already been dequeued.
Here is how this would look in code:
class Queue:
def __ init __ (self, n):
self.array_size = n
self.array= [None]* n
self.head= 0
self. tail = 0
self.size= 0
def enqueue(self, x):
if self.size== self.array_size:
print('Queue full, cannot enqueue.')
return
self.array[self.tail] = x
self.tail= (self.tail+ 1)
self.size+= 1
def dequeue(self):
%
self.array_size
CHAPTER 21. DATA STRUCTURE DESIGN
280
if self.size== 0:
print('Cannot dequeue from empty queue.')
return
result= self.array[self.head]
self.head= (self.head+ 1)
self.size-= 1
%
self.array_size
return result
def get_size(self):
return self.size
This is fairly close to what we are looking for, except now we must support an arbitrary
number of fixed-length arrays. Instead of limiting our queue size, we should start
filling up a new array when the current one is full. As for dequeuing, we can still shift
our head pointer right each time. However, once we get to the end of our array, we
can simply remove that array from consideration, since it will never be useful again.
To help us perform these operations we will track two new variables, the head_array
and the tai 1_array. The head array will refer to the :first list of elements, and will
always be what we dequeue from. The tail array will consist of all the arrays, including
the head, and will always be what we enqueue onto.
To understand how this arrays work, we can take a look at the following diagram,
similar to the one given previously.
Last operation
Head array
Tail array
enqueue(2)
[1,2, None]
[[1, 2, None]]
enqueue(3)
[1, 2, 3]
[[1, 2, 3]]
enqueue(4)
[1, 2, 3]
[[1, 2, 3], [4, None, None]]
dequeue()
[1, 2, 3]
[[1, 2, 3], [4, None, None]]
dequeue()
[1, 2, 3]
[[1, 2, 3], [4, None, None]]
dequeue()
[4, None, None]
[[4, None, None]]
enqueue(5)
[4, 5, None]
[[4, 5, None]]
CHAPTER 21. DATA STRUCTURE DESIGN
281
Here, we first enqueue two items to the queue, populating both the head and tail
arrays. When we enqueue our next item, there is no room in our existing array for it,
so we declare a new list and place this as its first element.
We then dequeue from the head array until there is nothing left to take. At this
point, the first array is useless, so we remove it and reassign the head array to the first
list of the tail.
As before, each time we add or remove an element, we update a size parameter, so
that getting the length of our queue is a painless process.
class Queue:
def __ init __ (self, n):
self.array_size = n
self.head_array
self.tail_array
self.curr_array
self.head
self. tail
self.size
[None]* n
[self.head_array]
0
0
0
0
def enqueue(self, x):
self.tail_array[self.curr_array][self.tail] = x
if self.tail== self.array_size - 1:
self.tail_array.append([None] * self.array_size)
self.curr_array += 1
self.tail= (self.tail+ 1)
self.size+= 1
%
self.array_size
def dequeue(self):
if self.size== 0:
print('Cannot dequeue from empty queue.')
return
result= self.head_array[self.head]
if self.head== self.array_size - 1:
self.head_array = self.tail_array[1]
CHAPTER 21. DATA STRUCTURE DESIGN
282
self.tail_array = self.tail_array[l:]
self.curr_array -= 1
self.head= (self.head+ 1)
%
self.array_size
self.size-= 1
return result
def get_size(self):
return self.size
Let the fixed length of each array be n. Then most of our enqueue operations will
take 0(1) time, but every once in a while we will need to append a new list to our
tail array, which takes 0( n ). Since performing an 0( n) process every n steps will
lead to a constant time method on average, our amortized time complexity is still
constant.
Similarly, dequeuing takes 0(1) most of the time, except for the cases where we must
reassign the head and tail arrays. Since this only happens once every n operations,
this will run in constant time on average.
The space required for this data structure depends on the number of elements we
plan to store. In particular, our space complexity will be O (k), where k is the greatest
size of our queue at any given time.
21.3
QJiack
A quack is a data structure combining properties of both stacks and queues. It can
be viewed as a list of elements written left to right such that three operations are
possible:
• push ( x ) : add a new itern x to the left end of the list
• pop(): remove and return the item on the left end of the list
• pull ( ): remove the item on the right end of the list.
CHAPTER 21. DATA STRUCTURE DESIGN
283
Implement a quack using three stacks and 0(1) additional memory, so that the
amortized time for any push, pop, or pull operation is 0(1).
Solution
Recall that a stack is a last-in :first-out (LIFO) container of elements. Therefore,
we can support push(x) and pop() simply by using the same methods provided
by a normal stack. In order to support the pull() operation, we need to access
the least-recently added item, which requires a data structure that supports :first-in
:first-out (FIFO) access.
We can simulate a deque, or double-ended queue, by using two stacks. One stack
represents the left, or front (for push/ pop), and the other represents the right, or
back (for pull). If we assume both the front and back stacks contain elements in the
correct ordering, supporting all three operations is straightforward. We push to the
top of the front stack, and pop from the same stack. When we call pull, we simply
pop from the back stack, assuming we've already reversed the elements correctly.
For example, imagine the integers 1 through 6 being pushed in order. Some configurations might look like this:
Front (Left)
[4,5,6]
[]
[1, 2, 3, 4, 5, 6]
Back (Right)
[3, 2, 1]
[6, 5, 4, 3, 2, 1]
[]
When we run out of elements in either stack, operations get tricky. How do we
make sure that both stacks are ordered correctly? Recall that we can reverse a stack
by using an auxiliary array. We can use the third stack as a buffer stack to move and
reverse elements correctly. When we need more elements in the right stack, we'll go
ahead and move half of the items over to the left stack. We'll pop half of the left
stack into the buffer stack. Then, we pop the remainder into the right stack. Finally,
we pop the items from the buffer stack back into the left stack.
CHAPTER 21. DATA STRUCTURE DESIGN
Front (Left)
[1, 2, 3, 4, 5, 6]
[1, 2, 3]
[]
[4,5,6]
284
Back (Right)
Buffer
[]
[]
[]
[6, 5, 4]
[3, 2, 1]
[3, 2, 1]
[6, 5, 4]
[]
When we run out of elements on the left stack, we can perform the same operations
in reverse. This re-balancing operation takes time proportional to O(n). However,
since we have guaranteed
~
~
elements on both stacks, there must be no fewer than
push or pop operations between each re-balance. Therefore, we can say that the
amortized time for pop and pull are each 0(1). The running time for each push
operation is 0(1).
class Quack:
def __ init __ (self):
self.right=[]
self.left= []
self.buffer= []
def push(self, x):
self.left.append(x)
def pop(self):
if not self.left and not self.right:
raise IndexError('pop from empty quack')
if not self.left:
# Re-balance stacks
size= len(self.right)
Move half of right stack to buffer
for_ in range(size // 2):
self.buffer.append(self.right.pop())
# Move remainder of right to left
#
while self.right:
self.left.append(self.left.pop())
Move buffer elements back to right
while self.buffer:
self.right.append(self.buffer.pop())
#
return self.left.pop()
CHAPTER 21. DATA STRUCTURE DESIGN
def pull(self):
if not self.left and not self.right:
raise IndexError('pull from empty quack')
if not self.right:
# Re-balance stacks
size= len(self.left)
#
Move half of left stack to buffer
for_ in range(size // 2):
self.buffer.append(self.left.pop())
Move remainder of left to right
while self. left:
self.right.append(self.left.pop())
#
#
Move buffer elements back to left
while self.buffer:
self.left.append(self.buffer.pop())
return self.right.pop()
285
System Design
System design is a broad term that refers to problems where you are asked to come up
with a high-level framework to achieve a particular goal, subject to constraints and
tradeoffs. This sounds quite abstract, so let's break down each part of this definition.
• A high-level framework
You will not be expected to have in-depth knowledge of every piece of the application
you are describing. In fact, if you have a solid understanding of just one piece of the
puzzle, that is often enough.
As for the rest, what your interviewer is looking for is typically a description of the
end-to-end process and how it achieves the goal at hand. For an application, for
example, it would be sensible to outline parts such as the backend logic, data storage,
user interface, and network architecture, as well as the APis that connect them.
• To achieve a particular goal
Sometimes the goal is given directly in the problem: "reduce load time", or "collect
user data". More commonly, you will be told the general purpose of the project, from
287
CHAPTER 22. SYSTEM DESIGN
288
which you will want to ask clarifying questions to figure out how to proceed. There
is no end to questions you can ask here, such as:
• Why is this system being built?
• Who will the clients be?
• How frequently will this system be used?
• What kind of data will we need to store?
• What is the scale of the project?
• How efficient does the application need to be?
It is a good idea to keep clarifying until you have a clear idea of what the end result
should look like. At this point you should begin to brainstorm a couple of options
about how to proceed.
• Subject to constraints and tradeoffs
After carefully scoping the goal through the questions above, you should have a sense
of which objectives have higher priorities than others. For example, if the goal is to
design a database, you will want to know if writes are far more prevalent than reads,
or vice versa.
Each piece of information you receive from the interviewer helps you decide between
competing alternatives, and you should feel free to talk aloud while explaining your
design in terms of these tradeoffs.
With this definition in place, we are ready to tackle a few system design questions.
22.1
Crawl Wikipedia
Design a system to crawl and copy all of Wikipedia using a distributed network of
machines.
CHAPTER 22. SYSTEM DESIGN
289
More specifically, suppose your server has access to a set of client machines. Your
client machines can execute code you have written to access Wikipedia pages, download and parse their data, and write the results to a database.
Some questions you may want to consider as part of your solution are:
• How will you reach as many pages as possible?
• How can you keep track of pages that have already been visited?
• How will you deal with your client machines being blacklisted?
• How can you update your database when Wikipedia pages are added or updated?
Solution
For any design problem, the first step should be to clarify the requirements. What
is the goal of this project? Do we have limited resources? How much error can we
tolerate?
To answer these questions, an interviewer will often recommend making reasonable
assumptions that help you solve the problem. This will be necessary in our case as
well.
We will assume, then, that the number of client machines is sufficiently large that we
do not need to worry about a few of them failing, and that each machine has access
to a separate database to which it can write webpage data. In addition, let us say that
the machines can communicate with the server but not with each other.
It is also necessary to know a little about the structure ofWikipedia. A quick Google
search will tell us that English-language Wikipedia has around 50 million pages,
of which around 6 million are articles. Each article has links to related articles
embedded within the text, which must be extracted as we parse the page.
• Outline
CHAPTER 22. SYSTEM DESIGN
290
Now we are ready to dive into the general approach of our solution. Each client
machine, or bot, can be seeded with some initial URLs to crawl, designed to explore
different topics. The table of contents page (https: //en. wikipedia. org/wiki/
Portal: Contents) has various categories that we can use for this purpose.
For each article, the bot will download the HTML page data, extract new links,
and write the text of the article to its database. The links found will provide us with
new pages to crawl, and ideally through this process we will be able to traverse all of
Wikipedia.
There are several options for how each database might be structured, but the simplest
way would probably be a keyvalue store mapping the URL to aJSON blob. Databases
will be separate, and we will assume we can combine all the results together once our
scraping phase is complete.
Here is how this would look:
Server
Client
Client
Client
DB
DB
DB
• Deduplication
The question now arises of how to prevent our bots from crawling the same page
multiple times. For example, it has been calculated that around 97% of the time,
CHAPTER 22. SYSTEM DESIGN
291
starting from a random page and repeatedly following the first link will eventually
lead to the philosophy page.
One solution might be to store a local cache on each client oflinks already traversed,
and only follow links that have not been added to this cache. This essentially amounts
to executing n independent breadth-first searches, where n is the number of client
machines. However, this does nothing to solve the issue of two clients independently
scraping the same page.
Suppose that instead, the server maintains some data structure that keeps track of
unique URLs. Before visiting any page, a client will make a call to the server to
ensure the page has not been seen yet. After scraping this page, the client will send a
second request to the server marking the page as visited.
This is more effective, but at a significant cost of efficiency. A third and improved
approach is as follows. Each bot visits and parses its full list ofURLs, and populates a
set of outward links discovered along the way. This entire list is then batched up and
sent to the server. Upon receiving such a batch, the server combines and deduplicates
these pages, and sends back the unvisited ones to be crawled.
Our slightly modified diagram is as follows. We only show two client machines for
the sake of clarity, but there may be arbitrarily many.
deduped URLs
deduped URLs
Server
Client
found links
DB
Client
found links
Deduper
DB
How exactly should our deduper work, though, given that storing every URL in
memory may be infeasible? One useful tool we can employ here is a Bloom filter.
As discussed in a previous chapter, a Bloom filter is a space-efficient data structure
CHAPTER 22. SYSTEM DESIGN
292
that tests set membership. At the cost of some potential false positives, we can
guarantee a fixed, reasonable upper bound on the cost of checking whether a page
has already been visited.
• Blacklisting
The advantage of having multiple bots perform our web crawling is that even if a
few IPs are blacklisted, the remaining ones can still continue scraping. Nevertheless
we would like to avoid this scenario if possible.
First, it helps to follow the rules. Many websites maintain a page called robots.txt,
which gives instructions to any bots crawling the site on what pages should be accessed
and by whom. We can see Wikipedia's version at https: //meta .wikimedia .org/
robots. txt. The structure of this page is a series of assignments of the form:
User-agent:*
Disallow: /wiki/forbidden_page.html
User-agent: bad_actor
Disallow: /
The User-agent field defines which bots are being issued an instruction, and the
Disallow field specifies which domains are off-limits. In the first example above,
all bots are told not to visit a particularly forbidden page. In the second, the bot
bad_actor is told not to crawl at all.
A common reason that bots are disallowed is due to scraping too fast. To prevent
this from happening, we can first test the number of requests per second Wikipedia
can handle, and throttle our calls to stay within this limit.
It is also a good idea to map client machines to separate kinds of domains. This
way, we will not have every bot bombarding the same servers with traffic. In our
case, if we wanted to crawl articles in all different language domains, (for example,
en. wikipedia. org, es. wikipedia. org, and so on), we could assign pages to each
client accordingly.
CHAPTER 22. SYSTEM DESIGN
293
• Updates
Most articles on Wikipedia do not frequently have major updates. However, there are
indeed new pages created all the time, and with each news story, scientific discovery,
or piece of celebrity gossip, an article probably will be updated.
One simple implementation would be to store each URL and the date it was scraped
in a database accessible by the server. Each day we can run a query to find articles
older than n days, instruct our client machines to re-crawl them, and update the
date column. The downside of this approach is that unless n can be dynamically
determined to suit different kinds of articles, we will end up scraping far more (or
less) than necessary.
A more sensible approach for keeping track of updates to a site is with an RSS feed,
which allows subscribers to receive a stream of changes. In the case ofWikipedia,
there are both New Page and Recent Changes RSS feeds. As a result, after our initial
scrape is finished, we can direct our server to listen to this feed and send instructions
to the client machines to re-crawl the updated pages.
As is often the case in design problems, we have only touched on a few areas in
detail, and there are many more questions we could answer. Feel free to explore this
in greater depth and implement your own version!
22.2
Design a hit counter
Design and implement a HitCounter class that keeps track of requests (or hits). It
should support the following operations:
• record(timestamp): records a hit that happened at timestamp
• total(): returns the total number of hits recorded
• range(lower, upper): returns the number of hits that occurred between
timestamps lower and upper (inclusive)
Follow-up: What if our system has limited memory?
CHAPTER 22. SYSTEM DESIGN
294
Solution
Let's first assume the timestamps in Unix time; that is, integers that represent the
number of seconds elapsed since January 1, 1970.
We can naively create a Hi tCounter class by simply using an unsorted list to store
all the hits, and implement range by querying over each hit one by one:
class HitCounter:
def __ init __ (self):
self.hits=[]
def record(self, timestamp):
self.hits.append(timestamp)
def total(self):
return len(self.hits)
def range(self, lower, upper):
count= 0
for hit in self.hits:
if lower<= hit<= upper:
count+= 1
return count
Here, record and total would take constant time, but range would take O(n) time.
One improvement we could make here is to use a sorted list or binary search tree to
keep track of the hits. That way, range would now take O(logn) time, but so would
record.
We'll use Python's bisect library to maintain sortedness:
import bisect
class HitCounter:
def __ init __ (self):
self.hits=[]
CHAPTER 22. SYSTEM DESIGN
295
def record(self, timestamp):
bisect.insort_left(self.hits, timestamp)
def total(self):
return len(self.hits)
def range(self, lower, upper):
left= bisect.bisect_left(self.hits, lower)
right= bisect.bisect_right(self.hits, upper)
return right - left
This will still take up a lot of space, though -
one element for each timestamp.
To address the follow-up question, let's think about possible tradeo:ffs we can make.
One approach here would be to sacrifice accuracy for memory by grouping together
timestamps in a coarser granularity, such as minutes or even hours. That means we'll
lose some accuracy around the borders but we'd be using up to a constant factor less
space.
For our solution, we'll keep track of each group in a tuple where the first item is the
timestamp in minutes and the second is the number of hits occurring within that
minute.
import bisect
from math import floor
class HitCounter:
def __ init __ (self):
self.counter= 0
self.hits=[]# (timestamp in minutes, # of times)
def record(self, timestamp):
self.counter+= 1
minute= floor(timestamp / 60)
i = bisect.bisect_left([hit[0] for hit in self.hits], minute)
CHAPTER 22. SYSTEM DESIGN
if i < len(self.hits) and self.hits[i][0]
self.hits[i]
296
== minute:
= (minute, self.hits[i][l]
+
1)
else:
self.hits.insert(i, (minute, 1))
def total(self):
return self.counter
def range(self, lower, upper):
lower_minute = floor(lower / 60)
upper_minute = floor(upper / 60)
lower_i = bisect.bisect_left(
[hit[0] for hit in self.hits], lower_minute
upper_i = bisect.bisect_right(
[hit[0] for hit in self.hits], upper_minute
return sum(self.hits[i][l] for i in range(lower_i, upper_i))
22.3
What happens when you visit a URL?
Describe what happens when you type a URL into your browser and press Enter.
Solution
We'll go through a very general and high-level overview of how requests are made,
consisting of the following parts: DNS lookup, HTTP request, server handling, and
rendering.
• DNS Lookup
First, the URL, or domain name, must be converted into an IP address that the
browser can use to send an HTTP request. Each domain name is associated with
an IP address, and if the pair has not been saved in the browser's cache, then most
browsers will ask the OS to look up (or resolve) the domain for it.
CHAPTER 22. SYSTEM DESIGN
297
An operating systems usually has default DNS nameservers that it can ask to lookup.
These DNS servers are essentially huge lookup tables. If an entry is not found in
these nameservers, then it may query other to see if it exists there, and forward the
results (and store them in its own cache).
• HTTP Request
Once the browser has the correct IP address, it then sends an HTTP GET request
to that IP.
The HTTP request must go through many networking layers (for example, SSL,
if it's encrypted). These layers generally serve to protect the integrity of the data
and do error correction. For example, the TCP layer handles reliability of the data
and orderedness. If packets underneath the TCP layer are corrupted (detected via
checksum), the protocol dictates that the request must be resent. If packets arrive in
the wrong order, it will reorder them.
In the end, the server will receive a request from the client at the URL specified,
along with metadata in the headers and cookies.
• Server Handling
Now the request has been received by some server. Popular server engines are nginx
and Apache. What these servers do is handle it accordingly. If the website is static,
for example, the server can serve a file from the local file system. More often, though,
the request is forwarded to an application running on a web framework such as
Django or Ruby on Rails.
These applications eventually return a response to the request, but sometimes they
may have to perform some logic to serve it. For example, if you're on Facebook, their
servers will parse the request, see that you are logged in, and query their databases
and get the data for your Facebook Feed.
• Rendering
CHAPTER 22. SYSTEM DESIGN
298
Now your browser should have gotten a response to its request, usually in the form
of HTML and CSS. HTML and CSS are markup languages that your browser can
interpret to load content and style the page. Rendering and laying out HTML/CSS
is a very tricky process, and rendering engines have to be very flexible so that an
unclosed tag, for example, doesn't crash the page.
The request might also ask to load more resources, such as images, stylesheets, or
JavaScript. This makes more requests, and JavaScript may also be used to dynamically
alter the page and make requests to the backend.
More and more, web applications these days simply load a bare page containing a
JavaScript bundle, which, once executed, fetches content from APis. The JavaScript
application then manipulates the DOM to add the content it loaded.
• Conclusion
This is only a brief overview of a possible answer to this question. Any of these topics
merit a book-length treatment! Generally, in an interview, this question is asked to
see if you're familiar with the web, how it works, and your mental model of it.
It's impossible to know the whole stack in-depth, so sometimes interviewers like to
explore a particular aspect of the stack that they are interested in, or you can go into
more in depth about a part of the stack that you're more knowledgeable about. In any
case, it's interviewers will reward coherent explanations because your interviewers
will be working with you and want to see how you think.
Glossary
array 20,56,70,134,151,181,186
backtracking 16 7, 248
binary search 179,275
binary search tree 85,294
bit manipulation 199,264
breadth-first search 121, 262
depth-first search 121,167,249
design 241,273,287
dynamic programming 157, 266
graph 119,189,224,254
hash table 30, 63,140,274
heap 105,193,228,252,258
linked list 41, 65
pathfinding 189,228,252
queue 52,122,129,278,283
randomized algorithm 209
recursion 146
sorting 22, 37, 179
stack 51,147,283
string 29,149,220
tree 73,124
trie 93,239
299
~ ~~r~i!.:boohlruoOO
-
621483LV00009B/405/P
1111111111111111 II II
II I Ill
9 781793 296634
Download