COM1721: Freshman Honors Seminar
A Random Walk Through Computing
Rajmohan Rajaraman
Tuesdays, 5:20 PM, 149 CN
Introduction
Explore a potpourri of concepts in computing
is usually kept in a jar and used for scent
Readings: Handouts and WWW
Grading: Quizzes, homework, and class participation pot pourri , literally rotten pot
Sample Concepts
Abstraction
Modularity
Randomization
Recursion
Representation
Self-reference
…
Sample Topics
Dictionary search
Structure of the Web
Self-reproducing programs
Undecidability
Private communication
Relational databases
Quantum computing, bioinformatics,…
Abstraction
A view of a problem that extracts the essential information relevant to a particular purpose and ignores inessential details
Driving a car:
We are provided a particular abstraction of the car in which we only need to know certain controls
Building a house:
Different levels of abstraction for house owner, architect, construction manager, real estate agent
Related concepts: information hiding, encapsulation, representation
Modularity
Decomposition of a system into components, each of which can be implemented independent of the others
Foundation for good software engineering
Design of a basic processor from scratch
Representation
To portray things or relationship between things
Knowledge representation: model relationship among objects as an edgelabeled graph
Data representation: bar graphs, histograms for statistics
Querying a dictionary; Web as a graph
Randomization
An algorithmic technique that uses probabilistic (rather than deterministic) selection
A simple and powerful tool to provide efficient solutions for many complex problems
Has a number of applications in security
Cryptography and private communication
Recursion
A way of specifying a process by means of itself
Complicated instances are defined in terms of simpler instances, which are given explicitly
Closely tied to mathematical induction
Fibonacci numbers
Self-reference
A statement/program that refers to itself
Examples:
“This statement contains five words”
“This statement contains six words”
“This statement is not self-referential”
“This statement is false”
Important concept in computing theory
Undecidability of the halting problem, selfreproducing programs
Gödel Escher Bach: an Eternal Golden Braid ,
Douglas Hofstader
Illustration: Representation
Problem: Derive an expression for the sum of the first n natural numbers
1 + 2 + 3 + … + n-2 + n-1 + n = ?
Sum of First n Natural Numbers
1 + 2 + 3 + … + 98 + 99 + 100 = S
100 + 99 + 98 + … + 3 + 2 + 1 = S
101 + 101 + 101 + … + 101 + 101 = 2S
S = 100*101/2
S = n(n+1)/2
A Different Representation
1
2
3
A “Geometric Derivation”
4
5
2 S
n(n
1)
Other Equalities
Sum of first n odd numbers
1 + 3 + 5 + … + 2n-1 = ?
Sum of first n cubes
1 + 4 + 9 + 16 + … + n^3 = ?
Representation and Programming
Representation is the essence of programming
Brooks , “The Mythical Man-Month”
Data structures
Dictionary
A collection of words with a specified ordering
Dictionary of English words
Dictionary of IP addresses
Dictionary of NU student names
Searching a Dictionary
Suppose we have a dictionary of
100,000 words
Consider different operations
Search for a word
List all anagrams of a word
Find the word matching the largest prefix
What representation (data structure) should we choose?
Search for a Word
Store the words in sorted order in a linear array
Unsuccessful search:
compare with 100,000 words
Successful search:
on average, compare with 50,000 words
Twenty Questions
Compare with 50,000 th word
If match, then done
If further in dictionary order, search right half
If earlier in dictionary order, search left half
Until word found, or search space empty
Recursion
Binary search
How Many Questions?
ajuma alderaan alpheratz amber dali escher picasso reliable renoir yukon vangogh
How Many Questions?
Question #
0
1
2
3
5
10
15
17
Search space
100,000
50,000
25,000
12,500
3,125
100
4
1
Anagrams
An anagram of a word is another word with the same distribution of letters, placed in a different order
Input: deposit
Output: posited , topside , dopiest
Anagrams: subessential suitableness
Detecting Anagrams
How do you determine whether two words X and Y are anagrams?
Compare the letter distributions
Time proportional to number of letters in each word
Suppose this subroutine anagram(X,Y) is fast
Listing Anagrams of a Word
Dictionary of 100,000 English words
List all anagrams of least
How should we represent the dictionary?
Linear array
Loop through dictionary: if anagram(X,least), include X in list
Running time = 100,000 calls to anagram()
A Different Data Structure
If X and Y are anagrams of each other, they are equivalent; the list of anagrams of X is same as the list for Y
This indicates an equivalence class of anagrams!
deposit posited topside dopiest race care acre adroitly dilatory idolatry
Anagram Signatures
Would like to store anagrams in the same class together
How do we identify a class?
Assign a signature!
Sort all the letters in the anagram word(s)
Same for each word in a class!
acre race care: acer deposit posited topside dopiest: deiopst subessential suitableness: abeeilnssstu
Anagram Program acre pots stop care post snap sign acer : acre opst : pots opst : stop acer : care opst : post anps :snap sort acer : acre acer : care anps :snap opst : pots opst : stop opst : post
Anagram Program acer : acre acer : care anps :snap opst : pots opst : stop opst : post merge acer : acre care anps : snap opst : pots stop post
Listing Anagrams for Given Word X
Compute sign(X) and lookup sign(X) in dictionary using binary search
List all words in list adjacent to sign(X) post sign opst lookup acer : acre care anps : snap opst : pots stop post
Efficiency of Anagram Program
Once dictionary has been stored in new representation:
Lookup takes at most 17 queries
Listing time is proportional to number of anagrams in the class
What about the cost of new representation?
Sign each word, sort, and merge
Expensive, but need to do it only once!
Preprocessing
References
Programming Pearls , by Jon Bentley,
Addison-Wesley
Great Ideas in Theoretical Computer
Science , Steven Rudich
A course at CMU