CMU 15-505: Internet Search Technologies

advertisement
CMU 15-505:
Internet Search Technologies
15-505 Internet Search
Technologies
• Instructors:
– Alona Fyshe
– Scott Larsen
– Chris Monson
– Kamal Nigam
• http://www.cs.cmu.edu/~knigam/15-505
What does it take to build a worldclass search engine and related
services?
•
•
•
•
•
•
•
Lots of computer science
Massively parallel computation
Special-purpose data storage
Information retrieval
Machine learning
Language analysis
User interface design
• Study each of these topics in narrow but
deep fashion
• Format: small seminar, readings,
interactive discussions, programming
practicum
• Grading:
– 55% programming homework
– 30% reading response
– 15% class participation
What are reading responses?
• Practice for reading and thinking about
computer science research papers
• Meant to be open-ended, fairly short (1
page)
• Can be:
– Summary of paper
– Critique of theory, experiments, approach
– Suggestions for follow-on studies
Collaboration and Cheating
• Please collaborate on ideas, approaches,
diagnosing problems – use the mailing list
• All words and code must be your own
• Disclose all collaborations
• Clarify any doubts
What will make this class
enjoyable?
• Interactive
• Flexibility to explore fun domains and data
• Early feedback to us about what works
and doesn’t
Problems in Internet Search
Technology:
• Huge Problems
– E.g. what changed in the web since this time yesterday?
• Classic Problems
– E.g. sorting a gazillion numbers fast
• New Problems
– E.g. making sense of dynamic Cyrillic web pages
• Practical Problems
– Eg. how do we make both advertisers and consumers happier at
the same time?
• Non-practical Problems
– E.g. what do you see if you zoom all the way in on the moon?
• Beautiful Problems
– And Fun Problems
A Taste
• Sorting
– Scaling size up
– Scale time requirements down
• Matrix Operations
– Thinking about the problem in a blend of old
ways and new ways
Classic Sorting Algorithms
•
•
•
•
•
•
•
•
Quick
Merge
Selection
Shell
Heap
Radix
Bucket
….
Enlarge the Problem:
• 1,000x too many keys for a single machine
• 1024 machines to use
Sorting: Parallel
• How would you do it?
– Quick?
– Merge?
– Selection?
– Shell?
– Heap?
– Radix?
– Bucket?
– ….
Bitonic Sort: Batcher (1968)
• Bitonic Sequence: <a0, a1, …, an-1 >
– Exists i such that <a0 .. ai> is monotonically
increasing and <ai+1 .. an-1> is monotonically
decreasing
– Or: there exists a cyclic shift of indices such
that the above is satisfied
– Eg. < 8, 9, 2, 1, 0, 4> is a bitonic sequence
Bitonic Merging Network
Bitonic Merge on a Hypercube
Bitonic Sort
Bitonic Sort
Procedure BitonicSort
for i = 0 to d -1
for j = i downto 0
if (i + 1)st bit of iproc <> jth bit of iproc
comp_exchange_max(j, item)
else
comp_exchange_min(j, item)
endif
endfor
endfor
comp_exchange_max and comp_exchange_min compare and
exchange the item with the neighbor on the jth dimension
Bitonic Sort Demo
http://www.inf.fhflensburg.de/lang/algorithmen/sortieren/bit
onic/bitonicen.htm
Parallel Sort: Beauty or a Beast?
• What does it take to implement this?
Bitonic Sort: Why?
•
•
•
•
O(n log2(n))
Data independent
Resource needs are perfectly defined
Very parallel friendly
Matrix Multiplication
cij   aikbkj
k
0.75
0.25
0.0
0.0
0.75
0.25
0.0
0.0
0.5625
0.375
0.0625
0.0
0.25
0.75
0.0
0.0
0.0
0.75
0.25
0.0
0.1875
0.675
0.1875
0.0
0.0
0.75
0.25
0.0
0.0
0.0
0.75
0.25
0.0
0.5625
0.375
0.0625
0.25
0.0
0.0
0.75
0.25
0.0
0.0
0.75
0.375
0.0625
0.0
0.5625
*
=
Matrix Pipeline
cij   aikbkj
0.5625
k
0.75
0.25
0.0
0.75
0.25
0.0
0.0
+
0.0625
0.25
0.75
0.0
0.0
+
0.0
0.0
0.75
0.25
0.0
+
0.0
0.25
0.0
0.0
0.75
0.0
=
0.0
0.75
0.25
0.0
0.0
0.0
0.75
0.25
0.625
0.375
0.0
0.0
0.25
0.0
0.0
0.75
0.1875
0.75
0.0625
0.0
0.0625
0.5625
0.1875
0.1875
0.375
0.0625
0.0
0.5625
Visualization
*
=
Visualization
*
=
Visualization
Visualization
Matrix Multiplication
• A cube of processors
• Each does a chunk of the computation
– Each needs different (and overlapping) portions of the
input
– Each passes intermediate results to certain neighbors
• Result is stored across multiple machines
• Seems kinda heavy for a simple algorithm!
• Lookup Fox’s algorithm and Canon’s algorithm
– Very pretty at one level
– Gory at another level
A Different View
Courtesy http://www.unrealtournament3.com/
Multiplication
Multi-texturing
*
Addition
+
Blending
=
Graphics Pipeline
Multiply
Multiply
Add
Image (Frame Buffer)
How the Algorithm Works
How the Algorithm Works
How the Algorithm Works
*
How the Algorithm Works
*
How the Algorithm Works
*
+
Performance
Multiplication of 2 Matrices 1024x1024
0.4
0.35
0.25
0.2
0.15
0.1
0.05
NVIDIA
Dual Opteron 280 Dual Core
(2.3GHz)
P4 Dual Core
(3.2GHz)
4 Threads
2 Threads
1 Thread
2 Threads
1 Thread
4 Threads
2 Threads
1 Thread
7800GTX
(450,1250)
0
7900GTX
(670, 820)
Seconds
0.3
Dual Xeon Dual Core (3.6GHz)
GPU Sorting
Problems in Internet Search
Technology:
•
•
•
•
•
•
•
Huge Problems
Classic Problems
New Problems
Practical Problems
Non-practical Problems
Beautiful Problems
Fun Problems
Questions?
CMU 15-505: Internet Search Technologies
– Kamal Nigam (knigam@google.com)
– Chris Monson (shiblon@google.com)
– Alona Fyshe (alonaf@google.com)
– Scott Larsen (esl@google.com)
Bitonic Rearranging (cycling)
Download