Introduction Computer Science Henri Bal Vrije Universiteit Amsterdam

advertisement
Introduction Computer Science
Henri Bal
Vrije Universiteit Amsterdam
Goals of this course
●
Understand typical Computer Science topics
●
Meet with students and some staff members
●
Develop skills:
●
Reading (English) scientific literature
●
Critical/analytical thinking about CS topics
●
Discussing
●
Presenting
●
Scientific writing
Structure
●
●
Tuesdays: guest lectures
●
2 scientific papers provided as context
●
Questions made up by lecturers beforehand
Thursday/Friday/Monday: working groups
●
2 students per group present a paper
●
Each group discusses both papers + questions
Topics (Tuesday lectures)
●
●
●
Intro & high-performance computing (Henri Bal)
Luggage handling at Heathrow Terminal 5
(Huub van der Wouden, with IMM students)
Finding & reading scientific literature
(Michel Klein, with LI & IMM students)
●
Bioinformatics (Jaap Heringa)
●
Watson (Chris Welty, with LI & IMM students)
●
e-Science infrastructures (Cees de Laat)
●
e-Health (Aart van Halteren)
Working Groups
●
Supervised by staff members (instructors)
●
First meeting:
●
●
Other meetings:
●
●
Instructors will present 1 paper, you do the discussions
Students present/discuss papers
Course material + working group composition will
be made available on Blackboard (bb.vu.nl)
Your tasks
●
●
●
Attend Tuesday lectures
Send brief answers to questions + pose 2 new
questions per paper before workgroup deadline
Give 1 presentation in a working group
●
Make slides, talk for 10-15 minutes
●
Participate in working group discussions
●
Write 2-page paper on 1 topic of your choice
●
●
Use (find!) 2 extra publications in the literature
Grading:
●
40% participation, 40% paper, 20% presentation
First presentation
●
My personal view on Computer Science
●
●
Why is Computer Science so interesting?
Biased towards my own research area:
●
High performance distributed computing
Computer Science (CS)
●
●
CS sits between technology and applications,
both of which have turbulent developments
●
Processors, networks, mobiles, wearables, …
●
Data explosion in virtually all applications
CS also studies many fundamental problems
of its own
●
Programming languages, security, AI, theory ….
Outline
●
Technology
●
●
●
●
Computers
●
Some history
●
High performance computers
●
Modern (multicore) PCs
Networks & mobile computing
Applications
●
Data explosion
●
Computation demands
Fundamental CS questions
Computers
●
Mainframe: powerful centralized computer
●
●
Minicomputers: <25K$, for small groups
●
●
PDP-8, PDP-11, VAX (1960s-1980s)
Workstations: expensive personal
graphical machine
●
●
IBM 704 (1964)
Xerox Alto (1973)
PCs: inexpensive machine for the masses
●
IBM PC (1981)
High Performance
Computers
●
●
Computer systems with many processors,
all computing in parallel
Paper: “Back to Thin-Core Massively
Parallel Processors”
Warning
●
●
Scientific papers may be overwhelming
Have to learn how to read scientific literature,
without understanding every word
●
‘’Moreover, smart algorithms that exploit data locality, perform
loop unrolling, eliminate iterative loops and recursive algorithms,
and use idle-power-friendly programming languages and
libraries as well as auto-tuning based on multiversion algorithms
can achieve higher-energy-efficiency applications.’’
●
(You’re not supposed to understand this yet!)
High Performance
Computers (1)
●
Vector machines
●
Can do vector operations in parallel
●
●
●
A and B: 1-dimensional matrices with 100 elements
Computing A+B (= 100 computations) takes as much
time as doing 1 addition on a sequential computer
History
●
1970s, 1980s (e.g., Cray)
●
2000s (Japanese Earth Simulator)
●
2010s (GPUs, Graphical Processing Units)
High Performance
Computers (2)
●
Massively parallel machines
●
1000s of special processors connected by a
special network, all running in parallel, each
doing part of the overall computations
●
●
E.g., CM-1, CM-5, Intel Paragon, IBM BlueGene
Connection network uses graph theory (math)
High Performance
Computers (3)
●
Cluster computers
●
●
Parallel machines built from off-the-shelf
(commodity) PCs and networks
Excellent price/performance ratio
●
●
Exponential performance growth of
processor speeds
See http://www.top500.org
for 500 fastest supercomputers
Multicores & Manycores
●
All PCs now have >1 compute cores
●
Every PC is a parallel computer!
●
Some PCs already have 48 cores
●
Core count will increase to hundreds
●
●
Intel Phi (2012): 60 Pentium-1’s on 1 chip,
with advanced vector support
Challenge: how to program these things?
Thinking in parallel is hard
●
How to split up the work?
●
Load balancing
●
●
Communication & synchronization
●
●
All cores should do the same amount of work
Cores must exchange data (=overhead)
Nondeterminism:
●
●
A single processor always gives same outcome
With >1 core the outcome may depend on the
order (called a ``race condition’’ bug)
Graphics Processing Units
(GPUs)
Differences CPUs and GPUs
●
CPU: minimize latency of 1 activity (thread)
●
●
●
●
Must be good at everything
ALU
ALU
ALU
ALU
Control
Big on-chip caches
Sophisticated control logic
Cache
GPU: maximize throughput of all threads using
large-scale parallelism
●
1000’s very simple cores
Current debates
●
Should we build chips with:
●
Very fast/complicated (superscalar) processors?
●
●
Many slower/simpler (thin) processors?
●
●
Hits a ‘’power wall’’, hard to increase clock frequency
Hard to program
How to deal with energy consumption?
●
Performance per Watt becomes key factor
Networks
●
Wide area networks (WANs)
●
Local area networks (LANs)
●
Mobile networks
●
Much more in
Computer Networks class
Wide area networks
●
ARPANET
●
●
●
●
First computer network, connecting some US
sites (1960s)
Speeds measured in kbit/s
Internet
●
Based on standardized (IP) protocol suite
●
Connect everyone/everything (Internet-of-things)
Dedicated optical networks (light paths)
●
10 gbit/s, point-to-point
Local Area Networks
●
Ethernet: developed by Xerox PARC (1974)
●
●
Speed increased from 10 mbit/s to 100 gbit/s
Cluster computers use Ethernet or faster
commodity networks
●
Myrinet
●
Infiniband
An aside
●
●
In Computer Science
●
k(ilo)=1024
●
m(ega)=10242
●
g(iga)=10243
●
t(era)=10244
●
p(eta)=10245
●
e(xa)=10246
All has to do with
binary numbers
DAS-5
Dual 8-core Intel E5-2630v3 CPUs
FDR InfiniBand
OpenFlow switches
Various accelerators
CentOS Linux
Bright Cluster Manager
Built by ClusterVision
UvA/MultimediaN (18/31)
VU (68)
SURFnet7
ASTRON (9)
10 Gb/s
TU Delft (48)
Leiden (24)
Mobile computing
●
Laptops, sensors, smartphones, tablets
●
Many forms of mobile networks
●
●
Wifi (local range)
●
3G, 4G (lower bandwidth, high coverage)
●
BlueTooth (for pairing devices)
Ultimately: ubiquitous computing?
●
●
Vision by Mark Weiser (1988)
‘’machines that fit the human environment
instead of forcing humans to enter theirs’’
Outline
●
Technology
●
●
●
●
Computers
●
Some history
●
High performance computers
●
Modern (multicore) PCs
Networks & mobile computing
Applications
●
Data explosion
●
Computation demands
Fundamental CS questions
Application developments
●
●
There is a ``data explosion’’ in many
application areas
●
Huge amounts of data (up to Petabytes/year)
●
Very complicated/heterogeneous data
Demand for computing
●
Model (simulate) designs on a computer
Data explosion
●
Society:
●
●
Industry, economy:
●
●
Web, social networks
Banks, stock markets
Science
●
LHC (``Higgs particle’’)
●
Data stored on world-wide ``grid’’
●
Bioinformatics (next generation sequencing)
●
Astronomy: software telescopes (LOFAR, SKA)
Computing demands
●
●
●
Computational science:
●
Modeling ozone layer, climate, ocean, human brain
●
Simulating galaxies
Engineering:
●
Aircraft modeling, designing F1 cars (Virgin VR01)
●
TVs (mostly software), embedded systems
Games and multimedia:
●
Computer chess (Deep Blue)
●
Watson (Jeopardy)
●
Analyzing multimedia content
●
Digital forensics
●
Generating movies
Pixar’s ``Up’’ (2009)
Whole movie (96 minutes) would take 94 years on 1 PC
(4 frames per day; 1 second takes 6 days; 1 minute per year)
Some fundamental Computer
Science topics (1)
●
Operating systems:
●
●
Windows, Linux, Minix (Andy Tanenbaum)
Programming languages and systems
●
Fortran, Cobol, C, Java, Python … (thousands)
What happens if you ask a computer scientist to solve
a problem?
He/she will come back 3 months later, with …
a new programming language ideally suited for
solving your problem
Some fundamental Computer
Science topics (2)
●
Security
●
●
(Semantic) web technology
●
●
Preventing/detecting attacks, privacy, etc
Finding and reasoning about content on the web
Cloud computing
●
Store data and programs remotely, in the Cloud
Some fundamental Computer
Science topics (3)
●
Artificial intelligence
●
●
Databases
●
●
E.g. automatic machine-learning
Storing and searching huge amounts of data
Logic, modelling, graph theory, complexity
●
Essential for many applications
Conclusion
●
●
Modern Computer Science deals with hectic
developments in technology and applications
Both provide us many research problems
●
●
Application-driven vs technology-driven research
There also are many fundamental CS
problems
Literature (Context)
●
Ami Marowka: Back to Thin-Core Massively
Parallel Processors, IEEE Computer,
December 2011, pp. 49-54
QUESTIONS
●
●
●
●
●
●
Explain what ``thin cores’’ are
What are the arguments in favor and against using ‘’thin
cores’’ ?
Which role does energy consumption play in this
discussion?
Compute the energy efficiency of the current 10 largest
supercomputers on www.top500.org
Which type of machine currently is most energy efficient?
Compare the maximum performance of the current #1
against the performance of the #1 of 10 years ago. What is
the difference?
Download