Introduction Computer Science Henri Bal Vrije Universiteit Amsterdam Goals of this course ● Understand typical Computer Science topics ● Meet with students and some staff members ● Develop skills: ● Reading (English) scientific literature ● Critical/analytical thinking about CS topics ● Discussing ● Presenting ● Scientific writing Structure ● ● Tuesdays: guest lectures ● 2 scientific papers provided as context ● Questions made up by lecturers beforehand Thursday/Friday/Monday: working groups ● 2 students per group present a paper ● Each group discusses both papers + questions Topics (Tuesday lectures) ● ● ● Intro & high-performance computing (Henri Bal) Luggage handling at Heathrow Terminal 5 (Huub van der Wouden, with IMM students) Finding & reading scientific literature (Michel Klein, with LI & IMM students) ● Bioinformatics (Jaap Heringa) ● Watson (Chris Welty, with LI & IMM students) ● e-Science infrastructures (Cees de Laat) ● e-Health (Aart van Halteren) Working Groups ● Supervised by staff members (instructors) ● First meeting: ● ● Other meetings: ● ● Instructors will present 1 paper, you do the discussions Students present/discuss papers Course material + working group composition will be made available on Blackboard (bb.vu.nl) Your tasks ● ● ● Attend Tuesday lectures Send brief answers to questions + pose 2 new questions per paper before workgroup deadline Give 1 presentation in a working group ● Make slides, talk for 10-15 minutes ● Participate in working group discussions ● Write 2-page paper on 1 topic of your choice ● ● Use (find!) 2 extra publications in the literature Grading: ● 40% participation, 40% paper, 20% presentation First presentation ● My personal view on Computer Science ● ● Why is Computer Science so interesting? Biased towards my own research area: ● High performance distributed computing Computer Science (CS) ● ● CS sits between technology and applications, both of which have turbulent developments ● Processors, networks, mobiles, wearables, … ● Data explosion in virtually all applications CS also studies many fundamental problems of its own ● Programming languages, security, AI, theory …. Outline ● Technology ● ● ● ● Computers ● Some history ● High performance computers ● Modern (multicore) PCs Networks & mobile computing Applications ● Data explosion ● Computation demands Fundamental CS questions Computers ● Mainframe: powerful centralized computer ● ● Minicomputers: <25K$, for small groups ● ● PDP-8, PDP-11, VAX (1960s-1980s) Workstations: expensive personal graphical machine ● ● IBM 704 (1964) Xerox Alto (1973) PCs: inexpensive machine for the masses ● IBM PC (1981) High Performance Computers ● ● Computer systems with many processors, all computing in parallel Paper: “Back to Thin-Core Massively Parallel Processors” Warning ● ● Scientific papers may be overwhelming Have to learn how to read scientific literature, without understanding every word ● ‘’Moreover, smart algorithms that exploit data locality, perform loop unrolling, eliminate iterative loops and recursive algorithms, and use idle-power-friendly programming languages and libraries as well as auto-tuning based on multiversion algorithms can achieve higher-energy-efficiency applications.’’ ● (You’re not supposed to understand this yet!) High Performance Computers (1) ● Vector machines ● Can do vector operations in parallel ● ● ● A and B: 1-dimensional matrices with 100 elements Computing A+B (= 100 computations) takes as much time as doing 1 addition on a sequential computer History ● 1970s, 1980s (e.g., Cray) ● 2000s (Japanese Earth Simulator) ● 2010s (GPUs, Graphical Processing Units) High Performance Computers (2) ● Massively parallel machines ● 1000s of special processors connected by a special network, all running in parallel, each doing part of the overall computations ● ● E.g., CM-1, CM-5, Intel Paragon, IBM BlueGene Connection network uses graph theory (math) High Performance Computers (3) ● Cluster computers ● ● Parallel machines built from off-the-shelf (commodity) PCs and networks Excellent price/performance ratio ● ● Exponential performance growth of processor speeds See http://www.top500.org for 500 fastest supercomputers Multicores & Manycores ● All PCs now have >1 compute cores ● Every PC is a parallel computer! ● Some PCs already have 48 cores ● Core count will increase to hundreds ● ● Intel Phi (2012): 60 Pentium-1’s on 1 chip, with advanced vector support Challenge: how to program these things? Thinking in parallel is hard ● How to split up the work? ● Load balancing ● ● Communication & synchronization ● ● All cores should do the same amount of work Cores must exchange data (=overhead) Nondeterminism: ● ● A single processor always gives same outcome With >1 core the outcome may depend on the order (called a ``race condition’’ bug) Graphics Processing Units (GPUs) Differences CPUs and GPUs ● CPU: minimize latency of 1 activity (thread) ● ● ● ● Must be good at everything ALU ALU ALU ALU Control Big on-chip caches Sophisticated control logic Cache GPU: maximize throughput of all threads using large-scale parallelism ● 1000’s very simple cores Current debates ● Should we build chips with: ● Very fast/complicated (superscalar) processors? ● ● Many slower/simpler (thin) processors? ● ● Hits a ‘’power wall’’, hard to increase clock frequency Hard to program How to deal with energy consumption? ● Performance per Watt becomes key factor Networks ● Wide area networks (WANs) ● Local area networks (LANs) ● Mobile networks ● Much more in Computer Networks class Wide area networks ● ARPANET ● ● ● ● First computer network, connecting some US sites (1960s) Speeds measured in kbit/s Internet ● Based on standardized (IP) protocol suite ● Connect everyone/everything (Internet-of-things) Dedicated optical networks (light paths) ● 10 gbit/s, point-to-point Local Area Networks ● Ethernet: developed by Xerox PARC (1974) ● ● Speed increased from 10 mbit/s to 100 gbit/s Cluster computers use Ethernet or faster commodity networks ● Myrinet ● Infiniband An aside ● ● In Computer Science ● k(ilo)=1024 ● m(ega)=10242 ● g(iga)=10243 ● t(era)=10244 ● p(eta)=10245 ● e(xa)=10246 All has to do with binary numbers DAS-5 Dual 8-core Intel E5-2630v3 CPUs FDR InfiniBand OpenFlow switches Various accelerators CentOS Linux Bright Cluster Manager Built by ClusterVision UvA/MultimediaN (18/31) VU (68) SURFnet7 ASTRON (9) 10 Gb/s TU Delft (48) Leiden (24) Mobile computing ● Laptops, sensors, smartphones, tablets ● Many forms of mobile networks ● ● Wifi (local range) ● 3G, 4G (lower bandwidth, high coverage) ● BlueTooth (for pairing devices) Ultimately: ubiquitous computing? ● ● Vision by Mark Weiser (1988) ‘’machines that fit the human environment instead of forcing humans to enter theirs’’ Outline ● Technology ● ● ● ● Computers ● Some history ● High performance computers ● Modern (multicore) PCs Networks & mobile computing Applications ● Data explosion ● Computation demands Fundamental CS questions Application developments ● ● There is a ``data explosion’’ in many application areas ● Huge amounts of data (up to Petabytes/year) ● Very complicated/heterogeneous data Demand for computing ● Model (simulate) designs on a computer Data explosion ● Society: ● ● Industry, economy: ● ● Web, social networks Banks, stock markets Science ● LHC (``Higgs particle’’) ● Data stored on world-wide ``grid’’ ● Bioinformatics (next generation sequencing) ● Astronomy: software telescopes (LOFAR, SKA) Computing demands ● ● ● Computational science: ● Modeling ozone layer, climate, ocean, human brain ● Simulating galaxies Engineering: ● Aircraft modeling, designing F1 cars (Virgin VR01) ● TVs (mostly software), embedded systems Games and multimedia: ● Computer chess (Deep Blue) ● Watson (Jeopardy) ● Analyzing multimedia content ● Digital forensics ● Generating movies Pixar’s ``Up’’ (2009) Whole movie (96 minutes) would take 94 years on 1 PC (4 frames per day; 1 second takes 6 days; 1 minute per year) Some fundamental Computer Science topics (1) ● Operating systems: ● ● Windows, Linux, Minix (Andy Tanenbaum) Programming languages and systems ● Fortran, Cobol, C, Java, Python … (thousands) What happens if you ask a computer scientist to solve a problem? He/she will come back 3 months later, with … a new programming language ideally suited for solving your problem Some fundamental Computer Science topics (2) ● Security ● ● (Semantic) web technology ● ● Preventing/detecting attacks, privacy, etc Finding and reasoning about content on the web Cloud computing ● Store data and programs remotely, in the Cloud Some fundamental Computer Science topics (3) ● Artificial intelligence ● ● Databases ● ● E.g. automatic machine-learning Storing and searching huge amounts of data Logic, modelling, graph theory, complexity ● Essential for many applications Conclusion ● ● Modern Computer Science deals with hectic developments in technology and applications Both provide us many research problems ● ● Application-driven vs technology-driven research There also are many fundamental CS problems Literature (Context) ● Ami Marowka: Back to Thin-Core Massively Parallel Processors, IEEE Computer, December 2011, pp. 49-54 QUESTIONS ● ● ● ● ● ● Explain what ``thin cores’’ are What are the arguments in favor and against using ‘’thin cores’’ ? Which role does energy consumption play in this discussion? Compute the energy efficiency of the current 10 largest supercomputers on www.top500.org Which type of machine currently is most energy efficient? Compare the maximum performance of the current #1 against the performance of the #1 of 10 years ago. What is the difference?