Al Aho aho@cs.columbia.edu Effective Software Systems Research Bell Labs December 6, 2013 1 Al Aho What it means to be a systems researcher 2 Al Aho Outline 1. Forces shaping the global information networking infrastructure 2. Necessary ingredients for research effectiveness 3. NSF CISE Advisory Council 2025 vision 4. The bottom line 3 Al Aho Forces Shaping the Network • Explosion of diverse end devices • Shift towards ultra-fast access • Ultra-fast IP network • Billions of new cloud services • Rapid deployment of cloud services 4 Al Aho A World Connected: 2.4B Internet Users 6/30/2012* Al Aho HTTP 0.9 Mosaic Browser RFC 675 TCP/IP ARPANet 1969 1974 5 World Wide Web Internet 1989 * www.internetworldstats.com 6 Al Aho Credit: Intel Corporation Cellular Networks, Mobile Devices and Pervasive Computing • Mobile phones are the only digital system accessible to the majority of the planet. – 6.8 billion mobile phone connections globally. – 85% of new handsets will be able to access the mobile web: 1 in 5 has access to fast service, 3G or better; 10 trillion messages by 2013. • Growing ecosystem of tools and applications: – Banking, commerce, healthcare, social networking: 1M distinct active apps just in App Store. – Mobile browsers can now display much of the content available to their desktop counterparts. • Mobile payment systems are now common in the developing world. – Sensitive and private data stored & entered on devices. Image Credit: Nicolle Rager Fuller, NSF Research Themes: Infrastructure scalability 7 Al Aho Spectrum management Security and privacy Energy consumption Advanced networking technologies F. Jahanian, NSF Qatar ARC’13 Available/Emerging Networking Technologies • Access – Ultrabroadband – FTTX • Networking infrastructure – IP bearer service – Software defined networks – OpenFlow – Network Functions Virtualization – Cloud computing – Content-delivery networks 8 Al Aho Business Challenges • Services and content from whose data centers? • More diverse devices but at a lower cost per device • More bandwidth but at a lower cost per bit • Seamless unification of disparate networks • Hosting, creating and delivering huge numbers of new services but with lower management costs 9 Al Aho Cross-cutting Business Issues • Role of content • Better security and privacy • Reducing energy consumption • Reducing network complexity • Improving software quality • Regulatory issues • Convergence and interoperability of infrastructures 10 Al Aho Software Systems The world depends on software but most people are not aware of how much software there already is, or how expensive it is to develop and maintain, or how hard it is to get it right. 11 Al Aho Software, Software Everywhere “A conservative estimate is that the world is already using hundreds of billions of lines of software to conduct its affairs.” A. V. Aho Software and the Future of Programming Languages Science, February 27, 2004, pp. 1131-1133 12 Al Aho IEEE Spectrum Software Hall of Shame Year Company Costs in US $ 2004 UK Inland Revenue Software errors contribute to $3.45 billion tax-credit overpayment 2004 J Sainsbury PLC [UK] Supply chain management system abandoned after deployment costing $527M 2002 CIGNA Corp Problems with CRM system contribute to $445M loss 1997 U. S. Internal Revenue Service Tax modernization effort cancelled after $4 billion is spent 1994 U. S. Federal Aviation Administration Advanced Automation System canceled after $2.6 billion is spent R. N. Charette, Why Software Fails, IEEE Spectrum, September 2005. 13 Al Aho J. D. Zients On Government IT Projects “[Government] IT projects too often cost hundreds of millions of dollars more than they should, take years longer than necessary to deploy and deliver technologies that are obsolete by the time they are completed.” Jeffrey D. Zients Management Consultant and Future Economic Advisor Obama White House From a 2010 internal government memo Quoted on the front page of the NY Times Monday, November 11, 2013 14 Al Aho Effective Software Systems Research • Two aspects of systems research – Experimental – Theoretical • But “If you find that you're spending almost all your time on theory, start turning some attention to practical things; it will improve your theories. If you find that you're spending almost all your time on practice, start turning some attention to theoretical things; it will improve your practice.” − Don Knuth 15 Al Aho Theory in Practice: Regular Expression Pattern Matching in Perl, Python, Ruby vs. AWK Time to check whether a?nan matches an regular expression and text size n Russ Cox Regular expression matching can be simple and fast (but is slow in Java, Perl, PHP, Python, Ruby, ...) [http://swtch.com/~rsc/regexp/regexp1.html, 2007] 16 Al Aho What is Effective Software Systems Research? Research that leads to software systems that have a major impact. Impact can take a variety of forms: – connecting people, e.g. WWW and the Internet – transforming an existing area, e.g. spreadsheets for accounting – creating a new company, e.g. Google search – realigning a corporate strategic plan, e.g. in 1991 Bill Gates realigns Microsoft to pursue an intelligent Internet – creating new knowledge, e.g. RSA encryption algorithm – giving your employer a decided advantage in the marketplace 17 Al Aho Traditional Metrics of Scholarly Effectiveness • Citations • Publications in influential journals & conferences • Awards and honors • Opinion of peers in letters and interviews 18 Al Aho Additional New Metrics • Article downloads and views • Influence on policy makers • Effects on industry and the economy • Public outreach The maze of impact metrics Editorial, Nature, October 16, 2013 19 Al Aho Examples of Effective Software Systems Research 1. First modern personal computer (Thacker; Xerox PARC) 2. TCP/IP (Cerf and Kahn; DARPA) 3. Public-key cryptography (Rivest, Shamir, Adleman; MIT) 4. OO programming (Dahl and Nygaard; Norwegian Computer Center) 5. Transaction processing (Gray; IBM) 6. Interactive computing (Englebart; SRI) 7. Distributed personal computing environments (Lampson; Xerox PARC) 8. Computer graphics (Sutherland; MIT) 9. Unix and C (Thompson and Ritchie; Bell Labs) 10. Relational model of data (Codd; IBM) ACM A. M. Turing Award Winners acm.org 20 Al Aho More Examples of Effective Software Systems 1. LLVM (compiler infrastructure written in C++; UIUC) 2. Eclipse (multilanguage IDE written in Java; IBM Canada) 3. VMware (virtualization technology) 4. Secure Network Programming (secure sockets; UT Austin) 5. MAKE (build utility; Bell Labs) 6. Java (general purpose OO language; SUN) 7. SPIN (software model checking system; Bell Labs) 8. The S System (statistical programming language; Bell Labs) 9. NCSA Mosaic (multiplatform browsing tool; NCSA, UIUC) 10. WWW (network-oriented hypermedia system; CERN) ACM Software System Awards acm.org 21 Al Aho Effective Programming Languages 22 1. C 6. C# 2. Java 7. (Visual) Basic 3. Objective-C 8. Python 4. C++ 9. Transact-SQL 5. PHP 10.JavaScript Al Aho Tiobe Programming Community Index www.tiobe.com November 2013 Three Necessary Ingredients for Effectiveness 1. Your idea should solve an important problem • George Heilmeier: “Answer my catechism questions!” 23 Al Aho The Heilmeier Catechism Questions 1. What are you trying to do? Articulate your objectives using absolutely no jargon. 2. How is it done today and what are the limits to current practice? 3. What’s new in your approach and why do you think it will be successful? 4. Who cares? 5. If you are successful, what difference will it make? George H. Heilmeier 6. What are the risks and the payoffs? 7. How much will it cost? 8. How long will it take? 9. What are the midterm and final “exams” to check for success? http://en.wikipedia.org/wiki/George_H._Heilmeier 24 Al Aho Three Necessary Ingredients for Effectiveness 1. Your idea should solve an important problem • George Heilmeier: “Answer my catechism questions!” 2. You need to teach others how to use your idea • Richard Hamming: “You not only need to do good work you must also teach others how to use your work.” 25 Al Aho Three Necessary Ingredients for Effectiveness 1. Your idea should solve an important problem • George Heilmeier: “Answer my catechism questions!” 2. You need to teach others how to use your idea • Richard Hamming: “You not only need to do good work you must also teach others how to use your work.” 3. To be effective in the near future your idea needs a route to the sea • AWK case study 26 Al Aho AWK: a simple pattern-action language for common data-processing applications Paradigm problem: Given a list of name-value pairs, print the total value associated with each name. alice 10 eve 20 bob 15 alice 30 eve 20 bob 15 alice 40 27 Al Aho AWK: a simple pattern-action language for common data-processing applications Paradigm problem: Given a list of name-value pairs, print the total value associated with each name. alice 10 eve 20 bob 15 alice 30 An AWK program is a sequence of pattern-action statements { total[$1] += $2 } END { for (x in total) print x, total[x] } eve 20 bob 15 alice 40 28 Al Aho AWK’s Route to the Sea • AWK solved common data-processing problems simply • AWK came with Unix and was easy to combine with other UNIX commands • AWK was (and still is!) small and easy to learn • On-line man page, tutorials and book showed how to use AWK to solve practical problems • An initial set of enthusiastic AWK users provided valuable feedback and acted as evangelists • We evolved AWK over a decade to meet new user needs 29 Al Aho Inspiring Innovation: The Programming Languages and Translators Course at Columbia University • In PLT you will learn the syntactic and semantic elements and the computational models of the most important modern programming languages as well as the algorithms and techniques used by compilers to translate them into machine and other target languages. The course will cover imperative, object-oriented, functional, logic, and scripting languages, as well as trends in the evolution of programming languages. • A highlight of this course is a semester-long programming project in which you will work in a small team to create and implement an innovative little language of your own design. This project will teach you computational thinking in language design as well as project management, teamwork, and communication skills that you can apply in all aspects of your career. 30 Al Aho A Sampling of PLT Languages What to Wear: personalized wardrobe recommendations Swift Fox: configuring wireless sensor networks Trowel: a webscraping language for journalists Upbeat: a language for auralizing data Suds: a language for shared-session telecom services Q-HSK: a language for teaching quantum computing http://www.cs.columbia.edu/~aho/cs4115/ 31 Al Aho Phases of a Compiler source program Lexical Analyzer target program Syntax Analyzer token stream Semantic Analyzer syntax tree Interm. Code Gen. annotated syntax tree Symbol Table 32 Al Aho Code Optimizer interm. rep. interm. rep. Code Gen. Front End Compiler Component Generators source program 33 Al Aho lex specification yacc specification Lexical Analyzer Generator LEX Syntax Analyzer Generator YACC Lexical Analyzer token stream Syntax Analyzer syntax tree The PLT Course Project at Columbia Week Task 2 Form a team of five and design an innovative new language 34 4 Write a whitepaper on your proposed language modeled after the Java whitepaper 8 Write a tutorial patterned after Chapter 1 and a language reference manual patterned after Appendix A of Kernighan and Ritchie’s book, The C Programming Language 14 Give a ten-minute presentation of your language to the class 15 Give a 30-minute working demo of your compiler to the teaching staff 15 Hand in the final project report Al Aho Individual Roles on the Project Team • Project manager – sets the project schedule, holds weekly meetings with the entire team, maintains the project log, and makes sure the project deliverables get done on time. • Language and tools guru – defines the baseline process to track language changes and maintain the intellectual integrity of the language. – teaches the team how to use various tools used to build the compiler. • System architect – defines the compiler architecture, modules, and interfaces. • System integrator – defines the system platform and makes sure the compiler components work together. • Tester and validator – defines the test suites and executes them to make sure the compiler meets the language specification. 35 Al Aho Telling Lessons Learned by PLT Students • “Designing a language is hard and designing a simple language is extremely hard!” • “During this course we realized how naïve and overambitious we were, and we all gained a newfound respect for the work and good decisions that went into languages like C and Java which we’ve taken for granted for years.” 36 Al Aho Where is the Future Headed? 37 Al Aho Bell’s Law: Birth and Death of Computer Classes Computers Per Person 1:106 Mainframe Mini 1:103 Workstation PC Laptop 1:1 PDA Cell 103:1 years 38 Al Aho Mote ??? G. Bell Bell’s Law for the Birth and Death of Computer Classes CACM, Jan 2008, pp. 86-94 Innovation Drivers/Trends for 2025 • Extreme personalization – Health/medical, career, mobility, lifestyle, education, language, data, entertainment, … • Extreme augmentation – Memory (perfect), data (always accessible and unbounded), communication (always on), analytics (as/when needed, AI/ML on demand…), multi-locations (virtual presence), … • Extreme human-world integration – In-body devices (vs. on-body or external), brain and sensorial interfaces, seamless virtual+real worlds … – Collective systems (e.g. crowdsourcing), assisted+assistive robotics, “green” living, disaster management … Vision 2025 NSF CISE AC 39 Al Aho The Dark Side of the Internet: Rise of Politically Motivated Attacks, Cyber War, Censorship, Hacktivism and Cyber Espionage 40 Al Aho Cyber Security Challenges • Attacks and defenses co-evolve: a system that was secure yesterday might no longer be secure tomorrow. • The technology base of our systems is frequently updated to improve functionality, availability, and/or performance. New systems introduce new vulnerabilities that need new defenses. • The environments in which our computing systems are deployed and the functionality they provide are dynamic, e.g. cloud computing, mobile platforms. Image Credit: ThinkStock • As automation pervades new platforms, vulnerabilities will be found in critical infrastructures, automotive systems, smart grids, medical devices, transportation systems. • The sophistication of attackers is increasing as well as their sheer number and the specificity of their targets. • Cyber security is a multi-dimensional problem requiring expertise from computer science, mathematics, economics, public policy, behavioral and social sciences. 41 Al Aho F. Jahanian, NSF Qatar ARC’13 The Bottom Line Software systems will play a key role in shaping the technology and industry structure of the global information infrastructure of the future. 42 Al Aho But Most Importantly You can’t do effective software systems research without a critical mass of great software systems researchers. 43 Al Aho Al Aho aho@cs.columbia.edu Effective Software Systems Research Bell Labs December 6, 2013 44 Al Aho