Advanced Programming Practices SOEN 6441/1 CC Steven Winikoff steven.winikoff@concordia.ca EV-3.301 514-848-2424, ext. 7619 Motivation Concordia promises "real education for the real world". When I got out of school, I thought I was the best programmer in the world. I could write an unbeatable tic-tac-toe program, use five different computer languages, and create 1000-line programs that worked. Then I got out into the real world. My first task was to read and understand a 200,000-line FORTRAN program, then speed it up by a factor of two. Any Real Programmer will tell you that all the structured coding in the world won't help you solve a problem like that – it takes actual talent. - Ed Post, Real Programmers Don't Use Pascal 3 Rule Number One When in doubt, ASK! or There's no such thing as a stupid question. 5 What This Course is NOT About What This Course IS About how to program – you already know that, or you shouldn't be here 7 ● the practice of programming ● "tradecraft" 8 Why a Project? Evaluations and Grading ● midterm exam (20%) ● ● week 4, i.e. July 22nd final exam (30%) ● ● Oftentimes, I like to think that what my students learn stems right from my explanations and examples in lecture. But one night of office hours watching them work through an assignment reminds me that most of the learning happens at those key, difficult-yet-inescapable points in the assignments. during the final exam period, August 13th-19th project (15% + 15% + 20%) ● teams of five ● three deliverables: July15th, July 30th, August 11th – – ● Lecture is left to play its supporting role of explanation and motivation, but in my heart, I suspect that lecture is theater compared to the learning that goes on when the students actually write the code. each one must include actual working code and documentation, representing an increment of the final work each increment is graded independently - Nick Parlante, Stanford University, 2004 form teams by July 8th! 9 10 The Practice of Programming ● problems of writing and managing code ● managing complexity ● pragmatic programming ● coding conventions and software documentation ● software configuration management ● advanced debugging techniques ● tools and techniques for testing software ● multithreading concurrency ● code reuse in software development ● quality in coding, fault tolerance Scaling Up ● a program that you write for your own use ● 11 Nobody but you cares what it does or how it's written. 12 Scaling Up ● Scaling Up a program that you write for another person (e.g. an assignment) ● ● ● a program that's too large to manage by yourself ● You have to code to somebody else's specification. Getting it right depends on communication with the user. ...and you also have to coordinate and communicate with other programmers. 13 14 Scaling Up ● Scaling Up a small industrial contract ● ● Not only many programmers, but many users, with different expectations. Enhancements and corrections will be needed. a large project (e.g. telecommunications, aerospace, etc.) ● 15 Typically up to tens of millions of lines of code, thousands of person-years to write, 20% changes per year, thousands of sites with different versions in use simultaneously, more than a day for a complete compile... 16 That's Not an Exaggeration "...and managing code" approximate operating system sizes: ● ● ● ● Windows XP: 40,000,000 lines of code Windows Vista: 50,000,000 lines of code Linux: 204,500,000 lines of code! ● refers to Fedora 9 as of October 2008 ● includes the kernel plus ~5500 application packages ● development time estimated at 60,000 person-years ● see http://www.dwheeler.com/sloc/ ● The lifetime of a program can range from less than a day (use it once, then throw it away) to more than 30 years. The amount of effort required for maintenance ranges from zero to up to about 80% of the overall cost of the system. 17 "...and managing code" ● ● ● 18 Details, Details... Be as systematic as humanly possible. At some point, a program becomes too large to hold in your head all at once. Draw maps. Make charts. Keep lists. I did as much of this as I could, but inspiration kept getting in the way. My lists would become obsolete, and then I had to decide whether to take the time to update the list, or whether to spend the time actually working on the game. The point where this will happen varies depending on the program and on the programmer. ...but no matter what your limit is, you do have one! (continued on next page...) 19 20 Details, Details... Software Doesn't Wear Out... (...continued from previous page) ...or does it? Months later, I'm sitting in front of the computer thinking, "Did I call that object Winter_Coat, Coat, or Overcoat?" I'm wasting more time opening up various files of source code and searching for text strings than I would have if I'd kept a good, alphabetized list. ● ● Jim Aikin, from Lessons Learned the Hard Way / Tried-and-true advice for creating your first game Once it's running correctly, in theory a program should continue to run correctly forever – unlike hardware, which eventually wears out as components age and fail. ...but the real world is never that simple: software does deteriorate over time. Why? http://www.xyzzynews.com/xyzzy.18d.html 21 22 The Practice of Programming Software Doesn't Age Gracefully ● ● The evironment a program lives in is subject to change, as the operating system and libraries it depends on are upgraded or replaced. Often the hardware changes also. ("640K ought to be enough for anybody"?) ● ● The requirements the program was designed to meet evolve. "Unix is like the state of New Jersey.'' - Richard Rashid, developer of Mach 23 ● problems of writing and managing code ● managing complexity ● pragmatic programming ● coding conventions and software documentation ● software configuration management ● advanced debugging techniques ● tools and techniques for testing software ● multithreading concurrency ● code reuse in software development ● quality in coding, fault tolerance 24 The End of the World As We Know It If Only It Were This Easy :-) The most likely way for the world to be destroyed, most experts agree, is by accident. That's where we come in; we're computer professionals. We cause accidents. - Nathaniel Borenstein source: http://xkcd.com/534/ ("Just make sure you don't have it maximize instead of minimize.") Approaches to the Problem Some Things Never Change ● The distinctive concerns of software engineering are: ● ● how to design and build a set of programs into a system Managing complexity is what software engineering is all about. No sufficiently good, sufficiently general solution yet, but... ● theories of programming ● ● ● how to design and build a program or system into a robust, tested, documented and supported product ● tools ● how to maintain intellectual control over complexity in large doses ● 27 profilers, debuggers, disassemblers, IDEs, documentation generators, code generators, revision control systems, ... development methodologies ● - Fred Brooks, 1978 none, structured, object-oriented, aspect-oriented, ... ad hoc, waterfall, incremental, spiral, scrum, extreme programming, unified process, ... 28 Five Easy Pieces One Piece at a Time No matter what the methodology, the basic components are the same: ● requirements analysis and specification ● design ● implementation ● ● ● ● testing (validation and verification) ● maintenance ...but the way these are combined can make a huge difference to the development process. The key point of iterative development is that every iteration is a working piece of code. Early iterations are unlikely to be useful, but they compile and run. Each new iteration adds new functionality or extends existing functionality. ● ...but only a bit at a time, which reduces complexity. ● This is the approach we'll take for the project. 29 30 Feedback is Good For You Some Incremental Disadvantages ● One major source of complexity is evolving (or misunderstood!) requirements. ● ● Incremental development means that the users see working code sooner. ● ...so they don't have to wait until the product is done to request changes. ● ● 31 Each new build must be incorporated into the existing structure without degrading the quality of what has been built already. Problems in subsequent builds may require the existing code base to be reorganized. Discipline is required to prevent the model from reverting to build-and-fix. Users see possibilities and want to change the requirements. (Or is this actually an advantage?) 32 The Practice of Programming ● problems of writing and managing code ● managing complexity ● pragmatic programming ● coding conventions and software documentation ● software configuration management ● advanced debugging techniques ● tools and techniques for testing software ● multithreading concurrency ● code reuse in software development ● quality in coding, fault tolerance Theory vs. Practice “Wizards don’t scare me. Everyone knows there’s a rule that you mustn’t use magic against civilians.” The man thrust his face close to Ridcully and raised a fist. Ridcully snapped his fingers. There was an inrush of air, and a croak. “I’ve always thought of it more as a guideline,” he said, mildly. “Bursar, go and put this frog in the flower bed and when he becomes his old self give him ten dollars.” - from Soul Music by Terry Pratchett 33 34 The Practice of Programming Theory vs. Practice ● ● ● ● problems of writing and managing code The pragmatic approach to programming recognizes that no one theoretical methodology is always right. ● managing complexity ● pragmatic programming ...so don't think of them as rules, think of them as guidelines ● coding conventions and software documentation ● software configuration management ● advanced debugging techniques ● tools and techniques for testing software ● multithreading concurrency ● code reuse in software development ● quality in coding, fault tolerance ...and don't be afraid to mix and match, taking the most useful ideas from every source you encounter. 35 36 Being Consistently Stylish ● ● ● What's In a Style? Coding conventions generally include Coding conventions are all about consistent programming style. There is no one perfect programming style! ...but consistency helps programmers work together effectively. ● file naming and organization ● indentation, alignment and white space ● comments ● identifier names ● programming practices ● use of side effects, cleverness in general, etc. 37 38 Style Matters Style Matters Consider this code fragment: if ( (country == SING) || (country == BRNI) || (country == POL) || (country == ITALY) ) { /* * If the country is Singapore, Brunei or Poland * then the current time is the answer time * rather than the off hook time. * Reset answer time and set day of week. */ source: The Practice of Programming, Kernighan and Pike ... What's wrong with this? 39 if ( (country == SING) || (country == BRNI) || (country == POL) || (country == ITALY) ) { /* * If the country is Singapore, Brunei or Poland * then the current time is the answer time * rather than the off hook time. * Reset answer time and set day of week. */ source: The Practice of Programming, Kernighan and Pike ... ● ● Why isn't Italy mentioned in the comment? If the comment and the code disagree, which one is right? 40 Being Stylish Being Stylish – Why Do We Care? ● ● This example is typical of working code. ● ● In the end, programming style is about writing code that's easy to read. ● ● ● It's good, but it could be better. ● Easy to read means easier to modify. Well-written code is easy to read, which means easy to understand and therefore easy to modify. You may think that you already understand your code, and that nobody but you will ever read it. ...but it also means code that's more likely to be correct in the first place. ● Good style can – and should – become a habit! ● If that's what you think, maybe you're right. ...but you're also wrong: 41 42 The Practice of Programming ● problems of writing and managing code ● managing complexity ● pragmatic programming Source Code Control ● Configuration management means keeping control over your code: ● coding conventions and software documentation knowing what changes were made ● knowing when they were made ● software configuration management ● knowing who made them ● advanced debugging techniques ● being able to undo them as needed ● tools and techniques for testing software ● being able to keep the pieces of a system consistent ● multithreading concurrency ● code reuse in software development ● quality in coding, fault tolerance ● ● 43 One of the earliest revision control systems was actually called sccs – the "source code control system" 44 The Practice of Programming ● problems of writing and managing code ● managing complexity ● pragmatic programming ● coding conventions and software documentation ● software configuration management ● advanced debugging techniques ● tools and techniques for testing software ● multithreading concurrency ● code reuse in software development ● quality in coding, fault tolerance 45 Debugging Strategy Debugging Approaches Tools help, but they don't replace human intelligence. ● ● writing code defensively ("Don't be too clever.") ● reading the code ● using tools to analyze the code ● instrumenting the code (the Wolf Fence Algorithm) ● debugging libraries, e.g. for memory allocation ● interactive debuggers (standalone or part of an IDE) ● ● ● 47 intuition: Perhaps the symptom itself suggests to you what must be wrong, or perhaps you know that the program was working until the most recent change, or... brute force: Generate a memory dump and/or many intermediate output statements, and read through them exhaustively until you find what's wrong. backtracking: Begin with the first known symptom, and work backward through the code to see what must have caused it. cause elimination: Based on inductive or deductive reasoning: form a hypothesis of what might be happening, and find a way to prove or disprove it, based on gathering data. This is similar to medical diagnosis and testing. 48 The "Wolf Fence" Algorithm for Debugging* The "Wolf Fence" Algorithm for Debugging The "Wolf Fence" method compels attention to that portion of the program containing the error. It is described as follows: The procedure is then: 0. Let A be the territory known to contain the wolf (initially all of Alaska). 0. Somewhere in Alaska there is a wolf. 1. You may build a wolf-proof fence partitioning Alaska as required. 1. Construct a fence across A, along any convenient natural line that divides A into B and C. 2. The wolf howls loudly. 2. Listen for the howls; determine if the wolf is in B or C. 3. The wolf does not move. 3. Go back to Step 1 until the wolf is contained in a tight little cage. * From Communications of the ACM, vol. 25, no. 11 and vol. 26, no. 2 (November 1982 and February 1983). This material is copyrighted by the Association for Computing Machinery, and is used by permission for noncommercial purposes. 49 50 You and Your Wolf Fence ● ● ● ● Not All Errors are Wolves Of course, "Alaska" is your program, "wolves" are really semantic or logic errors, and "fences" are just output statements that print the values of important variables — and also the line where the output statement occurs. Continuing the "Wolf Fence" metaphor, not all errors behave the same way. Some other types include ● The wolf’s “howling” occurs when the printed value of a variable is different from what you expect. ● By putting appropriate output statements in the right places, we can determine what part of the program is causing the error. ● You can protect these output statements by using a boolean constant or preprocessor macro to control whether or not they should be performed. 51 foxes, which never howl coyotes, which howl loudly until you get near them, then they become silent ...and worst of all, the “cheshire cat”, which intermittently appears and disappears, and which imitates other animals when it is not invisible 52 Get Help! ● ● ● When You've Found What's Wrong... Sometimes the easiest way to debug a program is to enlist a fresh pair of eyes. The joy of debugging is finally finding the cause of the problem. ...but what next? Some things to consider: Generally it's much more difficult to debug your own work, just as it's much more difficult to proofread your own writing. In both cases, it's too easy to see what you expect to see rather than what's actually there. ● ● This applies no matter how experienced and skilled you are. It's always easier to debug somebody else's code, even if your name is Donald Knuth or Linus Torvalds. ● How can the problem be fixed without introducing any new problems anywhere else? Is the same problem likely to be found elsewhere in the code? This can happen if the same logic pattern is used in more than one place. How could this problem have been prevented in the first place? It's always a good idea to build up your defensive coding skills. 53 54 The Practice of Programming ● problems of writing and managing code ● managing complexity ● pragmatic programming ● coding conventions and software documentation ● software configuration management ● advanced debugging techniques ● tools and techniques for testing software ● multithreading concurrency ● code reuse in software development ● quality in coding, fault tolerance Some Fundamental Truths ● Testing can confirm that errors exist. ● Testing cannot prove that no errors exist. "As far as we know, our computer has never had an undetected error." - Weisert ● 55 Wherever possible, use tools and techniques that prevent errors, rather than depend on finding and fixing errors. (example: consider C++ vs. Java) 56 When Are We Finished Testing? A Word About Error Rates Answer #1: ● Testing is never finished! It's just that at some point we stop testing and the customer takes over. Given a fixed level of programming skill, the number of vulnerabilities in software is directly proportional to the number of lines of code and inversely proportional to the length of time the software has been in wide use. Answer #2: ● Testing ends when we run out of time or money. - The SANS Institute Consensus Security Vulnerability Alert, Vol. 6, No. 47 (2007/11/19) Answer #3: ● No, we cannot be absolutely certain that the software will never fail, but relative to a theoretically sound and experimentally validated statistical model, we have done sufficient testing to say with 95 percent confidence that the probability of 1000 CPU hours of failure-free operation in a probabilistically defined environment is at least 0.995. What assumptions is this statement based on? Is it universally true? - Musa and Ackerman, 1989 57 58 Test-Driven Development ● ● What to Do When a Test Fails The main principle of TDD: Test code is written first, before the code to be tested. ● The basic rhythm is: ● (1) write a little bit of test code (2) write a little bit of production code ● (3) test the new production code, and refine it until it passes ● (4) repeat from step (1) ● ...but this is a lot of work to do manually. Fortunately tools exist to help automate the process. ● 59 A failed test indicates an error in the code, which means it's time for debugging. Part of the problem is that the immediate symptom may be only distantly related to the actual cause of the problem. Sometimes, errors uncovered in unit testing aren't even in your own code! Errors may be difficult to reproduce, especially if related to timing. Distributed systems make things even more interesting. :-/ 60 The Practice of Programming ● problems of writing and managing code ● managing complexity ● pragmatic programming ● coding conventions and software documentation ● software configuration management ● advanced debugging techniques ● tools and techniques for testing software ● multithreading concurrency ● code reuse in software development ● quality in coding, fault tolerance Distributed Systems A distributed system is one in which I cannot get something done, because a machine I've never heard of is down. - Leslie Lamport ● ● Now that multi-core CPUs are ubiquitous, distributed programming is becoming more important all the time. Concurrency and multithreading introduce specific issues of their own. 61 The Practice of Programming ● problems of writing and managing code ● managing complexity ● pragmatic programming ● coding conventions and software documentation ● software configuration management ● advanced debugging techniques ● tools and techniques for testing software ● multithreading concurrency ● code reuse in software development ● quality in coding, fault tolerance 62 The Three Great Virtues “We will encourage you to develop the three great virtues of a programmer: laziness, impatience, and hubris.” - Larry Wall, Tom Christiansen and Randal Schwartz (in Programming Perl) 63 64 Laziness is a Virtue Impatience is a Virtue impatience: laziness: The quality that makes you go to great effort to reduce overall energy expenditure. The anger you feel when the computer is being lazy. It makes you write labor-saving programs that other people will find useful, and document what you wrote so you don't have to answer so many questions about it. This makes you write programs that don't just react to your needs, but actually anticipate them. 65 66 Hubris is a Virtue Code Reuse Code reuse is an aspect of constructive laziness. hubris: Excessive pride, the sort of thing Zeus zaps you for. In practice, that means ● Also the quality that makes you write (and maintain) programs that other people won't want to say bad things about. ● 67 Before writing a piece of code, check to see whether what you need is already available in a library. When you must write new code, make it as general as possible to allow it to be reused in other projects. 68 Code Reuse ● ● ● Writing reusable code requires discipline: you have to think beyond the immediate purpose of the project. One Step Further... When you can't reuse existing code, can you at least reuse (part of) an existing design? If the problem you're trying to solve isn't unique, the solution probably isn't unique either. Fortunately, the same habits which promote good code also promote reusable code. ...and modern languages also provide support for reusability, in the form of templates and generic classes. This is where design patterns can be helpful. 69 The Practice of Programming ● problems of writing and managing code ● managing complexity ● pragmatic programming ● coding conventions and software documentation ● software configuration management ● advanced debugging techniques ● tools and techniques for testing software ● multithreading concurrency ● code reuse in software development ● quality in coding, fault tolerance 70 Fault Tolerance Fault tolerance means doing something reasonable no matter what happens. Even if the user's cat does walk across the keyboard, even if the user trips over the power cord and pulls it out, or the machine's battery dies, or... We have the most thorough test guy in the world [...] I showed him this program and he asked, “But Rob, what if time runs backward?” - Rob Kolstad (kolstad@sun.com) 71 72 ...but what if time does run backward? Time Always Runs Backward! Specifically, once a year. For example, on November 1st, 2014, the system clock on your computer behaved like this: Consider these entries from a log file on a Unix system: 04:06:02 sendmail[10665]: i22962vH010665: [...] 04:06:02 sm-mta[10667]: i22962Kx010667: [...] 04:06:02 sendmail[10665]: i22962vH010665: [...] 04:06:16 sm-mta[10669]: i22962Kx010667: [...] 04:03:02 sm-mta[11180]: i22JjAKx011180: [...] 04:03:22 sm-mta[11181]: i22JjAKx011180: [...] 04:05:19 sm-mta[11231]: i22L5KKx011231: [...] [...] [...] 01:59:57 01:59:58 01:59:59 01:00:00 01:00:01 01:00:02 [...] Oops, looks like the system clock was reset! ● ● 73 74 Expect the Unexpected Bugs (p. 1 of 6) Round the chattering printer, The stories that are told Of programs and their lurking bugs Would make your blood run cold. The key to fault tolerance is anticipating what might go wrong, and deciding what to do about it in advance. This often relies on exception handling in modern programming languages. It's the same the whole world over, From Apple to Big Blue And I swear upon a stack of cards These tales I tell are true. And it's bugs, bugs, bugs, bugs, Bugs, bugs, bugs. There's always one more bug. 75 written and performed by Steve Savitzky http://steve.savitzky.net/ 76 Bugs (p. 2 of 6) Columbia stood ready For her first trip to the sky. America's first shuttle, With the whole world standing by. Columbia first flew in 1981. They were heading for the tropics On a long-range testing flight. The crew on board the brand-new jet Thought things were working right. 'Til they went past the equator And the plane flipped upside down. They damn near took the software team And ran them out of town. The plane in question was With thirty seconds left to go, A warning flag unfurled. And it took them all next week to find The Bug Heard Round the World. And it's bugs, bugs, bugs, bugs, Bugs, bugs, bugs. There's always one more bug. And it's bugs, bugs, bugs, bugs, Bugs, bugs, bugs. There's always one more bug. 77 The Robert T. Morris worm nearly shut down the whole internet, in 1988. It was never intended to propagate as widely as it did. But not all of the bugs it found Were relics from the past. One more bug made the tapeworm spread A thousand times too fast. And it's bugs, bugs, bugs, bugs, Bugs, bugs, bugs. There's always one more bug. for details, do a web search for “Morris worm” the U.S. F-16 fighter jet. Fortunately this problem was discovered in a simulation, not on an actual flight. see volume 3, issue 44 of the Risks Digest http://catless.ncl.ac.uk/Risks/3.44.html Bugs (p. 4 of 6) The tapeworm laid the network low. It spread itself around Through loopholes in the system code Its programmer had found. Bugs (p. 3 of 6) 78 Bugs (p. 5 of 6) The century was ending, Everybody knew the date. Fixing bugs involving two-byte date fields Simply couldn't wait. But in the week that followed Many fools were heard to moan, “I could have fixed that program too, If only I had known”. And it's bugs, bugs, bugs, bugs, Bugs, bugs, bugs. There's always one more bug. 79 80 Bugs (p. 6 of 6) And when the final program's run And all its data saved, They'll take the last dead programmer And lay him in his grave. Bugs: the epilog There are far too many similar stories everywhere you look, from the trivial to the life-threatening. Suggested further reading includes: And the very last bug left in sight, A cockroach passing by, Will walk across his coffin there, As if to say “Nice try!” ● ● And it's bugs, bugs, bugs, bugs, Bugs, bugs, bugs. There's always one more bug. 81 the Risks Digest: http://catless.ncl.ac.uk/Risks Computer World's page of epic failures: http://www.computerworld.com/ s/article/9183580/ Epic_failures_11_ infamous_software_bugs 82