Uploaded by Yeldar Kudaibergenov

An essay on the craft of programming

advertisement
An essay on the craft of programming
Nuno A. Fonseca
e-mail: nf@ncc.up.pt
November 29, 2006
1
Introduction
Programming is a kind of an art. Why? Well, it is like painting in the sense that the
programmer starts with a blank sheet and, with a combination of science, art and craft,
sketches the overall shape of the programmer and then fills the details. As painters need
to know when to stop work on the details, a programmer should also know when to stop
refining and embellish a program - it can never be perfect.
A programmer is a listener, adviser, interpreter and dictator in the process of making a
computer do something. The programmer tries to capture elusive requirements and express
them using some programming language so that the computer does what is expected to do.
In the process, a programmer also tries to document the work, so that others (and himself!)
can understand it, and to develop the work so that others can build on it. Furthermore,
all this is done against the clock.
Software development is usually a long process that takes time (and money). Programmers spend most of their time not in developing but on debugging, fixing bugs, and
maintaining their programs.
Being a good programmer [1] is difficult and noble aspiration. Several issues related
to programming are important, namely designing, testing, debugging, among other things.
Here we will enumerate some good programming practices that, when followed, may reduce
the developing time, including time spent in debugging and maintenance. The outcome
of following good programming practices is also the production of good code (a somewhat
subjective definition).
This essay enumerates several good programming practices and programming rules.
These practices and rules are general, and therefore can be applied to most programming
languages.
2
Programming Rules
A pragmatic philosophy of programming encloses a bottom-up strategy (as opposed to
top-down): complex programs (systems) are constructed by building a set of simpler com1
2 PROGRAMMING RULES
ponents (functions, methods or programs) that perform simple things, but do it well. A
paradigmatic example is the Unix philosophy: write programs that do one thing and do it
well; write programs to work together; and write programs to handle text streams because
that is a universal interface.
Several rules of good practices of programming are next outlined (most of them are
found in [2]). The term program and function are used interchangeably.
1. Modularity Rule
(a) Write simple parts and connect them by clean interfaces;
(b) It allows the reduction of the debugging time (that often dominates development
time).
2. Clarity Rule
(a) Clarity is better than cleverness;
(b) Code that is clear is less likely to break and is easily comprehended by the next
person that has to change/fix it (which can be you);
(c) Remember: Software maintenance is important (and expensive).
3. Simplicity Rule
(a) Main motto: “small programs/functions are beautiful”!
(b) Design for simplicity - add complexity only when strictly necessary;
4. Parsimony Rule
(a) Write a big program/function only when it is clear by demonstration that nothing else will do;
(b) A program/function is big in the sense of large in volume of code and/or of
internal complexity.
5. Composition Rule
(a) Write programs that can be connected to other programs;
(b) Bigger programs can be a composition of smaller ones;
(c) Connection can be made in many ways, namely through interprocess communication mechanisms or simply by using text streams.
6. Separation Rule
(a) Separate policy from mechanism and separate GUI from engines;
(b) It allows changing the policy or GUI without destabilizing the mechanism/engines.
2
2 PROGRAMMING RULES
7. Transparency Rule
(a) A program is transparent if one can look at it working and see what is doing
and how;
(b) A program is discoverable when it has facilities for monitoring and displaying
the internal state;
(c) Design for visibility to make inspection and debugging easier.
8. Robustness Rule
(a) A software is robust when it performs well under unexpected conditions which
stresses the designer assumptions;
(b) Results from transparency and simplicity;
(c) Unfortunately, most software systems are fragile and buggy.
9. Representation Rule
(a) Data is more tractable than program logic (eg. a diagram of a 20 node tree is
easily understood than the flowchart of a program that generates the tree);
(b) Look for ways to shift the complexity from code to data;
(c) Fold knowledge into data so program logic can be simple and robust.
10. Least surprise Rule
(a) In interface design always do the least surprising thing;
(b) Interfaces should follow the KISS principle -Keep It Stupid and Simple.
11. Silence Rule
(a) When a program has nothing interesting to say, it should say nothing (no spurious messages).
12. Repair Rule
(a) When the program fails, fail noisily and as soon as possible;
(b) Programs should cope with incorrect inputs and its own execution errors as
elegantly as possible;
(c) When errors cannot be dealt with, the programs should fail in a way that makes
the diagnosis of the problem as easy as possible.
13. Economy Rule
(a) A programmer time is expensive - conserve it in preference to machine time.
3
2 PROGRAMMING RULES
14. Generation Rule
(a) Use code generators to automate error-prone tasks.
(b) Avoid hand-hacking - write programs to write programs whenever possible;
(c) Hand-hacking or manually customizing programs is error prone (eg. some details
are often neglected);
15. Optimization Rule
(a) Prototype before polishing, get it working before optimizing;
(b) Make it work first, then make it work better (faster);
(c) When in doubt, use brute force approach;
(d) Bottlenecks often occur in surprising places. Do not try to guess and optimize
the code until you have proof of where the bottleneck is;
(e) Don’t tune for speed until you have measured, and then only when one part of
the code overwhelms the rest;
(f) Fancy algorithms are slow when n (input size) is small, and n is often small.
Before using fancy algorithms check previous point;
(g) Fancy algorithms are much harder to implement and therefore, often, buggier
than simpler ones. Use simple algorithms and data structures first.
(h) When tuning, do it systematically so that bigger performance gains can be
achieved with minimum increase in code complexity.
16. Extensibility Rule
(a) Design for the future because it will be here sooner than you think;
(b) Don’t assume that there is a single solution (one true way) to solve a problem;
(c) Leave room to the code and data formats to grow;
(d) Add comments in the code of the type “if you ever need to do X then ...”.
17. SPOT Rule (Single Point of Truth)
(a) DRY Principle (Dont Repeat Yourself) - Don’t repeat code: every piece of
knowledge must have a single, unambiguous, authoritative representation within
the system;
(b) Repetition leads to inconsistency and broken code (eg. when some repetitions
are modified instead of all of them);
(c) Code repetition can be removed by refactoring. Refacturing is the process of
rewriting, reworking and re-architecturing code. Refactoring is used to eliminate duplication, non-orthogonal design, outdated knowledge and to improve
performance.
4
3 PROGRAMMING PHILOSOPHY
(d) Data repetition motto: No junk, No confusion. The data structure (the model)
should be minimal, e.g., should not be too general that it can represent situations
which cannot exist;
(e) Seek for data structures whose states have one to one correspondence with the
states of the real world.
18. Pragmatic rule: there are no perfect software.
19. Beware with offered code: avoid wizard code or other code that you don’t understand.
20. The users rule: When in doubt, Work with a user to think like a user.
3
Programming Philosophy
3.1
Duplication is evil
The DRY principle - Dont Repeat Yourself - states that the programmer should not duplicate knowledge throughout the system. One way to avoid it is to be aware of the main
types of duplication:
• Imposed duplication (the programmer does not have a choice)
• inadvert duplication
• impatient duplication (laziness and because is easier to copy code)
• interdevelopper duplication
3.2
Orthogonality is good
Do not split pieces of knowledge across multiple system components. Organize the code/system
around functionality and not around job functions (of the client). Try to have independent and decoupled components. Doing this eliminates effects between unrelated
things/components and increases productivity since changes and tests are local.
Therefore, a programmer should
• keep the code decouple - modules/components do not reveal anything to each others
- manipulation is done through some kind of API
• avoid global data - it allows components to leak information to each other in an
uncontrolled way;
• avoid similar functions;
• perform testing at components/modules level.
5
3.3 Target for modularity
3.3
3 PROGRAMMING PHILOSOPHY
Target for modularity
Are the functions/methods too large? It is not a metric based on the number of lines but
is based on the complexity of the function in terms of what it does. If you can’t describe
what it does in one line then it is big.
Another hint that suggest that a function should be divided into smaller functions is
when it has too many levels of indentation or too many local variables.
In order to profit from modularity, one needs to have a good interface, an interface that
makes sense without looking at the implementation behind it. Test: try to describe it to
another programmer by the phone and see if he understands it.
3.4
Tracer coding
When doing something novel, and where the requirements are vague, a programmer should
use something like a trace bullet. Tracer bullets are bullets that burn very brightly during
their flight making them visible to the naked eye. They are used by soldiers and placed
among the other bullets, allowing the shooter to follow the bullet trajectory relative to the
target in order to make corrections to the aim.
The same concept can be applied to programming. Instead of specifying the system in
every detail and producing tons of paper, a programmer develops the framework with the
basic functionality (to see if all works together). Then, incrementally, the remaining functionalities are added. The programmer starts developping the underlying structure. The
code addded at each stage kept, thererefore it should contain error checking, be structured
and have documentation. Another advantage of this approach is that a demo is available
early for both programmers and users to see, thus allowing one to better see the progress
and correct the course if necessary.
3.5
A prototype?
Prototypes are used to analyze, expose risk, and often to answer a few questions.
What to prototype? Anything experimental or that you do not have experience with.
The goal of prototyping is to learn - it is the only time that the value does not reside on
the code. For instance, you can prototype
• architectures
• new functionalities
• structure or contents of external data
• third-party tools or components
• performance issues
• user interface design
6
3.6 Always Automate
3.6
3 PROGRAMMING PHILOSOPHY
Always Automate
Automation ensures consistency, repeatability and accuracy. Do not use manual procedures
(to compile, test, backups, versions, website generation, ...), but always automate them.
One example of automation is code generators, ie. code that writes code. There are
two main types of code generators:
• Passive code generators - are run once to produce a result. Eg. creating new source
files (templates, copyright notices, ...), performing one-off conversions among programming languages;
• Active code generators - are used each time the results are required Eg. generate the
source code for data structures from a database scheme.
3.7
Debugging
Debugging a program is seen by many programmers as a nighmare, a task to avoid at all
costs, because it may be tedious, frustrating, and often takes a long time. The problem
gets worst when a programmer needs to debug third-party code. However, debugging can
be rather painless, and maybe fun, it is attacked as a puzzle to be solved.
Use a debugger to pinpoint the problem. Whenever possible use a debugger that allows
you to visualize the data. Often, is necessary to use tracing statements - little diagnostic
messages printed by a program to the screen or a file that say things like “i got here” and
“the value of X=10”. Tracing is important in concurrent systems, real-time systems and
event based applications.
An useful technique for finding the cause of a problem is simply to explain it to someone
else. They do not need to say anything, the simple act of explaining the problem, step by
step, often causes the reason of the problem to be understood.
In the process of debugging do not assume anything, you should prove it. Identify the
reasons that caused the bug and check if they exist anywhere else in the code.
3.8
Meta-programming
The systems should be highly configurable. The choice of algorithms, database product,
midleware technology and user interface style should be implemented as configuration
options. To this effect, a programmer should use metadata 1 to describe configuration
options for an application: tuning parameters, user preferences,, installation directory, etc.
3.9
Avoid programming by coincidence
A strategy often used by novice programmers is what is called programming by coincidence.
It works as follows: the programmer types some code, tries it and it seems to work; he
1
Metadata is data about data, is any data that describes the application.
7
3.10 Testing
3 PROGRAMMING PHILOSOPHY
then types more code, tries it and continues this process until the program seems to work.
Naturally, after some time the program stops working and then the programmer will have
great difficulties in fixing the bug. The problem with the strategy relies in the fact that
does not know why the code worked the first place, therefore understanding why it does
not work becomes a more difficult job.
3.10
Testing
Test early, often, and automatically.
What to test?
• unit testing: Software unit test is code that exercises a module, by establishing a
artificial environment and then testing routines in the module being tested. Testing
should be done at the module level, in isolation, to verify its behavior.
• integration testing: evaluates the major subsystems that make up the project work
well with each other.
• validation and verification: validate the results produced by the system.
• resource exhaustion or errors: check what happens when the resources fail (lack of
memory, dis ck space, network,...)
• performance testing: verifies how the systems behaves under stress and how evaluates
its performance.
• usability testing: is done by users.
How to test?
• Regression tests - compares the output of the system with previously known values
• test data (synthetic and real)
• exercising GUI
• testing the tests - use saboteurs that introduce errors in the code and then check if
the tests detect the errors
3.11
Optimization
The most important thing to know about optimization for performance is to know when
not to do it.
The smartest, cheapest, and often fastest way to improve performance gains is to wait
a few months for the hardware to become fastest (exploit Moore’s law implies that you can
improve performance 26% in six months just by buying new hardware).
A programmer, before optimizing the code, should measure where the program spends
time (profile your code).
8
3.12 Complexity
3.12
4 ABOUT USER INTERFACES
Complexity
“Everything should be made as simple as possible, but no simpler.” - Albert Einstein
How to define simplicity in programming?
There are several ways
• Implementation complexity (programmer view)
– degree of difficulty that a programmer will experience while attempting to understand a program so that he can mentally model or debug it
• Interface complexity (user’s view)
• Number of lines of code (codebase size): more lines of code tend to represent more
bugs
3.13
Develop multivalent programs
A multivalent program should have the following traits
• The application logic lies in a library with a documented API and can be linked to
other programs;
• One UI mode is a GUI either linked directly to the core library or acting as a separate
process.
3.14
Estimating Times
When you estimate the duration of some task, like developping a system, you should take
attention to the units used. if you say that something takes about 120 working days,
then peoplle will be expect it to be completed in a date pretty close (one or less days
difference). But if you say 6 months, people will expect the conclusion between 5 to 7
months. Therefore, select the quote estimate unit (days, weeks, or months) wisely.
How do estimate? The easiest and provably the more accurate approach is to ask
someone that has done the same thing previously. If that does not apply then, after
understanding what is being asked, you build a model of the system (a rough picture of
how the system, and respective components, is going to be implemented) and estimate
the time for developing each component. The estimation should get more accurate with
experience.
4
About User Interfaces
4.1
User Interface Designs
• Apply the rule of least surprise whenever possible.
9
4.2 Web Browser as a Universal Front End
5 FINAL REMARKS
• A program interface should be
– transparent (WYSIWYG)
– concise
– expressive
– scriptable
• A program interface is scriptable if it is easily manipulated by other programs (allows
task automation);
• A program interface is concise when the length and complexity of the actions required
to perform some task can be done easily (few keystrokes, mouse clicks, ...);
• A program interface is expressive when can be readily used to performs a variety of
actions;
4.2
Web Browser as a Universal Front End
For a large class of applications it makes increasing sense to use web browsers as interface.
The main advantages are
• the GUI does not need to be implemented, instead it can be described using languages
like HTML
• avoids complex and expensive coding to implement the interface
• the application becomes Internet ready
On the other hand, the possible disadvantage is making a batch style interaction.
5
Final Remarks
To conclude, a good (pragmatic) programmer [3] besides following the above rules, philosophies, and principles, should also be:
• early adopter/fast adopter
• inquisitive
• critical thinker
• realistic
• a jack of all trades
To become a good programmer, or just to keep updated, one should:
10
REFERENCES
REFERENCES
• invest regularly on improving oneself: eg. learn a new language every year, read a
technical book each quarter,...;
• Diversify: eg. read a non-technical book periodically;
• invest on learning emerging technologies;
• review and balance your methods and choices.
Finally, a good programmer does not leave bad design or poor code, he/she fixes them
when discovered to avoid the entropy increase in the code. The problem of inaction is
similar to what happens when you have a broken window in a building that is left unrepaired
- it instills a sense of abandonment of the building.
References
[1] Robert L Read. How to be a programmer: A short, comprehensive, and personal
summary. http://samizdat.mines.edu/howto/HowToBeAProgrammer.html.
[2] Eric S. Raymond. The Art of UNIX Programming. Addison-Wesley, 2004.
[3] Andrew Hunt and David Thomas. The Pragmatic Programmer. Addison-Wesley, 17th
edition, 1964.
11
Download