Chapter 28 Formal Methods Software Engineering: A Practitioner’s Approach, 6th edition by Roger S. Pressman 1 Problems with Conventional Specification contradictions ambiguities vagueness incompleteness mixed levels of abstraction 2 Formal Specification Desired properties—consistency, completeness, and lack of ambiguity—are the objectives of all specification methods The formal syntax of a specification language (Section 28.4) enables requirements or design to be interpreted in only one way, eliminating ambiguity that often occurs when a natural language (e.g., English) or a graphical notation must be interpreted The descriptive facilities of set theory and logic notation (Section 28.2) enable clear statement of facts (requirements). Consistency is ensured by mathematically proving that initial facts can be formally mapped (using inference rules) into later statements within the specification. 3 Formal Methods Concepts data invariant—a condition that is true throughout the execution of the system that contains a collection of data state Many formal languages, such as OCL (Section 28.5) , use the notion of states as they were discussed in Chapters 7 and 8, that is, a system can be in one of several states, each representing an externally observable mode of behavior. The Z language (Section 28.6)defines a state as the stored data which a system accesses and alters operation—an action that takes place in a system and reads or writes data to a state precondition defines the circumstances in which a particular operation is valid postcondition defines what happens when an operation has completed its action 4 An Example—Print Spooler 5 States and Data Invariant The state of the spooler is represented by the four components Queues, OutputDevices, Limits, and Sizes. The data invariant has five components: • Each output device is associated with an upper limit of print lines • Each output device is associated with a possibly nonempty queue of files awaiting printing • Each file is associated with a size • Each queue associated with an output device contains files that have a size less than the upper limit of the output device • There will be no more than MaxDevs output devices administered by the spooler 6 Operations An operation which adds a new output device to the spooler together with its associated print limit An operation which removes a file from the queue associated with a particular output device An operation which adds a file to the queue associated with a particular output device An operation which alters the upper limit of print lines for a particular output device An operation which moves a file from a queue associated with an output device to another queue associated with a second output device 7 Pre- & Postconditions For the first operation (adds a new output device to the spooler together with its associated print limit): Precondition: the output device name does not already exist and that there are currently less than MaxDevs output devices known to the spooler Postcondition: the name of the new device is added to the collection of existing device names, a new entry is formed for the device with no files being associated with its queue, and the device is associated with its print limit. 8 Mathematical Concepts sets and constructive set specification set operators logic operators e.g., i, j: • i > j i2 => j2 which states that, for every pair of values in the set of natural numbers, if i is greater than j, then i2 is greater than j2. sequences 9 Sets and Constructive Specification A set is a collection of objects or elements and is used as a cornerstone of formal methods. Enumeration {C++, Pascal, Ada, COBOL, Java} #{C++, Pascal, Ada, COBOL, Java} implies cardinality = 5 Constructive set specification is preferable to enumeration because it enables a succinct definition of large sets. {x, y : N | x + y = 10 (x, y2)} 10 Set Operators A specialized set of symbology is used to represent set and logic operations. Examples The P operator is used to indicate membership of a set. For example, the expression xPX The operators , , and # take sets as their operands. The predicate A,B has the value true if the members of the set A are contained in the set B and has the value false otherwise. The union operator, <, takes two sets and forms a set that contains all the elements in the set with duplicates eliminated. {File1, File2, Tax, Compiler} < {NewTax, D2, D3, File2} is the set {Filel, File2, Tax, Compiler, NewTax, D2, D3} 11 Logic Operators Another important component of a formal method is logic: the algebra of true and false expressions. Examples: V or ¬ not => implies Universal quantification is a way of making a statement about the elements of a set that is true for every member of the set. Universal quantification uses the symbol, . An example of its use is i, j : N i > j => i2 > j2 which states that for every pair of values in the set of natural numbers, if i is greater than j, then i2 is greater than j2. 12 Sequences Sequences are designated using angle brackets. For example, the preceding sequence would normally be written as k Jones, Wilson, Shapiro, Estavezl Catenation, X, is a binary operator that forms a sequence constructed by adding its second operand to the end of its first operand. For example, k 2, 3, 34, 1l X k12, 33, 34, 200 l = k 2, 3, 34, 1, 12, 33, 34, 200 l Other operators that can be applied to sequences are head, tail, front, and last. head k 2, 3, 34, 1, 99, 101 l = 2 tail k 2, 3, 34, 1, 99, 101 l = 73, 34, 1,99, 1018 last k 2, 3, 34, 1, 99, 101 l = 101 front k 2, 3, 34, 1, 99, 101 l = 72, 3, 34, 1, 998 13 Formal Specification The block handler The block handler maintains a reservoir of unused blocks and will also keep track of blocks that are currently in use. When blocks are released from a deleted file they are normally added to a queue of blocks waiting to be added to the reservoir of unused blocks. The state used, free: P BLOCKS BlockQueue: seq P BLOCKS Data Invariant used > free = \ used < free = AllBlocks i: dom BlockQueue BlockQueue i # used i, j : dom BlockQueue i ≠ j => BlockQueue i > BlockQueue j = \ Precondition #BlockQueue > 0 Postcondition used' = used \ head BlockQueue free’ = free < head BlockQueue BlockQueue' = tail BlockQueue 14 Formal Specification Languages A formal specification language is usually composed of three primary components: a syntax that defines the specific notation with which the specification is represented semantics to help define a "universe of objects" [WIN90] that will be used to describe the system a set of relations that define the rules that indicate which objects properly satisfy the specification The syntactic domain of a formal specification language is often based on a syntax that is derived from standard set theory notation and predicate calculus. The semantic domain of a specification language indicates how the language represents system requirements. 15 Object Constraint Language (OCL) a formal notation developed so that users of UML can add more precision to their specifications All of the power of logic and discrete mathematics is available in the language However the designers of OCL decided that only ASCII characters (rather than conventional mathematical notation) should be used in OCL statements. 16 OCL Overview Like an object-oriented programming language, an OCL expression involves operators operating on objects. However, the result of a complete expression must always be a Boolean, i.e. true or false. The objects can be instances of the OCL Collection class, of which Set and Sequence are two subclasses. See Table 28.1 for summary of OCL notation 17 BlockHandler using UML 1 * Block BlockSet e le m e n t s num ber * * * fre e b lo ckQu e u e u se d { o rd e re d } * allBlo cks { su b se t } { su b se t } 1 1 1 BlockHandler 1 ad d Blo ck( ) re m o v e Blo ck( ) 18 BlockHandler in OCL No block will be marked as both unused and used. context BlockHandler inv: (self.used->intersection(self.free)) ->isEmpty() All the sets of blocks held in the queue will be subsets of the collection of currently used blocks. context BlockHandler inv: blockQueue->forAll(aBlockSet | used->includesAll(aBlockSet )) No elements of the queue will contain the same block numbers. context BlockHandler inv: blockQueue->forAll(blockSet1, blockSet2 | blockSet1 <> blockSet2 implies blockSet1.elements.number->excludesAll(blockSet2.elements.number)) The expression before implies is needed to ensure we ignore pairs where both elements are the same Block. The collection of used blocks and blocks that are unused will be the total collection of blocks that make up files. context BlockHandler inv: allBlocks = used->union(free) The collection of unused blocks will have no duplicate block numbers. context BlockHandler inv: free->isUnique(aBlock | aBlock.number) The collection of used blocks will have no duplicate block numbers. context BlockHandler inv: used->isUnique(aBlock | aBlock.number) 19 The Z Language organized into schemas defines variables establishes relationships between variables the analog for a “module” in conventional languages notation described in Table 28.2 20 BlockHandler in Z The following example of a schema describes the state of the block handler and the data invariant: ———BlockHandler—————————————— used, free : P BLOCKS BlockQueue : seq P BLOCKS ——————————————————————— used > free = \ used < free = AllBlocks i: dom BlockQueue BlockQueue i # used i, j : dom BlockQueue i ≠ j => BlockQueue i > BlockQueue j = \ ———————————————————————— See Section 28.6.2 for further expansion of the specification 21 Chapter 29 Cleanroom Software Engineering Software Engineering: A Practitioner’s Approach, 6th edition by Roger S. Pressman 22 The Cleanroom Process Model 23 The Cleanroom Strategy-I Increment Planning—adopts the incremental strategy Requirements Gathering—defines a description of customer level requirements (for each increment) Box Structure Specification—describes the functional specification Formal Design—specifications (called “black boxes”) are iteratively refined (with an increment) to become analogous to architectural and procedural designs (called “state boxes” and “clear boxes,” respectively). Correctness Verification—verification begins with the highest level box structure (specification) and moves toward design detail and code using a set of “correctness questions.” If these do not demonstrate that the specification is correct, more formal (mathematical) methods for verification are used. Code Generation, Inspection and Verification—the box structure specifications, represented in a specialized language, are transmitted into the appropriate programming language. 24 The Cleanroom Strategy-II Statistical Test Planning—a suite of test cases that exercise of “probability distribution” of usage are planned and designed Statistical Usage Testing—execute a series of tests derived from a statistical sample (the probability distribution noted above) of all possible program executions by all users from a targeted population Certification—once verification, inspection and usage testing have been completed (and all errors are corrected) the increment is certified as ready for integration. 25 Box Structure Specification black box clear box state box 26 Box Structures black box state box clear box 27 Design Refinement & Verification If a function f is expanded into a sequence g and h, the correctness condition for all input to f is: • Does g followed by h do f? When a function f is refined into a conditional (if-then-else), the correctness condition for all input to f is: • Whenever condition <c> is true does g do f and whenever <c> is false, does h do f? When function f is refined as a loop, the correctness conditions for all input to f is: • Is termination guaranteed? • Whenever <c> is true does g followed by f do f, and whenever <c> is false, does skipping the loop still do f? 28 Advantages of Design Verification It reduces verification to a finite process. It lets cleanroom teams verify every line of design and code. It results in a near zero defect level. It scales up. It produces better code than unit testing. 29 Cleanroom Testing statistical use testing tests the actual usage of the program determine a “usage probability distribution” analyze the specification to identify a set of stimuli stimuli cause software to change behavior create usage scenarios assign probability of use to each stimuli test cases are generated for each stimuli according to the usage probability distribution 30 Certification 1. Usage scenarios must be created. 2. A usage profile is specified. 3. Test cases are generated from the profile. 4. Tests are executed and failure data are recorded and analyzed. 5. Reliability is computed and certified. 31 Certification Models Sampling model. Software testing executes m random test cases and is certified if no failures or a specified numbers of failures occur. The value of m is derived mathematically to ensure that required reliability is achieved. Component model. A system composed of n components is to be certified. The component model enables the analyst to determine the probability that component i will fail prior to completion. Certification model. The overall reliability of the system is projected and certified. 32 Chapter 30 Component-Based Software Engineering Software Engineering: A Practitioner’s Approach, 6th edition by Roger S. Pressman 33 The Key Questions When faced with the possibility of reuse, the software team asks: Are commercial off-the-shelf (COTS) components available to implement the requirement? Are internally-developed reusable components available to implement the requirement? Are the interfaces for available components compatible within the architecture of the system to be built? At the same time, they are faced with the following impediments to reuse ... 34 Impediments to Reuse Few companies and organizations have anything that even slightly resembles a comprehensive software reusability plan. Although an increasing number of software vendors currently sell tools or components that provide direct assistance for software reuse, the majority of software developers do not use them. Relatively little training is available to help software engineers and managers understand and apply reuse. Many software practitioners continue to believe that reuse is “more trouble than it’s worth.” Many companies continue to encourage of software development methodologies which do not facilitate reuse Few companies provide an incentives to produce reusable 35 program components. The CBSE Process 36 Domain Engineering 1. Define the domain to be investigated. 2. Categorize the items extracted from the domain. 3. Collect a representative sample of applications in the domain. 4. Analyze each application in the sample. 5. Develop an analysis model for the objects. 37 Identifying Reusable Components • Is component functionality required on future implementations? • How common is the component's function within the domain? • Is there duplication of the component's function within the domain? • Is the component hardware-dependent? • Does the hardware remain unchanged between implementations? • Can the hardware specifics be removed to another component? • Is the design optimized enough for the next implementation? • Can we parameterize a non-reusable component so that it becomes reusable? • Is the component reusable in many implementations with only minor changes? • Is reuse through modification feasible? • Can a non-reusable component be decomposed to yield reusable components? • How valid is component decomposition for reuse? 38 Structural Modeling every application has structural patterns that have the potential for reuse a “structure point” is a construct with the structure A structure point is an abstraction that should have a limited number of instances. Restating this in object-oriented jargon , the size of the class hierarchy should be small. The rules that govern the use of the structure point should be easily understood. In addition, the interface to the structure point should be relatively simple. The structure point should implement information hiding by hiding all complexity contained within the structure point itself. This reduces the perceived complexity of the overall system. 39 Structural Patterns An interface that enables the user to interact with the system. A bounds-setting mechanism that allows the user to establish bounds on the parameters to be measured. A sensor management mechanism that communicates with all monitoring sensors. A response mechanism that reacts to the input provided by the sensor management system. A control mechanism that enables the user to control the manner in which monitoring is carried out. 40 Component-Based Development a library of components must be available components should have a consistent structure a standard should exist, e.g., OMG/CORBA Microsoft COM Sun JavaBeans 41 CBSE Activities Component qualification Component adaptation Component composition Component update 42 Qualification Before a component can be used, you must consider: • application programming interface (API) • development and integration tools required by the component • run-time requirements including resource usage (e.g., memory or storage), timing or speed, and network protocol • service requirements including operating system interfaces and support from other components • security features including access controls and authentication protocol • embedded design assumptions including the use of specific numerical or non-numerical algorithms • exception handling 43 Adaptation The implication of “easy integration” is: (1) that consistent methods of resource management have been implemented for all components in the library; (2) that common activities such as data management exist for all components, and (3) that interfaces within the architecture and with the external environment have been implemented in a consistent manner. 44 Composition An infrastructure must be established to bind components together Architectural ingredients for composition include: Data exchange model Automation Structured storage Underlying object model 45 OMG/ CORBA The Object Management Group has published a common object request broker architecture (OMG/CORBA). An object request broker (ORB) provides services that enable reusable components (objects) to communicate with other components, regardless of their location within a system. Integration of CORBA components (without modification) within a system is assured if an interface definition language (IDL) interface is created for every component. Objects within the client application request one or more services from the ORB server. Requests are made via an IDL or dynamically at run time. An interface repository contains all necessary information about the service’s request and response formats. 46 ORB Architecture Interface Repository Client Dynamic Invocation Client IDL Stubs ORB interface Server Objects LAN ORB Core ORB interface Server IDL Stubs Object Adapter Interface Repository 47 Microsoft COM The component object model (COM) provides a specification for using components produced by various vendors within a single application running under the Windows operating system. COM encompasses two elements: COM interfaces (implemented as COM objects) a set of mechanisms for registering and passing messages between COM interfaces. 48 Sun JavaBeans The JavaBeans component system is a portable, platform independent CBSE infrastructure developed using the Java programming language. The JavaBeans component system encompasses a set of tools, called the Bean Development Kit (BDK), that allows developers to analyze how existing Beans (components) work customize their behavior and appearance establish mechanisms for coordination and communication develop custom Beans for use in a specific application test and evaluate Bean behavior. 49 Classification Enumerated classification—components are described by defining a hierarchical structure in which classes and varying levels of subclasses of software components are defined Faceted classification—a domain area is analyzed and a set of basic descriptive features are identified Attribute-value classification—a set of attributes are defined for all components in a domain area 50 Indexing 51 The Reuse Environment A component database capable of storing software components and the classification information necessary to retrieve them. A library management system that provides access to the database. A software component retrieval system (e.g., an object request broker) that enables a client application to retrieve components and services from the library server. CBSE tools that support the integration of reused components into a new design or implementation. 52 Reuse Economics Consider a new application, X, that requires 60 percent new code and the reuse of three structure points, SP1, SP2, and SP3. Average costs for qualification, adaptation, integration, and maintenance are available. overall effort = Enew + Equal + Eadapt + Eint where Enew = effort required to engineer and construct new software components (determined using techniques described in Chapter 23). Equal = effort required to qualify SP1, SP2, and SP3. Eadapt = effort required to adapt SP1, SP2, and SP3. Eint = effort required to integrate SP1, SP2, and SP3. The effort required to qualify, adapt, and integrate SP1, SP2, and SP3 is determined by taking the average of historical data collected for qualification, adaptation, and integration of the 53 reusable components in other applications. Reuse Metrics The benefit associated with reuse within a system S can be expressed as a ratio Rb(S) = [Cnoreuse – Creuse]/Cnoreuse where Cnoreuse is the cost of developing S with no reuse. Creuse is the cost of developing S with reuse. Devanbu and his colleagues [DEV95] suggest that Rb will be affected by the design of the system since Rb is affected by the design, it is important to make Rb a part of an assessment of design alternatives the benefits associated with reuse are closely aligned to the cost benefit of each individual reusable component. A general measure of reuse in object-oriented systems, termed reuse leverage [BAS94], is defined as Rlev = OBJreused/OBJbuilt where OBJreused is the number of objects reused in a system. OBJbuilt is the number of objects built for a system. 54 Chapter 31 Reengineering Software Engineering: A Practitioner’s Approach, 6th edition by Roger S. Pressman 55 Reengineering Business processes IT systems Reengineering Software applications 56 Business Process Reengineering Business definition. Business goals are identified within the context of four key drivers: cost reduction, time reduction, quality improvement, and personnel development and empowerment. Process identification. Processes that are critical to achieving the goals defined in the business definition are identified. Process evaluation. The existing process is thoroughly analyzed and measured. Process specification and design. Based on information obtained during the first three BPR activities, use-cases (Chapter 7) are prepared for each process that is to be redesigned. Prototyping. A redesigned business process must be prototyped before it is fully integrated into the business. Refinement and instantiation. Based on feedback from the prototype, the business process is refined and then instantiated within a business system. 57 Business Process Reengineering 58 BPR Principles Organize around outcomes, not tasks. Have those who use the output of the process perform the process. Incorporate information processing work into the real work that produces the raw information. Treat geographically dispersed resources as though they were centralized. Link parallel activities instead of integrated their results. When different Put the decision point where the work is performed, and build control into the process. Capture data once, at its source. 59 Software Reengineering Forward engineering Data restructuring code restructuring inventory analysis document restructuring reverse engineering 60 Inventory Analysis build a table that contains all applications establish a list of criteria, e.g., name of the application year it was originally created number of substantive changes made to it total effort applied to make these changes date of last substantive change effort applied to make the last change system(s) in which it resides applications to which it interfaces, ... analyze and prioritize to select candidates for reengineering 61 Document Restructuring Weak documentation is the trademark of many legacy systems. But what do we do about it? What are our options? Options … Creating documentation is far too time consuming. If the system works, we’ll live with what we have. In some cases, this is the correct approach. Documentation must be updated, but we have limited resources. We’ll use a “document when touched” approach. It may not be necessary to fully redocument an application. The system is business critical and must be fully redocumented. Even in this case, an intelligent approach is to pare documentation to an essential minimum. 62 Reverse Engineering dirty source code restructure code clean source code processing extract abstractions interface initial specification database refine & simplify final specification 63 Code Restructuring Source code is analyzed using a restructuring tool. Poorly design code segments are redesigned Violations of structured programming constructs are noted and code is then restructured (this can be done automatically) The resultant restructured code is reviewed and tested to ensure that no anomalies have been introduced Internal code documentation is updated. 64 Data Restructuring Unlike code restructuring, which occurs at a relatively low level of abstraction, data structuring is a full-scale reengineering activity In most cases, data restructuring begins with a reverse engineering activity. Current data architecture is dissected and necessary data models are defined (Chapter 9). Data objects and attributes are identified, and existing data structures are reviewed for quality. When data structure is weak (e.g., flat files are currently implemented, when a relational approach would greatly simplify processing), the data are reengineered. Because data architecture has a strong influence on program architecture and the algorithms that populate it, changes to the data will invariably result in either architectural or code-level changes. 65 Forward Engineering 1. The cost to maintain one line of source code may be 20 to 40 times the cost of initial development of that line. 2. Redesign of the software architecture (program and/or data structure), using modern design concepts, can greatly facilitate future maintenance. 3. Because a prototype of the software already exists, development productivity should be much higher than average. 4. The user now has experience with the software. Therefore, new requirements and the direction of change can be ascertained with greater ease. 5. CASE tools for reengineering will automate some parts of the job. 6. A complete software configuration (documents, programs and data) will exist upon completion of preventive maintenance. 66 Economics of Reengineering-I A cost/benefit analysis model for reengineering has been proposed by Sneed [SNE95]. Nine parameters are defined: P1 = current annual maintenance cost for an application. P2 = current annual operation cost for an application. P3 = current annual business value of an application. P4 = predicted annual maintenance cost after reengineering. P5 = predicted annual operations cost after reengineering. P6 = predicted annual business value after reengineering. P7 = estimated reengineering costs. P8 = estimated reengineering calendar time. P9 = reengineering risk factor (P9 = 1.0 is nominal). L = expected life of the system. 67 Economics of Reengineering-II The cost associated with continuing maintenance of a candidate application (i.e., reengineering is not performed) can be defined as Cmaint = [P3 - (P1 + P2)] x L The costs associated with reengineering are defined using the following relationship: Creeng = [P6 - (P4 + P5) x (L - P8) - (P7 x P9)] ` Using the costs presented in equations above, the overall benefit of reengineering can be computed as cost benefit = Creeng - Cmaint 68 Chapter 32 The Road Ahead Software Engineering: A Practitioner’s Approach, 6th edition by Roger S. Pressman 69 Importance of SoftwareRevisited In Chapter 1, software was characterized as a differentiator. The function delivered by software differentiates products, systems, and services and provides competitive advantage in the marketplace. But software is more that a differentiator. The programs, documents, and data that are software help to generate the most important commodity that any individual, business, or government can acquire—information. 70 The Scope of Change Software connected technologies will impact communications, energy, healthcare, transportation, entertainment, economics, manufacturing, and warfare, to name only a few Some technologies to watch: Carbon nanotubes Biosensors OLED displays Grid Computing Cognitive machines 71 People - Building Systems Communication is changing e.g., video conferencing Work patterns are changing e.g., intelligent agents Knowledge acquisition is changing e.g., data mining, the Web 72 The “New” SE Process Agile the process and the people must be adaptable Incremental Delivery occurs in increments All software engineering activities are iterative Object-oriented Classes are defined Responsibilities are identified Collaboration is described 73 An Information Spectrum 74 Technology Trends Combination technologies. When two important technologies are merged, the impact of the merged result is often greater that sum of the impact of each taken separately. Data fusion. The more data we acquire, the more data we need. More importantly, the more data we acquire, the more difficult it is to extract useful information. Technology Push. Today, some technologies evolve as solutions looking for problems. Networking and serendipity. In this context networking implies connections between people or between people and information. Information overload. A vast sea of information is accessible by anyone with an Internet connection. 75 Software Engineering Ethics-I An ACM/IEEE-CS Joint Task Force has produced a Software Engineering Code of Ethics and Professional Practices (Version 5.1). The code [ACM98] states: Software engineers shall commit themselves to making the analysis, specification, design, development, testing and maintenance of software a beneficial and respected profession. In accordance with their commitment to the health, safety and welfare of the public, software engineers shall adhere to the following Eight Principles: 76 Software Engineering Ethics-I 1. PUBLIC - Software engineers shall act consistently with the public interest. 2. CLIENT AND EMPLOYER - Software engineers shall act in a manner that is in the best interests of their client and employer consistent with the public interest. 3. PRODUCT - Software engineers shall ensure that their products and related modifications meet the highest professional standards possible. 4. JUDGMENT - Software engineers shall maintain integrity and independence in their professional judgment. 5. MANAGEMENT - Software engineering managers and leaders shall subscribe to and promote an ethical approach to the management of software development and maintenance. 6. PROFESSION - Software engineers shall advance the integrity and reputation of the profession consistent with the public interest. 7. COLLEAGUES - Software engineers shall be fair to and supportive of their colleagues. 8. SELF - Software engineers shall participate in lifelong learning regarding the practice of their profession and shall promote an ethical approach to the practice of the profession. 77 Ethics-On a Personal level Never steal data for personal gain. Never distribute or sell proprietary information obtained as part of your work on a software project. Never maliciously destroy or modify another person’s programs, files, or data. Never violate the privacy of an individual, a group, or an organization. Never hack into a system for sport or profit. Never create or promulgate a computer virus or worm. Never use computing technology to facilitate discrimination or harassment. 78