PHASE 2: SYSTEMS ANALYSIS Requirements Analysis Introduction Structured Analysis Data-Flow Diagrams Data Dictionary Process Descriptions Introduction The Requirements Analysis is the second major step of the system analysis phase of the systems development life cycle (SDLC). When analyzing requirements, various structured analysis tools are used to document the systems requirements, analyze them, and develop a model of the system. Structured Analysis Structured analysis examines a system in terms of its inputs, outputs, and processes and is the most common way to describe the requirements for a new system. It focuses on the flow of data through the system to convert into useful information. The end product of structured analysis is a logical model of the system, which shows what the system does, regardless of how it will be done. Logical model is also termed as business model, as it describes the business flow of data. A physical model describes how the system will function when the logical model is implemented. In this chapter we describe logical models. Structured analysis uses three main tools: Data-Flow Diagrams Data Dictionary Process Descriptions In this chapter, we will discuss data-flow diagrams. 1 Data Flow Diagrams Data flow diagrams are used to graphically represent the flow of data in a business information system. It describes the processes that are involved in a system to transfer data from the input to the file storage and reports generation. Data flow diagrams can be divided into logical or physical. The logical data flow diagram describes flow of data through a system to perform certain functionality of a business. The physical data flow diagram describes the implementation of the logical data flow. There are four basic symbols that are used to represent a data-flow diagram. Process Symbol A process receives input data and produces output with a different content or form. Processes can be as simple as collecting input data and saving in the database, or it can be complex as producing a report containing monthly sales of all retail stores in the northwest region. Every process has a name that identifies the function it performs. The name consists of a verb, followed by a singular noun. Examples are: Apply Payment, Calculate Commission, and Verify Order. The details of a process are not shown in the data flow diagram. They are documented separately in a process description. Data Flow Symbol A data-flow is a path for data to move from one part of the information system to another. A data-flow may represent a single data element such the Customer ID or it can represent a set of data element (or a data structure). Examples of data structure are: customer information (LastName, FirstName, SS#, Tel #, etc.) or order information (OrderId, Item#, OrderDate, CustomerID, etc.). The data-flow name consists of a singular noun and an adjective, if needed. The direction of the data flow shows the direction to which data moves in the information system. Because every process changes data from one form into another, at least one data-flow must enter and one data-flow must exit each process symbol. 2 Data Store Symbol A data store or data repository is used in a data-flow diagram to represent a situation when the system must retain data because one or more processes need to use the stored data in a later time. For example, transaction report of a customer bank account is printed at the end of a monthly cycle. Thus a record of each customer account transaction must be kept in the system. A data store name is a plural name consisting of a noun and adjectives. Examples are: Students, Accounts receivables, Orders, and Daily Payments. A data store must be connected to a process with a data-flow. Each data store must have at least one input data-flow and at least one output data-flow (even if the output data-flow is a control or confirmation message). External Entity Symbol An external entity is a person, department, outside organization, or other information system that provides data to the system or receives outputs from the system. External entities are components outside of the boundaries of the information systems. They represent how the information system interacts with the outside world. For example, a customer submitting an order is an external entity because the customer supplies data to the order processing system (although we store information about a customer in a data store). External entities also are called terminators because they are data origins or final destinations. An external entity must be connected to a process through a data-flow. 3 Developing Data Flow Diagrams Data flow diagrams should be drawn systematically. Figure below illustrates the steps involved in successfully developing a data flow diagram. 4 5 Context-Level Diagram The context diagram is the highest level in a data flow diagram, containing only one process representing the entire system. All external entities are shown on the context diagram as well as major data flow to and from them. The diagram does not contain any data storage. Creating Diagram 0 The single process in the context-level diagram, representing the entire system, can be exploded to include the major processes of the system in the next level diagram, which is termed as diagram 0. 6 Next-Level Diagrams Processes in diagram 0 (with a whole number) can be exploded further to represent details of the processing activities. Example below shows the next level ((Diagram 1) of process explosion. 7 Conventions for Data-Flow Diagram The context diagram must fit in one page. The process name in the context diagram should be the name of the information system. For example, Grading System, Order Processing System, Registration System. Use unique names within each set of symbols. For example, there can be only one entity CUSTOMER in all levels of the data-flow diagrams. There can be only one process named CALCULATE OVERTIME among all levels of data-flow diagrams. Do not cross lines. One way to achieve this is to restrict the number of processes in a data-flow diagram. On lower-level data-flow diagrams with multiple processes, one should not have more than nine (9) process symbols. Another way to avoid crossing lines is to duplicate an external entity or data store. Use a special notation such as an asterisk, to denote the duplicate symbol. Use abbreviated identification for external entities and data-flows. For example, C for entity CUSTOMER, and D1 for data store STUDENT. Use a unique reference number for each process symbol. The process number in the context-level diagram is 0 (zero). Other process numbers are in the hierarchy of (1, 2, 3,…); (1.1, 1.2, 1.3, …., 2.1, 2.2, 2.3,….); (1.1.1, 1.1.2, 1.1.3,…………………………………………………………). Verify Errors in DFD 8 Balancing and Leveling Data-Flow Diagrams Balancing refers to the preservation of input and output data flows of the parent diagram on the child diagrams Leveling means that the information system is first displayed as a single process, and then shows more detail in subsequent child diagrams until all processes become functional primitives 9 Logical and Physical Data Flow Diagrams Data flow diagrams are categorized as either logical or physical. A logical data flow diagram focuses on the business and how the business operates. It is not concerned with how the system will be constructed. A physical data flow diagram shows how the system will be implemented, including the hardware, software, files, and people in the system. It is developed such that the processes described in the logical data flow diagrams are implemented correctly to achieve the goal of the business. Table below contracts the features of logical and physical data flow diagrams. Figure next page illustrates a logical data flow diagram and a physical data flow diagram for a grocery store cashier. Advantages of Logical Data Flow Diagram Better communication: A logical diagram is easier to use when communicating with the users of a system, because it is centered on business activities. More stable system: Systems formed using a logical data flow diagram are often more stable than those that are not because they are based on business events and not on a particular technology or method of implementation. Better understanding of business: A logical data flow diagram has a business emphasis and helps the analyst to understand the business being studied, to grasp why procedures are performed, and to determine the expected result of performing a task. Flexibility and maintenance: The new system will be more flexible and easier to maintain if its design is based on a logical model and the business functions are not subject to frequent change. Physical aspects of the system change more frequently than do business functions. Simple to develop: The logical model is easy to create and simpler to use because it does not often contain data stores other than files or a database. Easy conversion to physical model: Once the logical data flow diagrams are created, the creation of physical data flow diagrams becomes easy. Developing Physical Data Flow Diagrams Create the physical data flow diagram for a system by analyzing its output and input. Determine which data fields or elements need to be keyed and stored in a file or table. 10 When examining the output, determine whether the information must immediately be displayed or made available to users as reports. Processes that immediately display output on screens are on-line processes. Processes that require high volume of transactions, such as billing or check processing, or a large number of records that need to be summarized are usually batch processes. Create a process for each distinct output. Figure below describes what should be included in a physical data flow diagram. Advantages of Physical Data Flow Diagram Clarifying which processes are manual and which are automated: Manual processes require detailed documentation and automated process require computer programs to be developed. Describing processes in more detail than do logical DFDs: Describes all steps for processing of data. Sequencing processes that have to be done in a particular order: Sequence of activities that lead to a meaningful result are described. For example, update must be performed before a producing a summary report. Identifying temporary data storage: Temporary storage such as a sales transaction file for a customer receipt (report) in a grocery store, are described. Specifying actual names of files and printouts: Logical data flow diagrams describes actual filenames and reports, so that the programmers can relate those with the data dictionary during the developmental phase of the system. Adding controls to ensure the processes are done properly: These are conditions or validations of data that are to be met during input, update, delete, and other processing of data. 11 Partitioning Data Flow Diagrams Partitioning is the process of examining a data flow diagram and determining how it should be divided into collections of manual procedures and collections of computer programs. Analyze each process to determine whether it should be manual or automated procedure. Examples of manual processes are, organizing checks for payment posting, scanning a form, verifying all required fields on a form, and so on. Group automated procedures into a series of computer programs. Also determine whether a process is a batch processing (high volume of data processing from a file) or on-line processing (payment entry). To develop a collection or group of programs, analyze the nature of each process and group them according to the functionality of the processes. There are six reasons for partitioning data flow diagrams: Processing performed by different user groups: If the processes performed by several different user groups are physically located in different locations with the company, they should be partitioned into different computer programs. Timing: if two processes execute at different times, they can not be grouped into one program. Similar Tasks: If two processes perform similar tasks and both are batch processes, they may be grouped into one program. Efficiency: Several batch processes may be combined into one program for efficient processing. This can improve computer access time. Consistency of data: Processes may be combined into one program for consistency of data. Security: Processes may be partitioned into different programs for security reasons. 12