IS-270, Week 2 The database development lifecycle and fact-finding Jean-François Blanchette Department of Information Studies Graduate School of Education and Information Studies Components of a Database System Organizational Database Systems 1 Database Management System ο A database is a self-describing collection of related records ο A database management system (DBMS) serves as an intermediary between database applications and the database ν It creates, processes and administers the databases it controls ο A database application is a set of one or more computer programs that serves as an intermediary between the user and the DBMS ο Users use a database application to enter, read, delete and query data and to produce reports Functions of a DBMS ο ο ο ο ο ο ο ο ο ο Create databases Create tables Create supporting structures Read database data Modify database data (insert, update, delete) Maintain database structures Enforce rules Control concurrency Provide security Perform backup and recovery Real-world DB design is difficult! ο The implementation of a information processing/management system is a significant engineering endeavor: ν Project must complete on time/on budget; ν When operational, system must meet its requirements, operate reliably and efficiently; ν Documentation and coding for the project must be such that the system can be maintained and enhanced over a long lifetime. ν Most importantly, it must meet the needs of its user 2 Very difficult! ο “The software crisis”: 1996 UK study states that: 80-90% systems unable to meet their goals! 80% delivered late/over budget 40% projects fail/abandoned Less 25% properly integrate business with technology objectives ν Only 10-20% meet their success criteria ν ν ν ν ο Two major causes: ν (A) Poor or incomplete requirements specification ν (B) Lack of structured development methodology (A) Poor requirements ο One explanation: systems analysts trained to focus on considerations of technical efficiency (speed, storage, cost)… ο … not on organizational aspects of information systems implementation, ν How the system will fit within a given organizational structure ν How the system will fit the mental structures of users (cognitive ergonomics) ο Fact-finding methodologies have evolved in response to this problem (B) Steps to structured development ο ο ο ο ο ο ο ο ο ο (1) Planning (2) System definition (3) Requirements collection and analysis (4) Logical design (5) Physical design (6) Application Design (7) Prototyping (8) Implementation (9) Testing (10) Maintenance 3 Database planning ο Mission statement and objectives: ν Defines the major aims of the DBMS ν Identifies particular tasks DBMS must support ν Estimates resources: work, money, time ο Documents how data will be collected ο Documents legal and business requirements regarding data acquisition (feasibility?), manipulation (privacy?), storage (security?) ο Other contexts to consider: organizational culture, ethical practices, political issues, etc, may all play a role in success of project System definition ο System boundaries: How does the database interface with other systems within the organization? ο User view: What is required of a database from perspective of particular users/departments. F.ex., for a library catalog database: ν ν ν ν Patrons needs to perform searches/browse through the catalog to view description of library material; Clerks needs to enter new patron information; check out books, print out receipts for late fees; Branch manager needs to enter new employee information; Regional manager needs to extract figures for all branches within the province Requirements ο Collect information for each major user view using fact-finding techniques: ν Data to be used and/or generated ν Expected transactions (retrievals, updates) ν Policies for each views (what can and can’t be done with the data) ο Analyze and distill information into statements regarding expected functionality of systems ν Patron: perform searchers by title and author; … ν Clerk: enter patrons, etc… 4 Database design ο Logical design: ν identify and model the objects to be represented ν Identify and model their attributes, relationships, constraints ο Physical design: ν Translate model into relational tables ν Decide file organization ν Decide indexes Application design ο In our case (Access), application means queries (transactions) and the input/output screens (forms and reports); ο Three types of transactions: ν ν ν Retrieval of data for display or report Update: insert, delete, or modify records Mixed: retrieval and update ο For each transaction, document: ν ν ν Data to be used by transaction Output of transaction Importance and expected rate of usage ο Good and friendly user interface design is crucial to usability Prototyping, etc. ο Prototyping: create working mockup, in order to test and refine design ο Implement: physical realization ο Testing: ν Debug ν Verify conformance to requirements ο Operational maintenance: ν Performance monitoring and tuning ν Upgrading/update, as new requirements surface 5 Discussion: TR Commission ο Is it possible to fit emotional & violent events into cold analytical structure of databases? What kind of stories do databases tell? ο Was the data model effective in capturing 'real-world' relationships between victims, events, violations? ο What is the relationship between (a) keeping track of individual violations, and (b) revealing trends? ο Advantages/Disadvantages of recording violations using free-form narrative text, forms, and semi-structured forms? Fact-finding Why fact-finding? ο In order to properly identify requirements… ο … and thus derive an accurate data model… ο .. the designer must learn about problems, opportunities, constraints, terminology, priorities of the organization and users ο Crucial issue! if we correctly identify and model information management needs, we will: ν Make better information systems ν Guarantee user/market acceptance ν Even create new markets! 6 Fact-finding techniques ο Information for the stages of database planning, system definition and requirements collection and analysis is obtained through fact-finding techniques ο Common methods: ν ν ν ν ν ν (1) Review documentation (2) Questionnaires (3) Interviewing (4) Participant-observation (5) ‘Naturalistic enquiry’ (6) Good old-fashioned research (1) Reviewing documentation ο Overall information management problem is found in: ν Memos, e-mails, minutes of meetings; ο Description of organization: ν Organizational charts, business/strategic plans, job descriptions, forms, reports (manual and computerized) ο Current information system documentation: ν flowcharts, diagrams, program documentation, user/training manuals (2) Questionnaires ο Advantages: ν Gather facts from large number of people while controlling the format of answers ν Confidential, anonymous, convenient ν (fixed-format) easily tabulated and analyzed ο But … ν Preparation is time-consuming ν Body language of respondent unavailable ν No opportunity to reword questions when they are misinterpreted ν Low (5-10 %) response rate 7 (3) Interviews ο Most commonly used method: meet face-to-face with relevant staff and management, using (semi-)structured format ο Advantages: ν ν ν Fine-grained, flexible, allows for rewording Body language of respondent Involves end-users in design process ο But … ν ν ν Very time-consuming Costly to transcribe, code, and process … Requires good communication skills from system analyst (4) Participant-observation ο Participate in/observe staff perform job tasks ο Advantages: ν Will enable you to observe tacit knowledge, not explicit by other means ν Makes visible interpersonal and organizational dynamics ο But: ν To obtain representative observations, must be performed over significant time period ν Heisenberg Principle: observation alters measurement (5) ‘Naturalistic inquiry’ ο Productivity paradox and emergence of collective/cooperative context for IS leads to growing appreciation for ‘naturalistic enquiry’ ο Suite of research methodologies/data collection techniques: ν ν Participant observation, respondent interviewing, artifact collection, generation of field notes From anthropological tradition of ethnography & sociological tradition of ethno-methodology ο Better at discovering (often tacit) meanings and understandings that participants in social contexts negotiate and derive from interactions with IT and other participants, increasing likelihood that system will be used effectively 8 Example: ο In “The myth of the paperless office” Selen & Harper (2002) use ‘naturalistic enquiry’ to examine how office personnel use paper to organize their work: ν Fine-grained analysis of reading practices (reports, memos, etc); observation of filing practices; etc. ο Findings: ν ν Paper intimately tied to how knowledge workers organize their work, individually and collectively Replacing all paper with IT is inefficient: better to rethink work organization and design information systems (with or without paper) around it (6) Old-fashioned research ο Other people might have solved the same problem you are facing ο Search: ν reference books ν computer trade journals ν Web ο F.ex., generic data models are available for many common information management scenarios: ν lending, invoicing, scheduling, grading, etc. Outputs of fact-finding ο Collected data is used to produce several descriptions of future database: ν ν ν ν ν (1) Mission statement (2) Objectives (3) User views (4) Requirements specification (5) System specification 9 (1) Mission Statement ο Concise summary of the overall purpose of the database (e.g. Making the Case): ν “Telling the truth in such a way that it cannot be denied is the first need of a truth commission established in the aftermath of gross human violations.The purpose of the TRC information management system is to provide a collective memory for those violations and the ability to relate information from different sources. By so doing, it allows anyone in the organization to access information collected by any investigator, without restriction.” (2) Objectives ο High-level description of the expected functionalities of the DB: ν (a) Maintain (enter, update, delete) data on branches, staff, material, patrons, borrowing, suppliers, orders to suppliers ν (b) Perform searches on material, staff, borrowed material, members, suppliers ν (c) track status of book orders, stock, borrowed material ν (d) Report on staff, borrowed material, patrons (at each branch), suppliers, orders (system-wide) (3) User views ο Identify classes of users according to objectives: ν Director: To report on video, staff, members at all branches ν Manager: To maintain (enter, update, delete) data on branch and staff at given branch ν Clerk: To maintain (enter, update, delete) data on videos and members at given branch 10 (4a) Requirements specification ο For each user view, establish: ν (a) What data is needed + constraints ν (b) Transaction (entry, update, queries) requirements ο (a) Data requirements (Branch view): ν Before borrowing, patrons must register as a member of a local branch ν Must provide first and last name, address ν Each member assigned a unique number across all branches ν Members can rent up to 10 books at any one time (4b) Requirements specification ο (b) Transaction requirements: ν Data entry: ο Create new member ο Create new borrowing agreement ν Data update/deletion: ο Update/delete member of staff ο Update/delete details on book ν Data queries: ο List name, position, salary of staff at a given branch, ordered by staff name ο List the details of all copies of a given book at a specified branch (5) System specification ο How much data will the database contain? ν 20 000 titles; 400 000 copies; 10 branches; 2000 staff; 100 000 patrons ο How much will the database grow by? ν 100 new titles, 20 staff, 1000 patrons/month ο What types & average number of record searches: ν ν Catalog searches: 10 000/day (peak 11 am) Borrowing: 5 000 (5 pm) ο Performance (response time): for which transactions is good performance critical to the smooth operation of the organization? ο Networking issues, backup, recovery ο Security: ν determine privileges appropriate to user view 11 Assignments ο Labs: ν To make sure you absorb the relational DB concepts presented in class (relationships, queries, modeling) ο Requirement analysis: ν ν To make sure you understand the difficulty of analyzing and solving a data management problem using relational databases To increase your familiarity with this genre of technical writing ο Database project: ν Get to design a reasonably complex DB ο Final paper: ν Knowledge + writing & analytical skills 12