Introduction to Database Processing IS 240 – Database Management Lecture #1 – 2004-01-15 Prof. M. E. Kabay, PhD, CISSP Norwich University mkabay@norwich.edu http://www2.norwich.edu/mkabay Topics Why study DBMS? DBMS Applications Historical Overview Defining “Database” How a DBMS Is Used Historical Development of DBMS Some Fundamental Issues in DB Applications HOMEWORK 2 Copyright © 2004 M. E. Kabay. All rights reserved. Why study DBMS? Central technology of today’s information technology (IT) Teaches orderly analysis of data requirements and relationships Opportunity to understand internals underlying externals of applications Provides basis for rapid assimilation and application of wide range of specific DBMS tools Structured Query Language (SQL) almost universally used in industry Increases likelihood of good jobs 3 Copyright © 2004 M. E. Kabay. All rights reserved. DBMS Applications DBMS = database management system Database contains one or more tables (files, datasets) columns = fields rows = records Relations among tables help navigate DB DB Application allows access to database usable formats data entry reports 4 Copyright © 2004 M. E. Kabay. All rights reserved. Concurrency Single-user database allows only one user at a time aka exclusive access Types of access permissions READ WRITE APPEND LOCK Multi-user databases need to protect against damage to records 5 Copyright © 2004 M. E. Kabay. All rights reserved. TIME Concurrency (cont’d) Joe accesses Widget record Shakheena accesses Widget record Inventory shows 25 to both Joe takes out 10 Application writes out record to DB Inventory now shows 15 Shakheena takes out 5 from her copy of data (25) Inventory now shows 20 6 Copyright © 2004 M. E. Kabay. All rights reserved. Historical Overview How did people handle masses of data? Manual systems Clay tablets Parchment Paper Abacus Punch cards (1890-1960) File systems (1950-present) DBMS (1970-present) 7 Copyright © 2004 M. E. Kabay. All rights reserved. Problems with File Systems Separated, isolated data Duplication of data File-format dependency File incompatibilities Hard to show useful views of data 8 Copyright © 2004 M. E. Kabay. All rights reserved. Problems: Separated, Isolated Data Multiple files for different aspects of system Linkages handled entirely by application programming Coordinate access to multiple files for different functions Some databases have hundreds of files 9 Copyright © 2004 M. E. Kabay. All rights reserved. Problems: Duplication Of Data Early collections of files duplicated data e.g., identifiers (name, address. . . .) Easy to generate discrepancies copies of data in different records and different files could diverge from each other Frustrating for users and clients enter same information over and over Results inconsistent, contradictory send invoice to old address in one program, new address in other program 10 Copyright © 2004 M. E. Kabay. All rights reserved. Problems: File-format Dependency Structure of data files hard-coded in application program All changes to data files requires modification of programs rewrite data description rewrite special code for linking or searching recompile source code to generate object update documentation 11 Copyright © 2004 M. E. Kabay. All rights reserved. Problems: File Incompatibilities Different analysts and programmers used different data definitions NAME has 20 chars NAME has 40 chars Different names for fields SSN vs SS# LAST_NAME vs L_NAME Different record structures LAST | FIRST | STREET1 | STREET 2 | CITY NAME | ADDRESS | CITY_&_STATE 12 Copyright © 2004 M. E. Kabay. All rights reserved. Problems: Hard To Show Useful Views Of Data Combining fields from different records in different files necessary for most users reports on-screen visualization Every report / screen requires special programming find data (often by serial search) place in output in specific positions all require a great deal of programming 13 Copyright © 2004 M. E. Kabay. All rights reserved. Defining “Database” “A database is a self-describing collection of integrated records.” Self-describing Integrated Model of a model 14 Copyright © 2004 M. E. Kabay. All rights reserved. How a DBMS Is Used APPLICATION PROGRAMS TOOLS API INTERNALS DATA DICTIONARY DATA 15 Copyright © 2004 M. E. Kabay. All rights reserved. QUERY Self-describing Databases have data dictionaries aka data directory or metadata Data dictionary supports independence between programs and database Change in data dictionary does NOT usually require change in program Enormous reduction in programming complexity and maintenance of programs Data dictionary supports independence between database and documentation Constant problem: bad documentation DBMS helps reduce dependence on manual documents 16 Copyright © 2004 M. E. Kabay. All rights reserved. Integrated Files are accessed in systematic way Special files maintain indexes that help speed access “Find all records where name begins with S” Find records where city_population > 750,000 and household_median_income > $50,000 Application metadata can include report structures “Print the invoice for Mrs Smith’s fuel oil deliveries this month” 17 Copyright © 2004 M. E. Kabay. All rights reserved. Model of a model Databases are designed by people DB does not directly reflect “reality” DB reflects designer’s decisions about how to represent user’s perceptions of what matters “The availability of a tool determines perceptions of what’s a reasonable request.” As users learn to use their DB, they begin to think in new ways Recognize new possibilities, need new functions Databases evolve as they are used 18 Copyright © 2004 M. E. Kabay. All rights reserved. Historical Development of DBMS 1970s: E. F. Codd – relational DB model normalization of data reduce repetition 1980s: Microcomputers: dBase II not DBMS not relational but interfaces improved mainframe products ported to PCs 19 Copyright © 2004 M. E. Kabay. All rights reserved. Historical Development of DBMS (cont’d) Mid-1980s: client-server architecture Link inexpensive computers in networks (LANs) Store data on servers Run client programs on workstations for user interface, some computations, reports 1990s: Web-based systems Web exploded into use ~1993 Common interface: browser client software reading standard formatting codes – HTML, XML, JAVA 20 Copyright © 2004 M. E. Kabay. All rights reserved. Some Fundamental Issues in DB Applications Ethical constraints on data gathering and usage (details later in course) How do we protect data subjects against abuse? Security (details later) confidentiality control integrity authenticity availability utility 21 Copyright © 2004 M. E. Kabay. All rights reserved. Homework Study Kroenke Chapter 1 using SQ3R By next Thu, 23 Jan 2003: required Write out answers to each of the Group 1; print your responses and bring them to class Questions 1.1 through 1.23 using a computer (23 points) Short (½ to 1 page) response to Project A on page 23 (2 points) Short response to Project B (2 points) 22 Copyright © 2004 M. E. Kabay. All rights reserved. For Extra Credit 1 point each: optional Write out & print answers by Thu 30 Jan 2003 FiredUp Project Question A FiredUp Project Question B (only if A done) 23 Copyright © 2004 M. E. Kabay. All rights reserved. Preparation for Next Class Review Kroenke Chapter 1 Scan Kroenke Chapter 2 Apply SQ3R global scan to Ferrett et al Study Ferrett et al Projects 1 & 2 Load files from Ferrett CD-ROM onto your own computer if you have one Amazon has the 1st chapter of Kroenke; search on Kroenke and look for the 8th edition. Also see the Kroenke Web site at http://www.prenhall.com/kroenke 24 Copyright © 2004 M. E. Kabay. All rights reserved. DISCUSSION 25 Copyright © 2004 M. E. Kabay. All rights reserved.