CSUN Information Systems Systems Analysis & Design http://www.csun.edu/~dn58412/IS431/IS431_SP16.htm Database Design IS 431: Lecture 9 1 Database Design in SDLC IS 431 : Lecture 9 2 Database Design Elements of a Database Evolution of Database Systems DBMS Architecture Relational Data Model and Relational Database Systems IS 431 : Lecture 9 3 Data Fields A field is the physical implementation of a data attribute. They are the smallest unit of meaningful data. A key (identifier) is a field having unique values to identify one and only one record in a file. A secondary key is an alternate identifier for a record. A descriptive field is any other (non-key) field that stores business data. IS 431 : Lecture 9 4 Records A record is a collection of fields arranged in a predefined format. – Fixed-length record structures – Variable-length record structures IS 431 : Lecture 9 5 Files A file is the set of all occurrences of a given record structure. – Types Master files Transaction files Document files Archival files Table lookup files Audit files – File organization: index, sequential … – File access: direct, sequential … IS 431 : Lecture 9 6 Conventional Files vs. the Database File – a collection of similar records. – Files are unrelated to each other except in the code of an application program. – Data storage is built around the applications that use the files. Database – a collection of interrelated files – Records in one file (or table) are physically related to records in another file (or table). – Applications are built around the integrated database IS 431 : Lecture 9 7 Files vs. Database IS 431 : Lecture 9 8 Pros and Cons of Conventional Files Pros: – Easy to design because of their single-application focus – Excellent performance due to optimized organization for a single application Cons – Harder to adapt to sharing across applications – Harder to adapt to new requirements – Need to duplicate attributes in several files. IS 431 : Lecture 9 9 Pros: Pros and Cons of Databases – Data independence from applications increases adaptability and flexibility – Superior scalability – Ability to share data across applications – Less, and/or controlled redundancy (total non-redundancy is not achievable) Cons: – – – – – More complex than file technology Somewhat slower performance Investment in DBMS and database experts Need to adhere to design principles to realize benefits Increased vulnerability due to consolidating data in a centralized database IS 431 : Lecture 9 10 Logical Structure of a Database VIEW A VIEW B VIEW C EXTERNAL LEVEL Inventory Sales Customer CONCEPTUAL LEVEL Cash Receipt INVENTORY RECORD Item_No Description Cost INT(5), non-null, index=itemx CHAR(20) CURRENCY (6,2) IS 431 : Lecture 9 INTERNAL LEVEL 11 Evolution of Database Systems File Management (Flat File) Systems Hierarchical Databases Network Databases Relational Databases Object-Oriented Databases Data Warehouse IS 431 : Lecture 9 12 File Management Systems EMPLOYEE UPDATE PROGRAM FD EMPLOYEE MASTER FILE EMPLOYEE REPORT PROGRAM FD CHECK-WRITING PROGRAM FD TIMECARD FILE FD IS 431 : Lecture 9 13 Hierarchical Databases Car Engine Body Left Door Handle Right Door Window IS 431 : Lecture 9 Chassis Hood Roof Lock 14 Network Databases CUSTOMERS Acme Mfg. #11231 PRODUCTS First Corp. #11232 Size 4 Widget #11233 ORDERS IS 431 : Lecture 9 #11234 4D Bolt #11235 15 Relational Databases CUSTOMERS CUST ID PRODUCTS 1 1 PRODUCT ID ORDERS M ORDER # CUST ID PRODUCT ID QUANTITY IS 431 : Lecture 9 M 16 Object-Oriented Databases CUSTOMERS CUST ID 1 1 PRODUCTS PRODUCT ID CUST NAME PRICE ADDRESS QTY-ON HAND Add Customer Drop Customer Change Customer * ORDER # Buy Product CUST ID Sell Product PRODUCT ID Data New Product ORDERS * QUANTITY Take Order Method Drop Order IS 431 : Lecture 9 17 Data Warehouse IS 431 : Lecture 9 18 Difficulties of Non-Relational Tables Update Anomaly: not changing all occurrence of a data item (in many places) Insert Anomaly: add an invalid (null record) to the database Delete Anomaly: not remove all info (in many places) about a deleted record IS 431 : Lecture 9 19 Relational Data Model Table (Relation) – Row: specific occurrence of an entity – Column: characteristic of an entity, must be single-valued and same data type for all occurrences Primary Key: unique identifier of an occurrence of an entity (entity integrity) Foreign Key: to link tables together, must be null or corresponding to value of a primary key in another table (referential integrity) IS 431 : Lecture 9 20 Requirements of Relational Data Model Primary Key must be unique Foreign Key must either be null or corresponding to existing value of a primary key of another table (to link with this table) Column describes a characteristic of an object identified by primary key Column is single-value ( ex: not “$100, $250” ) Values in rows of a specific column must be of the same data type Column order or row order are unimportant IS 431 : Lecture 9 21 Database Integrity Entity integrity: error exists when identifiers (primary keys) are not unique to identify specific instances of the entity (2 customers have a same identification) Referential integrity: error exists when a foreign key value in one table has no matching primary key value in the related table (Sell to non-existing customers) Domain integrity:error exists when field value is outside the range (a pencil costs $100,000 !!!) IS 431 : Lecture 9 22 Data Architecture A business’s data architecture defines how that business will develop and use files and databases to store all of the organization’s data; the file and database technology to be used; and the administrative structure set up to manage the data resource. Data is stored in some combination of: – – – – – Conventional files Operational databases (also called transactional databases) Data warehouses Personal databases Work group databases IS 431 : Lecture 9 23 A Modern Data Architecture Users and Programmers Information System (built in-house) File Users and Programmers A legacy file-based information system (built in-house) File Information System (built in-house) Operational Database File End-User Tools File Data Warehouse End-User Applications File Users and Programmers A legacy file-based information system (purchased) Information System (built in-house) Personal DB Users File File Operational Database Information System (purchased) Work-Group Database End-User Work Group Users and Programmers IS 431 : Lecture 9 24 Database Architecture Database architecture refers to the database technology including the database engine, database utilities, CASE tools, and database development tools. A database management system (DBMS) is specialized software that is used to create, access, control, and manage the database. The core of the DBMS is a database engine. – A data definition language (DDL) is that part of the engine used to physically define tables, fields, and structural relationships: (CREATE TABLE; ALTER TABLE …) – A data manipulation language (DML) is that part of the engine used to create, read, update, and delete records in the database, and navigate between different files (tables) in the database: (INSERT; SELECT… FROM …WHERE…) IS 431 : Lecture 9 25 Typical DBMS Architecture IS 431 : Lecture 9 26 Goals of Database Design A database should provide for efficient storage, update, and retrieval of data. A database should be reliable—the stored data should have high integrity and promote user trust in that data. A database should be adaptable and scalable to new and unforeseen requirements and applications. IS 431 : Lecture 9 27 Normalization Revisited First normal form (1NF) : – No repeating group of a same attribute – If not: create a new entity for this group. Second normal form (2NF) – Attributes depend on the (composite) key, not part of it. – If not: create a new entity for these partial depended attributes Third normal form (3NF) – Attributes depend on the (primary) key only, not on each other – If not: create new entity for these partial depended attributes IS 431 : Lecture 9 28 Database Models An Entity Relationship Diagram (ERD) is the logical model of the data requirements (data to be stored in the system and their relationships. A Database Schema (physical data model) is the blueprint of the planned implementation of the logical model. One can also use Relational Data Model as a blueprint of relational database design. IS 431 : Lecture 9 29 Physical Data Types Logical Data Type to be stored in field) Physical Data Type MS Access Fixed length character data (use for fields with relatively fixed length character data) Variable length character data (use for fields that require character data but for which size varies greatly--such as ADDRESS) Very long character data (use for long descriptions and notes-usually no more than one such field per record) Integer number TEXT Decimal number NUMBER Physical Data Type Microsoft SQL Server CHAR (size) or character (size) Physical Data Type Oracle CHAR (size) TEXT VARCHAR (max size) or VARCHAR (max size) character varying (max size) MEMO TEXT LONG VARCHAR or LONG VARCHAR2 NUMBER INT (size) or integer or smallinteger or tinuinteger DECIMAL (size, decimal places) or NUMERIC (size, decimal places) INTEGER (size) or NUMBER (size) IS 431 : Lecture 9 DECIMAL (size, decimal places) or NUMERIC (size, decimal places) or NUMBER 30 Physical Data Types … Logical Data Type to be stored in field) Physical Data Type MS Access Physical Data Type Microsoft SQL Server Physical Data Type Oracle Financial Number Date (with time) CURRENCY MONEY DATETIME or SMALLDATETIME Depending on precision needed see decimal number Current time (use to store the data and time from the computer’s system clock) not supported TIMESTAMP not supported Yes or No; or True or False YES/NO BIT use CHAR(1) and set a yes or no domain Image OLE OBJECT IMAGE LONGRAW Hyperlink HYPERLINK VARBINARY RAW Can designer define new data types? NO YES YES DATE/TIME IS 431 : Lecture 9 DATE 31 Data Distribution Data distribution analysis establishes which business locations need access to which logical data entities and attributes. Centralization Horizontal distribution (Data partitioning: Rows or Records) Vertical distribution (Data partitioning: Columns or Attributes) Replication IS 431 : Lecture 9 32 Data Distribution Options Store all data on a single server (Centralization) Store specific tables on different servers. Store subsets of a specific table on different servers – Subsets of columns (Vertical Partitioning) – Subsets of rows (Horizontal Partitioning) Replicate (duplicate) specific tables or subsets on different servers for back up (Replication) IS 431 : Lecture 9 33 Database Distribution Vertical partitioning IS 431 : Lecture 9 34 Database Distribution Horizontal partitioning IS 431 : Lecture 9 35 Database Distribution IS 431 : Lecture 9 36 Database Distribution IS 431 : Lecture 8 37 Relational Databases Relational databases implement stored data in a series of two-dimensional tables that are “related” to one another via foreign keys. – The physical data model is called a schema. – The DDL and DML for a relational database is called SQL (Structured Query Language). – Triggers are programs embedded within a table that are automatically invoked by updates to another table. – Stored procedures are programs embedded within a table that can be called from an application program. IS 431 : Lecture 9 38 A Method for Database Design 1. 2. 3. 4. 5. 6. 7. Review the logical data model. Create a table for each entity. Create fields for each attribute. Create an index for each primary and secondary key. Create an index for each subsetting criterion. Designate foreign keys for relationships. Define data types, sizes, null settings, domains, and defaults for each attribute. 8. Create or combine tables to implement supertype/ subtype structures. 9. Evaluate and specify referential integrity constraints. IS 431 : Lecture 9 39 From Logical Data Model Customer Cust No (1,1) Order Product place (1,M) Order No (1,M) contain (1,N) Product No CUSTOMER (Cust No, ….) ORDER (Order No, Cust No, ….) PRODUCT (Product No,…) ORDER-PRODUCT (OrderNo, ProductNo, …) IS 431 : Lecture 9 40 to Physical Data Model (Relational Schema) … IS 431 : Lecture 9 41 …to “MS Access” Implementation. IS 431 : Lecture 9 42 Database Components Form Builder Report Writer Interactive Query Tool Application Program Database Front-end Database Engine To other computer systems Database Database Gateway To other DBMS brands IS 431 : Lecture 9 43 The Role of SQL Form Builder SQL Report Writer SQL Interactive Query Tool Application Program SQL SQL Database Front-end SQL Database Engine SQL To other computer systems SQL Database Database Gateway To other DBMS brands IS 431 : Lecture 9 44 Administrators Data administrator is responsible for the data planning, definition, architecture, and management in the organization. (CIO) Database administrators are responsible for the database technology, database design and construction, security, backup and recovery, and performance tuning. (Supervisors) IS 431 : Lecture 9 45