Introduction Ellen Walker CPSC 356 Database Design Hiram College CPSC 356: Database Design • Sources of information – Ellen Walker (walkerel@hiram.edu) – Web page (http://cs.hiram.edu/~walkerel/cs356) Syllabus handout (available on the web page) Database Implementation Projects • External clients • Clients will… – – – – Meet with you within the next two weeks Approve each of 3 major phases Attend presentations on the last day of class Help me to evaluate your projects • You may find your own client, but you cannot be your own client Database Implementation Projects • You will… – Each take a role on the project (see overview document) and switch roles for each phase – Meet regularly (at least once a week) - in person or electronically (and keep minutes) – Submit preliminary reports and final phase deliverables as specified in the syllabus. (Only final deliverables are graded). – Submit confidential evaluations of your group members with each deliverable. Project Groups • Usually, projects are done by groups of 3-4 people • We have 5 in the class – Divide into 2 groups (2, 3) – Work as a single group (5) Entrepreneurial Mindset • “A way of thinking and acting to create a new product, service or activity that satisfies a need and adds value to one’s self and community” -from the grant proposal • As consultants, you will apply this mindset throughout your project • At points in the course, we will look at relevant literature and examples of entrepreneurship in the software industry. What is a Database? • “Collection of related data” • Usually large (too large to fit in computer memory at once) • Can be centralized or distributed • Generally accessed by “query” -- retrieving only “relevant” parts at once Where are Databases? • • • • • • E-commerce (e.g. shopping carts) Airline & hotel reservation systems Credit card bureaus Manufacturers (e.g. parts, tests, defects) Libraries ERP systems (e.g. SCT Banner) Where else are databases? • The human genome project • Finance software (e.g. quicken) • … Who Interacts with Databases? • Creating the System – – – – System Analyst Database Designer Application Programmer Project Manager • Once the System Exists – Database Administrator – System Administrator – End User Advantages of Databases • • • • • • • Redundancy Control Data Consistency and Integrity Data Sharing and Integration Security Improved Maintenance Concurrency (without data loss) Backup & Recovery services Disadvantages… • • • • • • Complexity Size Cost (Hardware & Software) Performance Conversion Risk of failure File-based Systems • Since the 1950’s… • Methods and vocabulary from paper records • Custom-programmed individual applications – Sales (enter data, retrieve properties, contact clients) – Contracts (enter data, record leases) Terminology • File - collection of records • Record - set of logically connected data (one instance) • Field - element of a record • Example: – File = library card catalog – Record = one card – Field = author’s name Dream Home Example • Sales – Property record (address, owner, rooms, rent) – Client record (name, address, phone,property requirements) • Contracts – Lease record (client, property, rent, payment info) – Property record (address, rent) – Client record (name, address, phone) What’s Wrong? • Each program stores its own data – No cross-program queries (% of clients that actually rented?) – Multiply entered data (property addresses) • Cost • Inconsistency • … in its own format – Incompatible data files – Change the format -> change the program! • … to satisfy a limited set of queries – Program proliferation! How to Fix? • Separate data handling from application program • Make data definitions standardized and external • Develop reusable query algorithms to be controlled by external information Better Definition of Database • Database: A shared collection of logically related data and a description of data, designed to meet the information needs of the organization. • Database Management System (DBMS): A software system that enables users to define, create, maintain, and control access to the database. Data / Program Separation • In addition to data, database holds metadata (data about data), e.g. field names and data types. • DBMS now a single program (for all databases) that acts on both meta-data and data • Provides a form of data abstraction Organizing Data • Entity - a distinct object (noun) to represent • Attribute - property of an entity • Relationship - association between entities (usually a verb) • Example – Entities: customer, home – Attributes: name (of customer), address (of home) – Relationship: customer RENTS home Database Management System Query Update DBMS Update (non-db op) Query Transaction High Level Language (e.g. SQL) Low-level Operations e.g. Relational Algebra Database DBMS Software • Data Definition Language – Specifies meta-data (data types, structure, constraints) • Data Manipulation Language (e.g. SQL) – Insert, update, delete and retrieve data • Access control – Security, Integrity, Concurrency, Recovery Components of DBMS Environment • Hardware (servers, clients, storage) • Software – DBMS environment (e.g. Access, Oracle) – Application programs • Data • Procedures (in the “real world”) – Control the design & use of the database • People Database Design • Find an appropriate schema (organization of attributes into tables) based on – – – – Needs of the entire organization Efficiency of access Ease of maintenance Logical relationships among the attributes (normalization) Bad Ideas for Database Design • Representing the same information multiple times in different tables • Putting entirely unrelated attributes in the same table • Confusing views and schemas 3 Phase Database Design • Conceptual Design – Independent of all physical considerations – Validate against requirements (cannot be implemented!) • Logical Design – Lower-level model for a particular kind of database (Relational vs. Hierarchical vs. Object-oriented) • Physical Design – Data structures, disk layout, etc.