Database Design

advertisement
Introduction
Ellen Walker
CPSC 356 Database Design
Hiram College
CPSC 356: Database Design
• Sources of information
– Ellen Walker (walkerel@hiram.edu)
– Web page (http://cs.hiram.edu/~walkerel/cs356)
Syllabus handout (available on the web page)
Database Implementation
Projects
• External clients
• Clients will…
–
–
–
–
Meet with you within the next two weeks
Approve each of 3 major phases
Attend presentations on the last day of class
Help me to evaluate your projects
• You may find your own client, but you
cannot be your own client
Database Implementation
Projects
• You will…
– Each take a role on the project (see overview
document) and switch roles for each phase
– Meet regularly (at least once a week) - in person or
electronically (and keep minutes)
– Submit preliminary reports and final phase deliverables
as specified in the syllabus. (Only final deliverables are
graded).
– Submit confidential evaluations of your group members
with each deliverable.
Project Groups
• Usually, projects are done by groups of 3-4
people
• We have 5 in the class
– Divide into 2 groups (2, 3)
– Work as a single group (5)
Entrepreneurial Mindset
• “A way of thinking and acting to create a new
product, service or activity that satisfies a need
and adds value to one’s self and community” -from the grant proposal
• As consultants, you will apply this mindset
throughout your project
• At points in the course, we will look at relevant
literature and examples of entrepreneurship in the
software industry.
What is a Database?
• “Collection of related data”
• Usually large (too large to fit in computer memory
at once)
• Can be centralized or distributed
• Generally accessed by “query” -- retrieving only
“relevant” parts at once
Where are Databases?
•
•
•
•
•
•
E-commerce (e.g. shopping carts)
Airline & hotel reservation systems
Credit card bureaus
Manufacturers (e.g. parts, tests, defects)
Libraries
ERP systems (e.g. SCT Banner)
Where else are databases?
• The human genome project
• Finance software (e.g. quicken)
• …
Who Interacts with Databases?
• Creating the System
–
–
–
–
System Analyst
Database Designer
Application Programmer
Project Manager
• Once the System Exists
– Database Administrator
– System Administrator
– End User
Advantages of Databases
•
•
•
•
•
•
•
Redundancy Control
Data Consistency and Integrity
Data Sharing and Integration
Security
Improved Maintenance
Concurrency (without data loss)
Backup & Recovery services
Disadvantages…
•
•
•
•
•
•
Complexity
Size
Cost (Hardware & Software)
Performance
Conversion
Risk of failure
File-based Systems
• Since the 1950’s…
• Methods and vocabulary from paper records
• Custom-programmed individual
applications
– Sales (enter data, retrieve properties, contact
clients)
– Contracts (enter data, record leases)
Terminology
• File - collection of records
• Record - set of logically connected data
(one instance)
• Field - element of a record
• Example:
– File = library card catalog
– Record = one card
– Field = author’s name
Dream Home Example
• Sales
– Property record (address, owner, rooms, rent)
– Client record (name, address, phone,property
requirements)
• Contracts
– Lease record (client, property, rent, payment info)
– Property record (address, rent)
– Client record (name, address, phone)
What’s Wrong?
• Each program stores its own data
– No cross-program queries (% of clients that actually
rented?)
– Multiply entered data (property addresses)
• Cost
• Inconsistency
• … in its own format
– Incompatible data files
– Change the format -> change the program!
• … to satisfy a limited set of queries
– Program proliferation!
How to Fix?
• Separate data handling from application
program
• Make data definitions standardized and
external
• Develop reusable query algorithms to be
controlled by external information
Better Definition of Database
• Database: A shared collection of logically
related data and a description of data,
designed to meet the information needs of
the organization.
• Database Management System (DBMS): A
software system that enables users to define,
create, maintain, and control access to the
database.
Data / Program Separation
• In addition to data, database holds metadata (data about data), e.g. field names and
data types.
• DBMS now a single program (for all
databases) that acts on both meta-data and
data
• Provides a form of data abstraction
Organizing Data
• Entity - a distinct object (noun) to represent
• Attribute - property of an entity
• Relationship - association between entities
(usually a verb)
• Example
– Entities: customer, home
– Attributes: name (of customer), address (of home)
– Relationship: customer RENTS home
Database Management System
Query
Update
DBMS
Update
(non-db op)
Query
Transaction High Level
Language
(e.g. SQL)
Low-level
Operations
e.g.
Relational
Algebra
Database
DBMS Software
• Data Definition Language
– Specifies meta-data (data types, structure,
constraints)
• Data Manipulation Language (e.g. SQL)
– Insert, update, delete and retrieve data
• Access control
– Security, Integrity, Concurrency, Recovery
Components of DBMS
Environment
• Hardware (servers, clients, storage)
• Software
– DBMS environment (e.g. Access, Oracle)
– Application programs
• Data
• Procedures (in the “real world”)
– Control the design & use of the database
• People
Database Design
• Find an appropriate schema (organization of
attributes into tables) based on
–
–
–
–
Needs of the entire organization
Efficiency of access
Ease of maintenance
Logical relationships among the attributes
(normalization)
Bad Ideas for Database Design
• Representing the same information multiple
times in different tables
• Putting entirely unrelated attributes in the
same table
• Confusing views and schemas
3 Phase Database Design
• Conceptual Design
– Independent of all physical considerations
– Validate against requirements (cannot be implemented!)
• Logical Design
– Lower-level model for a particular kind of database
(Relational vs. Hierarchical vs. Object-oriented)
• Physical Design
– Data structures, disk layout, etc.
Download