Info 275 Quiz #2 Review Database Development Software Realities

advertisement
Info 275 Quiz #2 Review
Database Development
Software Realities
- An enormous number of information systems are conceived and implemented
every year
- Often:
o Delivered late or over budget (80%)
o Completely fail or abandoned (40%)
o Fail to address needs of users( training etc. 40%)
o Don’t align with organizational goals (75%)
- Major reasons for failure of software projects includes
o Lack of a complete requirements specifications
o Lack of appropriate specification
o Poor decomposition of design into manageable components
- We need a well-defined and logical approach to guide development
- Structured approach to development was proposed called Systems Development
Lifecycle (SDLC)
Information Systems
- Resources that enable collection, management, control, and dissemination of
information throughout an organization
- Database is fundamental component of IS, and its development/usage should be
viewed from perspective of the wider requirements of the organization.
Project
- A planned undertaking that has a specified beginning and end and that produces
some definite result
- For our purpose: the result is a new or modified information system and
associated database
- Usually requires a team of experts (Data analyst, DBA’s)
- The development of databases are critical component of information systems
projects
System Definition
- Describes scope and boundaries of database system and the major user views
- User view defines what is required of a database system from perspective of:
o A particular job role (such as manager or supervisor) or
o Enterprise application area (such as marketing, personnel or stock
control)
- Database application may have one or more user views (applications, modules,
subsystems)
- Identifying user views helps ensure that no major user of the database are
forgotten when developing requirements for new system
- User views also help in development of complex database system allowing
requirements to be broken down into manageable pieces
Representation of a Database System with Multiple User Views
Example- Course Admin
-
Assume its many years ago, and X is considering building a new information
system (that ends up looking like Banner) for student registration in course
sections.
Key Functions
Create / Update Students
Create / Update Courses
Create / Update Sections
Create / Update Buildings and Classrooms
Create / Update Faculties and Departments
Create / Update Professors
Create / Update Timeblocks
Assign Sections to Timeblocks
Assign Sections to Classroom(s)
Assign Sections to Professors
Enroll Student in Section
Drop Student From Section
Provide Section Override to Student
Generate Timetable
Assign Student Grades
Requirements Collection and Analysis
- Process of collecting and analyzing information about the part of organization to
be supported by the database system, and using this information to identify
users’ requirements of new system
- Requirements is about understanding, in detail user requirements in terms of:
• Functions and Events (transactions)
• Things (?)
Defining System Requirements
- Requirement:
• Create a means to transport a single individual from home to place of
work
• Management Interpretation
• IT Interpretation
• User Interpretation
Requirements Collection and Analysis
- Information is gathered for each major user view including:
• A description of data used or generated
• Details of how data is to be generated/used
• Any additional requirements for new database systems
- Information is analyzed to identify requirements to be included in new database
systems. Described in the requirements specification
- AS the result of requirements, we develop a series of data models (conceptual,
logical and physical)
- Another important activity is deciding how to manage the requirements for a
database system with multiple user views.
- Two main approaches:
• centralized approach;
•
•
view integration approach;
combination of both approaches.
Centralized approach;
- Requirements for each user view are merged into a single set of requirements.
- A data model is created representing all user views during the database design
stage.
•
View integration approach
• Requirements for each user view remain as separate lists.
• Data models representing each user view are created and then merged
later during the database design stage.
• Data model representing single user view (or a subset of all user views) is
called a local data model.
• Local data models are then merged at a later stage during database
design to produce a global data model
Database Design
- Process of creating a design for a database that will support the enterprise’s
mission statement and mission objectives for the required database system
- Major deliverable: Data model
- Main purpose of data modeling include:
• To assist in understanding the meaning (Semantics) of the data
• To facilitate communication about the information requirements
- Building data model requires answering questions about entities, relationships,
and attributes
- A data model ensures we understand:
- Each user’s perspective of the data;
- Nature of the data itself, independent of its physical representations;
- Use of data across user views.
-
Three phases of database design:
o Conceptual database design
o Logical database design
o Physical database design.
Conceptual Database Design
- Process of constructing a model of the data used in an enterprise, independent of
all physical considerations
- Data model is built using the information in users requirements specifications
- Conceptual data model is source of information for logical design phase
Logical Database Design
- Process of constructing a model of the data used in an enterprise based on a
specific data model (relational) but independent of a particular DBMS and other
physical considerations
- Conceptual data model is refined and mapped on to a logical data model
Physical Database Design
- Process of producing a description of the database implementation of secondary
stage
- Describes base relations, file organizations and indexes used to achieve efficient
access to data. Also describes any association integrity constraints and security
measures
- Tailored to a specific DBMS system
Three-Level ANSI-SPARC Architecture and Phases of Database Design
Application Design
- Design of user interface and application programs that use and process the
database
- Database design and application design are parallel activities and are fdone in
tandem, often by the same team
- Includes two important activities:
o User interface design
o Transaction design
User Interface Design
Application Design- Transactions
- Transaction: An action, or series of actions, carried out by a single user or
application program, which accesses or changes content of the database
- Important characteristics of transactions
-
o Data to be used by the transaction
o Functional characteristics of the transaction
o Output of the transaction
o Importance to the users
o Expected rate of usage
Three main types of transactions
o Retrieval
o Update
o Mixed
Implementation
- Physical realization of the database and application designs
o Use DDL to create database schemas and empty database files.
o Use DDL to create any specified user views
o Use programming language to create the application programs. This will
include the database transactions implemented using DML, possibly
embedded in a host programming language.
Data Conversion and Loading
- Transferring any existing data into new database and converting any existing
applications to run on new database.
- Only required when new database system is replacing an old system
o DBMS normally has utility that loads existing files into new database
- May be possible to convert and use application programs from old systems for
use by new system
Testing
- Process of running the database system with intent of finding errors
- Use carefully planned test strategies and realistic data
- Demonstrates that database and application programs appear to be working
according to requirements
Operational Maintenance
- Process of monitoring and maintaining database system following installation
- Monitoring performance of system
o If performance falls, may require tuning or reorganization of the database
- Maintaining and upgrading database application (when required)
- Incorporating new requirements into database applications.
CASE Tools
-
-
Automated tools that assist with requirements, design and implementation tasks
Support provided by CASE tools include:
o data dictionary to store information about database system’s data;
o design tools to support data analysis;
o tools to permit development of corporate data model, and conceptual
and logical data models;
o tools to enable prototyping of applications.
Provide following benefits:
o Standards/Consistency
o Automation/Higher Productivity
o Higher Quality Design/Fewer Defects
Concepts of the ER Model:
-
-
Entity Types
o Group of objects with same properties, identified by enterprise as having
independent existence (a table).
Entity Occurrence
o A unique object of an entity type (a row).
Entity Type Examples:
o Tangible Things
o Roles played by people
o Organization units
o Sites/Locations
o Incidents, Events, Transactions
Relationship Types
o Set of meaningful associations among entity types.
Relationship Occurrence
o Unique association, which includes one occurrence from each
participating entity type.
Ternary relationship:
o when the relationship has an attribute in the form of a new table. Ex:
Staff registers Client at Branch.
Recursive relationship:
-
-
-
o Relationship where the same entity participates more than once in
different roles.
o Relationship may be given role names to indicate purpose that each
participating entity type plays in a relationship.
Attributes
o Property of an entity or relationship type.
Attribute domain:
o Set of allowable values for one or more attributes
Simple Attribute:
o Attribute composed of a single component with an independent
existence.
Composite Attribute:
o Attribute composed of multiple components, each with an independent
existence.
Single Valued Attribute:
o Attribute that holds a single value for each occurrence of an entity type.
Multi-valued attribute:
o Attribute that holds multiple values for each occurrence of an entity type.
o We need special rules for dealing with these.
Derived attribute:
o Attribute that represents a value that is derivable from value of a related
attribute or set of attributes, not necessarily in the same entity type.
Example: Age from Date of Birth.
Strong Entity:
o Entity that is not existence-dependent on some other entity type.
Weak Entity:
o Entity that is existence-dependent on some other entity.
Strong/Weak Example: A CLIENT has a PREFERENCE. A PREFERENCE cannot exist
with a CLIENT.
Structural Constraints:
o Multiplicity: number or range of possible occurrences of an entity type
that may relate to a single occurrence of an associated entity through a
particular relationship.
 Represents policies or business rules established by user or
company.
o The most common degree for relationships is binary. Binary relationships
are referred to as:
 One-to-one (1..1)
 One-to-many (1..*)
 Many-to-many (*..*)
o Multiplicity is made up of two types of restrictions on relationships:
 Cardinality:


-
Describes maximum number of possible relationship
occurrences for an entity participating in a given
relationship type.
Participation:
 Determines whether all or only some entity occurrences
participate in a relationship.
Problems with ER Models:
o Fan Traps:
 Where a model represents a relationship between entity types,
but pathway between certain entity occurrences is unclear.
o Chasm Trap:
 Where a model suggests the existence of a relationship between
entity types, but the pathway does not exist between certain
entity occurrences.
Normalization
Purpose of Normalization:
Major aim of relational database design is to group attributes into relations to minimize
data redundancy.
Normalization is a technique for producing a set of suitable relations that support the
requirements of a database.
Characteristics of a suitable set of relations include:
-
The minimal # of attributes necessary to support the data requirements of the
enterprise.
Only attributes with a close logical relationship are found in the same table.
Minimal redundancy with each attribute represented only once with the
important exception of attributes that form all of part of foreign keys.
The benefits of using a database that has a suitable set of relations is that the database
will be:
- easier for the user to access and maintain data
- minimizes storage space on the computer
- less potential issues related to data integrity
- optimizes performance for operations such as insert
Functional Dependency:
-
-
-
Describes relationship between attributes.
Goals of functional dependency analysis:
o Ensure each relation contains information about a specific thing and each
attribute serves to describe that thing
o Ensure that relation contains only attributes with full functional
dependency on the primary key.
Characteristics of Functional Dependency:
o There I a 1:1 relationship between the attributes on the left hand side
(determinant) and those on the right hand side.
o Holds for ALL time.
o The determinant has the minimal number of attributes necessary to
maintain the dependency with the attributes on the right side.
Determinants should have the MIN number of attributes necessary to maintain
the functional dependency with the attributes on the right side: full functional
dependency.
Transitive Dependency describes a condition where A,B and C are attributes of a
relation such that A B and B C, then C is transitively dependent on A via B.
Process of Normalization:
- Formal technique for analyzing a relation based on its PK and the functional
dependencies between the attributes of that relation.
- Executed as a series of steps.
o UNF (Unnormalized Form)
 A table that contains one or more repeating groups – multiple
values in a single column.
o 1NF (First Normal Form)
 A relation in which the intersection of each row and column
contains one and only one value.
 Attained by indentifying repeating groups and flattening table
(filling empty cells) or creating a new table to make up for multi
valued attributes.
o 2NF (Second normal form)
 Based on the concept of full functional dependency
 Only applies to relations with composite keys
 A relation that is 1NF and every non-PK is fully functionally
dependent on the PK.
 1NF  2NF
o identify PK for 1NF table
o identify functional dependencies
o if partial dependencies exist on PK, remove them
by placing them in a new table
o 3NF (Third normal form)
 based on transitive dependency

relation in 1NF and 2NF in which no non-PK attribute is
transitively dependent on the PK.
 2NF  3NF
o IDENTIFY THE PK IN 2NF RELATION.
o IDENTIFY FUNCTIONAL DEPENDECIES.
o IF TRANSITIVE DEPENDENCIES EXIST ON THE PK,
REMOVE THEM BY PLACING THEM IN A NEW TABLE
WITH A COPY OF THE DETERMINANT WHICH
BECOMES THE NEW PK.
Conceptual Database Design
Design Methodology
- A structured approach that uses procedures, techniques, tools and
documentation aids to support and facilitate the process of design
3 Main phases:
- Conceptual Database design: design with no technology/ implementation
assumptions
- Logical database design: Design for specific model
- Physical database design: design for specific DBMS (i.e. Oracle vs. SQL Server)
Critical Success Factors in Database Design
• Work interactively with the users as much as possible.
• Follow a structured methodology throughout the data modelling process.
• Use diagrams to represent as much of the data models as possible.
• Build a data dictionary to supplement the data model diagrams.
• Be willing to repeat steps.
Conceptual Database Design Steps
• Step 1: Identify entity types
• Step 2: Identify relationship types
• Step 3: Identify and associate attributes with entity or relationship types
• Step 4: Determine attribute domains
• Step 5: Determine candidate, primary, and alternate key attributes
• Step 6: Consider use of enhanced modeling concepts (optional step)
• Step 7: Check model for redundancy
• Step 8: Validate conceptual model against user transactions
• Step 9: Review conceptual data model with user
Build Conceptual Data Model
•
•
Goal: To build a conceptual data model of the data requirements of the
enterprise.
Model comprises entity types, relationship types, attributes and attribute
domains, primary and alternate keys, and integrity constraints.
•
Documented in the form of an Entity-Relationship model and associated
documentation.
• ER Models document:
• Entities
• Attributes
• Relationships
Step 1: Identify Entity Types
• Goal is to identify all of the ‘things’ that users need in the computer system.
• Dream Homes:
• PropertyForRent, PrivateOwner, Business Owner, Client, Branch, Staff,
Lease, Preference
• Document in a data dictionary: a document where we describe entities,
attributes and relationships textually as well as on ER Diagram.
Step 2: Identify Relationship
•
•
•
Relationships: naturally occurring associations between entities
Multiplicity: - number (or range) of possible occurrences of an entity type that
may relate to a single occurrence of an associated entity type through a
particular relationship.
Document relationships in the data dictionary and on an Entity Relationship
Diagram
Step 3: Define each entity’s attributes
• Identify each individual piece of information associated with each entity.
• Only capture attributes that are required by the application we are building.
• There are multiple types:
• Simple
• Composite
• Derived
• Should we store staff age?
Step 4: Identify Attribute Domains
• Basic Domains:
• Character
• Numeric
• Dates
• We can also specify specific ranges or even specific values
• Province: NS, NF, NB ……
Step 5: Identify Keys
• Process of defining how each entity will be uniquely identified.
• Candidates
• Primary
• Alternate
Step 6: Apply Advanced modeling techniques
• Covers situations such as superclass/sub-class
• Owner:
• Private_owner
• Business_owner
• Required when the different subclasses exist and substantially different
attributes
Step 7: Check the model for redundancy
• This step is to ensure that there are no unnecessary, redundant / duplicated
relationships in the design.
• Examine 1:1 relationships
• Remove duplicates
• Consider time dimension
Step 8: Validate the model vs. user transactions
• This step is taken to ensure that the model meets the users functional
requirements for the database.
• Correct entities?
• Correct attributes?
• Correct relationships?
Validate model vs. Transactions (requirements)
• Initial version of the model is developed based on requirements analysis
• From application design, we get a list of required transactions
• We then ‘map’ the transactions to our model to ensure we have the right
entities, attributes and relationships in the model.
•
•
•
If we do not, we modify the model
Sample query
transactions:
• a) generate a list of staff supervised by each supervisor
• b) Generate a list of staff alphabetically
• c) List the properties and owners sorted by branch
Step 9: Review the model with users
• Walk through the model with users to ensure that we have reflected their
requirements in the database
• First step: train users on how to read an ERD!
Download