outline - University Directory: Western Illinois University

advertisement
School of Computer Sciences
IS 342
Western Illinois University
AMARAVADI
FINAL REVIEW – OUTLINE OF TOPICS
Database Preliminaries (Database introduction)
- FYI Data vs Information: concepts of entities, attributes, primary key, files, file processing
problems, usage of data/information.
- Definition of DBMS vs database; file/table, records, fields, schema, primary and secondary keys.
- Operational and advanced uses of databases.
- Organizational importance of databases (engine for all IS applications, starting point in the
transformation of data to information, decision support, strategic resource).
- Database objectives: to make data available, to ensure data quality and integrity, centralized control
of data, data independence.
- DBMS activities: define schema/structure (data definition), enter data, modify data, query data
(SQL), get reports – query-by example (QBE).
- The database development cycle (planning, requirements definition, logical and physical design,
implementation, maintenance).
- The database concept/approach.
- Entities, entity classes and attributes.
- Ignore the student administration application.
Database Fundamentals (Database evolution and environment)
- Evolution of DBMS: intro of business computers, use in TP, problems, academic formulations,
standardization of DBMS concepts.
- The file processing era and problems caused: uncontrolled redundancy, program-data dependence,
program maintenance, poor data quality, inability to get reports, application backlog.
- The DBMS approach : conceptualization of data, organized design, centralized management, system
controlled access, checks on data quality, query and reporting.
- Rules for getting information from multiple tables
- Database terminology: schema/structure/definition, file/table/relation, record, attribute/field.
- Database architecture: the three-schema architecture (external, conceptual and internal)
- Different classes of users in a DBMS (end users, administrators and developers).
- DBMS components: Data definition, SQL, Programming language interface, Data dictionary,
Screen/Report generation, Application generation, Export/Import.
- Components of a database environment: Data dictionary, database client, SQL server, directory server
Enterprise Applications programs, database.
Database Planning
- The database development cycle (planning, requirements definition, logical and physical design,
implementation, maintenance).
- About database planning: areas of organization, database size, # of analysts, HW/SW requirements.
- Without adequate planning: org. changes impact development, schedule changes, rework, inaccurate
estimation of resources (ignore).
- Notion of a business process: a group of related activities
- Functions, processes and activities
- Activity data relationship: process/activity can create/update/Delete data.
- The Enterprise Data Model (EDM): you need to be able to draw an EDM.
- Pre-requisites of planning: management commitment, project team.
- Planning metamodel: Study the organization from the top down, identify functions, processes and
activities, identify which functions impact CSFs, study the applications and the eclasses they handle.
- Steps in planning : identify background, information used, develop planning matrices, Enterprise model,
define scope of project; planning matrices (function vs CSF, process vs eclass, application vs eclass)
- Planning objectives: scope, overview of info requirements, critical processes & information classes,
-
resource requirements.
Elements of a database plan (ignore).
Outcomes of planning: planning matrices, preliminary data model, team assignments, application
requirements, project priorities.
Planning pitfalls: insufficient management support, strategic plan not available, direction not known and
lack of co-ordination.
Database Analysis
- Information collected during RD: data structure and rule/constraint information.
- Constraints: Integrity constraints and performance constraints.
- Steps in RD (FYI): define scope, select methodology, identify user views, develop data model, crosscheck, specify constraints.
- Top-down vs bottom-up approaches: Conceptual Data Model vs Enterprise Data Model;
- ER Model (FYI)
- Basic modeling concepts: Entity classes, attributes (multi-valued, derived) and relationships (degree,
cardinality, class/subclass, recursive).
- Basic symbols of ER: Entity class, relationship, primary-key, attribute, multi-valued attribute, Gerunds,
relationships of different degrees (unary/recursive, binary, ternary), relationship cardinalities, classsubclass (sub-types, super-types), exclusive and exhaustive, non-exclusive and non-exhaustive;
minimum-maximum cardinalities.
- ER drawing guidelines: identify eclasses, identify relationships among eclasses, use rectangles for
eclasses and diamonds for relationships, depict the cardinalities, use ovals for attributes, label attributes,
underline the pkey.
- ER additional guidelines (FYI): build model around central eclass, group attributes with entity classes,
show only relevant attributes, do not show computed attributes or values, underline pkey, label all
relationships, refine.
- User views: reports, receipts, forms, memos and screen displays.
- Constraints: domain, integrity, business and performance – you need to be able to define and give
examples of each.
Logical Database Design - I
- Ill-structured, well-structured tables, anomalies
- Design concepts: insertion, update/modification and deletion anomalies.
- Design objectives: efficient form, no redundancies, queries/reporting facilitated, database implementable.
- Design from ER using thumb rules, for 1:1, 1:M and M:N type relationships.
- Use normalization theory if: complex data relationships, no planning/ER, maintenance
- More design concepts: FD concept, FD diagrams, determinant, full F.D. rule
- Functional dependency rules: reflexive, augmentation, union, decomposition, transitivity & substitution
rules.
- Concepts in design: Primary key, candidate key, composite key, non-key, foreign key/cross-reference
key (when do we need these?)
Logical Database Design -II
- The normalization process:
1. Remove repeating groups  1st NF
2. Remove partial dependencies  2nd NF
3. Remove transitive dependencies  3rd NF
4. Remove multivalued dependencies  4th NF
-
More design concepts: repeating groups, partial dependency, transitive dependency and multi-valued
functional dependency.
-
The normalization process with the FD approach:
1. Draw an FD chart
2. Group FDs together based on common determinants
3. Place each FD in a separate table
4. Add foreign keys if necessary.
-
The normal forms: 1st, 2nd, 3rd, 4th
SQl: A Standard for Database Processing
- Codd’s rules for RDBMS: information representation, guaranteed access, treatment of nulls, dynamic
on-line catalog, comprehensive data sub-language, view updating, high-level insert, update and delete,
physical, logical, integrity and distribution independence, non-subversion.
- Types of SQL: DDL, DML, SQL/T, SQL/AU (DCL), SQL/I
- DDL: Create/alter/drop table, Create/drop index, create/drop view. [need to be able to write DDL &
DML statements]
- DML: Insert, update, delete, select.
- Types of Functions (and examples): Logical (=,>,<,<=,>=, <>,!=, Between, NOT, LIKE, %, IN)
Arithmetic (ABS, ROUND, TRUNC, COUNT, AVG, SUM, MAX, MIN), String (Length, Substr,
Lower, Upper) and Date (Add_months, Months_between, Next_Day, To_date).
- Select statements using functions, expressions, logical operators, aggregates, group by and having.
Physical Database Design (only till types of file organizations)
- Physical Database Design: The process of mapping logical structures to internal structures.
- Components of physical design (tables, storage and distribution, file organization, specification of
integrity constraints).
- Denormalization: de-normalization examples (1:1 and 1:M).
- Storage strategies: volume and usage analysis, usage maps, estimating storage requirements
- Data distribution strategies: centralized vs distributed (replicated, partitioned – vertical and horizontal),
General principle for data distribution.
- File organization criteria: retrieval time, access type, storage space, maintenance effort.
- Types of file organizations: sequential, indexed-sequential, relative and hashed.
- Relative and hashed organizations: IRG, IBG, buckets and slots, hashing algorithm, hash addresses,
collisions/overflows (open overflow and chained overflow), load factor, search length.
- Evaluation of hashing: hash sequence, extra space, low file activity ratio, real-time and OO.
- Indexing: primary, secondary/non-key, clustering.
- The indexed organization: cylinder, track, sequence set, overflow tracks; insertions and deletions
- Indexing strategies: index on pkey, on foreign keys and on secondary keys depending on query
frequency.
- Advantages and disadvantages of ISAM: sequential access, # of indexes, uniform retrieval time,
workhorse organization.
Download