MIS 4340 Class Introduction

advertisement
G. Green
Foundations of Database
Systems
Class Introduction
1
•
•
•
•
•
Introductions
Course Overview
Syllabus
Case
Database Development Overview
G. Green
Agenda
2
Foundations of Database Systems
 Understand data-related activities of SDLC
 Implement data modeling, database design, and database
implementation techniques
 CASE (Visio)
 Database (SQL Server)
G. Green
 Objectives
 Course Contents





Lectures, Examples, In-Class Exercises
Individual Assignments (3)
Team Project* (3 parts)
Quizzes (3)
Exams (2)
3
*Can request teammates; see syllabus for Team Preferences deadline
• Service Learning & Kolb’s Learning Cycle
• Motivators for Choosing MIS Major
• International and US
• Periodic Assessments
G. Green
Research
• Some NOT graded; others are
4
Learning
 Prepare:
 Participate:
G. Green
› Prepare --read & reread book, notes-- for each class
› Do book, in-class, and extra problems on your own
› Come see me during office hours for help
› Attend, listen, be attentive (no internet distractions), be engaged
› Ask and answer questions, & add to discussion
› Do each assignment completely & in a timely and professional
manner
 Take PLENTY of notes in class:
› Do NOT just rely on powerpoint
 Explore:
› Go beyond classroom material to research topics
5
Class Resources
 http://canvas.baylor.edu
 Schedule contains links to all lecture slides, study guides,
assignments and project write-ups
G. Green
 Syllabus/Schedule, Grades, Attendance:
 Other Resources:
 http://blogs.baylor.edu/gina_green/mis-4340-resources/
 Copies of in-class handouts, team resources, database tables, …
6
G. Green
Syllabus…
7
G. Green
Introduction to Databases
Chapter 1
8
Topics
• Chapter 1
• Chapter 9 (Pages 409 – 410)
• Big Data
G. Green
• The Database Environment
• Database Development Process
• Chapter 10 (Pages 444 – 445, 446-447)
• Master Data Management
• Data Federation
• Chapter 11 (Pages 464 – 472, 486, 499 – 506)
•
•
•
•
Database Personnel
Metadata Management (e.g., Data Dictionaries)
Backup Facilities
Overview of Tuning the Database for Performance
9
Evolution of Database
Technologies
1970’s
1980’s
1990’s
2000+
Federated
G. Green
1960’s
MDDB
Hierarchical
Object
XML
Traditional Files
Relational
Network
NoSQL
Object-Relational
…….
10
Figure 1-3 Old file processing systems: Example
Duplicate Data
11
Traditional File Processing
Environment
› Program-data dependence = “structural” & “data”
› Limited data sharing = “islands of automation”
› Duplication of data = “redundancy”
› Lengthy development times
› Excessive program maintenance
G. Green
Disadvantages:
12
G. Green
The Database Environment
13
 Program-data independence
 Improved data sharing
 Minimal data redundancy
 Improved data accessibility/responsiveness
 Improved data consistency
 Faster application development
 Enforcement of standards
 Improved data quality
 Reduced program maintenance
G. Green
Advantages of Databases
14
G. Green
Database Development Process
Chapter 1
15
Systems Development Life
Cycle
DB Activities in SDLC
Planning
Enterprise Modeling*
Analysis
DB Scope, Requirements
(Conceptual Data Model)
Design
DB Design
(Logical DB Design)
DB Design
(Physical DB Design)
Implementation
DB Implementation
(Load, Test, Eval, Op)
DB Maintenance*
G. Green
SDLC for this class
16
Enterprise Data Modeling
requirements
G. Green
• Determine organizational data
• Build enterprise data model
• outcome is a very high-level Entity-Relationship Diagram
• see :
• http://da.ks.gov/kito/ITPlans/data_maps06.ppt
• http://www.tdan.com/view-articles/5205
17
G. Green
18
Source: http://www.tdan.com/view-articles/5205
Conceptual Data Modeling
Determine business rules
G. Green
Determine user data requirements
Build conceptual data model
› outcome is an Entity-Relationship Diagram
(conceptual schema)
19
Logical Database Design
› e.g., the Relational Model
G. Green
Select database model
Transform conceptual (ERD) into logical
(relational) data model
Normalize and link data structures
› Outcome is normalized, linked relational
tables
20
Physical Database Design
Select storage device(s)
Design fields, records, files (physical
schema)
G. Green
Select database product (e.g., SQL Server)
› outcomes are detailed, physical definitions for:
 fields (data dictionary)
 records (space requirements for physical structures)*
 files (access methods)
*Will not do in this class
21
Database Implementation
• Create views (external schema)
• Establish access rights
G. Green
• Create database file/table structures
• Load test data
• Write/test programs that process data
• Install database (with production data) into
production operations
› outcomes are secured database tables loaded with data
22
Database Maintenance
• Maintain database structures
• Performance, tuning
G. Green
• Storage/space management
• I/O Contention
• CPU Usage
• Application Tuning
• Data availability
• DBMS upgrades, "fixes"
• Backup, recovery …….
23
Database Maintenance, cont…
• Full
• Incremental
• Differential
G. Green
• Backup
• Business Continuity
• Data Replication ("fallback")
24
G. Green
Data and Database
Administration
Chapter 11
25
 Data Administration: A high-level function that is
responsible for the overall management of data
resources in an organization, including
maintaining corporate-wide definitions and
standards
 Database Administration: A technical function
that is responsible for physical database design
and for dealing with technical issues such as
security enforcement, database performance,
and backup and recovery
G. Green
Traditional Administration Definitions
26
Data People Involved in
SDLC
 Data(base) Analysts/Designers
 requirements elicitation, design
 Business (Intelligence) Analyst
 BI requirements, design
 Data Architects
 strategy, governance
 Data Stewards
 quality, metadata, MDM
 Business Analytics Engineer
 data analytics, statistics, mining
 Data Mining Engineer; Big Data
 “big data” specialists
G. Green
 Data Administrators
Engineer; Data Scientist …
 Database Administrators
 (System) DBAs
 implementation/maintenance
 Application DBAs
 Procedural DBAs
 stored code
 e-DBAs
 web-enabled DBMSs
 Data Warehouse Administrators
 ETL, DW implementation
27
•
•
•
•
•
•
•
•
•
Relational database design, implementation
Database programming
ETL (extract, translate, load)
Data warehousing design (star schema) and implementation
(MDDB)
Data analysis, reporting, and mining techniques
Statistical modeling with tools such as R, SAS, or SPSS
Data visualization tools
Cloud database implementations
Technologies for structured and unstructured data
• Hadoop (Hadoop is an Apache project to provide an open-source
implementation of frameworks for reliable, scalable, distributed
computing and data storage.)
• NoSQL
• "NewSQL"
***See Big Data University for (mostly) free self-study training
G. Green
Growing Skillset
28
G. Green
Data Quality and
Integration
Chapter 10
29
Metadata Management
• Part of DBMS
• "Active" dictionary
G. Green
• System Catalog
• Data Dictionary
• Typically "passive"
• Extension of catalog metadata
• Information Repository (e.g., IRDS)
• Standards for data dictionaries
• Integrates dictionaries
30
• "Ensuring the currency, meaning, and quality of
reference data within and across various subject
areas" (pg 444)
• Identify
G. Green
Master Data Management
• Common Data Subjects
• Common Data Elements
• Sources of "the truth"
• Cleanse
• Update applications to reference Master Data
repository
• Ensures consistency of key data (not ALL data)
throughout organization
31
G. Green
Data and Database
Administration
Chapter 11
32
Cloud Computing
• Business Model
Computing resources on demand
Need-based architectures
Internet-based delivery
Pay as you go
G. Green
•
•
•
•
• History (VERY high-level and approximate)
Time-sharing
Utility Computing
Virtual Machines
50's
60's
WWW
70's
Cloud Computing
Personal Computers
80's
Grid Computing
90's
2000's
33
G. Green
Cloud Computing Services
• Impacts to Data(base) Administration
• See textbook page 469
34
Summary
• Evolution of Data Management
• Database Concepts
• Components of a DBMS Environment
• Database Advantages
G. Green
• Disadvantages of file processing
• People Involved in Data Management
• Traditional job divisions and responsibilities
• Newer job titles
• Database Development:
• Overall SDLC
• Database Activities in the SDLC
• Special Topics
• Metadata Management
• MDM
• Cloud Computing Impacts
35
Download