Database Beginnings CIS 121 – Computer Concepts II Instructor: Ron Christensen What is Data? • Data are facts about things, places, events • Facts about data are called metadata • Data are building blocks for information 3/10/2016 CIS 121 - Portland Community College 2 Common Data Applications • Word Processors – Like fifth generation language, very difficult to track • Spreadsheets – Mostly two dimensional. When items change, multiple spreadsheets become problem some. • Databases – More efficient, real-time, online. 3/10/2016 CIS 121 - Portland Community College 3 DBMS – Database Management System • DBMS software… – – – – – 3/10/2016 Manages metadata Organizes data Enforces security Manages access to data and metadata Enforces integrity CIS 121 - Portland Community College 4 Figure 1.2 Information System Input Output Database 3/10/2016 CIS 121 - Portland Community College 5 What is a Database (Black Box)? • A computer accepts input, processes it according to some rules, and makes output. • Everything will be represented as data, including the input and the output • So what the DBMS is doing is taking the data produced (the input) and converting it into information (the output) 3/10/2016 CIS 121 - Portland Community College 6 Relational DBMS • Relational refers to the method of organizing data, i.e. in tables • Developed in the1970s • Most common of several different types of DBMS systems – Hierarchical – Network – Object 3/10/2016 CIS 121 - Portland Community College 7 Microsoft ACCESS • Relational DBMS software for Windows • Available as part of MS-Office 3/10/2016 CIS 121 - Portland Community College 8 Type of Databases • Production Database – Transaction oriented – Lots of well-defined read/write/update actions • Decision Support Database – Query oriented – Complex queries – Few updates or writes 3/10/2016 CIS 121 - Portland Community College 9 A Bigger Picture… • Databases are a critical part of information systems • Information systems help people solve problems • Your personal database and web page will form the basis of a personal information system 3/10/2016 CIS 121 - Portland Community College 10 Figure 1.2 Information System Software Operating System Application Programs People DBMS Write and Use Database Design and Manage Hardware 3/10/2016 CIS 121 - Portland Community College 11 Entities • Entities are anything about which you want to know something – People, places, things, events, concepts • Data are facts about entities 3/10/2016 CIS 121 - Portland Community College 12 Entity Sets • Entity sets are collections of related entities. Entities are related by their classification: • student entities are related by the fact that they are all students • invoice entities are related by the fact that they are all invoices • car entities are related by the fact that they are all cars 3/10/2016 CIS 121 - Portland Community College 13 Entity Sets, cont. •Entity sets are named as a broad definition of the whole entity. •Entity and Entity set names are singular. •Entity and Entity set names are capitalized. Examples: An entity named HR contains HR entities. An entity named EMPLOYEE contains employee records. An entity named DEPARTMENT contains department records. 3/10/2016 CIS 121 - Portland Community College 14 Entity Sets, cont. • Entity Sets can only contain related records – a STUDENT entity set may not contain INVOICE data – a DEPARTMENT entity set may not contain invoice data – a PRODUCT entity set may not contain employee data …. And so on 3/10/2016 CIS 121 - Portland Community College 15 Relational Databases • Data is stored in TABLES (also called relational sets) • Tables are collections of attributes for related records • The reality is that lots of different things get stuffed into a table for a variety of reasons • (Tables are another word for entity sets) 3/10/2016 CIS 121 - Portland Community College 16 Database Tables • A database table is used to store a record (or tuple) – An entity set is a collection of related records • An record is anything you want to keep track of, so an entity may be a person, place, thing, event, etc. – (Record is another word for Entity) 3/10/2016 CIS 121 - Portland Community College 17 Database Table Components • At the conceptual level, a database table may be viewed as a matrix. – Matrix rows are also known as tuples or records • each row contains an record – Matrix columns are also called fields or attributes. • Each column (field) contains the record’s attribute values 3/10/2016 CIS 121 - Portland Community College 18 Attributes • Attributes are the characteristics that describe records – A STUDENT record may be described by attributes that may include... • • • • • 3/10/2016 social security number name address date of birth major CIS 121 - Portland Community College 19 Records, Attributes, Data • Records are things about which you want to know something, e.g. STUDENT • Attributes describe something about the entity, e.g. the name of the student • Data are the values of the attributes, e.g. DOE, JOHN 3/10/2016 CIS 121 - Portland Community College 20 Null • A null is an absence of value • A null is NOT – A blank – A zero • A null has several meanings in a database – Value does not exist – Value is not known 3/10/2016 CIS 121 - Portland Community College 21 Attribute Names • Attribute names are capitalized. • For documentation reasons, attribute names are composed of two parts: – the first few characters reflect the entity they help describe. – subsequent characters are sufficiently descriptive to identify the attribute. 3/10/2016 CIS 121 - Portland Community College 22 Attribute Names, cont. • Examples of attribute names: EMP_LNAME = employee last name STU_GPA = student grade point average PROD_CODE = product code CUST_LNAME = customer last name INV_NUM = invoice number 3/10/2016 CIS 121 - Portland Community College 23 Keys • Primary Key (PK) – an attribute (or combination of attributes) that uniquely identifies each row (tuple) in a table. • A PK composed of two or more attributes is known as a composite PK. • Foreign Key (FK) – an attribute in one table whose values match the PK values in a related table or whose “values” are null. • FKs are used to link (connect) related tables. 3/10/2016 CIS 121 - Portland Community College 24 Data Integrity. • Entity Integrity – PK uniquely identifies each entity in a table • PK may not include nulls • Referential Integrity – FK values in one table match the PK values in the related table • FK may not include nulls 3/10/2016 CIS 121 - Portland Community College 25 A Conceptual View of a Database Table Tables are named. The table you see here is the EMPLOYEE table. Named attributes (fields) Each column contains the values of an attribute. The EMP_FNAME column only contains employee first names; the EMP_PHONE may only contain employee phone numbers. Each row represents an record A row is also called a tuple. The Primary Key (PK) is a unique record identifier. If you know the PK value, you will know all of its CIS row’s attribute values 3/10/2016 121 - Portland Community College Each row/column intersection contains only one of an entity’s attribute values 26 Entity Integrity EMPLOYEE table A table exhibits entity integrity when all of its Primary Key (PK) values uniquely identify each table row (record.) 1. A PK cannot contain duplicate values 2. A PK cannot contains nulls Note: A null indicates the absence of a value; it is not a blank. (You create a null when you tap the ENTER key without first making an entry. A blank is created when you tap the space bar and then tap the ENTER key.) 3/10/2016 CIS 121 - Portland Community College 27 Foreign keys (FK) and Referential Integrity A Foreign Key (FK) is an attribute located in one table that “points to” a Primary Key (PK) in a related table. The use of FKs allows you relate one table to another. To maintain referential integrity, a foreign key (FK) must reference an existing PK value in a related table it may be null. INVOICE table LINE table 3/10/2016 CIS 121 - Portland Community College 28 Attribute types 1. Simple (atomic) 2. Composite A. Single-valued B. Multi-valued Derived 3/10/2016 CIS 121 - Portland Community College 29 Simple (Atomic) Attributes • A simple (atomic) attribute cannot be decomposed into meaningful components • Examples: – The attribute EMP_LNAME cannot be decomposed, because you cannot subdivide EMP_LNAME into a set of new attributes. – The attribute PROD_PRICE cannot be decomposed, because you cannot subdivide PROD_PRICE into a new set of attributes. 3/10/2016 CIS 121 - Portland Community College 30 Simple (Atomic) Attributes, cont. Simple attributes may be …. • single-valued or • multi-valued 3/10/2016 CIS 121 - Portland Community College 31 Simple (Atomic) Attributes, cont. • Single-valued simple attributes • Example: an employee can have only one gender, so EMP_GENDER is a single-valued attribute. The attribute EMP_GENDER cannot be decomposed, so it is a simple attribute. • Multi-valued simple attributes • Example: an employee can have many degrees, so EMP_DEGREE is multi-valued. The attribute EMP_DEGREE cannot be decomposed, so it is a simple attribute. 3/10/2016 CIS 121 - Portland Community College 32 Composite Attributes •A composite attribute can be decomposed into meaningful components • Example: an employee’s address, shown as 123 East Main Street, Nashville, TN 32123 may be decomposed into EMP_ADDRESS = 123 East Main Street EMP_CITY = Nashville EMP_STATE = TN EMP_ZIP = 32123 3/10/2016 CIS 121 - Portland Community College 33 Composite Attributes, cont. A composite attribute may be …. • single-valued or • multi-valued 3/10/2016 CIS 121 - Portland Community College 34 Composite Attributes, cont. • single-valued composite attributes – Example: an employee can have only one date of birth, so EMP_DOB is single-valued. But the attribute EMP_DOB can be decomposed into year, month, and day, so it is a composite attribute. • multi-valued composite attributes – Example: an employee can have more than one address, so EMP_ADDRESS may be multi-valued. The attribute EMP_ADDRESS can be decomposed into street address, city, state, and ZIP code, so it is a composite attribute. 3/10/2016 CIS 121 - Portland Community College 35 Attribute Storage • Each table row/column intersection contains a single attribute value for a single entity. – Ideally, attributes are simple and single-valued. – Single-valued composite attributes are acceptable • but composite attributes may make queries more complex and may impose reporting limitations. – Multi-valued attributes, either simple or composite, may • create structural problems • make queries more complex and may impose reporting limitations. • May be necessary for a variety of reasons, but should be avoided where possible 3/10/2016 CIS 121 - Portland Community College 36 Multi-valued Attribute Storage • Multi-valued attributes are sometimes stored as – strings • this approach yields query complexity and reporting limitations – multiple attributes • this approach yields structural problems • Ideally, multi-valued attributes are handled through the use of composite tables. 3/10/2016 CIS 121 - Portland Community College 37 Multi-valued Attributes Stored As Strings Multi-valued attribute Poor practice: Makes it difficult to generate queries such as “How many employees have earned BA or MBA degrees?” 3/10/2016 CIS 121 - Portland Community College 38 Storing Multi-valued Attributes In Separate Columns Poor structure: many nulls table structure must be altered when additional degrees are earned 3/10/2016 CIS 121 - Portland Community College 39 Composite Tables • Multi-valued attributes can generate many nulls, increasing uncertainty about the data • One solution is to add another table, called a composite table or bridge table • It is often better to use more tables rather than poor attributes in fewer tables 3/10/2016 CIS 121 - Portland Community College 40 A Composite Table (EDUCATION) is Used To Convert M:N Relationships To 1:M Relationships Table name: EMPLOYEE Table name: EDUCATION Table name: DEGREE 3/10/2016 CIS 121 - Portland Community College 41 Supertype/Subtype Relationships Are Used To Eliminate or Control the Occurrence of Nulls A table with many nulls Supertype. (Table name: EMPLOYEE Subtype. (Table name: PILOT) 3/10/2016 CIS 121 - Portland Community College 42 Relationships • Tables can be related to each other in a variety of ways – – – – 1:1 Relationships 1:M Relationships M:N Relationships Recursive Relationships • Tables are related by their PKs and FKs 3/10/2016 CIS 121 - Portland Community College 43 1:1 Relationships • Entity in one table is related to only one entity in another table • Example – STUDENT : SOCIAL SECURITY NUMBER Each student has only one Social Security Number, each Social Security Number refers to only one student 3/10/2016 CIS 121 - Portland Community College 44 1:M Relationships • Entity in one table is related to many entities in another table • Example – STUDENT : TUITION INVOICE Each student may have several tuition invoices, each tuition invoice is assigned to only one student 3/10/2016 CIS 121 - Portland Community College 45 M:N Relationships • An entity can occur more than once on both sides of the relationship – causes several different problems in databases • Example: – STUDENT : CLASS Each student may register for many classes, each class may contain many students 3/10/2016 CIS 121 - Portland Community College 46 Some Issues with M:N • May signal redundant data, resulting in anomalies – Modification anomaly – Deletion anomaly • Keep it simple: DO NOT IMPLEMENT M:N RELATIONSHIPS. Use composite tables instead 3/10/2016 CIS 121 - Portland Community College 47 Recursive Relationships • Entity is related to itself • Examples – TEAM : TEAM Teams in an athletic league play each other – CLASS : CLASS Some classes are pre-requisites for others 3/10/2016 CIS 121 - Portland Community College 48 Optional (vs. Mandatory) Relationships • Optional relationships occur when entities in two tables may be related, but do not need to be. • Example (Optional Relationship): – STUDENT : SOCIAL SECURITY NUMBER A student is not required to have a social security number • Example (Mandatory Relationship) – STUDENT : CLASS A student must be registered in a class to be a student 3/10/2016 CIS 121 - Portland Community College 49 End of Lecture 3/10/2016 CIS 121 - Portland Community College 50