Chapter 9 Database Systems Courtesy of Chris Pascucci, Shelly/Vermaat, Joanne Nichols Database Basics Database – Collection of data on a specific topic or purpose that is stored for future use. – Data is organized so you can access, retrieve, sort, and edit data. Database Management System (DBMS) – Software used to create, use, and manage a database. – Create forms, reports, and queries. Database System – Comprised of a database, DBMS, and applications. – Applications such as e-commerce and scheduling. – University example: registration applications, financial applications, etc… 2 Database Basics Data – Unprocessed items like raw facts, numbers, text, etc… Information – Data that has been processed in an organized and meaningful way. A major function of a computer is to process data into information. 3 Database Integrity Data Integrity is maintained when data is accurate and up-to-date. Garbage in Garbage In, Garbage Out (GIGO) – computer phrase that means you cannot create correct information from incorrect data. Garbage out Data integrity is lost 4 Characteristics of Information Accurate Verifiable Timely Organized Accessible Useful Costeffective 5 Data dictionary A data dictionary contains data about each file in the database and each field in those files 6 Validating Data Alphabetic/Numeric check Range check Consistency check Completeness check Check digit Other checks 7 Database Systems Multiple users can interact with the same database 8 The Hierarchy of Data A database contains files, files contain records, and records contain fields. Database – A collection of integrated and related files. Files – A collection of related records. Records – A collection of related fields. Fields – A collection of characters that describe some aspect of an object. – A single piece of information like a name, number, city, state, etc… 9 The Hierarchy of Data 10 Hierarchy of Data 11 Data File Example records Member ID First Name Last Name Address City State 2295 Milton Brewer 54 Lucy Court Shelbyville IN 2928 Shannon Murray 33099 Clark Street Montgomery AL 3876 Louella Drake 33 Timmons Place Cincinnati OH 3928 Adelbert Ruiz 99 Tenth Street Carmel IN 4872 Elena Gupta 2 East Penn Drive Pittsboro IN key field fields 12 Database Management Systems 13 File Maintenance Modifying Records Adding Records Deleting Records 14 Benefits of Using a DBMS Enter data quickly and easily. Organize records in different and useful ways. Locate records quickly. Eliminate redundant data. Create queries for specific data. Create reports. 15 DBMS 16 Database Approach to Data Management Database Approach – Many applications and users can share data in a database. – Secures data so only authorized users can access it. • Access privileges (none, read-only, full-update) • Principle of least privilege – Provides means to backup data. – Requires a DBMS. File Processing System Approach – Each department/area within an organization has its own set of files. – Data redundancy - same data stored in multiple files. – Isolated data - data stored in files at various physical locations difficult to access. 17 Benefits of Using a Database Approach 18 File Processing vs DBMS 19 Database Management Creating and implementing the right database system involves: – Determining how data is stored and retrieved. – How people will see and use the database. – How the database will be created and maintained. – How reports and documents will be generated. 20 Types of Databases Relational Databases (most commonly used) Object-Oriented Databases Multi-Dimensional Databases (used for data warehouses) Others… 21 Relational Databases A relational database stores data in a table that consists of rows and columns. Most common type of database used for payroll, inventory, ordering, and other business-related functions. Also stores data relationships, which are connections within data. 22 Relational Databases 23 Object-Oriented Databases An object-oriented database stores data in objects. An object is an item that contains data, as well as actions that read and process the data. Mainly used for multimedia databases (video, audio, graphics), CAD (computer aided design) , and Web databases. 24 Multi-Dimensional Databases A multi-dimensional database stores data in dimensions. Multiple dimensions, also called hypercube, allow users to analyze any view of data. Can consolidate data much faster than relational database. 25 Multi-Dimensional Databases 26 Large-Scale Databases Data-Warehouse – Huge database that stores and manages massive amounts of data. – Holds important information from a variety of sources. – Usually a subset of multiple database. Data Mart – Smaller version of a data warehouse. – Often developed for a specific purpose. • Examples: sales department, inventory and shipping department, finance department, upper level management, and so on. Regional operating centers might each have their own data mart that contributes to the master data warehouse 27 Large-Scale Databases Data-Mining – A technique used to extract information from a data warehouse or a data mart. – Sort through huge amounts of data to find patterns and establish relationships among the data. Business Intelligence – Business use of data mining can help increase efficiency, reduce costs, or increase profits. – Identifies trends. – Identifies patterns in customer behaviors. 28 Example of Data Mining Wal-Mart captures point-of-sale transactions from over 2,900 stores in 6 countries and continuously transmits this data to its massive 500+ terabyte data warehouse. 1 Terabyte = 1 trillion characters (bytes) Can determine what products are selling well or poorly in which regions. Database is refreshed every hour. Wal-Mart allows more than 3,500 suppliers to access data on their products and perform data analyses. These suppliers use this data to identify customer buying patterns at the store display level. They use this information to manage local store inventory and identify new merchandising opportunities. 29 Data Mining Some concerns regarding Data-Mining – DARPA (Defense Advanced Research Projects Agency ) developed project TIA (Terrorism Information Awareness). • Main goal of TIA is to preemptively uncover and disrupt terrorist attacks. • TIA helps U.S. government monitor daily transactions such as, credit card transactions, airline tickets, rental car, passport, driver’s license, etc… – Medical Information • Prescription reminders sent from a pharmacy require access to certain personal information. • Profiling patients based on factors such as, age, gender, disease, etc… – Clinicians must make patients aware of how their information may be used. – Limitations: • Data mining tools are not self-sufficient applications and require trained specialists to analyze the information generated by these tools. • Patterns and connections that are found depend on “Real World” circumstances that may be casual and not necessarily be a threat. 30 Databases How are databases important to us? Shop for products or services Buy or sell stocks Search for a job Make airline reservations Register for college classes Check semester grades 31 Databases in Action NCIC – National Crime Information Center FBI’s huge database created in 1967 under J. Edgar Hoover. Over 15 million active records in 19 files. Makes available a variety of records for law enforcement and security purposes. Information in this database assists in: – Apprehending fugitives – Locating missing persons – Locating and returning stolen property http://www.fbi.gov/about-us/cjis/ncic 32 Databases in Action National DNA Database Originally intended for sex offenders – has since been extended to include almost any criminal offender. FBI uses this database to store missing persons DNA. Stores DNA crime scenes samples. Used to ID unidentified human remains. US has over 9 million records in CODIS (Combined DNA index system) – largest DNA DB in the world! 33 Databases in Action National Security Agency (NSA) Database “largest database ever assembled in the world”, from unnamed source in the NSA. Contains hundreds of billions of records of telephone calls. Existence of this database and the NSA program that compiled it was unknown to the general public until USA TODAY broke the story in May 2006. Records and saves all phone calls ever made and all telecommunications via a “black room” called “Room 641A”. Supercomputers analyze all data in their database to find certain flags. – Terrorist chatter 34