CSCI N207 Data Analysis with Spreadsheets 2a. What and Why Database? Lingma Acheson Department of Computer and Information Science IUPUI 1 What is a Database? • In a general sense, a database is any organized collection of data. • Examples: – – – – – – – Grocery List Audio CD Catalog Phone Book Airline Ticketing Oncourse Amazon eBay 2 What is a Database? • Forms of managing data: – Manual book keeping – Spreadsheets – WORD documents – Database management tool –… 3 Why Use a Database Tool? • From a technical point of view, a database is organized collection of data stored using a database management tool, e.g. Microsoft Office Access, Oracle, Microsoft SQL Server, MySQL. • Purpose for using a database tool – Better keep track of things – Obtain information quickly – Be able to analyze large amount of data and gain more insight 4 Why Use a Database Tool? • Some data can be stored in a spreadsheet • E.g. • No problem with adding or deleting students. 5 Why Use a Database Tool? • When data involves two different kinds yet related information – E.g. students and advisers information stored in a spreadsheet: 6 Why Use a Database Tool? Problems occur : – Modification problems: Deleting Row 6 will result in the loss of one adviser. If changing the email in row 8, must change in row 5. If adding an adviser with no students, null values occur. 7 Why Use a Database Tool? – Other problems: • Duplication, e.g. 20 students with one adviser would result in the same advisor information repeated 20 times. • Confusion, e.g. if several different emails were found for one adviser, which one is correct? • How about using two spreadsheets, one stores information about students, and the other stores information about advisers? – Question: How to reflect the relationship between students and their advisers? 8 Why Use a Database Tool? • Solution: Using a relational database, store students information in one table, and advisers information in another table, and the database system will allow you to define relationships among the two tables. • E.g. STUDENT table and ADVISER table – Can add a student without touching the ADVISER table and vise versa. – Can link the two tables together to see the relationship. 9 Why Use a Database Tool? STUDENT table: ADVISER table: 10