Brian Harrington (Slides adapted from material by Dr. Purva Gawde and Dr Anna Bretscher) Welcome to CSCB20 What is this course? ● A practical introduction to databases and web app development Why Databases? ● It’s how computers organize information Why Web Apps? ● It’s how people access information Course Structure ● Database Design: First 5-6 weeks ● Web Application Design: Next 6-7 weeks ● Weekly Work: ○ ○ ○ ○ ○ Lectures: 2 hours per week Tutorials: 1 hour per week (starting in week 2) Reading: Approx 1 hour per week Practice: Approx 2 hours per week Assignments: Approx 1-2 hours per week Evaluation ● Term Work (35%) ○ 3 Assignments ■ Assignment 1 - 8% ■ Assignment 2 - 12% ■ Assignment 3 - 15% ● Midterm (25%) ● Exam (40%) ○ Need to score at least 35% on final exam to pass the course) How To Succeed in CSCB20 ● ● ● ● ● Come to Lecture/Tutorial Stay on top of material Do assigned readings Practice Have fun! Lecture Plan ● Databases ○ What/Where/Why/When/How ● Web Applications ○ Very brief overview ● Back to Databases ○ Introduction to Relational Algebra ■ Procedural query language ○ Terminology Databases: What ● A Database is just a collection of interrelated data ○ (technically doesn’t have to be interrelated, just really boring if it’s not) A Database Databases: What ● A Database Management System (DBMS) is: Databases ○ One or more databases ○ A set of programs to DBMS access/manipulate data ○ Another set of programs to ensure integrity of data You (not touching data directly) Databases: Why? ● Ummm… to store data? ● But why not just plain text files? ○ Fine for small amounts of data ○ Difficult to format ○ Can’t perform any checks/calculations ● Spreadsheets? ○ Fine for medium amounts of data ○ Designed for single access ○ No redundancy/accuracy checking < < Databases: Why? ● Database transactions are (or should be) ACID ○ Atomic (all or nothing) ○ Consistent (can’t stop half way) ○ Isolated (can’t impact other transactions) ○ Durable (once committed, stays committed) Databases: Where? ● Basically anywhere you have large amounts of structured, interrelated data that needs concurrent access ○ Retail ○ Manufacturing ○ Banking ○ Universities ○ Travel ○ Telecommunications Source: Brand Finance (@visualcap) Databases: Which? ● Many types of DBMS ○ Object-Oriented ○ Network ○ NoSQL ○ Hierarchical ○ Relational (← this is us) ● Why are we using Relational? ○ Commonly Used ○ Cross platform ○ Integrates with common web app software Relational Database Model ● Data is stored in tables ○ Each row represents a single entity ○ Each column represents a specific attribute ○ Relations between entities can be indicated by shared keys Ted Codd Turing Award 1981 Important Aside: Data Abstraction ● The Relational Model is only an abstraction, a collection of tools for describing data, semantics, relationships and constraints. The data (probably) isn’t actually held in simple tables, complex data structures can be used to make things more efficient, but thanks to abstraction… we don’t have to care about them (unless we take CSCC43) Database Languages ● Data Definition Language (DDL) ○ Specify the database schema (tables, attributes, relations) ● Data Manipulation Language (DML) ○ Access and update data ○ We will be using SQL Web Applications (A very brief overview) Web Applications ● Front End: Interacting with the user ○ HTML (put content on the page) ○ CSS (ensure consistent style) ○ JavaScript (add dynamic elements) ● Back End: Interacting with the database/servers/etc ○ JavaScript/Ruby/Java/Python (manipulate user requests) ○ SQL (interact with DBMS) BREAK Relational Databases Introduction ● Some Terminology: ○ Tables(Relations): set of data about a specific relationship ○ Tuples(Rows): information about a particular entity with respect to a particular relation ○ Attributes(Columns): The specific values of a relation ○ Schema: Logical design of tables/relations Relational Databases Introduction Instructor name Course dept_name office code semester enrolment Harrington CMS IC342 CSCA20 F 500 Turing CMS IC123 CSCA08 F 800 Einstein Physics SW999 CSCB20 S 175 Mozart Music AA123 PSYA01 F 1000 Freud Psychology SY234 PHYD01 S 3 Instructor(name, dept_name, office) Course(code, semester, enrolment) Relational Databases Introduction Instructor ID name Course dept_name office code dept_name semester enrolment 12345 Harrington CMS IC342 CSCA20 CMS F 500 23456 Turing CMS IC123 CSCA08 CMS F 800 41231 Einstein Physics SW999 CSCB20 CMS S 175 23124 Mozart Music AA123 PSYA01 Psychology F 1000 98765 Freud Psychology SY234 PHYD01 Physics Instructor(ID, name, dept_name, office) S 3 Course(code, dept_name, semester, enrolment) Relational Databases Introduction ● Keys: ○ Need to make sure we can uniquely identify a tuple within a relation ○ Superkey: a set of attributes that will uniquely identify a tuple ○ Candidate key: Superkey with minimal set of attributes ○ Primary key: Candidate key that we (database designers) choose as our unique ID Relational Databases Introduction ● Keys: ○ Superkeys? ○ Candidate Keys? ○ Primary Key Instructor ID name dept_name office 12345 Harrington CMS IC342 23456 Turing CMS IC123 41231 Einstein Physics SW999 23124 Mozart Music AA123 98765 Freud Psychology SY234 Things to do this week: ● Readings: Course Syllabus ● Practice: None (1 bonus hour of self-care time) ● To-dos: ○ Check that you can log into Piazza ○ Check that you are enrolled in a tutorial Next Week ● Tutorials start ● More on relational model ● Relational Algebra ○ (don’t worry, it’s not as scary as it sounds) Brian Harrington (Slides adapted from material by Dr. Purva Gawde and Dr Anna Bretscher)