Lecture 22 4/1/04 18:21 Lecture 22 Read S&G ch. 12 (Computer Networks) for next week Databases Numeric & Symbolic Computing (S&G, §§11.3 –11.4) §§11.3– 4/1/04 CS 100 - Lecture 22 1 4/1/04 Data Organization CS 100 - Lecture 22 2 Example File “Employee” • A database is a collection of related files – analogy: all the file cabinets in a business • A file is a collection of related records – analogy: all the folders in one drawer (holding, say, the personnel records) • A record is composed of fields – analogy: the folder for a particular employee, containing, for example their name, employment history, pay rate, insurance information, evaluations 4/1/04 CS 100 - Lecture 22 PayRate Hours Pay 86 Janet Kay 51 16.50 94 1560.40 123 Francine Perreira 18 8.50 185 1572.50 149 Fred Takasano 43 12.35 250 3087.50 71 John Kay 53 17.80 245 4361.00 165 Butch Honou 17 6.70 53 4/1/04 355.10 Field CS 100 - Lecture 22 4 • A file is viewed as a table • Each table contains information about a number of instances of some entity – an entity is a fundamental distinguishable object, such as “employee” – think of IRS databases, Wal-Mart employee & inventory databases • Each instance of the entity is represented by a tuple • Therefore efficiency is critical: – e.g., the data for a particular employee – efficiency of data storage – efficiency of retrieval • Each tuple has a number of attributes – which characterize the instance (e.g., a particular employee’s attributes) • The data in a database is usually static – updated manually, not automatically CS 100 Age Relational Database Model • Databases are typically oriented toward very large amounts of data CS 100 - Lecture 22 Name Record 3 How is this different from a spreadsheet? 4/1/04 ID • Primary key: attribute(s) that uniquely identify a tuple 5 4/1/04 CS 100 - Lecture 22 6 1 Lecture 22 4/1/04 18:21 A Table for the “Employee” Entity Query Languages • A query language allows users to: ID Name Age PayRate Hours Pay 86 Janet Kay 51 16.50 94 1560.40 123 Francine Perreira 18 8.50 185 1572.50 149 Fred Takasano 43 12.35 250 3087.50 71 John Kay 53 17.80 245 4361.00 165 Butch Honou 17 6.70 53 355.10 4/1/04 Primary Key Tuple – – – – • SQL (Structured Query Language) – a standard query language – a textual language – sometimes used behind a graphical “front end” Attribute CS 100 - Lecture 22 7 retrieve information from a database relate information in different files in a database update information in a database perform statistical and other data processing operations on selected information 4/1/04 8 Example Query Example Query (2) >SELECT ID, NAME, AGE, PAYRATE, HOURS, PAY >FROM EMPLOYEE >WHERE ID = 123; >SELECT ID, NAME, AGE, PAYRATE, HOURS, PAY >FROM EMPLOYEE >WHERE NAME = ’John Kay’; 123 Francine Perreira $8.50 185 $1572.50 18 71 John Kay $4361.00 > 53 $17.80 245 > 4/1/04 CS 100 - Lecture 22 9 4/1/04 ID Name $4361.00 > 4/1/04 CS 100 - Lecture 22 10 >SELECT * >FROM EMPLOYEE >ORDER BY PAYRATE; >SELECT NAME, PAY >FROM EMPLOYEE >WHERE NAME = ’John Kay’; John Kay CS 100 - Lecture 22 Example Query (4) Example Query (3) CS 100 CS 100 - Lecture 22 11 Age PayRate 165 Butch Honou 123 Francine Perreira 17 18 $6.70 $8.50 Hours 53 $355.10 185 $1572.50 Pay 149 Fred Takasano 250 $3087.50 43 $12.35 86 Janet Kay 51 $16.50 94 $1560.40 71 John Kay 53 $17.80 245 $4361.00 4/1/04 CS 100 - Lecture 22 12 2 Lecture 22 4/1/04 18:21 Example Query (5) Modifying Databases >SELECT * >FROM EMPLOYEE >WHERE AGE > 21; ID Name 86 Janet Kay 149 Fred Takasano 71 John Kay 4/1/04 Age PayRate 51 $16.50 94 $1560.40 43 $12.35 250 $3087.50 53 $17.80 245 $4361.00 CS 100 - Lecture 22 Hours Pay 13 • DELETE * FROM EMPLOYEE WHERE AGE < 21; • UPDATE EMPLOYEE SET PAYRATE = 8.75 WHERE ID = 123; • INSERT INTO EMPLOYEE VALUES (456, ’Sandy Beech’, 13.25, 0, 0); 4/1/04 Another Table CS 100 - Lecture 22 14 Foreign Key Primary Key InsuredID PlanType DateIssued 86 A4 02/23/78 123 B2 12/03/91 149 A1 06/11/85 71 A4 10/01/72 149 B2 4/1/04 • The “InsuredID” attribute is a foreign key because it is a primary key into a different table (EMPLOYEE) • Foreign keys establish relationships between tables • E.g., between the employee (with all his/her attributes) and the insurance plan (with all its attributes) 04/23/90 CS 100 - Lecture 22 15 4/1/04 NAME Fred Takasano Fred Takasano PLANTYPE A1 B2 • SQL is a very high-level language – nonprocedural – problem-specific • Performance in a major issue • Consistency issues with simultaneous updates • Distributed databases (files stored in many locations) – access time & consistency problems > 4/1/04 CS 100 CS 100 - Lecture 22 16 Computer Science Issues Example Query of Joined Tables >SELECT EMPLOYEE.NAME, INSURANCE.PLANTYPE >FROM EMPLOYEE, INSURANCE >WHERE EMPLOYEE.NAME = ’Fred Takasano’ AND EMPLOYEE.ID = INSURANCE.INSUREDID; CS 100 - Lecture 22 17 4/1/04 CS 100 - Lecture 22 18 3 Lecture 22 4/1/04 18:21 Numeric Computation • Applications that make heavy use of real arithmetic • Especially used in science, engineering, economics, statistics, animation • The motivation for the first computers • Still drives the development of supercomputers and parallel computers Numeric and Symbolic Computing a teraflop machine performs at least 1012 (a trillion) floating-point operations per second 36 Tflops already achieved (Japan’s Earth Simulator, which cost $350–500M) 4/1/04 CS 100 - Lecture 22 19 4/1/04 Computer Science Issues • Manipulate mathematical formulas, equations, etc. much the way a mathematician would better algorithms accessing of data in memory hierarchies parallel computation data communication in networks – automate processes that are mechanical, tedious, and error-prone • Mathematical software libraries • Accuracy and stability of numerical approximations 4/1/04 CS 100 - Lecture 22 • Examples: Macsyma, Mathematica, Maple, MatLab 21 4/1/04 Example: Simplification ( x −1) 2 + ( x + 2) + (2x − 3) + x (1+ x + 3y ) CS 100 22 4 • Expand[(1 + x + 3y)^4] • 1 + 4x + 6x2 + 4x3 + x4 + 12y € + 36xy + 36x2y + 12x3y + 54y2 + 108xy2 + 54x2y2 + 108y3 +108xy3 + 81y4 • 12 - 12x + 5x2 CS 100 - Lecture 22 CS 100 - Lecture 22 Example: Expansion 2 • Simplify[(x-1)^2 + (x+2) + (2x-3)^2 + x] € 4/1/04 20 Symbolic Computing • Performance: – – – – CS 100 - Lecture 22 23 4/1/04 CS 100 - Lecture 22 24 4 Lecture 22 4/1/04 18:21 Typical Expansion Rules Example: Solving Equations Expand[ X × (Y + Z )] ⇒ X × Y + X × Z 2x + y = 11 6x − 2y = 8 Expand[( X + Y ) × Z ] ⇒ X × Z + Y × Z Expand[ X 2 ] ⇒ X × X Hence, Expand[(n + 1)2] € ⇒Expand[(n + 1)(n + 1)] ⇒Expand[(n + 1)n + (n + 1)1] ⇒Expand[(n + 1)n + (n + 1)1] ⇒Expand[n×n + 1×n + n×1 + 1×1] • Solve[ {2x + y == 11, 6x - 2y == 8}, {x, € y}] • {{x -> 3, y -> 5}} 4/1/04 CS 100 - Lecture 22 25 4/1/04 CS 100 - Lecture 22 26 Computer Science Issues Digression • Symbolic computation systems are: • Recall our discussion of formalized mathematics, and the idea of reducing mathematics to the mechanical application of formal rules • Formal rules: depend on the form of expressions, not their meaning • Symbolic computation is an application of the idea of a calculus 4/1/04 CS 100 CS 100 - Lecture 22 – very high-level languages – problem-specific – nonprocedural • Depend on many algorithms, e.g.: – pattern matching – efficient management of complex data structures representing formulas • Results should be presented in a form familiar and useful to the mathematically literate 27 4/1/04 CS 100 - Lecture 22 28 5