Lecture 22 Read S&G ch. 12 (Computer Networks) for next week

advertisement
Lecture 22
4/1/04 18:21
Lecture 22
Databases
Numeric & Symbolic Computing
(S&G, §§11.3
–11.4)
§§11.3–
4/1/04
CS 100 - Lecture 22
1
Read S&G ch. 12
(Computer Networks)
for next week
4/1/04
CS 100
CS 100 - Lecture 22
2
1
Lecture 22
4/1/04 18:21
Data Organization
• A database is a collection of related files
– analogy: all the file cabinets in a business
• A file is a collection of related records
– analogy: all the folders in one drawer (holding,
say, the personnel records)
• A record is composed of fields
– analogy: the folder for a particular employee,
containing, for example their name,
employment history, pay rate, insurance
information, evaluations
4/1/04
CS 100 - Lecture 22
3
Example File “Employee”
ID
Name
Age
PayRate Hours
86
Janet Kay
51
16.50
94
1560.40
123
Francine Perreira
18
8.50
185
1572.50
149
Fred Takasano
43
12.35
250
3087.50
71
John Kay
53
17.80
245
4361.00
165
Butch Honou
17
6.70
53
355.10
Field
Record
4/1/04
CS 100
Pay
CS 100 - Lecture 22
4
2
Lecture 22
4/1/04 18:21
How is this different from a
spreadsheet?
• Databases are typically oriented toward
very large amounts of data
– think of IRS databases, Wal-Mart employee &
inventory databases
• Therefore efficiency is critical:
– efficiency of data storage
– efficiency of retrieval
• The data in a database is usually static
– updated manually, not automatically
4/1/04
CS 100 - Lecture 22
5
Relational Database Model
• A file is viewed as a table
• Each table contains information about a number of
instances of some entity
– an entity is a fundamental distinguishable object, such as
“employee”
• Each instance of the entity is represented by a tuple
– e.g., the data for a particular employee
• Each tuple has a number of attributes
– which characterize the instance (e.g., a particular
employee’s attributes)
• Primary key: attribute(s) that uniquely identify a tuple
4/1/04
CS 100
CS 100 - Lecture 22
6
3
Lecture 22
4/1/04 18:21
A Table for the “Employee” Entity
ID
Name
Age
86
Janet Kay
51
16.50
94
1560.40
123
Francine Perreira
18
8.50
185
1572.50
149
Fred Takasano
43
12.35
250
3087.50
71
John Kay
53
17.80
245
4361.00
165
Butch Honou
17
6.70
53
355.10
Primary
Key
4/1/04
PayRate Hours
Tuple
Pay
Attribute
CS 100 - Lecture 22
7
Query Languages
• A query language allows users to:
–
–
–
–
retrieve information from a database
relate information in different files in a database
update information in a database
perform statistical and other data processing operations
on selected information
• SQL (Structured Query Language)
– a standard query language
– a textual language
– sometimes used behind a graphical “front end”
4/1/04
CS 100
CS 100 - Lecture 22
8
4
Lecture 22
4/1/04 18:21
Example Query
>SELECT ID, NAME, AGE, PAYRATE,
HOURS, PAY
>FROM EMPLOYEE
>WHERE ID = 123;
123 Francine Perreira
$8.50 185 $1572.50
18
>
4/1/04
CS 100 - Lecture 22
9
Example Query (2)
>SELECT ID, NAME, AGE, PAYRATE,
HOURS, PAY
>FROM EMPLOYEE
>WHERE NAME = ’John Kay’;
71 John Kay
$4361.00
53
$17.80
245
>
4/1/04
CS 100
CS 100 - Lecture 22
10
5
Lecture 22
4/1/04 18:21
Example Query (3)
>SELECT NAME, PAY
>FROM EMPLOYEE
>WHERE NAME = ’John Kay’;
John Kay
$4361.00
>
4/1/04
CS 100 - Lecture 22
11
Example Query (4)
>SELECT *
>FROM EMPLOYEE
>ORDER BY PAYRATE;
ID
165
123
149
86
71
4/1/04
CS 100
Name
Butch Honou
Francine Perreira
Fred Takasano
Janet Kay
John Kay
Age
17
18
43
51
53
PayRate
$6.70
$8.50
$12.35
$16.50
$17.80
CS 100 - Lecture 22
Hours
53
185
250
94
245
Pay
$355.10
$1572.50
$3087.50
$1560.40
$4361.00
12
6
Lecture 22
4/1/04 18:21
Example Query (5)
>SELECT *
>FROM EMPLOYEE
>WHERE AGE > 21;
ID
86
149
71
4/1/04
Name
Janet Kay
Fred Takasano
John Kay
Age
51
43
53
PayRate
$16.50
$12.35
$17.80
Hours
Pay
94 $1560.40
250 $3087.50
245 $4361.00
CS 100 - Lecture 22
13
Modifying Databases
• DELETE *
FROM EMPLOYEE
WHERE AGE < 21;
• UPDATE EMPLOYEE
SET PAYRATE = 8.75
WHERE ID = 123;
• INSERT INTO EMPLOYEE
VALUES (456, ’Sandy Beech’,
13.25, 0, 0);
4/1/04
CS 100
CS 100 - Lecture 22
14
7
Lecture 22
4/1/04 18:21
Another Table
Primary Key
InsuredID PlanType
DateIssued
86
A4
02/23/78
123
B2
12/03/91
149
A1
06/11/85
71
A4
10/01/72
149
B2
04/23/90
4/1/04
CS 100 - Lecture 22
15
Foreign Key
• The “InsuredID” attribute is a foreign key
because it is a primary key into a different
table (EMPLOYEE)
• Foreign keys establish relationships
between tables
• E.g., between the employee (with all his/her
attributes) and the insurance plan (with all
its attributes)
4/1/04
CS 100
CS 100 - Lecture 22
16
8
Lecture 22
4/1/04 18:21
Example Query of Joined Tables
>SELECT EMPLOYEE.NAME,
INSURANCE.PLANTYPE
>FROM EMPLOYEE, INSURANCE
>WHERE EMPLOYEE.NAME = ’Fred Takasano’
AND EMPLOYEE.ID = INSURANCE.INSUREDID;
NAME
Fred Takasano
Fred Takasano
PLANTYPE
A1
B2
>
4/1/04
CS 100 - Lecture 22
17
Computer Science Issues
• SQL is a very high-level language
– nonprocedural
– problem-specific
• Performance in a major issue
• Consistency issues with simultaneous
updates
• Distributed databases (files stored in many
locations)
– access time & consistency problems
4/1/04
CS 100
CS 100 - Lecture 22
18
9
Lecture 22
4/1/04 18:21
Numeric and Symbolic
Computing
4/1/04
CS 100 - Lecture 22
19
Numeric Computation
• Applications that make heavy use of real
arithmetic
• Especially used in science, engineering,
economics, statistics, animation
• The motivation for the first computers
• Still drives the development of supercomputers
and parallel computers
 a teraflop machine performs at least 1012 (a trillion)
floating-point operations per second
 36 Tflops already achieved (Japan’s Earth Simulator,
which cost $350–500M)
4/1/04
CS 100
CS 100 - Lecture 22
20
10
Lecture 22
4/1/04 18:21
Computer Science Issues
• Performance:
–
–
–
–
better algorithms
accessing of data in memory hierarchies
parallel computation
data communication in networks
• Mathematical software libraries
• Accuracy and stability of numerical
approximations
4/1/04
CS 100 - Lecture 22
21
Symbolic Computing
• Manipulate mathematical formulas,
equations, etc. much the way a
mathematician would
– automate processes that are mechanical,
tedious, and error-prone
• Examples: Macsyma, Mathematica, Maple,
MatLab
4/1/04
CS 100
CS 100 - Lecture 22
22
11
Lecture 22
4/1/04 18:21
Example: Simplification
( x −1)
2
2
+ ( x + 2) + (2x − 3) + x
• Simplify[(x-1)^2 + (x+2) +
(2x-3)^2 + x]
€
• 12 - 12x + 5x2
4/1/04
CS 100 - Lecture 22
23
Example: Expansion
(1+ x + 3y )
4
• Expand[(1 + x + 3y)^4]
• 1 + 4x + 6x2 + 4x3 + x4 + 12y
€
+ 36xy + 36x2y + 12x3y + 54y2
+ 108xy2 + 54x2y2 + 108y3
+108xy3 + 81y4
4/1/04
CS 100
CS 100 - Lecture 22
24
12
Lecture 22
4/1/04 18:21
Example: Solving Equations
2x + y = 11
6x − 2y = 8
• Solve[ {2x + y == 11,
6x - 2y == 8},
{x,
€ y}]
• {{x -> 3, y -> 5}}
4/1/04
CS 100 - Lecture 22
25
Typical Expansion Rules
Expand[ X × (Y + Z )] ⇒ X × Y + X × Z
Expand[( X + Y ) × Z ] ⇒ X × Z + Y × Z
Expand[ X 2 ] ⇒ X × X
Hence,
Expand[(n + 1)2]
€ ⇒Expand[(n + 1)(n + 1)]
⇒Expand[(n + 1)n + (n + 1)1]
⇒Expand[(n + 1)n + (n + 1)1]
⇒Expand[n×n + 1×n + n×1 + 1×1]
4/1/04
CS 100
CS 100 - Lecture 22
26
13
Lecture 22
4/1/04 18:21
Digression
• Recall our discussion of formalized
mathematics, and the idea of reducing
mathematics to the mechanical application
of formal rules
• Formal rules: depend on the form of
expressions, not their meaning
• Symbolic computation is an application of
the idea of a calculus
4/1/04
CS 100 - Lecture 22
27
Computer Science Issues
• Symbolic computation systems are:
– very high-level languages
– problem-specific
– nonprocedural
• Depend on many algorithms, e.g.:
– pattern matching
– efficient management of complex data structures
representing formulas
• Results should be presented in a form familiar and
useful to the mathematically literate
4/1/04
CS 100
CS 100 - Lecture 22
28
14
Download