Database Design Dr. M.E. Fayad, Professor Computer Engineering Department, Room #283I College of Engineering San José State University One Washington Square San José, CA 95192-0180 http://www.engr.sjsu.edu/~fayad, m.fayad@sjsu.edu 2003 SJSU -- CmpE L1-S1 Infinite R-DB Lesson 1: Infinite Relational Database 2 2003 SJSU – CmpE --- M.E. Fayad L1-S2 Infinite R-DB Lesson Objectives Objectives Understand Infinite Relational Databases Explore the view level Understand the logical view Abstract Data Type 3 2003 SJSU – CmpE --- M.E. Fayad L1-S3 Infinite R-DB Infinite Relational Databases Data Abstraction- allows people to forget unimportant details – View Level – a way of presenting data to a – group of users – Logical Level – how data is understood to be when writing queries 4 2003 SJSU – CmpE --- M.E. Fayad L1-S4 Infinite R-DB The View Level 2003 The highest level of data abstraction is the view level A view is a way of presenting data to a particular group of users. Data Presentation may depend on users preferences. Each view has to be functional for the users. This means that when designing a view we must keep in mind the functions to be preformed on the data. SJSU – CmpE --- M.E. Fayad L1-S5 Infinite R-DB 5 The View Level View level presentation of the data: Science, Art, or both (discussion) We will illustrate examples from different computer fields, such as computer graphics, for view level presentation of complex data, especially spatiotemporal data, such as realistic display of images and movies. 2003 SJSU – CmpE --- M.E. Fayad L1-S6 Infinite R-DB 6 The View Level Examples: –Charts –Graphs –Drawings –Maps –Video or Animation Examples? What is a view? What is a model? What are the differences between a model and a view? 2003 SJSU – CmpE --- M.E. Fayad 7 L1-S7 Infinite R-DB The Logical Level Example: Infinite relational data model • Relation – table (Each table has a name and defines a relation) • Relational scheme – top row / list of attributes (The top row of a table is called an attribute name) (The ordered set of attributes of a table is called a relation scheme.) • Arity or dimension – number of attributes of a relation (We will use arity and dimension interchangeably with a preference for dimension in the case of spatiotemporal relations.) 2003 SJSU – CmpE --- M.E. Fayad L1-S8 Infinite R-DB 8 The Logical Level Example: Infinite relational data model • Database schema – set of relation names and schemes • Tuple / Point – each row below the scheme (we will use these two terms interchangeably with a preference for point in the case of spatiotemporal relations. • Instance – the set of tuples in a table (Each row describes an instance of the scheme.) (Please remember a relation schemes are usually fixed while a relation instances may change over time due to database updates.) 2003 SJSU – CmpE --- M.E. Fayad L1-S9 Infinite R-DB 9 Example (1) SSN 123-45-6789 987-65-4321 567-89-0123 2003 Surname Doe Fulano Roe First Name(s) Jane Q. Juan Richard Rodney Telephone Number 512-555-1234 210-543-9876 512-987-6431 SSN Wages Interest Capital Gain 123-45-6789 100,000 3,400 0 987-65-4321 83,640 2,821 3,400 567-89-0123 46,000 501 1,200 SJSU – CmpE --- M.E. Fayad L1-S10 10 Infinite R-DB Example (2) Name the relations! What is arity of each relation? What is the relation scheme of each relation? What is the database scheme? How many tupls in each of the relation? How many instances of each of these relations? 2003 SJSU – CmpE --- M.E. Fayad L1-S11 Infinite R-DB 11 Relation schemes & Instances (1) T or F: Relation schemes are usually fixed (T) Relation instances change with updates (T) Example Scheme: Taxrecord(SSN,Wages,Interest,Capital_gain) Taxtable(Income,Tax) 2003 SJSU – CmpE --- M.E. Fayad L1-S12 Infinite R-DB 12 Relation schemes & Instances (2) Example: Streets(Name, X, Y ) Streets contains pairs of street names and (x,y) points such that the point belongs to the street. There are an infinite number of (x, y) locations associated with each street. Example: Crops(Corn,Rye,Sunflower, Wheat) Crops contains all possible combinations of four crops that a farmer could plant. There are an infinite number of tuples in any instance of this relation. 2003 SJSU – CmpE --- M.E. Fayad L1-S13 Infinite R-DB Infinite Relational Data Model Other examples: Temporal Data Spatial Data Operations Research 14 2003 SJSU – CmpE --- M.E. Fayad L1-S14 Infinite R-DB Temporal & Spatial Data In many application areas of machine learning and data mining, researchers face challenges entailed by temporal and spatial data. What are the differences between temporal and spatial data? 15 2003 SJSU – CmpE --- M.E. Fayad L1-S15 Infinite R-DB Temporal Data Type (1) The user-defined temporal data type is a time representation specially designed to meet the specific needs of the user. For example, the designers of a database used for class scheduling in a school might be based on a "Year:Term:Day:Period" format. Terms belonging to a user-defined temporal data type get the same query language support as do terms belonging to built-in temporal data types such as the DATE data type. 16 2003 SJSU – CmpE --- M.E. Fayad L1-S16 Infinite R-DB Temporal Databases A temporal database is a database that supports some aspect of time, not counting user-defined time. 17 2003 SJSU – CmpE --- M.E. Fayad L1-S17 Infinite R-DB Spatiotemporal The spatiotemporal is used to indicate that the modified concept concerns simultaneous support of some aspect of time and some aspect of space, in one or more dimensions. 18 2003 SJSU – CmpE --- M.E. Fayad L1-S18 Infinite R-DB Abstract Data Types (1) Domain – range of values for an attribute. – string, integers or real numbers Scalar Domain – always a single value – (ex: string, integer or real number) data type domains – composed of scalar domains. Abstract 19 2003 SJSU – CmpE --- M.E. Fayad L1-S19 Infinite R-DB Abstract Data Types (2) Example: Vertices(Cities) The domain of Cities is a set of strings. Example: Streets(Name, Extent) The domain of Extent is a set of (x,y) points. 20 2003 SJSU – CmpE --- M.E. Fayad L1-S20 Infinite R-DB Database Glossary (1) A database is a collection of related data. A database management system (DBMS) is a collection of programs that enables users to create and maintain a database. A database system = database + DBMS 21 2003 SJSU – CmpE --- M.E. Fayad L1-S21 Infinite R-DB Database Glossary (2) A database can be of any size and of varying complexity. IRS database Assume there are a 100 million taxpayers Each taxpayer file has an average of 5 forms. Each form is approx. 200 chars Assume also that IRS keeps the past three returns for each taxpayer? What is the size of IRS’s database? 22 (100*(106)*200*5) = 4*(1011) = 400 gigabytes 2003 SJSU – CmpE --- M.E. Fayad L1-S22 Infinite R-DB Characteristics of the Database Approach Self-describing nature of a database system Database contains the database itself, the definition or description of the database structure and constraints The definition is stored in the system catalog which contains the information, such as structure of each file, the type and storage format of each data item, and various constraints on the data. The information stored in the catalog is called meta-data. 23 2003 SJSU – CmpE --- M.E. Fayad L1-S23 Infinite R-DB Characteristics of the Database Approach Insulation between programs and data, and data abstraction In OO databases users can define operations on data as part of the database definitions. An operation is called a function is specified in two parts: the interface or signature and the implementation 24 Data abstraction 2003 SJSU – CmpE --- M.E. Fayad L1-S24 Infinite R-DB Characteristics of the Database Approach Support multiple views of the data Dealing with Raw Data Many users = different perspectives or views of the database. Facilities for multiple views 25 2003 SJSU – CmpE --- M.E. Fayad L1-S25 Infinite R-DB Characteristics of the Database Approach Sharing of data and multiuser transaction processing A multiuser DBMS must allow multiple users to access the database at the same time. Concurrency control – to ensure that several users trying to update the same data do so in a controlled manner so that the result of the updates is correct. 26 2003 SJSU – CmpE --- M.E. Fayad L1-S26 Infinite R-DB Actors on the Scene Database administrators Database designers End users (casual end users, naïve or parametric end users, sophisticated end users, and stand-alone user System analysts and application programmers or software engineers 2003 SJSU – CmpE --- M.E. Fayad L1-S27 27 Infinite R-DB Worker Behind the Scene DBMS system designers and implementers Tool developers Operators and maintenance personnel 28 2003 SJSU – CmpE --- M.E. Fayad L1-S28 Infinite R-DB Advantages of Using DBMS (1) Controlling redundancy Redundancy is storing the same data multiple times that lead to several problems: 1. Duplication of effort 2. Waste of storage space 3. Inconsistent 29 2003 SJSU – CmpE --- M.E. Fayad L1-S29 Infinite R-DB Advantages of Using DBMS (1) Restricting unauthorized access DBMS should provide a security and authorization mechanisms which specify account restrictions. DBMS should enforce these restrictions automatically. 2003 SJSU – CmpE --- M.E. Fayad 30 L1-S30 Infinite R-DB Advantages of Using DBMS (1) Providing persistent storage for program objects and data structures In OO Database Systems, an object said to be persistent if it survives the execution of program execution and can be later retrieved by another program. Compatibility – OODBs offer data structure compatible with one or more OO programming languages Traditional DB systems often suffer from the so-called impedance or mismatch problem 2003 SJSU – CmpE --- M.E. Fayad L1-S31 Infinite R-DB 31 Advantages of Using DBMS (1) Permitting inferencing and actions using rules Some database systems provide capabilities for defining deduction rules for inferencing new information from the stored database facts. Such systems are called deductive database systems. 2003 SJSU – CmpE --- M.E. Fayad L1-S32 Infinite R-DB 32 Advantages of Using DBMS (2) 2003 Providing multiple user interfaces Representing complex relationships among data Enforcing integrity constraints Providing backup and recovery SJSU – CmpE --- M.E. Fayad L1-S33 33 Infinite R-DB Additional Advantages of Using DBMS (2) 2003 Potential enforcing standards Reducing application development time Flexibility Availability of up-to-date information Economics of Scale SJSU – CmpE --- 34 M.E. Fayad L1-S34 Infinite R-DB Discussion Questions T/F: a. A view is a way of presenting data to a particular group of users. b. Any relation can be presented by multiple views c. Arity = the number of columns in the relation. d. An instance = any row of a relation e. Spatial database is a database that supports some aspect of time, not counting f. Spatial data in the form of two- or three-dimensional images. g. Spatial data is any information about the location and shape of, and relationships among, geographic features. This includes remotely sensed data as well as map data. 2003 SJSU – CmpE --- M.E. Fayad L1-S35 Infinite R-DB 35 Tasks for Next Lecture Task 1: Data Modeling Using EntityRelationship Model 36 2003 SJSU – CmpE --- M.E. Fayad L1-S36 Infinite R-DB