Database System Architecture 2. DATABASE SYSTEM ARCHITECTURE 2.1 INTRODUCTION (The architecture of a database system is greatly influenced by the underlying computer system on which it runs, in particular by such aspects of computer architecture as networking, parallelism, and distribution :) 2.2 Schemas, Sub-schemas, and Instances Schema The overall design of the database is called the database schema. The logical structure of database Example: the database consists of information about a set of customers and accounts and the relationship between them Database system has several schemas Physical schema at the lowest level(physical level) Logical schema at the intermediate level(logical level) Sub-schema A database may also have several schemas at the view level, sometimes called subschema. Subschema at the highest level Instances The collection of information stored in the database as a particular moment is called an instance of the database. Database change over time as information is inserted and deleted. OR The actual content of the database at a particular point in time Analogous to the value of a variable 1 Database System Architecture 2.3 Three-level ANSI SPARC database Architecture External Level Users’ view of the database. The highest level of abstraction describes only part of the entire database. Describes that part of database that is relevant to a particular user. Mostly end users interact with this level most frequently. This level deals with mainly the end display of the database structure along with records stored here. Conceptual Level Community view of the database. The next higher level of abstraction. Describes what data is stored in database and relationships among the data. 2 Database System Architecture The logical level of abstraction is used by database administrator who must decide what information is to be kept in the database. It is a “level of indirection” between the internal and external level. Internal Level Physical representation of the database on the computer. The lowest level of abstraction. Describes how the data is stored in the database. Complex low-level data structures are described in detail. Differences between Three Levels of ANSI-SPARC Architecture Objectives of three-tire architecture All users should be able to access same data. A user’s view is immune to changes made in other views. Users should not need to know physical database storage details. DBA should be able to change database storage structures without affecting the users’ views. Internal structure of database should be unaffected by changes to physical aspects of storage. DBA should be able to change conceptual structure of database without affecting all users. 3 Database System Architecture 2.4 Data Independence Definition: The ability to modify a schema definition in one level without affecting a schema definition in the next higher level is called data independence. There are two level of data independence. Physical data independence Logical data independence Physical data independence Refers to immunity of conceptual schema to changes in the internal schema. Internal schema changes (e.g. using different file organizations, storage structures/devices). Should not require change to conceptual or external schemas. It is occasionally when improve performance. Logical data independence Refers to immunity of external schemas to changes in conceptual schema. Conceptual schema changes (e.g. addition/removal of entities). Should not require changes to external schema or rewrites of application programs. It is done whenever logical structure of the database is altered. 2.5 Mappings There are the two types of mappings Conceptual/Internal Mapping External/Conceptual Mapping Conceptual/Internal Mapping The conceptual/internal mapping defines the correspondence between the conceptual view and the store database; it specifies how conceptual record and fields are represented at the internal level. If structure of the store database is changed. 4 Database System Architecture If changed is made to the storage structure definition-then the conceptual/internal mapping must be changed accordingly, so that the conceptual schema can remain invariant. External/Conceptual Mapping The external/conceptual mapping defines the correspondence between a particular external view and conceptual view. The differences that can exist between these two levels are analogous to those that can exist between the conceptual view and the stored database. Example: fields can have different data types; fields and record name can be changed; several conceptual fields can be combined into a single external field. Any number of external views can exist at the same time; any number of users can share a given external view: different external views can overlap. 2.6 Structure Components, and Functions of DBMS Structure of DBMS DBMS (Database Management System) acts as an interface between the user and the database. The user requests the DBMS to perform various operations (insert, delete, update and retrieval) on the database. The components of DBMS perform these requested operations on the database and provide necessary data to the users. The various components of DBMS are shown below: - Fig. 2.1 Structure of DBMS 5 Database System Architecture DDL Compiler - Data Description Language compiler processes schema definitions specified in the DDL. It includes metadata information such as the name of the files, data items, storage details of each file, mapping information and constraints etc. DML Compiler and Query optimizer - The DML commands such as insert, update, delete, retrieve from the application program are sent to the DML compiler for compilation into object code for database access. The object code is then optimized in the best way to execute a query by the query optimizer and then send to the data manager. Data Manager - The Data Manager is the central software component of the DBMS also knows as Database Control System. The Main Functions Of Data Manager Are:– Convert operations in user's Queries coming from the application programs or combination of DML Compiler and Query optimizer which is known as Query Processor from user's logical view to physical file system. Controls DBMS information access that is stored on disk. It also controls handling buffers in main memory. It also enforces constraints to maintain consistency and integrity of the data. It also synchronizes the simultaneous operations performed by the concurrent users. It also controls the backup and recovery operations. Data Dictionary - Data Dictionary, which stores metadata about the database, in particular the schema of the database. Data - names of the tables, names of attributes of each table, length of attributes, and number of rows in each table. Detailed information on physical database design such as storage structure, access paths, files and record sizes. Usage statistics such as frequency of query and transactions. Data dictionary is used to actually control the data integrity, database operation and accuracy. It may be used as a important part of the DBMS Data Files – which store the database itself. Compiled DML - The DML complier converts the high level Queries into low level file access commands known as compiled DML. 6 Database System Architecture End Users : The second class of users then is end user, who interacts with system from online workstation or terminals. Use the interface provided as an integral part of the database system software. Components of DBMS Four components of database system.(explain to the CHAPTER -1) Function and services of DBMS Functions of DBMS DBMS free the programmers from the need to worry about the organization and location of the data i.e. it shields the users from complex hardware level details. DBMS can organize process and present data elements from the database. This capability enables decision makers to search and query database contents in order to extract answers that are not available in regular Reports. Programming is speeded up because programmer can concentrate on logic of the application. It includes special user friendly query languages which are easy to understand by non programming users of the system. The various common examples of DBMS are Oracle, Access, SQL Server, Sybase, FoxPro, Dbase etc. The service provided by the DBMS includes: Authorization services like log on to the DBMS start the database stop the Database etc. Transaction supports like Recovery, Rollback etc, Import and Export of Data. Maintaining data dictionary User's Monitoring 7 Database System Architecture 2.7 Data Models The structure of the database is called the data models: A Collection of conceptual tools for describing data, data relationship, data semantic and consistency constraint. There are three different groups. Object-based logical models The entity-relationship model The object-oriented model The semantic data model The functional data model Record-based logical models Relational model Network model Hierarchical model Physical models Unifying model Frame-memory model Record-based logical models Record based logical models are used in describing data at the logical and view levels. Record-base models are named as database structure have fixed format records of several types. Each record type define a fixed number of fields or attributes. Each attributes and each fields is a usually of a fixed length. There are three most widely used record-based models are: Relational model Network model Hierarchical model Relational model: The relational model uses a collection of tables to represent both data and the relationships among those data. Each table has multiple columns, and each column has unique name. Network model: Data in the network model are represented by collections of record and relationships among data are represented by links, which can be viewed as pointers. 8 Database System Architecture The records in the database are organized as collection of arbitrary graphs Network model Advantages Network model. Conceptual Simplicity Ease of data access Data Integrity and capability to handle more relationship types Data independence Database standards Disadvantages Network model. System complexity Absence of structural independence Hierarchical model: In hierarchical model the data and relationships among the data are represented by records and links. It is same as network model but differs in terms of organization of records as collections of trees rather than graphs. 9 Database System Architecture Hierarchical model Advantages Hierarchical model Simplicity Data Security and Data Integrity Efficiency Disadvantages Hierarchical model Implementation Complexity Lack of structural independence Programming complexity Object-based logical models Object-based logical models are used in describing data at the logical and the view levels. They provide fairly flexible structuring capabilities and allow data constraints to be specified explicitly. There are many different models more widely know ones are; Entity-relationship model Object-oriented model Semantic model Functional data model Physical data model: This model is used to describe the data at the lowest level. Two widely know ones are the unifying model and the frame-memory model. 10 Database System Architecture Relational model The relational model uses a collection of tables to represent both data and the relationships among those data. Each table has multiple columns, and each column has a unique name. It does not use pointers or links. This model relates record by values that they contain. Customer-ID C00001 C00002 Customer-name Smith Jones Customer-street 4 north St 3 main St Customer-city Rye Harrison (a) Customer table Account-number A-101 A-215 Balance 500 700 (b) Account table Customer-ID C00001 C00002 Account-number A-101 A-215 (c) Depositor table Entity-Relational model Models an enterprise as a collection of entities and relationships Entity: a “thing” or “object” in the enterprise that is distinguishable from other objects E.g. each person is an entity, bank account is an entity. Attribute: Described by a set of attributes E.g. the attributes account-number and balance describes one particular account Relationship: an association among several entities E.g. Depositor relationship associates a customer with each account Entity set and Relationship: The set of all entities of the same type are called entity set and the set of relationship of the same type are called the relationship set. 11 Database System Architecture Represented diagrammatically by an entity-relationship diagram: (A sample E-R diagram) Rectangles, which represent entity sets Ellipses, which represent attributes Diamonds, which represent relationships among entity sets Lines, which link attributes to entity sets and entity sets to relationships Each component is labelled with the entity or the relationship that it represents. Object- Oriented Data Model Object corresponds to an entity in the E-R model Object has associated with it A set of variables that contain the data for the object; variables correspond to attributes in the E-R model. A set of messages to which the object responds; each message may have zero, one, or more parameters A set methods, each of which is a body of code to implement a message; a method returns a value as the response to the message. Class employee { /* Variables */ string name; string address; date start-date; int salary; 12 Database System Architecture /*Messages*/ int annual-salary(); string get-name(); string get-address(); int set-address(string new-address); int employment-length(); }; The methods for handling the message are usually defined separately from the class definition. The methods get-address() and setaddress() would be defined. string get-address(){ return address; } int set-address(string new-address){ address=new-address; } 2.8 Types of Database System Centralized Database Systems A Centralized Computer System Run on a single computer system and do not interact with other computer systems. 13 Database System Architecture General-purpose computer system: one to a few CPUs and a number of device controllers that are connected through a common bus that provide access to shared memory. Single-user system (e.g., personal computer or workstation): desk-top unit, single user, usually has only one CPU and one or two hard disks; the OS may support only one user. Multi-user system: more disks, more memory, multiple CPUs, and a multiuser OS. Serve a large number of users who are connected to the system vie terminals. Often called server systems. Parallel Database Systems Parallel database systems consist of multiple processors and multiple disks connected by a fast interconnection network. A coarse-grain parallel machine consists of a small number of powerful processors A massively parallel or fine grain parallel machine utilizes thousands of smaller processors. Two main performance measures: throughput --- the number of tasks that can be completed in a given time interval response time --- the amount of time it takes to complete a single task from the time it is submitted Speedup: a fixed-sized problem executing on a small system is given to a system which is N-times larger. Measured by: Speedup = small system elapsed time Large system elapsed time Speedup is linear if equation equals N. Speedup 14 Database System Architecture Scaleup: increase the size of both the problem and the system N-times larger system used to perform N-times larger job Measured by: Scaleup = small system small problem elapsed time Big system big problem elapsed time Scale up is linear if equation equals 1. Scaleup Batch scaleup: A single large job; typical of most decision support queries and scientific simulation. Use an N-times larger computer on N-times larger problem. Transaction scaleup: Numerous small queries submitted by independent users to a shared database; typical transaction processing and timesharing systems. N-times as many users submitting requests (hence, N-times as many requests) to an N-times larger database, on an N-times larger computer. Well-suited to parallel execution. 15 Database System Architecture Client-Server Database Systems Server systems satisfy requests generated at m client systems, whose general structure is shown below: Database functionality can be divided into: Back-end: manages access structures, query evaluation and optimization, concurrency control and recovery. Front-end: consists of tools such as forms, report-writers, and graphical user interface facilities. The interface between the front-end and the back-end is through SQL or through an application program interface. Advantages of replacing mainframes with networks of workstations or personal computers connected to back-end server machines: better functionality for the cost flexibility in locating resources and expanding facilities better user interfaces easier maintenance Server systems can be broadly categorized into two kinds: transaction servers which are widely used in relational database systems, and 16 Database System Architecture data servers, used in object-oriented database systems Distributed Database Systems Data spread over multiple machines (also referred to as sites or nodes). Network interconnects the machines Data shared by users on multiple machines Homogeneous distributed databases Same software/schema on all sites, data may be partitioned among sites Goal: provide a view of a single database, hiding details of distribution Heterogeneous distributed databases Different software/schema on different sites Goal: integrate existing databases to provide useful functionality Differentiate between local and global transactions A local transaction accesses data in the single site at which the transaction was initiated. A global transaction either accesses data in a site different from the one at which the transaction was initiated or accesses data in several different sites. Sharing data – users at one site able to access the data residing at some other sites. Autonomy – each site is able to retain a degree of control over data stored locally. 17 Database System Architecture Disadvantage: added complexity required to ensure proper coordination among sites. Software development cost. Greater potential for bugs. Increased processing overhead. Atomicity needed even for transactions that update data at multiple sites REVIEW QUESTION 1) 2) 3) 4) Explain Schemas, Sub-schemas, and Instances. Explain Three-level ANSI SPARC database Architecture. Explain External Level, Conceptual Level, And Internal Level. Explain Data Independence OR Difference between Physical data independence and Logical data independence. 5) Explain to the Mapping OR Explain Conceptual/Internal Mapping and External/Conceptual Mapping 6) Explain structure of DBMS. 7) Explain components of DBMS. 8) Explain Function and services of DBMS. 9) Explain DATA Model. 10) Explain Object-based logical models. 11) Explain entity-relationship model. 12) Explain object-oriented model. 13) Explain Record-based logical models. 14) Explain Relational model. 15) Explain Network model. 16) Explain Hierarchical model. 17) Explain Physical models 18) Comparison between Data Models 19) Types of Database System any one will be explain? 20) Explain Centralized Database System 21) Explain Parallel Database System 22) Explain Client / Server Database System 23) Explain Distributed Database System 18