Introduction to Database Lec 01 Dr Adel Khaled Outlines • BASIC CONCEPTS AND DEFINITIONS • Data Versus Information • Metadata • TRADITIONAL FILE PROCESSING SYSTEMS • Data Models • Database Management Systems (DBMS) • COMPONENTS OF THE DATABASE ENVIRONMENT • Characteristics of the Database Approach BASIC CONCEPTS AND DEFINITIONS BASIC CONCEPTS AND DEFINITIONS BASIC CONCEPTS AND DEFINITIONS ➢ Data Historically, the term data referred to facts concerning objects and events that could be recorded and stored on computer media. ➢ For example, in a salesperson’s database, the data would include facts such as customer name, address, and telephone number. ➢ This type of data is called structured data. ➢ The most important structured data types are numeric, character, and dates. ➢ Structured data are stored in tabular form (in tables, relations, arrays, spreadsheets, etc.) and are most commonly found in traditional databases and data warehouses. BASIC CONCEPTS AND DEFINITIONS Databases today are used to store objects such as documents, e-mails, maps, photographic images, sound, and video segments in addition to structured data. For example, the salesperson’s database might include a photo image of the customer contact. This type of data is referred to as unstructured data, or as multimedia data. Data Versus Information These facts satisfy our definition of data, but most people would agree that the data are useless in their present form. Even if we guess that this is a list of people’s names paired with their Social Security numbers, the data remain useless because we have no idea what the entries mean. Data Versus Information Metadata • Metadata are data that describe the properties or characteristics of end- user data and the context of that data. • Some of the properties that are typically described include data names, definitions, length (or size), and allowable values. • Metadata describing data context include the source of the data, where the data are stored, ownership (or stewardship), and usage. • Although it may seem circular, many people think of metadata as “data about data.” Metadata TRADITIONAL FILE PROCESSING SYSTEMS • When computer-based data processing was first available, there were no databases • As business applications became more complex, it became evident that traditional file processing systems had a number of shortcomings and limitations. TRADITIONAL FILE PROCESSING SYSTEMS Disadvantages of File Processing Systems ➢ PROGRAM-DATA DEPENDENCE File descriptions are stored within each database application program that accesses a given file. ➢ DUPLICATION OF DATA Because applications are often developed independently in file processing systems, unplanned duplicate data files are the rule rather than the exception. ➢ LIMITED DATA SHARING With the traditional file processing approach, each application has its own private files, and users have little opportunity to share data outside their own applications.. ➢ Managers often find that a requested report requires a major programming effort because data must be drawn from several incompatible files in separate systems. ➢ EXCESSIVE PROGRAM MAINTENANCE The preceding factors all combined to create a heavy program maintenance load in organizations that relied on traditional file processing system. Data Models Data models capture the nature of and relationships among data and are used at different levels of abstraction as a database is conceptualized and designed. The effectiveness and efficiency of a database is directly associated with the structure of the database. Entity A person, a place, an object, an event, or a concept in the user environment about which the organization wishes to maintain data. Attribute: The data you are interested in capturing about the entity (e.g., Customer Name) is called an. Data Models RELATIONSHIPS A well-structured database establishes the relationships between entities that exist in organizational data so that desired information can be retrieved. Most relationships are one-to-many (1:M) or many-to-many (M:N) Database Management Systems (DBMS) Database management system (DBMS) A software system that is used to create, maintain, and provide controlled access to user databases. in Figure 1-4 there is only one place where the CUSTOMER information is stored rather than the two Customer Master Files. Both the Order Filling System and the Invoicing System will access the data contained in the single CUSTOMER entity. File system DBMS Advantages of the Database Approach Cautions About Database Benefits COMPONENTS OF THE DATABASE ENVIRONMENT • Computer-aided software engineering (CASE) tools CASE tools are automated tools used to design databases and application programs. • These tools help with creation of data models and in some cases can also help automatically generate the “code” needed to create the database. COMPONENTS OF THE DATABASE ENVIRONMENT ❑ Repository A repository is a centralized knowledge base for all data definitions, data relationships, screen and report formats, and other system components. ❑ A repository contains an extended set of metadata important for managing databases as well as other components of an information system. ❑ DBMS A DBMS is a software system that is used to create, maintain, and provide controlled access to user databases. ❑ Database A database is an organized collection of logically related data, usually designed to meet the information needs of multiple users in an organization. It is important to distinguish between the database and the repository. The repository contains definitions of data, whereas the database contains occurrences of data. COMPONENTS OF THE DATABASE ENVIRONMENT ➢ Application programs Computer-based application programs are used to create and maintain the database and provide information to users. ➢ User interface the user interface includes languages, menus, and other facilities by which users interact with various system components, such as CASE tools, application programs, the DBMS, and the repository. ➢ Data and database administrators Data administrators are persons who are responsible for the overall management of data resources in an organization. Database administrators are responsible for physical database design and for managing technical issues in the database environment. ➢ System developers System developers are persons such as systems analysts and programmers who design new application programs. System developers often use CASE tools for system requirements analysis and program design. Database Applications • These examples are what we called traditional database applications • More Recent Applications: • Traditional database application • Geographic Information Systems (GIS):can store and analyze maps, weather data, and satellite images • Multimedia database: store images, audio clips, and video streams digitally • Data Warehouses: systems are used in many companies to extract and analyze useful business information from very large databases to support decision making. • Real-time and active database technology is used to control industrial and manufacturing processes. • Database search techniques are being applied to the World Wide Web to improve the search for information Database can be any size and complexity For example: • A list of names and address • IRS Internal Revenue Service (assume it has 100 million taxpayers and each taxpayer file 5 forms with 400 characters of information per form=800Gbyte) • Amazon.com (15 million people visit per day; about 100 people are responsible for database update) An UNIVERSITY Example • A UNIVERSITY database for maintaining information concerning students, courses, and grades in a university environment An UNIVERSITY Example STUDENT file stores data on each student COURSE file stores data on each course SECTION file stores data on each section of each course GRADE_REPORT file stores the grades that students receive PREREQUISITE file stores the prerequisites An UNIVERSITY Example Database manipulation involves querying االستعالمand updating. Examples of queries are as follows: • Retrieve the transcript—a list of all courses and grades—of ‘Smith’ • List the names of students who took the section of the ‘Database’ course offered in fall 2008 and their grades in that section • List the prerequisites of the ‘Database’ course Examples of updates include the following: • Change the class of ‘Smith’ to sophomore • Create a new section for the ‘Database’ course for this semester • Enter a grade of ‘A’ for ‘Smith’ in the ‘Database’ section of last semester Characteristics of the Database Approach The main characteristics of the database approach Self-Describing Nature of a Database System ➢ A fundamental characteristic of the database approach is that the database system contains not only the database itself but also a complete definition or description of the database structure and constraints. ➢ This definition is stored in the DBMS catalog, which contains information such as the structure of each file, the type and storage format of each data item, and various constraints on the data. ➢ The information stored in the catalog is called meta-data Characteristics of the Database Approach ➢ Support of Multiple Views of the Data ➢ A database typically has many types of users, each of whom may require a different perspective or view of the database. ➢ A view may be a subset of the database or it may contain virtual data that is derived from the database files but is not explicitly stored. ➢ Some users may not need to be aware of whether the data they refer to is stored or derived. ➢ A multiuser DBMS whose users have a variety of distinct applications must provide facilities for defining multiple views. Characteristics of the Database Approach Sharing of Data and Multiuser Transaction Processing • A multiuser DBMS, as its name implies, must allow multiple users to access the database at the same time. • This is essential if data for multiple applications is to be integrated and maintained in a single database. • The DBMS must include concurrency control software to ensure that several users trying to update the same data do so in a controlled manner so that the result of the updates is correct. • A fundamental role of multiuser DBMS software is to ensure that concurrent transactions operate correctly and efficiently. Database Users Database administrators: Responsible for authorizing access to the database, for coordinating and monitoring its use, acquiring software and hardware resources, controlling its use and monitoring efficiency of operations. Database Designers: Responsible to define the content, the structure, the constraints, and functions or transactions against the database. They must communicate with the end-users and understand their needs. End Users Access to the database for querying, updating, and generating reports System Analysts Determine the requirements of end users