Comparisons between C++ and Java (Appendix I)

advertisement

Chapter 1: Introduction

What is a database?

shared file containing integrated data with controlled redundancy

Often implemented as a group of related tables (examples on pages 4-5)

Relationships between tables often implemented as other tables.

Making Distinctions

Database:

A collection of tables as in the previous slide

Database management system (DBMS):

Database plus tools to process requests, enforce integrity constraints, provide security, analyze usage, optimize access, etc.

Database application.

Software that accesses data from a database.

Definition:

Relation:

2-dimensional table having the following properties

Table entries are single valued or atomic.

This means that an entry does not consist of a structure more complex than strings, dates, etc.

For example, a column type cannot be another table or relation.

entries in any one column are of the same type

Each column, also called attribute, has a name order of columns & rows is unimportant no two rows are identical

NOTE: The purist may disagree with this definition as he or she will see a table as an implementation of a relation.

A relation is more of an abstract concept but is usually implemented using a table.

Some do use the terms interchangeably.

Benefits

reduce redundancy and inconsistency. e.g. most student information from the tables on page 5 is not replicated.

One fact is in one place.

Data is shared

Security is applied centrally

language independent (COBOL, VB,

C++, java, C#, etc)

Multiple applications

Easier to maintain integrity data independence allow data access without knowledge of its internal organization and structure

Design: How many tables?

This is an important decision, and based on a set of rules known as normalization, which we will cover later.

However, the figure on page 18 illustrates a simple example related to an important issue.

1-8

SQL: Structured Query Language

Used to extract information from one or more tables.

Can specify what you want without specifying how to get it.

Example on p. 9

Can be standalone or embedded in application software.

General format

Select stuff

From one or more tables

Where conditions

Ranges from nearly trivial to fairly complex logic

Some Definitions

metadata (Sysfiles) - also data dictionary.

Description of all tables in the database.

Example on p. 12.

More of a concern for a Database

Administrator (DBA).

Know what it is - but we will not focus on it.

Client/Server Environment

Database is stored on a server

Application software often runs on a client using languages such as C# and others.

Typically written by application programmers.

Stored Procedure.

Procedures stored on a server.

Typically written by DB people.

Can be invoked by client applications.

Can be used for common activities used by multiple users.

Some DBAs may limit database access except through certain stored procedures.

It gives them more control.

Trigger.

A special type of procedure that is invoked automatically when a certain action occurs.

Can be used to make sure needed data elements are updated due to user actions.

Example: A student adds a class and a trigger is activated to update tuition & fees.

The app that adds the student does not know about the trigger but the DBMS knows.

Building a database:

Some terms:

Entity-Relationship (E-R) Diagrams.

Entities are somewhat like the classes you’ve designed in previous courses.

Relationships define how entities are related to one another.

Together they must reflect the reality as it is understood.

Design phase:

Design entities, relationships, and constraints consistent with perceived reality.

Test phase:

Create tables, stored procedures, triggers, forms, reports, etc. consistent with the E-R diagram and test.

Implementation phase:

Put into production

Early database models:

Hierarchical model

IMS (Information Management System)

Developed largely by IBM

Required all data be organized as a hierarchy

Tended to be awkward since not all realities are hierarchical in nature

network model (CODASYL-

Conference on Data Systems

Languages)

Data organized into complex graph

(network) data structures.

Application programs reflected the actual data structures.

Changes in design potentially affect ALL applications – costly!!

Relational Model (dominant form today)

Object Model (not a commercial success).

See table on p. 21 for a general history – also the prose on p. 23.

We will NOT cover web-based databases/services since we already have a course for that.

Relational Model

Edgar Codd’s landmark paper in CACM

(Communications of the Association of

Computing Machinery) A relational Model of data for large shared databases in 1970.

Codd, a mathematician working for IBM in

San Jose CA, applied concepts of relational algebra to the problem of a

“stored data bank”.

Paved the way for the development of the relational database.

Mapping objects to relational databases.

E.g. how does an object oriented program access non-object-oriented data in a relational database?

http://www.agiledata.org/essays/mapp ingObjects.html

We will see how this works when we discuss ADO.NET

Data Structures for databases:

Appendix D

We will not focus on complex data structures but there are a few things you must be aware of.

This appendix is online in a zipped file at http://wps.prenhall.com/bp_kroenke_datab ase_11/127/32761/8386898.cw/index.html

Disk

Contains concentric magnetic tracks on each surface.

Each track is divided into sectors .

Disk head moves radially inward and outward while the disk rotates .

Disks are SLOW and a potential bottleneck

Need to minimize disk head movement

(seek time) for optimal performance.

Rotational delays (time for sector to rotate past the head) also a factor

File Organizations

Linked List:

Database records, disk sectors, or clusters of sectors are maintained in a linear linked list.

Simple but can be very slow, especially for finding a record based on a key or index value.

i.e. find an employee record given the employee’s ID.

See pages D-3 and D-4 of the appendix.

Indexes: list of field values that identify records along with the location of that record.

List can be linear or some other structure.

e.g. a textbook often has a linear index at the end

Searching the index is a lot faster than searching through all of the content.

However, for many millions of records a linear index can still be inefficient.

B-tree hierarchical arrangement of index values.

Provides quick access & order to data.

See pages D-5 and D-6 of the appendix.

Typically each level would correspond to a sector or cluster.

Might have millions or records accessible via only a few index layers.

[ http://technet.microsoft.com/enus/library/cc917672.aspx

Hash function

Index value (sometimes called a key) fed into a hash function which specifies where to store the entire record.

To locate a record, given its key, apply the hash function to the key value and the location is calculated.

R is a record

R.k

is the value of R’s key field

H is a hash function that calculates a location

H (R.k)

Hash table (database records)

R is stored here

Time to find a record can be independent of the number of records.

Assumes a good hash function and sufficient space.

Each are nontrivial and the subject of a course in data structures or algorithms.

Indexes

Dense

1 entry for each record

Useful if records are stored in random order

Non-dense

1 index for a group of records (say 1 on a page)

Useful if records are maintained in order

3 Levels (views) of a database.

Internal physical storage conceptual, sometimes the DBA view a collection of Base Tables

Base table is a table with a direct underlying storage structure.

Created from the E-R diagram described by a data dictionary or metadata .

External, sometimes user view

Collection of tables defined for a particular user using SQL.

They are called logical tables , virtual tables, or derived tables , or just view

The data in a view is presented to the user as a single table though it is actually derived from one or more base tables specified in its definition.

These logical tables do not exist in the same sense as a base table – there is

NO direct underlying storage structure.

A view can simplify the user’s view of the database and provide security by hiding certain parts of a base table.

Examples:

Table consisting of student with a given major or GPA value) the single table in Fig 1-20 (page 18) could be a view derived by joining the two other base tables in the same figure.

DBA user user user

DBA (Database Administrator) defines conceptual schema, internal schema, user liaison, security and integrity, backup/recovery, performance

Microsoft SQL Server 2008

Accessing SQL Server:

Start

All Programs

Microsoft SQL

Server 2008 R2

SQL Server

Management Studio .

You may see a window indicating the

MS SQL Server Management Studio is configuring for first time use.

Just wait and be patient.

In the Connect to Server window, select

Database Engine for the Server Type

ICSD for the server name

Windows Authentication for Authentication

(These should all be defaults).

Then press the connect button.

In the Object Explorer pane (left side of screen), expand the Databases folder.

If you don’t see an Object Explorer Pane, select it from the View menu.

There are four databases that start with

“CS451” you should have read-only access to each one. I will use these during the semester.

To see the tables in one of them, expand the database folder and the subsequent

Tables folder that appear. Right click on one of the table names and select Select

Top 1000 Rows .

Test this and let me know of any access problems.

Download