this pdf excerpt

advertisement
cnapter2
lcl exprect
rnclancy?
, h o r vn i n
r: tor the
DatabaseSystem Concepts
and Architecture
I hold on
: io na lfi l e
rJn gesto
1rL'ntand
rntifi, tl-re
f N, ar.rd
' ., 'l 't.l ?
i n clude
tratures
" i t su e of
.rnri r-r-rity
'':al$**ffir
tr he architecture of DBMS packageshas evolyed
ffi from the early rnonolithii systems,where the
ivhole DBMS software packagewas one tightly integrated system,to the modern
DBMS packagesthat are rnodular in design,with a client/serversystemarchitecture.
This evolution mirrors the trends in computing, where large centralizedmainframe
computersarebeing replacedby hundredsof distributedworkstationsand personal
computers connected via contntunications networks to various types of server
nrachines-Web servers,databaseservers,file servers,application servers,and so on.
In a basic client/server DBMS architecture,the system functionality is distributed
benveentwo types of modules.r A client module is typically designedso that it rvill
run on a user workstation or personalcomputer.Typically,applicationprograms
and user interfacesthat accessthe datirbaserun in the client module. Hence,the
client module handles user interaction and provides the user-friendlv interfaces
such as forms- or menu-basedGUIs. The other kind of module, called a server
module, typically handles data storage,access,search,and other functions. We discussclient/serverarchitecturesin more detail in Section2.5. First, we rnust study
more basic concepts that will give us a better understanding of moderr-rdatabase
architectures.
In this chapter we presentthe terminology and basic conceptsthartrvill be used
throughout the book. Section2.1 discusses
data models and definesthe conceptsof
1 . As we sh a llse e in Se ctio n2 ,5,there are vari ati onson thi s s mpl e l w o-l rerclent/serverarchtecture,
29
Chapter 2 DatabaseSystem Concepts and Architecture
schemasand instirnces,which are fundanrentalto the study of databirses)/stems.
Therr,we discussthe three-schemaDBlvlS architecturear.rdclataindependencein
or.rwhat a DBMS is supposedto do. Lr
Section2.2;this providesa user'sperspectir.e
Section2.3 we describethe typesof interfacesand languagesthat are typically prothe databasesystemsoftwareenvironrnent.
vided by a DBMS. Section2.4 discusses
2.5 givesan overviewof varioustypesof client/serverarchitectures.
Finally,
Sectior-r
Sect ion2. 7
a cl assi fi cati onof the t,vpesof D B MS packages.
Se c ti o n2.6 preser.rts
sunrmirrizesthe chapter.
in Sections2.4 through 2.6 providesmore detailedconceptsthat rnav
The r-naterial
to the basicintroductory material.
be consideredas sr-rpplementary
2.1 Data Models,Schemas,and Instances
characteristicof tl.redirtabirseapproachis tl-ratit providessome
One fundan.rer.rtal
level of data abstrirction.Data abstraction generallyrefersto the suppressionof
detailsof data organizationand storageand the highlighting of the essentialfeaof the
turesfor an improved understandingof clata.One of the main characteristics
databaseapproachis to support data abstractionso that different usersmay perlevelof detail.A data model-a collectionof concepts
ceivedata at their pref-erred
that can be used to describethe structure of a database-provides the necessary
o.fa databascw e m ean t he dat a
n re a n st o achi evethi s abstracti on.lB v strrtctrtrt:
and constraintsthat should hold tbr the data.Most data ntodtypes,relatior.rships,
elsalsoincludea setof basicoperations fbr specifing retrievirlsand updateson the
database.
In addition to the basicoperationsprovided by the data model, it is becomingmore
comnon to include conceptsin the data model to speciff the dynamic aspect or
appiication.This allorvsthe databasedesignerto specifra set
behavior of a databirse
ofvalid user-definedoperationsthat are allorvedon the databaseobjects,rAn exanrrvl-richcan be appliedto a
ple of a user-det'ined
operationcould be COMPUTE-GPA,
STUDENTobject.On the other hand, genericoperationsto insert,delete,modiSr,or
retrieveany kind of object are often included in the Dnsicdnto nrodeloperatiorts.
Conceptsto specifubehavior are fundamentalto object-orienteddata models (see
Chapters20 and 21) but are alsobeing incorporatedin more traditional data rnodmodels (seeChapter 22) extend the basic relaels. For example,ol'rject-relational
tionirl model to incltrdesuch concepts,amons others.In the relationaldata model,
th e re i s a provi si on to attachbehavi or to the rel ati onsi n the form of per sist ent
storedmodules,poprularlyknown as storedprocedures(seeChapter9).
2 , So r e t' Te) tre rrota t", del 's u>ed to oenote d speti ',c databasedescrol or, o- schena for e' arrol e.
th e m a r keti ngdata model .We w l l not use thi s ni erpretatron,
rel recl 5aTrenow rereD voaLaD ased e> rgrards o' Il a' e
3 .- l- e r c,.tsi oro{-orccptsl ooesrrbebera'ro
o e sr g l d ctr" l e. are r'creds'rgl , be -g co'nb,.ed , )l o a S rngl eactrvrtll.adr'ol a 7 sl ecr' y rrg oel ^a/ro' s
a sso ca te d w i th softw aredes qn,
2.1 Data Models,Schemas.and Instances
\\'5ternS.
: . icncei n
i : t r r lo.In
-.rlh prrolr tllllle D t ,
.. I rrn al l y,
. l io n 2 .7
::r.lt ntay
.:J.
sOllle
.. . iort of
:t : i. r lfea:-. o t -th e
: : r. rVp er!( ) l l ce P t s
r cccs sary
: hc cl ata
.tt.lnrod: a\( )ll th e
. : r gr t l o re
rspect or
..;iti a set
\lt L'\rtnt-
:.Ii.'ti to a
:trtlih',or
'( ' r (l Il ()/ ls .
' Jcls (se e
.ri.rnrod. i. ic r el a .r nrociel,
' a r \ lS te nt
2.1.1 Categoriesof Data Models
\lany data models havebeen proposed,which rve can categorizeaccordingto the
t\.pesof conceptsthey use to describethe databasestructure. HighJevel or conceptual data models provide conceptsthat are closeto the way many usersperceive
data,whereaslow-level or physical data models provide conceptsthat describethe
detailsof how data is storedin the computer.Conceptsprovided by low-leveldata
nrodelsare generally meant for computer specialists,not for typical end users.
Betweenthese two extremesis a classof representational (or implementation)
data models,awhich provide conceptsthat may be understood by end usersbut that
.lre not too far removed from the way data is organized within the computer.
Representational
data models hide sorre detailsof data storagebut can be implernentedon a computer systemdirectly.
(-onceptualdata models use conceptssuch as entities,attributes,and relationships.
.\n entity representsa real-worldobject or concept,such as an employeeor a projAn attribute represents
r'ctthat is describedin the database.
someproperty of intercst that further describesan entity, such as the employee'sname or salary.A
re l a ti o n s h i pa mo n g tw o or more euti ti esrepresents
un l ssoci ati ouamong tw o or
nrore entities,for example,a works-on relationship between an employeeand a
project.Chapter 3 presentsthe Entity-Relationshipmodel-a popular high-level
.onceptual data model. Chapter 4 describesadditional abstractions used for
.rdvancedmodeling,such as generalization,
and categories.
specialization,
Representationalor implementation data models are the models Llsedmost frequently in traditional commercial DBMSs. Theseinclude the widely used relational
clatamodel, aswell as the so-calledlegacydata models-the network and hierarchical models-that havebeenwidely usedin the past.Part 2 is devotedto the relational
data model, its operationsand languages,
and some of the techniquesfor programnring relational databaseapplications.5The SQL standard for relational databasesis
describedin Chapters8 and 9. Representational
data modelsrepresentdataby using
record structuresand henceare sometimescalledrecord-baseddata models.
\\'e can regard the object data model group (ODMG) as a new family of higherlevel implementation data models that are closerto conceptualdata models.We
describethe generalcharacteristics
of object databasesand the ODMG proposed
standardin Chapters20 and 21. Object dartamodels are also frequentlyutilized as
high-levelconceptualmodels,particularlyin the softwareengineeringdomain.
Physicaldata models describehow data is stored as files in the cornputer by representinginformation such as record formats,record orderings,and accesspaths.An
accesspath is a structure that makesthe searchfor particular databaserecordsefficient.We discussphysicalstoragetechniquesand accessstrlrcturesin Chapters13
-:.:--De,
J. The term implementatrondata model is not a standard termi we have ntroduced t to refer to ihe availa cie d a ta m o d e lsin co m m e r cia databasesystems,
::. l'
S
5 , A su m m a r yo f th e n e two r k a nd hi erarchi cadata model s i s i ncl udedi n Append ces E and F. They are
a cce ssib lefr o m th e b o o k' sWe b srte,
31
Chapter2 Database
SystemConceptsand Architecture
and 14. An index is an example of an accesspath that allows direct accessto data
using an index term or a key.word.It is similar to the index at the end of this book,
exceptthat it may be organizedin a linear,hierarchical,or some other fashion.
2.1.2 Schemas,Instances,and DatabaseState
In any data model, it is important to distinguishbetweenthe descriptionof the database and the databaseitself. The description of a databaseis called the database
schema, which is specified during databasedesign and is not expectedto change
frequently.6Most data models have certain conventions for displaying schemasas
diagrams.TA displayedschema is called a schema diagram. Figure 2.1 shows a
schemadiagram for the databaseshown in Figure 1.2; the diagram displaysthe
structureof each record type but not the actual instancesof records.We call each
object in the schema-such as STUDENTor COURSE-a schemaconstruct.
A schemadiagram displaysonly sorre aspectsof a scherna,such as the names of
record types and data items,and some types of constraints.Other aspectsare not
specifiedin the schemadiagram;fbr example,Figure2.1 showsneitherthe datatype
of eachdata itern nor the relationshipsamong the vnrious files.Many typesof constraints are not representedin schema diagrams.A constraint such as students
majoringin cornputerscience
must toke C51310belbrethe end of theirsoplrcmore
)tear
is quite difficult to represent.
The actual data in a databasemay changequite frequently. For example,the databaseshown in Figure 1.2changeseverytime we add a studentor enter a new grade.
The data in the databaseat a particularmornent in time is calleda databasestateor
snapshot. It is also called the atrrent set of occurrences or instances in the database.In a given databasestate,each schernaconstruct has its own current set of
instances;for example,the STUDENTconstruct will contain the set of individr"ral
student entities(records)as its instances.Many databasestatescan be constructed
to correspond to a particular databaseschema.Every time we insert or delete a
record or change the value of a data item in a record, we change one state of the
databaseinto anotherstate.
The distinction between datarbase
schemaand databasestate is very important.
When we define a new database,we specif, its databaseschemaonly to the DBMS.
At this point, the corresponding databasestate is the entpty sfcte with no data. We
get the initial stateof the databasewhen the databaseis first populated or loaded
with the initial data.From then on, everytime an updateoperatior.r
is appliedto the
database,we get another databasestate.At any point in time, the databasehas a
currentstnte.EThe DBMS is partly responsiblefor ensuring that every stateof the
6
Srhama
nhonnoc
d; r o L
r ra , ,ta l t.,
J rr u
noodan
e<
r n a,u
r eunyu
r ,, i r oLm o n r r
n{
rho
n o r a h r co
r r .r i n -a ",u ,
a:ngnpl,,\
)
L- r,
ln^^
a , q s,
N
,\rl ^vvc ^ "
d a ta b a sesyste m sin clu d eo perati onsfor al l ow i ngschemachanges,a though the schemachangeprocess
i. m o r e r lo lve d ll' a n s n p le daraba.eupdal es.
7 lr r s cr sr o r a ' y r T o a T a o asepdr arce ro L<e sLhe/r?sas t'e pl u'a 'o' s"hpma,e\e
th e o r o o e r o u r a lfo r m ,T h e word schemei s someti mesused to refer to a schema.
B. T h e cu r r e n tsta ie is a so c a ed the currentsnapsholof the database.
tl 'oJgl - .' hcmafdi \
Architecture
2.2 Three-Schema
and Datalndeoendence
to i 11il
i. bo ok,
) 11.
r c d Jtaa t a base
. ir . rnge
r': ll. l5 i l S
.:) ( )\ \ 'S i l
.. r ',r t he
.r ,lcach
databaseis a valid state-that is, a state that satisfiesthe structure and constraints
specified in the schema. Hence, specifying a correct schema to the DBMS is
c'rtremely important and the schema must be designedwith utmost care. The
DBMS storesthe descriptionsof the schemaconstructsand constraints-also called
the meta-data-in the DBMS catalog so that DBMS software can refer to the
schemawhenever it needsto. The schemais sometimescalled the intension, and a
databasestateis called an extension of the schema.
,\lthough, as mentioned earlier,the schemais not supposedto changefrequently,it
is r.rotuncommon that changesoccasionallyneed to be applied to the schemaas the
.rpplicationrequirementschange.For example,we may decide that another data
to
item needsto be stored for each record in a file, such as adding the Date_of_birth
the STUDENTschemain Figure2.1.This is known as schemaevolution.Most modern DBMSs include some operations for schema evolution that can be applied
rlhile the databaseis operational.
.i:l l cs o f
. i:c I t ot
. ll.l t|Pe
\) [ co l l :; rr/ r't t f -s
. ' fi . l 'c{l r-
:c t l a t a-
\ qracle.
state()r
Architecture
2.2 Three-Schema
and Data Independence
Three of the four important characteristicsof the databaseapproach, listed in
Section1.3,are (1) insulation of programs and data (program-dataand programoperationindependence),(2) support of multiple userviews,and (3) useof a catalog to store the databasedescription (schema). In this section we specify an
architecture for databasesystems,called the three-schema architecture,q that was
:c t l i t t a: : -i r 'l O f
i ir it iual
' rru ct ed
-: clct ea
-' oi the
Figure 2.1
f or t he
S chema
diagr am
S T U D EN T
Name Studentnumber Class Maior
datehac p
C OU R SE
Course name Course number Credit hours Department
'( ) r t i l I l t .
I )B\ lS.
.rt.r.\Ve
loaded
. l to t he
'c h;:s it
: oi tl.re
:..e r
-. SS
:.il
P R E R E OU IS ITE
number
CoursenumberI Prerequisite
SECTION
Section identifier Coursenumber Semester Year
Instructor
GR AD ER E POR T
Student_number
I Section_identifier
I Grade
S
rt (Tschrtzis
afterthe committee
that proposed
architecture,
9 Ths is alsoknownas the ANSI/SPARC
andKluo1978) ,
rn Fi n rrrp
1 9
Chapter2 DatabaseSystemConceptsand Architecture
proposed to help achieveand visualizethesecharacteristics.
Then we discussthe
conceptof data independencefurther.
2.2.1 The Three-Schema
Architecture
The goal of the three-schema
architecture,illustratedin Figure2.2,is to separatethe
user applicationsand the physicaldatabase.In this architecture,schemascan be
defined at the follorving three levels:
The internal level has an internal schema,which describesthe physicalstoragestructureof the database.
The internalschemausesa physicaldatarnodel
and describesthe completedetailsof data storageand accesspaths for the
database.
The conceptual level has a conceptual schema, which describesthe structure of the whoie databasefor a comrnunity of users.The conceptualschema
hides the detailsof physicalstoragestructuresand concentrateson describi n g e n ti ti e s,di .rtatypes, rel ati onshi ps,user operati ons,and constraint s.
Usually,a representationaldata model is used to describethe conceptual
schemawhen a databasesystemis implernented.This intplenrentatiotr
corrceptualschenrnis often based on a conceptualsclrcmodesignin a high-level
data n-rodel.
: The external or view level includes a number of external schemas or user
views. Eachexternalschemadescribesthe part of the databasethat a particular user group is interestedin and hides the rest of the databasefrom that
Figure2,2
The three-schema
architecture.
End Users
*
ExternalLevel
External/Conceptu
al
M apping
\
Conceptual Level
Conceptual/Internal
Mapping
Internal Level
InternalSchema
Stored Database
2.2 Three-Schema
Architecture
and DataIndependence
, .ussthe
.tr.ltcthe
. ..ln be
.. r i st or:.1lnr)del
. t ir r the
la \ t l'L lC -
. ch ema
.: cr cri b . i r. lints.
t . r . pt ual
:. rl l 1 1 ) r l -
sh -l e vel
r )f u sef
.
,-
r ['(ll- - c ; ^ rtl-
\) l t l t h at
user group. As in the previous case,each external schemais typically implemented using a representational
dertarnodel,possiblybasedon an external
schemadesignin a high-ler.eldata model.
The three-schemaarchitectureis a convenienttool with which the usercan visualize
the schemalevelsin a databasesysterr.Most DBMSs do not separatethe threelevels
completelyand explicitly,but support the three-schemaarchitectureto some extent.
Some DBMSs may include physical-leveldetails in the conceptual schema.The
three-levelANSI architecturehas an ir-nportantplace in clatabasetechnology development becauseit clearlyseparates
the users'externallevel,the system'sconceptual
level,and the internal storagelevelfor designinga database.
It is very much applicable in the designof DBMSs, even today.In most DBMSs that support user views,
e x te rn a l s c h e ma s a re s p eci fi ed i n the same data model that descri besthe
conceptual-level
information (for example,a relationalDBMS like OracleusesSQL
tbr this). SomeDBMSs allow differentdata rnodelsto be usedat the conceptualand
external levels.An example is Universal Data Base (UDB), a DBMS from IBM,
rvhichusesthe relationalmodel to describethe conceptualschema,but may use an
object-orientedmodel to describean externalschema.
Notice that the three schemasare only descriptionsof data; the stored data that
octuallyexistsis at the physicallevel.In a DBMS basedon the three-schemaarchitecture, each user group refers only to its orvn external schema.Hence, the DBMS
must transform a requestspecifiedon an external schemainto a requestagainstthe
conceptualschema,and then into a requeston the internal schemafor processing
over the stored database.If the requestis a databaseretrieval,the data extracted
from the stored databasemust be reformatted to match the user'sexternalview. The
processesof transforming requestsand resultsbetween levelsare called mappings.
Thesemappings may be time-consuming,so some DBMSs-especially those that
are meant to support small databases-do not support external views.Even in such
systems,however,a certain arnount of mapping is necessaryto transform requests
betweenthe conceptualand internal levels.
2.2.2 Data Independence
The three-schemaarchitecture can be used to further explain the cor-rceptof data
independence, which can be defined as the capacity to change the schema at one
levelof a databasesystemwithout having to changethe schemaat the next higher
level.We can define two types of data independence:
r Logical data independence is the capacityto charrgethe conceptual schema
without having to changeexterntrlschemasor application programs. We
may changethe conceptualschemato expand the database(by adding a
record type or data item), to changeconstraints,or to reducethe database
(by removing a record type or data item). In the last case,externalschemas
that referonly to the remainingdatashould not be affected.For example,the
external schema of Figure 1.5(a) should not be affectedby changing the
GR AD E_ R EP ORfil
T e (or record type) show n i n Fi gure 1.2 i nto the one
36
Chapter 2 DatabaseSystem Concepts and Architecrure
shown in Figure L6(a). onlv the viervdefinition and the mappingsneed be
changedin a DBMS that supportslogicaldata independence.
After the conceptual schemaundergoesa logical reorganization,applicatiou progriuus
thertreferencethe externalschemaconstructsmust work as betore.Chanses
to constraintscan be appliedto the conceptualschemawithout affectingthe
externalschemasor irpplicationprograms.
l.,',Physical data independence is the capacity to change the internal schena
w i th o u t h a v i n g to change the conceptualschema.H ence, the external
schentasneednot be changedaswell.Changesto the internalschen.ra
ntaybe
neededbecausesome physicirlfiles were reorganized-[or example,by creating additionalaccessstructures-to improve the performanceof retrievalor
update.If the same data as before remains in the database,tve should not
have to changethe conceptualscherna.For example,providing an access
path to improve retrievalspeedof sectionrecords(Figure 1.2) by semester
and yearshould not requirea querv such as listoll sectiotts
oJJbred
in fnlt 2004
to be changed,although the query would be e-xecuted
.rore efficiently by the
DBMS by utilizing the new accesspatl'r.
Generally,physicaldata independenceexists in most databasesand tlle environnlents in which the exact location of data on disk, hardware details of storage
encoding,placement,compression,splitting,mergingof records,and so or-rare hidden from the user.Applicirtionsrenrainunilivareof thesedetails.On the other hanj,
logical data independenceis very hard to come by becauseit allorvsstructuraland
c o n s tra i n t c h a n g e sw i t hout affecti ng appl i cati on programs-a much stri cter
requirement.
whenever we havea multiple-levelDBMS, its catalogmust be expandedto include
infbrmation on how to ntap reqLlests
and dataalrons the variouslevels.Tl-reDBIVIS
usesadditionalsotiwareto accomplishthesemappingsby ret'erringto the mapping
information in the catalog.Data independenceoccursbecausewhen the schemais
changedat some level,the schemaat tl.renext higher levelremainsunchanged;on11,
the mappingbetweenthe trvo levelsis changed.Hence,applicationprogramsreferring to the higher-levelschemaneednot be changed.
The three-schentaarchitecturecirn make it easierto achievetrue clataindependence,both physicaland logical. Horvever,the two levelsof mappings createan
overheadduring cornpilationor executiorrof i.rquery or progrirm,leadingto inefficienciesin the DBMS. Beciruse
of this, few DBMSs haveimplementedthe full threeschemaarchitecture.
2.3 DatabaseLanguagesand Interfaces
In section 1.4we discussedthe variety of userssupportedbv a DBNIS.The DBlvtS
must provide appropriatelanguages
and ir.rterfaces
fbr eachcategoryof users.In this
sectionwe discussthe types of languagesand interfacesprovided by a DBMS and
the usercategoriestargetedby eachinterfirce.
2.3 DatabaseLanouaoesand Interfaces
cci be
' col l I r.l l11s
. rn ges
r g t he
hcma
c rn al
i.rvbe
. rciltr. r l 0 r
.l trot
i c ccss
r a st er
. :004
.r the
iro n()rirge
: hiclh a nd ,
rl .rr.rd
ricter
c lu de
) B\ I S
ppillg
nt.l ls
: o nlv
rcfer-
-'Pe11.tc a l l
nt'fllh r.-e-
)I J\ I S
n tl.ris
: .r n cl
2.3 .1D B MS La n g u a ges
Once the designof a databaseis completedand a DBMS is chosento implenrentthe
tlltabase,the first stepis to specif' conceptualand internal schemasfor the database
.rnclany mappingsbetweenthe two. In marryDBMSs where no strict separationof
lc.r'elsis maintained, one language,calied the data definition language (DDL), is
Lrsed
by the DBA and by databasedesignersto defineboth schemas.The DBMS will
h.u.ea DDL compiler whosefunction is to processDDL statementsin order to identif\'descriptionsof the schemaconstructsand to storethe scher.na
descriptionin the
l)BMS catalog.
ln DBMSswherea clearseparationis maintainedbetweenthe conceptualand internal levels,the DDL is usedto specifythe conceptualschemaonly.Another liu-rguage,
the storage definition language (SDL), is used to specif' the internal scl-rer.ntr.
The
nrirppingsbetween the two schemasmay be specifiedin either one of theselanquages.In most relationalDBMSs today,there is no specificlanguagethat perfbrms
the role of SDL. Instead, the internal schema is specified by a cor-r-rbinatior.r
of
parametersand specificationsrelatedto storage-the DBA staff typically controls
inclexingand mapping of data to storage.For a true three-schemaarchitecture,rve
rvould need a third language,the view definition language (VDL), to specify user
viervsand their mappingsto the conceptualschema,but in most DBMSs the DDL is
usedto define both conceptualand externalschemas.In relationalDBMSs,SQL is
usedin the role of VDL to define user or applicationviews as resultsof predefined
(seeChapters8 and 9).
qLreries
()nce the databaseschemasare compiled and the databaseis populatedwith data,
usersmust have some means to manipulate the database.Typical manipulations
includeretrieval,insertion,deletion,and modification of the data.The DBMS pror ides a set of operations or a languagecalled the data manipulation language
DML) for thesepurposes.
ln current DBMSs,the precedingtypes of languagesare usually not considered
distirrct languages;rather, a comprehensiveintegrated languageis used that includes
constructsfor conceptualschemadefinition, view definition, and data manipr-rlation. Storagedefinitior-ris typicallykept septrrate,
sinceit is usedfor defining phvsical storagestructuresto fine-tune the performanceof the databasesystem,lvl-richis
tusr"rally
done by the DBA staff.A typical exarnpleof a cornprehensivedatabaselanguageis the SQL relationaldatabaselanguage(seeChapters8 and 9), rvhich representsa combination of DDL, VDL, and DML, as well as statementsfor constraint
specification,schemaevoiution, and other features.The SDL was a component in
earlyversionsof SQL but hasbeen rernovedfror.nthe languageto keepit at the conceptualand externallevelsonly.
'fhere are two main types of DMLs. A high-level or nonprocedural DML can be
used on its own to specif, complex databaseoperationsconcisely.Many DBMSs
allow high-levelDML statementseither to be enteredinteractivelyfrom a display
in a general-purposeprogramming lanrnonitor or terminal or to be en-rbedded
guage.In the latter case,DML statementsmust be identifiedwithin the program so
37
Chapter2 DatabaseSystemConceptsand Architecture
that they can be extractedby a precornpilerand processedby the DBMS. A lowlevel or procedural DML rrrrr-sr
be embeddedin a general-purposeprogramming
language.This type of DML typically retrieves individual records or objects from
eachseprarately.
Therefore,it needsto rlseprogranrming
the databaseand processes
languageconstructs,such aslooping, to retrieveand processeachrecordfrom a set
of records.Low-level DMLs are erlsocalled record-at-a-time DMLs becauseof this
property.DL/1, a DML designedfor the hierarchicalmodel, is a lol-level DML that
to
u s e sc o mma n d ss u c ha s GE TU N IOU EGE
, T N E X T,or GE TN E X TW ITH INP A R E N T,
navigatefrom recordto record rvithin a hierarchyof recordsin the database.Highlevel DMLs, such as SQL, can specifl,and retrievemany recordsin a single DML
statement;therefbre,they.arecalledset-at-a-time or set-oriented DMLs. A qr-reryin
a high-levelDivll often specifiesrll ich datato retrievelather than lrcw lo retrieveit;
therefore,sr.rchlanguagesare also called declarative.
Whenever DML comn-rands,whether high level or lorv level, are embedded in a
ger-reral-purpose
progran-uninglanguage,that languageis called the host language
and the DN4L is called the data sublanguage.r0On the other hand, a high-level
DML usedin a standaloneinteractivemanner is calleda querylanguage.In general,
both retrievaland updatecommandsof a high-levelDML may be usedinteractively
part of tl-requery language.rl
and are hencecoi'rsidered
Casual end userstypically use a high-level query languageto specifr their requests,
whereasprogrammers use the DML in its embedded form. For naive and parametric users,there usuallv irre user-friendly interfaces for interacting with the database;thesecan alsobe usedby cirsualusersor otherslvho do not want to learn the
detailsof a high-levelquery language.We discussthesetypesof interfacesnext.
2.3.2 DBMS Interfaces
User-friendly interfacesprovided by a DBMS may include the following:
Menu-Based Interfaces for Web Clients or Browsing, Theseinterfacespresent
the userwith listsof options (calledmenus) that leadthe userthrough the formulation of a reqlrest.Menus do awayrvith the need to mernorize the specificcommands
and syntaxof a query language;rather,the query is con-rposed
step-by-stepby picking options from a menu that is displayed by the system.Pull-down menus are a
very popular technique in Web-based user interfaces. They are also often used in
browsing interfaces, which allow a user to look through the contents of a database
in an exploratoryand unstructuredmanner.
1 0 . lr o b le ct o a la o a se s,tre rost a1d odta srbaaguages [yoi cal l y'orrr o1e .rl eQrareritar^guage-fo'
e xa m p le ,C+ + with so m e extensi onsto supportdatabasefunct onal i ty,Some rel ati onalsystemsa so pro v d e in te g r a te dla n g u a g e s-forexampl e,Orac e's PLISOL,
1 1 .Acco r d r n gto th e En g lishmeani ngof the w orcjquery,i i shoul dreal l ybe used to descr be retri evas onl y ,
n o t u p d a te s.
2.3 DatabaseLanouaoesand Interfaces
-{ lowrn r ming
.ts fiom
rn.rming
)m a set
: oi this
\ l L t hat
I EN T ,tO
: . H ighl c DML
l u e r y in
:riL'\'eiU
l cd in a
rnguage
i h -leve l
!c-n€f?1,
.rctively
r'QU€StS,
.iramettc' data-'arn the
present
rrmular m an ds
:rvpickJS are a
u s edi n
l.rtabase
displaysa fbrm to each user.
Forms-Based Interfac€s. A forrns-basedir-rterface
L-serscan fill out all of the form entriesto insert new data,or they can fill out only
ccrtainentries,in which casethe DBMS will retrievernatchingdata for the remaining entries.Forms are usually designedand programmed for naive usersas intertircesto canned transactions.Many DBMSs have forms specification languages,
rvhichare speciallanguagesthat help programmersspecifysucl-rfbrms. SQL*Forms
is ir form-basedlanguagethat specifiesqueriesusing a form designedin conjuncti o n rv i th th e re l a ti o n a ld a tabaseschema.Oracl e Forms i s a component of the
L)racleproduct suite that providesan extensiveset of featuresto designand build
.rpplicationsusing forrns.Some systemshaveutilities that define a form by letting
the end userinteractivelvconstructa sampleforrn on the screen.
Graphical User Interfaces. A GUI tl,picallydisplaysa schena to the user in diaqrammaticform. The userthen can specifya query by manipulatingthe diagram.It-t
nriinycases,GUIs utilize both menus and forrns.Most GUIs use a pointing device,
parts of the displayedschemadiagram.
\uch elsa mouse,to pick certair.r
Natural Language Interfaces. Theseinterfacesacceptrequestsrvritten in English
()r some other languageand attempt to ruulerstotrd
them. A natural languageinterwhich is similar to the databaseconceptualschema,
l.rceusuallyhas its own schenra,
.rsrvellas a dictionary of important words.The natural languageinterfacerefersto
the rvordsir-rits schema,as rvell as to the set of standardwords in its dictionary,to
generatesa
interpret the request.lf the interpretation is successful,the ir.rterface
high-levelquery correspondingto the natural languagerequestand submits it to
the DBMS for processing;otherrvise,a dialogueis startedwith the userto clarifl' the
rcquest.The capabilitiesof natural languageinterfaceshave not advancedrapidly.
lirclal',we seesearchengir.res
that acceptstringsof ntrturallanguage(like Englishor
:panish) words and match them u'ith docurnentsat specificsites(for local search
cngines)or Web pageson the Web at large (for engineslike Google or AskJeeves).
'l'hev
use predefinedindexeson rvords and use ranking functions to retrieveand
prr-sentresultingdocumentsin a decreasingdegreeof match. Such"free form" texttral query interfacesare not common on structured relational or legacvr.r.roclel
databases.
Speech Input and Output. Limited useof speechas an ir-rputquerv and speechas
.l n a n s w e r to a q u e s ti o n or resul t of a request i s becomi ng cornmonpl ace.
directory,
,\pplications with limitecl vocabulariessuch as inquiries for telephor-re
t'light arrival/departure,and bank account inforrnation are allowing speechfor
inpr-rtand output to enableordir-raryfolks to accessthis inforrnation. The speech
input is detectedusing a library of predefinedwords and usedto set up the paramt.tersthat are suppliedto the queries.For output, a similar conversionfrom text or
into speechtakesplace.
nr.rmbers
Interfaces for Parametric Users. Parametricusers,such as bank tellers,ofterr
havea small set of operationsthat they must perform repeatedly.For example,a
Chapter 2 DatabaseSystem Concepts and Architecrure
teller is able to use single-function keys to invoke routine and repetitive
transactions
such as depositsor withdrawals into accounts,or balanceinquiries.
S),stemsanalysts
and programmers design and implement a specialinterfacefor
eachknown classof
naive.users.Usually a small set of abbreviatedcommands is included,
with the goal
of minimizing the number of keystrokesrequired for each request.
For example,
function keys in a terminal can be p.og.un-r-.d to initiate,nurious
commands. This
allows the parametric user to proceed with a minirnal number of
keystrokes.
Interfaces for the DBA. Most databasesystemscontainprivileged
commandsthat
can.beused only by the DBAs staff.Theseinclude .o,rr-u.,d, foi
creating accounrs,
settingsystemparameters,granting accountauthorization,changi'g
a schema,and
reorganizingthe storagestructuresof a database.
2.4 The DatabaseSystemEnvironment
A DBMS is a complex softwaresystem.In this sectio.rwe discuss
the types of software componentsthat constitutea DBMS and the types of computer
systemsoft_
ware with which the DBMS interacts.
2.4.1 DBMS ComponentModules
Figure2.3 illustrates,in a simplified form, the typical DBMS components.
The fig_
ure is divided into two halves.The top half of the hgure refersto
the various usersof
the databaseenvironment and their interfaces.Thelower half shows
the internals of
the DBMS responsiblefor storageof data and processingof transactions.
The databaseand the DBMS catalogare usually stored on disk. Access
to the disk is
controlled primarily by the operating system (os), rvhich
schedules disk
input/output. A higher-level stored data manager module of the
DBMS controls
access
to DBMS information that is storedon disk,whetherit is part of
the database
or the catalog.
Let us consider the top half of the figure first. It shows interf'acesfor
the DBA staff,
casualuserswho work with interactive interfacesto formulate queries,
application
programmers who program using some host languages,
atrd paiametric userswho
do data entry work by supplying parametersto predefined tia.sactions.
The DBA
staff works on defining the databaie and tuning it by making changes
to its defini_
tion using the DDL and other privilegedcommands.
DDr compiler processe.s
schemadefinitions, specified i' the DDL, and stores
lhe
descriptionsof the schemas(meta-data)in the DBMS catalog.The catalog
includes
information suchasthe namesand sizesof files,namesand dita types
of dataitems,
storagedetails of each file, mapping infbrmation among schemas,
and constraints,
in addition to many other types of information rhar areireeded
by the DBMS rnod_
ules.DBMS softwaremodulesthen look up the cataloginformatio'
as .eeded.
Casuai users and persons with occasionalneed fbr infbrmation from the
database
interact using some form of interface,which we show, as interactive
query inter-
2.4 The DatabaseSystem Environment
Ir on s
rh'sts
lss of
' g o al
n p le ,
T h is
CasualUsers
,
IDDLI
I
t
Sta lp m o n lq
- ' - ' " " ' ""' "
-----T-----
; that
unts,
. and
softsoft-
r f'lo-
r'rs Of
.i l s o f
iskis
disk
rtrols
rbase
stafl
ation
. rr'ho
I)BA
etrni!t ore s
I u d es
tcnts,
r i nt s,
n r od a b ase
l n ter-
I
,/
V
f-h-;,^^-l
Ouery
--T--
I
J
I
V
Application
Programmers
41
Parametric
Users
+
)
l--:---__l
uuery
IDDL I
I
l
I Compiler
I
--T--
Itl C o m p i l e rI
--------r-------
i
I
I
I
I
DBA Commands,
Oueries,andTransactions
V
Il^,,1 System l,
/
I uatarog/ r
Data
]
l-*,
I\- Dictionary I
-/
Concurrency
Control/
Backup/Recovery
Subsystems
Oueryand Transaction
Execution
Figur e2. 3
C nmnnnent nrndrrl ec nf : D B MS and thei r nterac tons .
or form-basedinteractionthat
tirce.We havenot explicitll-shown any r.t.te'nu-btrsed
Thesequeri esare
m a y b e L i s e dto g e l l e ra teth e i nteracti vequery ar,rtomati cal l y.
parsed,analyzeclfor correctnessof the operationsfor the model, the namesof data
elements,and so on by a query compiler that compilesthem into irn internal form.
This internal query is subjectedto query optirnizationthat we discussin Chapter
42
Chapter 2 DatabaseSystem Concepts and Architecture
15.Among other things, the query optimizer is concernedwith rearrangementand
possiblereorderingof operations,elimination of redundancies,and use of correct
algorithms and indexesduring execution.It consults the systemcatalog for statistical and other physical information about the stored data and generatesexecutable
code that performs the necessaryoperations for the query and makes calls on the
runtime processor.
Application programmers write programs in host languagessuch as fava, C, or
COBOL that are subn-rittedto a precompiler. The precompiler extractsDML commands from an application program written in a host programming language.
Thesecommandsaresentto the DML compiler for compilation into objectcodefor
databaseaccess.
The rest of the program is sent to the host languagecompiler.The
object codesfor the DML commands and the rest of the program are linked, forming a cannedtransactionwhoseexecutablecode includescallsto the runtime database processor.These canned transactions are useful to parametric users who
simply supply the parametersto these canned transactions so they can be run
repeatedlyas separatetransactions.An exampleis a bank withdrawal transaction
where the accountnumber and the amount may be suppliedas parameters.
In the lower half of Figure 2.3,the runtime databaseprocessoris shown to execute
(l) the privilegedcommands, (2) the executablequery plans,and (3) the canned
transactionswith runtime parameters.It works with the systemdictionary and may
update it with statistics.It works with the stored data manager,which in turn uses
basic operating system servicesfor carrying out low-level input/output operations
betweendisk and main memory. It handlesother aspectsof data transfer such as
managementof buffers in main memory. Some DBMSs havetheir own buffer management module while others depend on the OS for buffer management.We have
shown concurrency control and backup and recoverysystemsseparatelyas a module in this figure. They are integrated into the working of the run-time database
processorfor purposesof transactionmanagement.
It is now common to have the client program that accesses
the DBMS running on a
separatecomputer from the computer on which the databaseresides.The former is
called the client computer running a DBMS client and the latter is called the databaseserver. In some cases,the client accesses
a middle computer, calledthe application server, which in turn accesses
the databaseserver.We elaborateon this topic in
Se c ti o n2 .5 .
Figure 2.3 is not meant to describea specificDBMS; rather, it illustratestypical
DBMS modules. The DBMS interacts with the operating system when disk
accesses-to the databaseor to the catalog-are needed.If the computer system is
sharedby many users,the OS will scheduleDBMS disk accessrequestsand DBMS
processingalongwith other processes.
On the other hand, if the computer systemis
mainly dedicatedto running the databaseserver,the DBMS will control main memory buffering of disk pages.The DBMS also interfaceswith compilers for generalpurpose host programming languages,and rvith application serversand client
programs running on separatemachinesthrough the systemnetwork interface.
2.4 The DatabaseSvstem Environmenr
nt a nd
orrect
: a t ist i utable
rn the
C, or
L(rlll-
t u ag e.
,defor
'r. The
f orm. -l^+^
ud td-
j \fho
)e run
action
\ecute
a nn ed
d may
N U SC S
ations
lch as
m an .' have
modtabase
g on a
mer is
data'plicarp icin
v pical
r disk
tem is
)BMS
tem is
ntemneralclien t
,c',
2.4.2 DatabaseSystem Utilities
ln addition to possessing
the softrvaremodules just described,most DBMSs have
databaseutilities that help the DBA managethe databasesystem.Common utilities
hai'ethe following typesof functions:
Loading. A loading utility is usedto load existingdata files-such as text files
or sequentialfiles-into the database.Usually,the current (source)format of
the data file and the desired(target) databasefile structure are specifiedto the
utility, which then automaticallyreformatsthe clataand storesit in the database.With the proliferationof DBMSs,transferringdata frorn one DBMS to
another is becoming colnllrolt in many organizations.Some vendors are
offering products that generatethe appropriate loading programs, given the
existingsourceand target databasestoragedescriptions(internal schernas).
Such tools are also called conversion tools. For the hierarchicalDBMS called
IMS (l B M ) a n d fo r many netw ork D B MS s i ncl udi ng ID MS (C ornputer
Associates),
SUPRA (Cincorr), or IMAGE (HP), the vendorsor third party
companiesare rnakinga variety of conversiontools available(e.g.,Cincom's
SUPRAServerSQL) to transform data into the relationalmodel.
Backup. A backup utility createsa backup copy of the database,usuallyby
dumping the entire databaseonto tape. The backup copy can be used to
restorethe databasein caseof catastrophicfailure.Incrementalbackupsare
also often used,where only changessincethe previousbackup are recorded.
Incremental backup is r.norecomplex, but savesspace.
Databasestorage reorganization. This utility can be used to reorganizea set
of databasefiles into a differentfile organizationto inrproveperformance.
Performancemonitoring. Such a utility monitors databaseusageand provides statisticsto the DBA. The DtsA usesthe statisticsin making decisions
such as whether or not to reorganizefilesor whether to add or drop indexes
to improve performarnce.
t)ther utilities may be availablefor sorting files,handling data compression,monitoring access
by users,interf'acing
with the network,and performingother functions.
2.4.3Tools,ApplicationEnvironments,
and CommunicationsFacilities
Other tools areoften avaiiableto databasedesigners,
users,and DBMS. CASEtoolsrr
.rreusedin the designphaseof databasesysterns.
Another tool that can be quite usetirl in large organizations is an expandeddata dictionary (or data repository) system. In addition to storing cataloginforr.nationabout schemasand constraints,the
riatadictionary storesother ir.rformation,such as designdecisions,usagestandards,
L Ar th o u g aCASE sta r o s io ' co r p u t e-a
r d a ta b a sed e sig n ,
oed so't"l a e e-g 'l ee rng.nary C AS f too 5 dre L5eo p rq-dry
Chapter2 DatabaseSystemConceptsand Architecture
applicationprogram descriptions,and userinformation. Sucha systemis alsocaiied
an information repository. This information can be accesseddirectly by users or
the DBA when needed.A data dictionary utility is similar to the DBMS catalog,but
mainly by usersrather
it includes a wider variety of information and is accessed
than by the DBMS software.
Application development environments, such as the PowerBuilder (Sybase)or
JBuilder (Borland) system,are becoming quite popular. These systernsprovide an
environment for developing databaseapplications and include facilitiesthat help in
many facetsof databasesystems,including databasedesign,GUI development,
querying and updating, and application program development.
The DBMS also needsto interfacewith communications softI^/are,whose function
is to allow usersat locations remote frorn the databasesystemsite to accessthe databasethrough computer terminals, workstations, or their local personal computers.
These are connected to the databasesite through data communications hardware
suchas phone lines,long-haulnetworks,local netlvorks,or satellitecommunication
devices.Many commercial databasesystenlshave communication packagesthat
work with the DBMS. The integratedDBMS and data communicationssysternis
calleda DB/DC system.lnaddition,somedistributedDBMSs are physicallydistributed over multiple machines.In this case,communications networks are neededto
connect the machines. These are often local area networks (LANs), but they can
also be other types of networks.
2.5 Centralizedand Client/Server
Architecturesfor DBMSs
2.5.1 CentralizedDBMSsArchitecture
Architectures for DBMSs have fbllowed trends sirrilar to those for general computer system architectures.Earlier architecturesused mainframe computers to
provide the main processingfor all systemfunctions, including user application
programs and user interfaceprograms,as well as all the DBMS functionality.The
such systemsvia computer terminalsthat did
reasonwas that most usersaccessed
not have processingpower and only provided display capabilities.Therefore,all
processingwas performed remotely on the computer system,and only display
information and controls were sent from the computer to the display terminals,
which were connectedto the central computer via various types of communications networks.
As prices of hardware declined, most users replacedtheir terminals with PCs and
workstations.At first, databasesystensusedthesecomputerssimilarly to how they
had used display terminals, so that the DBMS itself was still a centralized DBMS in
which all the DBMS functionality,applicationprogram execution,and user interfaceprocessingwere carried out on one machine.Figure2.4 illustratesthe physical
2.5 Centralized
and Client/Server
Architectures
for DBMSs
\r) called
u scrsor
r log ,but
': rather
rJsc) or
, r i de an
t h p ln
in
) P nt ent,
i n ct ion
ne da ta :lp Uters.
r rd\\'are
l ic. rt ion
cc.sthat
, \ t em i S
.iistrib:cdedto
h cl can
45
i o m p o n e n ts i n a c e n tra l i z edarchi tecture.Gradual l y,D B MS systemsstarted to
trploit the availableprocessingpower at the user side,which led to client/server
l)BNISarchitectures.
2.5.2 Basic Client/ServerArchitectures
Irirst,we discussclient/serverarchitecturein general,then we seehow it is appliedto
I )llN4Ss.The client/server architecture was developedto deal with computing envir()nmentsin which a largenumber of PCs,workstations,file servers,printers,datal..rseservers,Web servers,and other equipment are connectedvia a network. The
idea is to define specialized servers with specific functionalities. For example,it is
possibleto connecta number of PCs or small workstationsas clientsto a file server
that maintainsthe files of the client machines.Another machine can be designated
.is a printer server by being connected to various printers; thereafter,all print
rr.questsby the clients are forwarded to this machine.Web servers or e-mail servers
.rlsofall into the specializedservercategory.In this way,the resourcesprovided by
.pecializedserverscan be accessed
by many client machines.The client machines
the
user
with
the
appropriate
interfacesto utilize theseservers,as well as
l.rovide
rvith local processingpower to run local applications.This conceptcnn be carried
()\'crto software,with specialized
prograrns-such asa DBMS or a CAD (computer.rrdeddesign)package-being stored on specificservermachinesand being made
.rccessible
to multiple clients.Figure 2.5 illustratesclient/serverarchitectureat the
Figure 2.4
A physical
centralized
architecture.
l l comrlers to
l ica t ion
r t r. T he
h.rt did
o re, ai l
i i s p la y
.-. : .- .,1 ^
rllllldlJt
t Apdb"ti*l
t
Programs
I
t
Terminal
DisplayControl
| fDisp
t pBMSI
Software
OperatingSystem
r un i c a-
(- s a nd
rr r ' t h ey
u.\lsin
r ln t er'hr sical
l/O Devices
(Printers,
TapeD ri ves,...)
Chapter2 DatabaseSystemConceptsand Architecrure
logical level; Figure 2.6 is a simplified diagram that shows the physical architecture.
Some machineswould be client sitesonly (for example,disklessworkstationsor
workstations/PCs rvith disks that have only clieni software installed). other
machineswould be dedicatedservers,and otherswould haveboth client and server
functionality.
The concept of client/serverarchitectureassumesan underlying framework that
consistsof rnany PCs and workstations as well as a smaller tru-b., of mainframe
machines,connectedvia LANs and other types of computer networks.A client in
this framework is typically a user machine that providei user interface capabilities
and loca-lprocessing.When a client requiresaccessto additional functionalitysuch as databaseaccess-that doesnot existat that machine,it connectsto a server
that provides the needed functionality. A server is a system containing both hardware and software that can provide servicesto the client machines,such as file
Figure 2.5
Logicaltwo-tier
client/server
architecture.
Figure 2.6
Physical
two-tier
client,/server
architecture,
Diskless
Client
tilf
Site 1
C l i ent
with Disk
2.5 Centralized
and Client/Server
Architectures
for DBMSs
'cture.
)n s o r
O the r
server
k that
trame
ent in
rilities
lin'server
hardas file
.l.ccss,printing, archiving,or databaseaccess.
In the generalcase,some machines
installonly client software,othersonly serversoftrvare,and still othersmay include
iroth client and serversoftrvare,
as illustratedir-rFigure2.6.However,it is more comnron that client and serversoftrvareusually run on separatemachines.TWo main
lvpes of basic DBMS architecturesrverecreatedon this underlying client/server
il'irnrework:two-tier and three-tier.1r
We discussthem next.
2.5.3Two-TierClient/ServerArchitectures
for DBMSs
I'hc'client/serverarchitectureis increasinglybeing incorporated into commercial
l)BN,{Spackages.In relationaldatabasemanagementsystems(RDBMSs),many of
rr hich started as centralizedsysten-rs,
the s1'sterncomponents that were first moved
:o the client side were the user interfaceand application programs.BecauseSQL
sceChapters8 and 9) provided a standardlanguagefor RDBMSs,this createda
l,rgicaldividing point betweenclient and server.Hence,the query and transaction
:irnctionaiityrelatedto SQL processingremainedon the serverside.In such archiiccture,the serveris often called a query server or transaction server becauseit
r.ror.idesthesetwo functionalities.In an RDBMS the serveris also often calledar-r
SQL server.
in such a client/serverarchitecture,the r-rserinterfaceprograms and application
prosramscan run on the client side.When DBMS accessis required,the program
c:tablishesa connectionto the DBMS (rvhich is on the serverside);once the conncction is created,the client program can communicatewith the DBMS. A standard
..rlled Open Database Connectivity (ODBC) provides an application programming interface (API), which allowsclient-sideprogramsto call the DBMS, as long
ls both client and server machines have the necessarysoftware installed. Most
l)BMS vendorsprovide ODBC driversfor their systems.
A client program can actu.rllvconnectto severalRDBMSs and sendquery and transactionrequestsusing the
t)DBC API, which are then processedat the serversites.Any querv resultsare sent
b.rckto the client program, rvhich can processor displaythe resultsas needed.A
rclatedstandard for the lava programming language,called JDBC, has also been
.lt-tlned.This allows Javaclient programs to accessthe DtsMS through a standard
i rlterface.
fh e s e c o n d a p p ro a c h to c l i ent/serverarchi tecturel vas taken by some obj ectoriented DBMSs,where the softrvaremoduies of the DBMS were divided between
clicnt and server in a more integrated way. For example, the server level may
includethe part of the DBN{Ssoftwareresponsiblefbr handling datastorageon disk
p.rges,
local collcurrencycontrol and recovery,buffering and cachingof disk pages,
.rndother such functions.N'leanwhile,
the client level may handle the userinterface;
clatadictionary functions;DBMS interactionswith programming languagecompilt'rs;global query optimization. concrlrrencycontrol, and recoveryacrossmultiple
\crvers;structuring of comprlexobjectsfrom the data in the buffers;and other such
We d scuss i he tw o most basi c ones
3 T h e r e a r e m a n y o th e r va r ia to n s of cl i ent/serverarchrtectures.
,.T e ,
47
48
Chapter2 DatabaseSystemConceptsand Architecture
functions. In this approach,the client/serverinteraction is more tightly coupled and
is done internally by the DBMS modules-some of which reside on the client and
some on the server-rather than by the users.The exact division of functionalitv
varies from systemto system.In such a client/serverarchitecture,the server has
been called a data seryer becauseit provides data in disk pagesto the client. This
data can then be structured into objects for the client programs by the client-side
DBMS software itself.
The architecturesdescribedhere are called two-tier architectures becausethe software components are distributed over two systems:client and server.The advantagesof this architecture are its simplicity and seamlesscompatibility with existing
systems.The emergenceof the Web changedthe roles of clients and server,Ieading
to the three-tier architecture.
2.5.4 Three-Tierand n-TierArchitecturesfor Web Applications
Many Web applications use an architecturecalledthe three-tier architecture, which
adds an intermediate layer between the client and the databaseserver,as illustrated
i n F i g u re2 .7(a).
This intermediate layer or middle tier is sometimes called the application server
and sometimesthe Web server, depending on the application. This server plays an
intermediaryrole by storingbusinessrules (proceduresor constraints)that are used
to accessdata from the databaseserver.It can also improve databasesecurity by
checking a client's credentialsbefore forwarding a request to the databaseserver.
Clients contain GUI interfacesand some additional application-specificbusiness
rules.The intermediate serveracceptsrequestsfrom the client, processesthe request
and sendsdatabasecommands to the databaseserver.and then acts as a conduit for
Figure 2.7
I o n ica l ih r e e - tie r
;iw
C l i ent
client/server
architecture,
w i tha c o u p l eo f c o mmo n l ,
usednomenclatures.
Application
Server
or
WebServer
Database
Server
2.6 Classification
of DatabaseManagement
Systems
pl.'d and
l i c n t a nd
tion irl i ty
.'rr.erhas
crt. This
: cnt -si d e
:h c so ftc .rdvan: e r isti ng
-. 1c'ading
tions
re.rr'hich
, -:rt rated
In Server
l'la\-san
irc used
. r rin ' by
.!
\!'fvef.
passing(partially) processeddata from the databaseserverto the clients,where it
nrrrybe processedfurther and filtered to be presentedto usersin GUI format. Thus,
tlrc lser interface,applictttiottrules, a'nddata nccess
act as the three tiers. Figure
l.l(b) showsanother architectureused by databaseand other applicationpackage
\r'ndors. The presentationlayer displaysinformation to the user and allows data
cntry. The businesslogic layer handlesintermediaterules and constraintsbefore
tl.rtais passedup to the user or down to the DBMS. The bottom layer includesall
Jrta managementservices.If the bottom layeris split into two layers(a Web server
.urcla databaseserver),then this becomesa four-tier architecture.It is custornaryto
.livide the layersbetweenthe user and the stored data further into finer componcnts, thereby giving rise to n-tier architectureswhere n may be four or five.
I\ picaily,the businesslogic layeris divided into rnultiplelayers.Besidesdistributing
i)rogrammingand datathroughout a netrvork,r-tier applicationsafford the advan:.rq eth a t a n y o n e ti e r c a n ru n on an appropri ateprocessor
or operati ngsystenrpl attirrrn and can be handled independently.Another layer typically used by vendors of
I:R P (e n te rp ri s ere s o u rc ep l anni r-rg),
and C R M (customerrel ati onshi pmauagerrrent) packagesis the rniddlewarelnyer which,accounts for the front-end modules
.ommunicating with a number of back-enddatabases.
\clvancesin encryption and decryption technologymake it saferto transfersensirivedatafrom serverto client in encryptedform, whereit will be decrypted.The latIr.r con be done by the hardwareor by advancedsoftware.This technologygives
hieher levelsof data security,but the network securityissuesremain a major concr.rn.Varioustechnologiesfor data compressionalsohelp to transferiargeamounts
tri data from serversto clientsover wired altd wirelessnetworks.
:.u : i n eS S
r lc'tlU€St
::Juitfor
2.6 Classification
of Database
ManagementSystems
Severalcriteria are normally usedto classifyDBMSs.The first is the data model or-r
rvhichthe DBMS is based.The main data model usedin many current commerciirl
l)BMSs is the relational data model. The object data model has been implementecl
in some commercialsystemsbut has not had widespreaduse.Many legacyapplications still run on databasesystemsbasedon the hierarchical and network data
models.Examplesof HierarchicalDBMSs include IMS (lBM) and some other systcms like System2K (SAS Inc.) or TDMS, which did not succeedmuch commercially.IMS continues to be a very dominant player among the DBMSs in use at
sovernmentaland industrial installations,including hospitalsand banks.The netrvork data model was usedby many vendorsand the resultingproducts like IDMS
rCullinet-now Computer Associates),
DMS 1100 (Univac-now Unisys),IMAGE
, Hewlett-Packard),VAX-DBMS (Digital-now Compaq), and SUPRA (Cincom)
still have a following and their user groups have their own activeorganizations.If
rveadd IBM's popular VSAM file systemto these,we can easilysaythat (at the time
of writing), more than 50% of the worldwide-computerizeddata is in these socalledlegary databasesystems.
Chapter2 DatabaseSystemConceptsand Architecture
The relational DBMSs arreevolving continuously, and, in particular, have been
incorporating n-ranyof the conceptsthat were developedin object databases.This
has led to a new classof DBMSs calledobject-relationalDBMSs.We can categorize
DBMSs basedon the data model: relational,obiect,object-relational,hierarchical,
network, and other.
The secondcriterion used to classifyDBMSs is the number of users supportedbv
the system.Single-usersystemssupport only one user at a time and are mostly used
with PCs.Multiuser systems,which include the majority of DBMSs, support concurrent multiple users.
The third criterion is the number of sites over which the databaseis distributed. A
DBMS is centralized if the data is stored at a single computer site.A centralized
DBMS can support mr-iltipleusers,but the DBlv{Sand the databaseresidetotally at
a singlecomputer site.A distributed DBMS (DDBMS) can havethe actualdatabase
and DBMS softrvaredistributed over many sites,connectedby a computer network.
Homogeneous DDBMSs use the same DBMS softrvareat multiple sites.A recent
trend is to develop softwirreto accessseveralautonomous preexistingdatabases
stored under heterogeneous DBMSs. This leads to a federated DBMS (or multidatabasesystem),in which the participatingDBMSs irrelooselycoupledand havea
degreeof local autonomy.Many DDBMSs usea client-serverarchitecture.
The fourth criterion continues to be cost. But it is very difficult to propose any type
of classiflcationof DBlv{Ssbasedorl cost. Today we have open source (free) DBMS
products like N'IYSQLand PostgreSQlthat are supported by third-party vendors
The main RDBMS products are availableas free examinawith additional services.
tion 30-da1,copy'r,ersions
as u'ell as personalversious,rvhich may cost under $100
and allow a fair amourrtof functionality.The giant systemsare being sold in modular form with componentsto handle distribution, replication,parallel processing,
n-robilecapability,and so on, with a lirrgenurnberof parirmetersthat rnustbe defined
for the configuration. Furthermore, they are sold in the tbrm of licenses-site
licensesallow unlimited use of the databasesystemrvith any number of copiesrunning at the customersite.Another type of licenselimits the number of concurrent
usersor the nurnber of user seatsat a location. Standalonesingle user versionsof
in the overallconfiguration
somesystemslike ACCESSaresold per copy or ir-rcluded
of a desktopor laptop.In irddition,datawarehousingand mining features,aswell as
clatatypes,aremadeavailableat extracost.It is quite common
support for aclclitional
to pay in millions for the installation and maintenanceof databasesystemsannualll'.
We can also classifra DBI\{S on tbe basisof the types of accesspath options for
storing files.One well-known tamily of DBMSs is basedon invertedfile structures.
Finall,v,a DBMS can be general purpose or special purpose. When performance is
DBMS can be designedand built for a
a primary consideration,a special-prurpose
sr-rcha systemcanrlot be used for other applicationswithout
speciticapplicirtior-r;
major changes.Many airline reservationsand telephonedirectory systemsdeveloped in the pirstare specialpurposeDBMSs.Thesefall into the categoryof online
transaction processing (OITP) systems,which rnust support a large number of
delays.
concurrenttransactionswithout imposing excessive
2.6 Classification
of DatabaseManaoement
Svstems
r a v ebe en
ases.This
:Jtegorize
'rarchical,
rtrrtc'dbY
. 'rt lv use d
;.r)rt COn- ih ut e d.A
.'ntralized
I ()I rl l Va t
i J at a base
nctu'ork.
.\ recent
J.rtabases
\)r multintl havea
j rnv type
r DBN{S
, r'endors
cramina: Jcr S100
::'imcldu1,r;g5sing,
'c dct-tned
.. ss-si te
'picsrun) ncu rrent
:: . ions o f
:lguration
.r: rr'ellas
aonltnon
.innually.
r l i o ns fo r
tnrct u re s.
r nra ncei s
.uilt fo r a
\ \\'ithout
:rs devel,ri online
u nrb er o f
51
l.et us briefly elaborateon the n-raincriterion for classiS,ing
DBMSs:the datamodel.
t'hebasicrelationaldatamodel representsa databaseas a collectionof tables,rvhere
cach table can be stored irs a separateflle. The databasein Figure 1.2 resen.rbles
a
rc-lationalrepresentation.Nlost relational databasesuse the high-level query lan*uagecalledSQL and support a limited form of userviews.We discussthe relational
rrrodel,its languagesirnd operatiol.ls,2urdtechniques[or programming relational
.rpplicationsin Chtrpters5 through 9.
l'he object data model definesa databasein terrnsof objects,their properties,and
their operations.Objectswith the same stnlcture and behavior belong to a class,
.rr.rdclassesare organizedinto hierarchies (or acyclic graphs). The operationsof
cach class are specified in terms of predefined procedures called methods.
l{clationalDBMSs hzrvebeenextendingtheir modelsto incorporateobjectdatabase
conceptsand other capabilities;thesesystemsare referredto as object-relationalor
extended relational systems.We discussobject databasesand oL'rject-relational
systcrnsin Chapters20 to 22.
Trvoolder, historically important data rnodels,r-rowknown as legacydata moclels,iire
the network and hierarchical rnodels.The network model representsdata as record
tvpesand alsorepresentsa limited type of l:N relationship,calleda set t)?e. A 1:N,
o r o n e -to -m a n y ,re l a ti o n shi prel atesone i nstanceof a record to many record
instancesusingsomepointer linking mechanismin thesernodels.Figure2.8 shorvsa
rretrvorkschemadi:rgram for the datirbaseof Figure 1.2,where record types are
and settvpesare shown as labeleddirectedarrows.The network
.hown as rectangles
nrodel,alsoknown as the CODASYL DBTG modei,lahas an associated
record-at-atime languagethat rnust be embeddedin a host prograrnminglanguage.The netn'ork DML was proposedin tl-re1971 DatabaseTask Group (DBTG) Repe1135 ur't
crtensiorr of the COBOL language.It provides commands for locating records
COUR S E _OFFE R IN GS
IS A
S T U D E N TGR A D E S
P R E R E OU IS ITE
S E C TIONGR A D E S
G R A D E _R E P OR T
1 4 , CODASYLDBT G sta n d sfo r Con{erenceon D ata SystemsLanguagesD atabaseTaskGroup,w hi ch s
th e co m m itte eth a t sp e c { e d th e net'l ork mode and ts anguage.
Figur e2. 8
Theschem aof
Frgur e2,1 in net wor k
m odelnot at ion.
Chapter2 DatabaseSystemConceptsand Architecture
d i re c tl y (e .g .,F IND A N Y (record-type> U S IN G < fi el d-l i st> , or FIN D D U P LICATE
<record-type> USING <field-list>). It has commands to support traversalswithin
set-types(e.g.,GET OWNER,GET {FlRST,NEXT,LAST}MEMBERWITHIN<set-type>
WHERElcondition)).It alsohascommandsto storenew data (e.g.,STORE<recordtype>) and to make it part of a set type (e.g.,CONNECT<record-type>TO <settype>). The languagealso handles many additional considerationssuch as the
currency of record types and set types,which are defined by the current position of
the navigation processwithin the database.It is prominently usedby IDMS, IMAGE,
and SUPRA DBMSs today. The hierarchical model representsdata as hierarchical
tree structures.Each hierarchy representsa number of related records.There is no
standardlanguagefor the hierirrchicalrnodel.A popular hierarchicalDML is DL/l of
the IMS system.It dominated the DBMS market for over 20 yearsbetween 1965and
1985and is a very widely used DBMS u'orldwide eventoday and holds a very high
percentageof data in governmental,health care, and banking and insurance databases.Its DML calledDL/ 1 was a de facto industry standardfor a long time. DL/ I has
commands to locatea record (e.g.,GET {UNIOUE,NEXT}<record-type>WHERE,
<condition>). It has navigational facilities to navigatewithin hierarchies(e.g.,GET
NEXTWITHINPARENTor GET{FlRST,
NEXT}PATH<hierarchical-path-specification)
WHERE<condition>). It has appropriatefacilitiesto storeand updaterecords(e.g.,
INSERT(record-type>, REPLACE<record-type>).Currency issuesduring navigation are also handled with aclditional featuresin the language.We give a brief
overviewof the network and hierarchicalmodelsin AppendicesE and F.rs
The eXtended Markup Language (XML) model, now consideredthe standard for
It combines
data interchangeover the Internet,alsouseshierarchicaltree structLlres.
with
from
representation
models.
Data is
databaseconcepts
concepts
docurnent
nested
create
representedas elements;with the use of tags,data can be
to
complex
the object rnodel, but
hierarchical structures. This model conceptually reser.nbles
uses different terminology. We discussXML irnd how it is related to databasesin
Chapter 27.
2.7 Summary
In this chapter we introduced the main conceptsused in databasesystems.We
three main categories:
defineda data model and we clistinguished
xs High-levelor conceptualdata rnodels(basedon entitiesand relationships)
s Low-level or physicaldrrtamodels
$s Representationalor irnplementation data models (record-based,objectoriented)
15 . T h e fu ll ch a o te r son the netw ork and hi erarchi calmodel s from the second edi ti onof th s book are
a va ila b leo ve r th e In te rnetfrom thi s book's C ompani onWebsi teat http:,//w w w .aw ,com/el nrasri ,
2.7 Sum m ar v
)LICATE
i \\'ithin
t- tvPe>
rccordO (se tr as t he
. it io n of
\I AG E,
.rrchical
rc ls no
I) L / 1 o f
9 6 5a nd
- 'n'h ig h
cc data)l / 1 has
I / H E RE,
. 9. ,G ET
cation)
' d s (e .9 .,
nin'igaa b r i ef
Jard for
rmbines
Da t a i s
--omplex
,del,but
basesin
We
-'r.ns.
rships)
trbiect-
\\'e distinguished the schema,or description of a database,from the databaseitself.
I'he schema does not changevery often, whereasthe databasestate changesevery
time data is inserted,deleted,or modified. Then we describedthe three-schema
l)BMS architecture,which allows three schemalevels:
.r' An internal schemadescribesthe physical storagestructure of the database.
',' A conceptual schemais a high-level description of the whole database.
,,r External schemasdescribethe views of different user groups.
.\ DBMS that cleanly separatesthe three levels must have mappings between the
\chemasto transform requestsand results from one level to the next. Most DBMSs
do not separatethe three levelscompletely.We used the three-schemaarchitecture
to definethe conceptsoflogical and physicaldata independence.
'l-hen
we discussedthe main typesof languagesand interfacesthat DBMSs support.
.\ datadefinition language(DDL) is usedto definethe databaseconceptualschema.
ln most DBMSs, the DDL also definesuser views and, sometimes,storagestructures; in other DBMSs, separatelanguages(VDL, SDL) may exist for specifying
r i*vs and storagestructures.This distinction is fading away in today'srelational
implementationswith SQL servingas a catchalllanguageto perform multiple roles,
including view definition. The storagedefinition part (SDL) was included in SQL
()\'ermany versions,but has now been relegatedto specialcommandsfor the DBA
in relationalDBMSs. The DBMS compilesall schemadefinitions and storestheir
descriptionsin the DBMS catalog.A data manipulation language(DML) is usedfor
speci$ing databaseretrievalsand updates.DMLs can be high level (set-oriented,
nonprocedural)or low level (record-oriented,procedural).A high-levelDML can
bc' embeddedin a host programming language,or it can be used as a standalone
i.rnguage;in the latter caseit is often called a query language,
\\'e discusseddifferent types of interfacesprovided by DBMSs, and the types of
l)BMS userswith which eachir.rterface
is associated.
Then we discussedthe database
environment,
DBMS
:\'st€rrr
typical
softwaremodules,and DBMS utilities for helping usersand the DBA perform their tasks.We continued with an overviervof the
nvo-tierand three-tierarchitecturesfor databaseapplications,progressively
moving
torvardn-tier, which are now very common in most modern applications,particul.rrly Web databaseapplications.
I-inally,we classifiedDBMSs accordingto severalcriteria: data model, number of
users,number of sites,types of accesspaths,and generality.We discussedthe avail.rbilityof DBMSs and additionalmodules-frorn no costin the form of open source
software,to configurationsthat annuallycostmillions to maintain.\Ve alsopointed
out the variety of licensingarrangementsfor DBMS and relatedproducts. The main
classificationof DBMSs is basedon the data model. We briefly discussedthe main
data models used in current commercial DBMSs and provided an example of the
network data model.
53
Download