Databassystem Application schema naming & structure information

advertisement
Application schema
naming & structure
information
Databassystem
Anv 4
Updates
Anv Queries
3
Svar
Updates
Svar
Anv Queries
2
Användare1
Modell Uppdatera
Updates
Queries
Världen
Database
Databashanteringssystem (DBMS)
Parsing &
validating
SELECT ORDER_ID, ENTRY_DATE
FROM ORDER
WHERE ENTRY_DATE > ‘2001-08-30’
σ
Intermediate form of query
Frågor-Svar
Databas
SQL query
ENTRY_DATE>2001-08-30
π
Query
optimizer
Bearbetning av
frågor och uppdateringar
System catalogue/DD
with metadata
Tillgång till lagrad data
Stored database
with application data
ORDER_ID,ENTRY_DATE
ORDER
Execution plan (Access plan)
π
Query code
generator
Code to execute the query
Fysisk
databas
Application
data
ORDER_ID,ENTRY_DATE
σ
ENTRY_DATE>2001-08-30
ORDER
Runtime DB
processor
Query result
1
20072007-0303-20
TDDB48 Lecture 1: Introduction
1
2
20072007-0303-20
TDDB48 Lecture 1: Introduction
Basic Definitions
Typical DBMS Functionality
• Database: A collection of related data.
• Data: Known facts that can be recorded and have an
implicit meaning.
• Mini-world: Some part of the real world about which data
is stored in a database. For example, student grades and
transcripts at a university.
• Database Management System (DBMS): A software
package/ system to facilitate the creation and
maintenance of a computerized database.
• Database System: The DBMS software together with the
data itself. Sometimes, the applications are also
included.
• Define a database: in terms of data types,
structures and constraints
• Construct or Load the Database on a secondary
storage medium
• Manipulating the database: querying, generating
reports, insertions, deletions and modifications
to its content
• Concurrent Processing and Sharing by a set of
users and programs – yet, keeping all data valid
and consistent
3
20072007-0303-20
TDDB48 Lecture 1: Introduction
3
4
5
6
20072007-0303-20
TDDB48 Lecture 1: Introduction
2
4
Typical DBMS Functionality
Other features:
– Protection or Security measures to
prevent unauthorized access
– “Active” processing to take internal
actions on data
– Presentation and Visualization of data
5
20072007-0303-20
TDDB48 Lecture 1: Introduction
1
Information retrieval (IR) on the
Internet
IRS, DBMS, and AI
1. Locate document collections
2. Formulate query
3. Judge relevance.
Data object
IRS
20072007-0303-20
7
TDDB48 Lecture 1: Introduction
document
DBMS
Traditional IR research and development
has been concentrated on 2 and 3. The
Internet (the web) requires 1 too.
Basic
function
AI
7
8
9
10
retrieval
(probabilistic)
tabell
retrieval
(deterministic)
logic expressions inference
20072007-0303-20
Database
size
small to very
large
small to very
large
usually
small
TDDB48 Lecture 1: Introduction
8
DBMS
• deterministic
SQL> select * from kund
where nummer = 17;
• meets the exact information need of the
user
• Cf.: search for memory stick in discussion
forums
9
20072007-0303-20
TDDB48 Lecture 1: Introduction
Lab policy
LiU: Disciplinary actions
• You are expected to do the lab assignments by
yourself. Merely copying others solutions will not be
tolerated, even if you make cosmetic changes to the
code/solution. If we suspect that this, or any other form
of cheating, has happened we will report it to the
disciplinary board of the university.
• Be prepared to be asked questions by your laboration
assistant about detailed and specific code and also
inquiries about why you have selected a specific
solution. This applies to all lab group members.
• If you have problems meeting a deadline it is much
better to talk to the instructor about it than to cheat.
(It is a shame that we have to say these things. They
should be obvious.)
• Any kind of academic dishonesty, such as
cheating, plagiarism, use of unauthorized
assistance, fraud and failure to comply with
University examination rules, may result in the
filing of a complaint to the University Disciplinary
Committee. The potential penalties include
expulsion, suspension, and revocation of
previously earned grade or degree.
• LiU Rules and regulations
11
20072007-0303-20
TDDB48 Lecture 1: Introduction
11
12
20072007-0303-20
TDDB48 Lecture 1: Introduction
12
2
Historical Development of
Database Technology
• Early Database Applications: The Hierarchical
and Network Models were introduced in mid
1960s and dominated during the seventies. A
bulk of the worldwide database processing still
occurs using these models.
• Relational Model based Systems: The model
that was originally introduced in 1970 was heavily
researched and experimented with in IBM and
the universities. Relational DBMS Products
emerged in the 1980s.
13
20072007-0303-20
TDDB48 Lecture 1: Introduction
13
14
20072007-0303-20
TDDB48 Lecture 1: Introduction
14
16
20072007-0303-20
TDDB48 Lecture 1: Introduction
16
Historical Development of
Database Technology
• Object-oriented applications: OODBMSs were
introduced in late 1980s and early 1990s to cater to
the need of complex data processing in CAD and
other applications. Their use has not taken off much.
• Data on the Web and E-commerce Applications:
Web contains data in HTML (Hypertext markup
language) with links among pages. This has given
rise to a new set of applications and E-commerce is
using new standards like XML (eXtended Markup
Language).
15
20072007-0303-20
TDDB48 Lecture 1: Introduction
15
Varfö
Varför databashanterare?
databashanterare?
Varfö
Varför databashanterare:
databashanterare: Enkelt
Exempel, kundregister i C:
create table kund
(nummer integer,
namn char(50),
adress char(50));
struct kund {
int nummer;
char namn[50 + 1];
char adress[50 + 1];
struct kund* nextp; };
17
20072007-0303-20
TDDB48 Lecture 1: Introduction
select namn, adress
from kund
where nummer = 17;
17
18
20072007-0303-20
TDDB48 Lecture 1: Introduction
18
3
Varfö
Varför databashanterare:
databashanterare:
Kraftfullt
Varfö
Varför databashanterare:
databashanterare:
Flexibelt
select *
from kund
where namn like 'S%'
order by adress;
select namn
from kund
where adress = 'Vägen 8'
and namn like 'S%';
select adress, count(*)
from kund
where namn = 'Anders'
group by adress;
alter table kund add telefon char(10);
19
20072007-0303-20
create index foo on kund(namn);
TDDB48 Lecture 1: Introduction
19
20
Mer: Varfö
Varför databashanterare?
•
•
•
•
20072007-0303-20
TDDB48 Lecture 1: Introduction
20
Flera anvä
användare samtidigt
Dataoberoende
Flera användare samtidigt
Persistens vid fel
Datamodellering.
Pelle
Summerar
lönekostnaden
Kalle
Uppdaterar
lönerna för
1000 anställda
21
20072007-0303-20
TDDB48 Lecture 1: Introduction
21
22
Kalle
20072007-0303-20
Databas
TDDB48 Lecture 1: Introduction
22
– Kontroll av redundant information
– Dataåtkomst
– Persistent datalagring
– Tillåter frågor och analys
– Tillåter flera användare
– Representera flera användare
– Effektiv lagring av data
– Integritetsvillkor
– Backup och återställning
Strömavbrott
23
TDDB48 Lecture 1: Introduction
DBMS: Sammanfattning av
fördelar
Persistens vid fel
Uppdaterar
lönerna för
1000 anställda
20072007-0303-20
Databas
23
24
20072007-0303-20
TDDB48 Lecture 1: Introduction
24
4
Categories of data models
History of Data Models
• Conceptual (high-level, semantic)
• Implementation (representational)
• Physical (low-level, internal)
• The data model implies the schema, which
implies what type of data that can be
stored
25
20072007-0303-20
TDDB48 Lecture 1: Introduction
25
• Network Model: the first one to be
implemented by Honeywell in 1964-65 (IDS
System). Adopted heavily due to the
support by CODASYL (CODASYL - DBTG
report of 1971). Later implemented in a
large variety of systems - IDMS (Cullinet now CA), DMS 1100 (Unisys), IMAGE
(H.P.), VAX -DBMS (Digital Equipment
Corp.).
• Hierarchical Data Model: implemented in a
joint effort by IBM and North American
26
20072007-0303-20
TDDB48 Lecture 1: Introduction
26
Network Model
History of Data Models
• ADVANTAGES:
• Object-oriented Data Model(s): several models have
been proposed for implementing in a database system
since 1980s. One set comprises models of persistent OO Programming Languages such as C++ (e.g., in
OBJECTSTORE or VERSANT), and Smalltalk (e.g., in
GEMSTONE). Additionally, systems like O2, ORION (at
MCC - then ITASCA), IRIS (at H.P.- used in Open
OODB).
• Object-Relational Models: Started with Informix
Universal Server in 1990s. Exemplified in the latest
versions of Oracle-10i, DB2, and SQL Server etc.
systems.
• XML-based Models in 2000s
27
20072007-0303-20
TDDB48 Lecture 1: Introduction
27
• Able to model complex relationships and represents
semantics of add/delete on the relationships.
• Can handle most situations for modeling using record
types and relationship types.
• Language is navigational; uses constructs like FIND, FIND
member, FIND owner, FIND NEXT within set, GET, etc.
Programmers can do optimal navigation through the
database.
• DISADVANTAGES:
• Navigational and procedural nature of processing
• Database contains a complex array of pointers that thread
through a set of records.
• Little scope for automated “query optimization”
28
20072007-0303-20
Hierarchical Model
• Simple to construct and operate on
• Corresponds to a number of natural hierarchically
organized domains - e.g., assemblies in manufacturing,
personnel organization in companies
• Language is simple; uses constructs like GET, GET
UNIQUE, GET NEXT, GET NEXT WITHIN PARENT, etc.
• DISADVANTAGES:
TDDB48 Lecture 1: Introduction
• Data lagras som tabeller
• Teoretisk modell
• Standardiserat frågespråk
• I början var dock dessa databaser
långsamma – de hierarkiska databaserna
snabbare.
• Navigational and procedural nature of processing
• Database is visualized as a linear arrangement of
records
• Little scope for “query optimization”
20072007-0303-20
28
Relationsmodellen
• ADVANTAGES:
29
TDDB48 Lecture 1: Introduction
29
30
20072007-0303-20
TDDB48 Lecture 1: Introduction
30
5
TreTre-schemaschema-arkitekturen
Databasanvä
Databasanvändare och roller
•
•
•
•
• Olika schema på olika nivåer
• Dataoberoende mellan nivåerna
Databasadministratör
Databasdesigner
Slutanvändare
Applikationsprogrammerare
Vy
20072007-0303-20
Vy
Konceptuell nivå
• DBMS-designer
• Verktygsutvecklare
• Operatör, service
31
Vy
Fysisk nivå
TDDB48 Lecture 1: Introduction
31
32
20072007-0303-20
Databassprå
Databasspråk
TDDB48 Lecture 1: Introduction
32
Datamodeller idag
• Data Definition Language - DDL
• Relationsdatabaser vanligast
• Fortfarande finns hierarkiska databaser
(främst inom flygindustrin)
• Objekt-orienterade och objekt-relationella
databaser är en liten del
• XML-databaser – nytt.
– Specificerar det konceptuella schemat
• Data Modification Language - DML
– Lagra och hämta data
• Data Control Language - DCL
– Kontrollerar databasexekveringen
• Host language
– Tillägg till ett programmeringsspråk
33
20072007-0303-20
TDDB48 Lecture 1: Introduction
33
34
20072007-0303-20
Databassystem
34
ERER-modellering
Anv 4
Updates
Anv Queries
3
Svar
Updates
Svar
Anv Queries
2
Användare1
Modell Uppdatera
Updates
Queries
Världen
TDDB48 Lecture 1: Introduction
Frågor-Svar
Databas
Databashanteringssystem (DBMS)
Bearbetning av
frågor och uppdateringar
Personnummer
Tillgång till lagrad data
Namn
Telefon
Fysisk
databas
35
20072007-0303-20
TDDB48 Lecture 1: Introduction
35
36
20072007-0303-20
TDDB48 Lecture 1: Introduction
Adress
E-post
Ålder
36
6
Symboler i ERER-diagram
ERER-diagram
Kandidatnycklar
Attribut
• Ett strukturerat sätt att modellera data
• Oberoende av databastyp
• Dokumentation av din datastruktur.
PNummer
E-post
Sammansatta
attribut
AnstÅr
Entitet
FNamn
Anställd
Namn
ENamn
Age
Härlett attribut
37
20072007-0303-20
TDDB48 Lecture 1: Introduction
37
38
20072007-0303-20
Relationer
Anum
Arbetar
på
TDDB48 Lecture 1: Introduction
38
Totalt deltagande
Pnum
Anställd
Free
Flervärt
attribut
Anum
Pnum
Avdelning
Avdelning
Arbetar
på
Anställd
“Varje avdelning måste ha minst en anställd”
“Anställda arbetar på avdelningar”
39
20072007-0303-20
TDDB48 Lecture 1: Introduction
39
40
Anställd
Anum
Arbetar
på
20072007-0303-20
TDDB48 Lecture 1: Introduction
40
Anum
Pnum
1
Avdelning
Anställd
Arbetar
på
1
Avdelning
“Varje avdelning har exakt en anställd och
varje anställd jobbar på exakt en avdelning”
“Varje anställd måste arbeta på en avdelning”
41
TDDB48 Lecture 1: Introduction
Kardinalitet:
Kardinalitet: Restriktioner på
på
antal
Totalt deltagande, forts.
Pnum
20072007-0303-20
41
42
20072007-0303-20
TDDB48 Lecture 1: Introduction
42
7
Restriktioner på
på antal, forts.
Restriktioner på
på antal, forts.
Anum
Pnum
N
Arbetar
på
Anställd
1
M
Avdelning
20072007-0303-20
TDDB48 Lecture 1: Introduction
20072007-0303-20
44
Restriktioner på
på antal, forts.
Arbetar
på
(1,100)
20072007-0303-20
TDDB48 Lecture 1: Introduction
TDDB48 Lecture 1: Introduction
44
Pnum
Anum
N
Avdelning
Arbetar
på
Anställd
1
Avdelning
“Anställda identifieras genom sin avdelning, t.ex. ‘Kalle på sälj’”
“Varje avdelning kan ha upp till 100 anställda
men varje anställd kan bara jobba på en avdelning”
45
Avdelning
Svaga entiteter
Anum
(1,1)
Anställd
N
“Varje avdelning kan ha många anställda och
varje anställd kan jobba på flera avdelningar”
43
Pnum
Arbetar
på
Anställd
“Varje avdelning kan ha många anställda
men varje anställd kan endast jobba på en avdelning”
43
Anum
Pnum
45
20072007-0303-20
46
TDDB48 Lecture 1: Introduction
46
SUMMARY OF NOTATION FOR ER SCHEMAS
Exempel
Symbol
Meaning
ENTITY TYPE
WEAK ENTITY TYPE
• Studenter studerar på studieprogram och
läser ett antal kurser. Varje kurs
identifieras av en kurskod och ger
studenten ett antal intjänade poäng.
RELATIONSHIP TYPE
IDENTIFYING RELATIONSHIP TYPE
ATTRIBUTE
KEY ATTRIBUTE
MULTIVALUED ATTRIBUTE
COMPOSITE ATTRIBUTE
DERIVED ATTRIBUTE
E1
E1
1
R
R
47
20072007-0303-20
TDDB48 Lecture 1: Introduction
47
48
E2
R
20072007-0303-20
N
(min,max)
TOTAL PARTICIPATION OF E2 IN R
E2
CARDINALITY RATIO 1:N FOR E 1:E2 IN R
E
STRUCTURAL CONSTRAINT (min, max) ON
PARTICIPATION OF E IN R
TDDB48 Lecture 1: Introduction
48
8
PROBLEM with ER notation
• Incorporates Set-subset relationships
• Incorporates Specialization/Generalization
Hierarchies
THE ENTITY RELATIONSHIP MODEL IN
ITS ORIGINAL FORM DID NOT
SUPPORT THE SPECIALIZATION/
GENERALIZATION ABSTRACTIONS
20072007-0303-20
49
Extended EntityEntity-Relationship
(EER) Model
HOW THE ER MODEL CAN BE EXTENDED WITH
- Set-subset relationships and
Specialization/Generalization Hierarchies and how
to display them in EER diagrams
TDDB48 Lecture 1: Introduction
49
20072007-0303-20
50
Exempel:
Exempel: Två
Två typer av anstä
anställda
TDDB48 Lecture 1: Introduction
50
Anställd
d
Tekniker
ANummer
Telefon.
System
Administratör
Lön
ANummer
ANummer
51
20072007-0303-20
Tekniker
Språk
Telefon
Telefon
TDDB48 Lecture 1: Introduction
Anummer
51
52
20072007-0303-20
Telefon
TDDB48 Lecture 1: Introduction
Lön
52
Anställd
Lön
Telefon
System
Lön
Telefon
d
d
Administratör
TDDB48 Lecture 1: Introduction
Tekniker
Språk
System
Administratör
Språk
“Anställda måste vara antingen tekniker eller (XOR) administratörer”
“Anställda kan vara tekniker eller (XOR) administratörer”
20072007-0303-20
Språk
Anummer
Anställd
53
Administratör
Lön
Lön
Anummer
Tekniker
System
53
54
20072007-0303-20
TDDB48 Lecture 1: Introduction
54
9
ANummer
Anställd
Anummer
Anställd
Lön
Telefon
Lön
Telefon
o
o
Tekniker
Tekniker
System
Administratör
20072007-0303-20
Administratör
AdmTekn
“Det kan finnas anställda som är både tekniker och administratörer”
55
System
Språk
Språk
TDDB48 Lecture 1: Introduction
Procent
55
56
20072007-0303-20
TDDB48 Lecture 1: Introduction
56
57
58
20072007-0303-20
TDDB48 Lecture 1: Introduction
58
Exempel
• På universitetet finns två typer av
studenter, doktorander och
grundutbildningsstudenter och man kan
inte tillhöra båda kategorierna. Beroende
på vilken kategori man tillhör är olika
kurser tillåtna. En del kurser bara för
doktorander, en del för
grundutbildningsstudenter och en del för
alla typer av studenter.
57
20072007-0303-20
TDDB48 Lecture 1: Introduction
UML Example for Displaying
Specialization / Generalization
Alternative Diagrammatic Notations
Displaying attributes
Symbols for entity type / class,
attribute and relationship
Notations for displaying
specialization / generalization
59
20072007-0303-20
TDDB48 Lecture 1: Introduction
59
60
20072007-0303-20
Various (min,
max) notations
TDDB48 Lecture 1: Introduction
Displaying
cardinality ratios
60
10
Download