Conceptual Architecture of PostgreSQL

advertisement
Conceptual Architecture
of PostgreSQL
Overview
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
What is Postgres?
Research Methods
Considered Alternatives
Reference Architecture
Conceptual Architecture
Inside Subsystems – Query Processor
Inside Subsystems – Storage Manager
Inside Subsystems – Utilities
Use Case
Concurrency Control
Design Trade-offs
Limitations of Research
Lessons Learned
Summary
Q&A
What is PostgreSQL?
•
•
•
•
•
•
Open-Source database management system
‘Ingres Project’ at UC Berkeley
First Postgres version released in 1997
Cross-Platform
Written in C
Used by organisations such as:– Yahoo
– MySpace
– Skype
Research Methods
General understanding of PostgreSQL
–
–
–
–
Developers guide
PostgreSQL wiki page
PostgreSQL manual
Wikipedia
Reference architecture for Database Management System
- Backbone of conceptual architecture
Conceptual architecture for PostgreSQL
- Various available online documentation of Conceptual
Architectures of PostgreSQL
Considered Alternatives
1. Client – Server
2. Client – Server w/ Pipe & Filter
3. Client – Server w/ Pipeline & Repository
Reference Architecture
Figure. 1
Conceptual Architecture
Client Communications Manager
Utilities &
Shared Components
Legend
Server
(Query Processor)
Dependencies
Storage
Manager
Figure 2.
Query Processor
Figure 3.
Inside Subsystems
Query Processor
• Consists of :–
–
–
–
–
–
Parser:
Traffic Cop
:
Utility Command:
Rewriter:
Planner/Optimizer:
Executor:
syntax
simple/complex
simple queries
rule augmentation
optimal plan
execute optimal plan
• Models a Pipe & Filter style Architecture
• Uses storage management & shared utilities
Inside Subsystems
Storage Manager
Provides Shared memory for buffers &
access to database.
Suggests repository style
Legend
Figure 4.
Inside Subsystems
Utilities
Legend
Consists of :
–
–
–
–
Utilities
Catalog
Access Methods
Nodes/Lists
Utilities are used by
all sub-components of
the query processor
Figure 5.
Use Case – Select Query
Figure 6.
Concurrency Control
Postmaster spawns multiple server threads (process per
request)
Problem - overwriting or modifying data
Solution…
- MVCC – Multi-version concurrency control
- Point in time DB snapshot
- Locks – locks entire table from being altered/deleted
Design Trade-offs
Reliability vs Performance
Scalability vs Maintainability
Security vs Performance
Limitations of Research
Personal Knowledge as well as experience with
architectures & databases
Determining depth of research
Sources are incomplete
Lessons Learned
Cannot rely on one source for information, will have
to go through several sources to build a complete
picture
Hard to decide on an architecture style
The value of the reference architecture
Summary
Hybrid Conceptual Architecture
Client Server – front/back connection
Pipe & Filter – back end processes
Repository – storage management/access
Design Attributes
Reliable & Secure
- data integrity, strict SQL compliance, user authentication
Performance
- slower and more complicated
Thank You!
Questions?
Download