LES05

Creating the Physical Model Designing the Physical Model Phase IV: Defining the physical model Database Object Naming Conventions • Keep the logical and physical names similar and descriptive. • Capitalize table and attribute names. • Use underscores instead of spaces to delineate separate words in an object’s name. • Use a suffix of _PK to indicate primary keys. • Use a suffix of _ID to indicate production keys. • Find a good balance between using very specific and very vague names. Database Object Naming Conventions • Develop a reasonable list of abbreviations. • List all the objects’ names, and work with the user community to define them. • Resolve name disputes. • Document your naming standards in the metadata document. • Plan for the naming standards to be a living document. Translating the Dimensional Model into a Physical Model • Apply the naming standards to the tables and attributes of the dimensional model. • List table columns with primary keys listed first. • Label primary keys consistently. • Identify the format and length of columns. • Label unique keys with a (#). • Label column optionality with NULL (o) or NOT NULL (*) constraints. • Label foreign keys with _FK. • Use synonyms for user tables. Physical Model Product # *Product _PK # *Channel_PK # *Promotion_PK n n n * * * * * * * * * * * * * * * v(11) v(125) v(35) v(20) v(25) v(20) v(10) n v(20) v(20) v(10) v(10) v(10) d d PRODUCT_ID PRODUCT_DESC PRODUCT_NAME CATEGORY_ID CATEGORY_DESC SUPPLIER_ID PRODUCT_STATUS LIST_PRICE CATALOG_ID PRODCUT_TYPE PRODUCT_CODE PROMOTION_CODE WHSE_LOCATION VALID_FROM_DATE VALID_TO_DATE Defining the Hardware Transforming the base dimensional data model into the physical model includes some of the following: • Defining naming and database standards • Performing an initial sizing • Designing tablespaces • Defining an initial indexing strategy • Using partitioning to split table and index data into smaller, more manageable chunks • Determining where to place database objects on disk (RAID, striping, disk mapping) • Using parallel processing Architectural Requirements Scalability Manageability Availability Extensibility Flexibility Integrated Accessibility Reliability User Business Budget Technology Architecture Characteristics • Robust • Available • VLM (very large memory) • Reliable • 64-bit • Extensible • Connective • Scalable • Open • Supportable • Recoverable • Parallel Hardware Requirements • SMP (Symmetric multiprocessing) • Cluster and MPP (massively parallel processing) • Hybrids using SMP and MPP Evaluation Criteria Determine the platform for your needs: SMP Clusters MPP High Low Scalability High Maturity Low Parallel Processing • Parallel daily operations • Shared resources – Memory – Disk – Nothing • Loosely or tightly coupled Operating system Hardware Application Database Making the Right Choice • Requirements differ from operational systems • Benchmark – Available from vendors – Develop your own – Use realistic queries • Scalability important Symmetric Multiprocessing (SMP) • Communication by shared memory • Disk controllers accessible to all CPUs • Proven technology CPU CPU CPU CPU Common bus Shared memory Shared disks SMP CPU CPU CPU CPU Benefits: • High concurrency • Workload balancing Shared memory • Moderate scalability • Easy administration Limitations: • Memory (cluster for improvements) • Bandwidth Clusters Node 1 Node 2 Node 3 CPU CPU CPU CPU CPU CPU CPU CPU CPU Shared memory Shared memory Common high-speed bus Shared memory Common high-speed bus Shared disks Clusters • • • • • Shared disk, loosely coupled Dedicated memory High-speed bus Shared resources SMP node Massively Parallel Processing (MPP) CPU CPU CPU CPU Memory Memory Memory Memory Disk Disk Disk Disk MPP nCube Arrangements • • • • • • • A shared nothing architecture Many nodes Fast access Exclusive memory on a node Low cost per node Scalable nCUBE configuration MPP Benefits • • • • • Unlimited incremental growth Very scalable Fast access Low cost per node Good for DSS MPP Limitations • • • • • • Rigid partitioning Cache consistency Restricted disk access High memory cost per node High management burden Careful data placement Architectural Tiers Tiered structures: • Modular • Logical separation DB server Distributed structures: • Two-tier • Three-tier • Four-tier (and more) Apps server Workstations Web server Internet Sample System Architecture Middleware Technologies for integration Gateway Database Server Requirements • • • • • • • • Robust Available Reliable Extensible Scalable Supportable Recoverable Parallel Parallelism • • • • • • • Database Query Load Index Sort Backup Recovery Further Considerations • • • • • • • Optimization strategy Partitioning strategy Summarization strategy Indexing techniques Hardware and software scalability Availability Administration Parallel Processing A large task broken into smaller tasks: • Concurrent execution • One or more processors Elapsed time Not parallel Processor 1 Parallel Processor 1 Processor 2 Processor 3 Processor 4 Parallel Database • Increased speed • Improved scalability • Performance gains – Availability – Flexibility – More users Parallel Processor 1 Processor 2 Processor 3 Processor 4 Parallel Query SQL code split among server processes Subquery Subquery Query Subquery Parallel Load Bypass SQL processing to speed throughput Jan 98 Feb 98 Order table Mar 98 Parallel Processing • Index: reduces the time to create • Sort: allocates memory in cache efficiently Parallel Processing • Backup: runs simultaneously from any node (online and offline) • Recovery: runs simultaneously from redo logs • Summaries: uses the CREATE TABLE AS SELECT statement

LES05

Related documents

Products

Support

LES05

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib