Uploaded by Rajesh Kanna

datawarehouse 2mm.docx

1)define a data warehouse with diagram.
A data warehouse is a centralized storage system that stores, analyzes, and interprets data. It's
designed to support business intelligence (BI) activities, especially analytics.
2)What is a virtual warehouse?
A virtual warehouse is a cloud-based computing environment that dynamically allocates and scales
computing resources for data analysis without the need for physical infrastructure.
3)how is a data warehouse different from a database
A data warehouse is specialized for reporting and analysis, storing historical data for decision-making.
In contrast, a database is a broader term for structured data storage and retrieval.
4)write a tangible and intangible benefits of datawarehouse
Tangible benefits of a data warehouse include improved decision-making, cost savings, and revenue
growth. Intangible benefits encompass enhanced customer satisfaction through better insights and
strategic planning.
5) differentiate ETL with ELT
ETL (Extract, Transform, Load) involves extracting data from source systems, transforming it, and then
loading it into a data warehouse. ELT (Extract, Load, Transform) processes raw data within the data
warehouse itself.
6)what is a MOLAP?
MOLAP (Multidimensional Online Analytical Processing) is a type of OLAP that organizes data into
multidimensional cubes, providing efficient and fast analysis capabilities.
7)list any 4 tool for performing OLAP
Four tools for performing OLAP: Microsoft Excel PivotTables, IBM Cognos, Oracle OLAP, and SAP BW
enable interactive analysis of multidimensional data.
8) differentiate the characteristics of OTLP and OLAP
OTLP (Online Transaction Processing) focuses on real-time transactional data, ensuring data integrity.
OLAP (Online Analytical Processing) is designed for complex queries and analysis supporting
decision-making.
9)define Apex cuboid
"Apex cuboid" is not a recognized term in the context of data warehousing. It may refer to a specific
term in a particular context not commonly associated with data-related concepts.
10)list out the OLAP operation in multi dimensional data
OLAP operations in multidimensional data include Drill Down, Roll Up, Slice, Dice, Pivot, and Rotate.
These operations allow users to navigate and analyze data in various ways.
11)why do we need multi dimensional database
We need multidimensional databases to efficiently handle and represent complex data relationships
in a way that aligns with real-world business scenarios. This enables better analysis and reporting of
business metrics.
12) differentiate horizontal parallelism and vertical parallelism
Horizontal parallelism involves dividing data horizontally for parallel processing across multiple nodes
or servers. Vertical parallelism divides attributes or columns for parallel execution. Both aim to
improve processing efficiency.
13)define data partitioning and it's type
Data partitioning involves dividing a large dataset into smaller, more manageable parts. Types of data
partitioning include Range Partitioning, Hash Partitioning, List Partitioning, and Composite
Partitioning, each serving different optimization purposes.
14)list out the need of datamart
Data marts are needed for specific business units or departments to cater to their analytical needs.
They provide a focused subset of data from the larger data warehouse, improving performance and
relevance for targeted analysis and reporting.
15)write the types in 3 tier architecture
In a 3-tier architecture:
- Presentation Tier (User Interface): Handles user interaction.
- Application Tier (Business Logic): Manages application logic and processes.
- Data Tier (Storage and Retrieval): Stores and manages data.
16)define a datamart and give example
A datamart is a subset of a data warehouse focused on specific business functions. Example: A Sales
Datamart for in-depth analysis of sales-related data.
17)what are the type of partition strategy
Types of partition strategies include:
- Range Partitioning: Divides data based on specified ranges.
- Hash Partitioning: Distributes data using a hash function.
- List Partitioning: Divides data into predefined lists.
- Composite Partitioning: Combines multiple partitioning methods for optimization.