Uploaded by azsuhas

ETL Interview qus

advertisement
Q #1. What is ETL?
Ans. ETL refers to Extracting, Transforming and Loading of Data from any outside system to the
required place. These are the basic 3 steps in the Data Integration process. Extracting means
locating the Data and removing from the source file, transforming is the process of transporting
it to the required target file and Loading the file in the target system in the format applicable.
Q #2. Why ETL testing is required?
Ans.
To keep a check on the Data which are being transferred from one system to the other.
To keep a track on the efficiency and speed of the process.
To be well acquainted with the ETL process before it gets implemented into your business and production.
Q #3. What are ETL tester responsibilities?
Ans.
Requires in depth knowledge on the ETL tools and processes.
Needs to write the SQL queries for the various given scenarios during the testing phase.
Should be able to carry our different types of tests such as Primary Key, defaults and keep a
check on the other functionality of the ETL process.
Quality Check
Q #4. What are Dimensions?
Ans. Dimensions are the groups or categories through which the summarized data are sorted.
Q #5. What is Staging area referring to?
Ans. Staging area is the place where the data is stored temporarily in the process of Data
Integration. Here, the data s cleansed and checked for any duplication.
Q #6. Explain ETL Mapping Sheets.
Ans. ETL mapping sheets contains all the required information from the source file including
all the rows and columns. This sheet helps the experts in writing the SQL queries for the ETL tools testing.
Q #7. Mention few Test cases and explain them.
Ans.
Mapping Doc Validation – Verifying if the ETL information is provided in the Mapping Doc.
Data Check – Every aspect regarding the Data such as Data check, Number Check, Null check are tested in this case
Correctness Issues – Misspelled Data, Inaccurate data and null data are tested.
Q #8. List few ETL bugs.
Ans. Calculation Bug, User Interface Bug, Source Bugs, Load condition bug, ECP related bug.
In addition to the above ETL testing questions, there may be other vital questions where you
may be asked to mention the ETL tools which you have used earlier. Also, you might be asked
about any debugging issues you have faced in your earlier experience or about any real time
experience.
**********ETL Testing Interview Questions and answers******************
Question.1 What is a Data Warehouse?
Answer: A Data Warehouse is a collection of data marts representing historical data from
different operational data source (OLTP). The data from these OLTP are structured and
optimized for querying and data analysis in a Data Warehouse.
Question.2 What is a Data mart?
Answer: A Data Mart is a subset of a data warehouse that can provide data for reporting
and analysis on a section, unit or a department like Sales Dept, HR Dept, etc. The Data Mart
are sometimes also called as HPQS (Higher Performance Query Structure).
Like or Follow or Subscribe To Unlock The Content
like
follow us
Youtube
Question.3 What is OLAP?
Answer: OLAP stands for Online Analytical Processing. It uses database tables
(Fact and Dimension tables) to enable multidimensional viewing, analysis and querying of large amount of data.
Question.4 What is OLTP?
Answer: OLTP stands for Online Transaction Processing Except data warehouse databases the other databases are OLTPs. These OLTP uses normalized schema structure. These OLTP
databases are designed for recording the daily operations and transactions of a business.
Question.5 What are Dimensions?
Answer: Dimensions are categories by which summarized data can be viewed. For example a profit Fact table can be viewed by a time dimension.
Question.6 What are Confirmed Dimensions?
Answer: The Dimensions which are reusable and fixed in nature Example customer, time, geography dimensions.
Question.7 What are Fact Tables?
Answer: A Fact Table is a table that contains summarized numerical (facts) and historical data. This Fact Table has a foreign key-primary key relation with a dimension table. The Fact Table
maintains the information in 3rd normal form.
A star schema is defined is defined as a logical database design in which there will be a
centrally located fact table which is surrounded by at least one or more dimension tables. This design is best suited for Data Warehouse or Data Mart.
Question.8 What are the types of Facts?
Answer: The types of Facts are as follows.
1. Additive Facts: A Fact which can be summed up for any of the dimension available in the
fact table.
2. Semi-Additive Facts: A Fact which can be summed up to a few dimensions and not for all
dimensions available in the fact table.
3. Non-Additive Fact: A Fact which cannot be summed up for any of the dimensions
available in the fact table.
Question.9 What are the types of Fact Tables?
Answer: The types of Fact Tables are:
1. Cumulative Fact Table: This type of fact tables generally describes what was happened over the period of time. They contain additive facts.
2. Snapshot Fact Table: This type of fact table deals with the particular period of time. They contain non-additive and semi-additive facts.
Question.10 What is Grain of Fact?
Answer: The Grain of Fact is defined as the level at which the fact information is stored in a fact table. This is also called as Fact Granularity or Fact Event Level.
Question.11 What is Factless Fact table?
Answer: The Fact Table which does not contains facts is called as Fact Table. Generally
when we need to combine two data marts, then one data mart will have a fact less fact table and other one with common fact table.
Question.12 What are Measures?
Answer: Measures are numeric data based on columns in a fact table.
Question.13 What are Cubes?
Answer: Cubes are data processing units composed of fact tables and dimensions from the data warehouse. They provided multidimensional analysis.
Question.14 What are Virtual Cubes?
Answer: These are combination of one or more real cubes and require no disk space to store them. They store only definition and not the data.
Question.15 What is a Star schema design?
Answer: A Star schema is defined as a logical database design in which there will be a centrally located fact table which is surrounded by at least one or more dimension tables. This design is best suited for Data Warehouse or Data Mart.
Question.16 What is Snow Flake schema Design?
Answer: In a Snow Flake design the dimension table (de-normalized table) will be further divided into one or more dimensions (normalized tables) to organize the information in a
better structural format. To design snow flake we should first design star schema design.
Question.17 What is Operational Data Store [ODS] ?
Answer: It is a collection of integrated databases designed to support operational monitoring.
Unlike the OLTP databases, the data in the ODS are integrated, subject oriented and
enterprise wide data.
Question.18 What is Denormalization?
Answer: Denormalization means a table with multi duplicate key. The dimension table follows Denormalization method with the technique of surrogate key.
Question.19 What is Surrogate Key?
Answer: A Surrogate Key is a sequence generated key which is assigned to be a primary key in the system (table).[/signinlocker]
Download