Home
Blog
ETL Testing
ETL Testing Interview Questions

ETL Testing Interview Questions

Ratings:

(4.5)Views: 1261

Share this blog:

Q1) What is ETL?

Ans: In data-warehousing architecture, ETL is an important component, which manages the data for any business process. ETL stands for Extract, Transform, and Load. The extract does the process of reading data from a database. Transform does the converting of data into a format that could be appropriate for reporting and analysis. While load does the process of writing the data into the target database.

Q2) Explain what the ETL testing operations include.

Ans: ETL testing includes

Verify whether the data is transformed correctly according to business requirements
Verify that the projected data is loaded into the data warehouse without any truncation and data loss
Make sure that the ETL application reports invalid data and replaces it with default values
Make sure that data loads at the expected time frame to improve scalability and performance

Q3) Mention what are the types of data warehouse applications and what is the difference between data mining and data warehousing?

Ans: The types of data warehouse applications are

Info Processing
Analytical Processing
Data Mining

Data mining can be defined as the process of extracting hidden predictive information from large databases and interpreting the data while data warehousing may make use of a data mine for analytical processing of the data in a faster way. Data warehousing is the process of aggregating data from multiple sources into one common repository

Q4) What are the various tools used in ETL?

Ans:

Cognos Decision Stream
Oracle Warehouse Builder
Business Objects XI
SAS business warehouse
SAS Enterprise ETL server

Q5) What is a fact? What are the types of facts?

Ans:

It is a central component of a multi-dimensional model that contains the measures to be analyzed. Facts are related to dimensions.

Types of facts are

Additive Facts
Semi-additive Facts
Non-additive Facts

Q6) Explain what are Cubes and OLAP Cubes?

Ans:

Cubes are data processing units comprised of fact tables and dimensions from the data warehouse. It provides a multi-dimensional analysis.
OLAP stands for Online Analytics Processing, and OLAP cube stores large data in a multi-dimensional form for reporting purposes. It consists of facts called measures categorized by dimensions.

Q7) Explain what is tracing level is and what are the types.

Ans: The tracing level is the amount of data stored in the log files. The tracing levels can be classified into two Normal and Verbose. The normal level explains the tracing level in a detailed manner while verbose explains the tracing levels at each and every row.

tracing level

Q8) Explain what is Grain of Fact?

Ans: Grain fact can be defined as the level at which factual information is stored. It is also known as Fact Granularity

Q9) Explain what factless fact schema is and what is Measured.

Ans:

A fact table without measures is known as a Factless fact table. It can view the number of occurring events. For example, it is used to record an event such as the employee count in a company.

The numeric data based on columns in a fact table is known as Measures

Q10) Explain what is transformation?

Ans: A transformation is a repository object which generates, modifies, or passes data. Transformation is of two types Active and Passive

Q11) Explain the use of Lookup Transformation.

Ans: The Lookup Transformation is useful for

Getting a related value from a table using a column value
The Update slowly changes the dimension table
Verify whether records already exist in the table

Q12) Explain what is partitioning, hash partitioning, and round-robin partitioning?

Ans:

To improve performance, transactions are subdivided, this is called Partitioning. Partitioning enables Informatica Server for creating multiple connections to various sources
The types of partitions are

Round-Robin Partitioning:

By Informatica data is distributed evenly among all partitions
In each partition where the number of rows to process is approximately the same, this Partitioning is applicable

Hash Partitioning:

For the purpose of partitioning keys to group data among partitions, the Informatica server applies a hash function
It is used when ensuring the process groups of rows with the same partitioning key in the same partition need to be ensured

Q13) Mention what is the advantage of using the DataReader Destination Adapter.

Ans: The advantage of using the DataReader Destination Adapter is that it populates an ADO recordset (consisting of records and columns) in memory and exposes the data from the DataFlow task by implementing the DataReader interface so that another application can consume the data.

Inclined to build a profession as ETL Testing Developer? Then here is the blog post on, explore ETL Testing Training

Q14) Using SSIS ( SQL Server Integration Service) what are the possible ways to update the table?

Ans: To update the table using SSIS the possible ways are:

Use a SQL command
Use a staging table
Use Cache
Use the Script Task
Use the full database name for updating if MSSQL is used

Q15) In case you have a non-OLEDB (Object Linking and Embedding Database) source for the lookup what would you do?

Ans: In case you have a non-OLEBD source for the lookup then you have to use Cache to load data and use it as a source

Q16) In what case do you use dynamic cache and static cache in connected and unconnected transformations?

Ans:

The dynamic cache is used when you have to update the master table and slowly change dimensions (SCD) type 1
For the flat files, a Static cache is used

Q17) Explain what are the differences between the Unconnected and Connected lookup.

Ans:

Connected Lookup	Unconnected Lookup
Connected lookup participates in mapping	It is used when a lookup function is used instead of an expression transformation while mapping
Multiple values can be returned	Ony returns one output port
It can be connected to another transformation and returns a value	Another transformation cannot be connected
Static or Dynamic cache can be used for connected Lookup	Unconnected as only static cache
Connected lookup supports user-defined default values	Unconnected lookup does not support user-defined default values
In Connected Lookup, multiple columns can be returned from the same row or inserted into the dynamic lookup cache	Unconnected lookup designates one return port and returns one column from each row

Q18) Explain what is the data source view.

Ans: A data source view allows for defining the relational schema which will be used in the analysis services databases. Rather than directly from data source objects, dimensions, and cubes are created from data source views.

Q19) Explain what is the difference between OLAP tools and ETL tools.

Ans: The difference between ETL and OLAP tools is that

ETL tool is meant for the extraction of data from the legacy systems and loads into a specified database with some process of cleansing data.

Example: Data stage, Informatica, etc.

While OLAP is meant for reporting purposes on OLAP data available in the multi-directional model.

Example: Business Objects, Cognos, etc.

Q20) How you can extract SAP data using Informatica?

Ans:

With the power connect option you extract SAP data using Informatica
Install and configure the PowerConnect tool
Import the source into the Source Analyzer. Informatica and SAP PowerConnect act as a gateway. The next step is to generate the ABAP code for the mapping, then only informatics can pull data from SAP
To connect and import sources from external systems Power Connect is used

Q21) Mention what is the difference between Power Mart and Power Center.

Ans:

Power Center	Power Mart
Suppose to process a huge volume of data	Suppose to process a low volume of data
It supports ERP sources such as SAP, People soft, etc.	It does not support ERP sources
It supports local and global repository	It supports the local repository
It converts local into a global repository	It had no specification to covert local into a global repository

Q22) Explain what a staging area is and what is the purpose of a staging area.

Ans: Data staging is an area where you hold the data temporarily on the data warehouse server. Data staging includes the following steps

Source data extraction and data transformation ( restructuring )
Data transformation (data cleansing, value transformation )
Surrogate key assignments

Q23) What is Bus Schema?

Ans: In the various business process to identify the common dimensions, BUS schema is used. It comes with a conformed dimension along with a standardized definition of information

Q24) Explain what is data purging.

Ans: Data purging is a process of deleting data from the data warehouse. It deletes junk data like rows with null values or extra spaces.

Q25) Explain what are Schema Objects.

Ans: Schema objects are the logical structure that directly refers to the database data. Schema objects include tables, views, sequence synonyms, indexes, clusters, function packages, and database links

Q26) Explain these terms Session, Worklet, Mapplet, and Workflow.

Ans:

Mapplet: It arranges or creates sets of transforms
Worklet: It represents a specific set of tasks given
Workflow: It’s a set of instructions that tell the server how to execute tasks
Session: It is a set of parameters that tells the server how to move data from sources to the target

Q27) List a few ETL bugs.

Ans: Calculation Bug, User Interface Bug, Source Bugs, Load condition bug, ECP related bug.

You liked the article?

Like: 1

Vote for difficulty

Current difficulty (Avg): Medium

EasyMediumHardDifficultExpert

IMPROVE ARTICLEReport Issue

Recommended Courses

ETL Testing Training 4.62732 : Learners

IBM BigFix Training 4.83213 : Learners

1/15

About Author

Name

TekSlate

Author Bio

TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.

Stay Updated

Get stories of change makers and innovators from the startup ecosystem in your inbox

Related Blogs

ETL Testing Interview Questions

Q1) What is ETL?

Q2) Explain what the ETL testing operations include.

Q3) Mention what are the types of data warehouse applications and what is the difference between data mining and data warehousing?

Q4) What are the various tools used in ETL?

Q5) What is a fact? What are the types of facts?

Q6) Explain what are Cubes and OLAP Cubes?

Q7) Explain what is tracing level is and what are the types.

Q8) Explain what is Grain of Fact?

Q9) Explain what factless fact schema is and what is Measured.

Q10) Explain what is transformation?

Q11) Explain the use of Lookup Transformation.

Q12) Explain what is partitioning, hash partitioning, and round-robin partitioning?

Q13) Mention what is the advantage of using the DataReader Destination Adapter.

Q14) Using SSIS ( SQL Server Integration Service) what are the possible ways to update the table?

Q15) In case you have a non-OLEDB (Object Linking and Embedding Database) source for the lookup what would you do?

Q16) In what case do you use dynamic cache and static cache in connected and unconnected transformations?

Q17) Explain what are the differences between the Unconnected and Connected lookup.

Q18) Explain what is the data source view.

Q19) Explain what is the difference between OLAP tools and ETL tools.

Q20) How you can extract SAP data using Informatica?

Q21) Mention what is the difference between Power Mart and Power Center.

Q22) Explain what a staging area is and what is the purpose of a staging area.

Q23) What is Bus Schema?

Q24) Explain what is data purging.

Q25) Explain what are Schema Objects.

Q26) Explain these terms Session, Worklet, Mapplet, and Workflow.

Q27) List a few ETL bugs.

Recommended Articles

Recommended Courses

About Author