Home
Blog
Informatica
ETL Performance optimaization & Aggrigate Cache in Informatica

ETL Performance optimaization & Aggrigate Cache in Informatica

Ratings:

(4)Views: 0

Share this blog:

ETL Performance optimaization:

Target-Performance optimization:

Use the following techniches to improve the performance of data loading.

(1) Drop the Index beforeloading

(2) Increase commit interval

(3) Use bulk Loading

ETL Performance optimaization & Aggrigate Cache in Informatica

Source -Performance optimization:

Use the following techniques to improve the performance of data extraction.

Create index on the source system
Optimize th equery using hints [SQ override - filter]

Transformation - Performance optimization:

(1) Filter transformation:

Keep the filter T/R as close to the SQ as possible to filter the data carry in the data flow.

(2) Sequence generator transformation:

Configure the sequence generator transformation as close to the target as possible in the mapping.

Other wise mappings will be carrying extra sequence nols through the transformation pross which you not be transfer.

Ex:

ETL Performance optimaization & Aggrigate Cache in Informatica

Sorter transformation:

Keep the sorter transformation before an aggregation may improve the perforation over an order by cause in "SQL override"

Because the source data may not be tone with buffer size need for databse sort.

(4) Joiner trasformation:

Defining the source with the less no of the records as a master which occupy the less amount of memory in the cache.

--> Use sorted input

(5) aggregator T/R:

--> Use sorter input

--> Define incremental aggregation

--> Reduce the no of groups

(6) Expression transformation:

--> Create variable port to simple the complex expression

--> Use pipe operator ll rather than using concatenation function.

--> Use decode function rather than using nested if function.

(7) Router transformation:

Create a one router transformation rather than multiple filter T/R because a row is read once into input groups but evaluate multiple times based on no of groups. Whereas using filter T/R requires the same row data to be duplicated for filter transformation.

LOOKUP transformation:

Define SQL override to reduce no of records in the LOOKUP cache.

--> Use persistence LOOKUP cache

--> Use shared LOOKUP cache

(9) UPdate Strategy transformation:

Performance Consideration:

The UPdate strategy T/R performance can vary depending on the number of UPdates and inserts. In some cases ther emay be a performance benefit to split a mapping with UPdates and inserts into mapping and sessions, one mapping with inserts and the other with UPdates.

If you want to enrich your career and become a professional in Informatica, then visit Tekslate - a global online training platform: "Informatica Training" This course will help you to achieve excellence in this domain.

Increment Aggregation - Aggregate cache:

How it works:

i) There are two types of cache memory, index and data cache.
ii) All rows are loaded into cache before any aggregation takes place.

iii) The index cache contains group by port values.

iv) The data cache contains all ports values variables and connected output ports.
v) Now group by input port used in non-aggregate output local variable ports.

-> ports containing aggregate function (multiple by three)

vi) One out port rowa will be required for each unique occurrence of the group by ports.

An incremental aggregation is a process of calculating the summary's for only new records that pass through mapping.

At first run the integration service creates index cache which is saved with an extension ".Idx" , a data cahce which is saved with an extension ".det"

At second run the integration service performs following actions.

(a) For each input record the integration service checks historical information in the index cache for a corresponding group.

(b) If it finds the group then the integration service performs the aggregation incrementary.

An Incremental aggregation with aggregate cache improves the session performance.

Empno Ename SQL Deptno

101 Sri 4000 10

102 Sree 3000 20

103 Ram 2500 30

104 Shyam 3500 10

105 Prasad 5000 20

106 Ravi 7000 20

107 Srinu 5000 10

Second Run:

Empno Ename Sal Deptno

101 Srinu 3000 10

102 Ram 1500 20

103 Ravi 3500 30

ETL Performance optimaization & Aggrigate Cache in Informatica

Create source and target definations

Create a mapping with the Name m - Incremental- Agger

Drop the source and target definition

ETL Performance optimaization & Aggrigate Cache in Informatica

Create a session with the name S - m - Incremental - Aggregation

Double click the session select the properties tab

Attribute Value

Incrementals Aggregation

Select the mapping tab set the reader connection and writer

Connection click apply and click ok

From repository menu click on save

The following transformation are know castly transformation in the memory usage

The Integration service uses a cache memory to process the transformation, there by it improve the performance.

Rank Cache:

An integration service uses a cache memory to process the rank transformations.

An integration service loads the data into the cache before performing the calculation.

Ex: Identifying top 4 sales

ETL Performance optimaization & Aggrigate Cache in Informatica

Process:

(1) The integration service loads first four records into the data cache.

ETL Performance optimaization & Aggrigate Cache in Informatica

(2) The integration service reads next record from the source, comparist with cache values. If the record is lower in the rank than cached rows it discards the row.

(3) If the row is higher in the rank than one of the cached values then integration service replaced the cached row with the higher ranked input row.

(4) Finally the data cache contains

Sorter Cache:

The integration service uses cache memory to process the sorter transformation.

The integration services pass all of the incoming data into sorter cache before peforming the sort operation.

If the amount of incoming data is greater than the cache size specific then the integration services will temperarly store the data in a sorter transformation work directory.

The integration service create sorter cache to store sort keys and data while the integration services sorts the data.

By default the integration service creates "1 memory cache" and "disk cache" to perform sort operation.

LOOKUP Cache - how it works:

There are 2 types of cache memory , index and data cache

All ports values from the LOOKUP table where the port is part of the LOOKUP condition are loaded into index cache.
The index cache contains all ports values from the LOOKUP table where the port is specified in the LOOKUP condition.
The datacache contains all ports values from the LOOKUP table that are not in LOOKUP condition and that are specifieed as "output" as ports.
After the cache of the index cache upon a match the ros from the cache are included in stream.

The following are the types of LOOKUP cache.

(1) static LOOKUP Cache:

It's a read only cache , this is the default cache created by integration service.

The integration service don't UPdate the cache while it process the LOOKUP transformation.

(2) Dynamic LOOKUP Cache:

The integration service UPdates the cache while it process the LOOKUP transformation.

Use dynamic LOOKUP cache when you perform LOOK on "target" The dynamic LOOKUP cache is used in implementing slowly changing dimension type1.

(3) Persistence LOOKUP Cache:

You can save the LOOKUP cache files and reuse them the next time the integration service process the LOOKUP transformation configure to use the cache.

It improves the session performance.

(4) Shared LOOKUP Cache:

You can shared the LOOKUP cache between multiple transformation, It improve the performance of mapping.

You liked the article?

Like: 0

Vote for difficulty

Current difficulty (Avg): Medium

EasyMediumHardDifficultExpert

IMPROVE ARTICLEReport Issue

Recommended Courses

DataStage Training 4.95789 : Learners

Teradata Training 4.92763 : Learners

1/6

About Author

Name

TekSlate

Author Bio

TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.

Stay Updated

Get stories of change makers and innovators from the startup ecosystem in your inbox

Related Blogs

ETL Performance optimaization & Aggrigate Cache in Informatica

ETL Performance optimaization:

Target-Performance optimization:

Source -Performance optimization:

Transformation - Performance optimization:

Sorter transformation:

LOOKUP transformation:

(9) UPdate Strategy transformation:

Increment Aggregation - Aggregate cache:

Rank Cache:

Sorter Cache:

LOOKUP Cache - how it works:

Recommended Articles

Recommended Courses

About Author