• USA : +1 973 910 5725
  • INDIA: +91 905 291 3388
  • info@tekslate.com
  • Login

Partitioning Technique in DataStage

Partitioning Technique w. r. t Performance Tuning

Partitioning is the process of dividing an input data set into multiple segments, or partitions. Each processing node in your system then performs an operation on an individual partition of the data set rather than on the entire data set.

Data Stage basically allows 2 types of partitioning:

-Key Based Partitioning

-Keyless Partitioning

11How to tune the jobs using partitioning Technique ?

Key based Technique 

Hash

Modulus

Range

DB/2

Hash

– rows with same key column (or multiple columns) go to the same partition. Hash is very often used and sometimes improves performance, however it is important to have in mind that hash partitioning does not guarantee load balance and misuse may lead to skew data and poor performance.

Modulus

– data is partitioned on one specified numeric field by calculating modulus against number of partitions. Not used very often.

Range

– an expensive refinement to hash partitioning. It is imilar to hash but partition mapping is user-determined and partitions are ordered. Rows are distributed according to the values in one or more key fields, using a range map (the ‘Write Range Map’ stage needs to be used to create it). Range partitioning requires processing the data twice which makes it hard to find a reason for using it.

Keyless Techniques

Same

Entire

Round Robin

Random

Entire

– all rows from a dataset are distributed to each partition. Duplicated rows are stored and the data volume is significantly increased.

Same

– existing partitioning remains unchanged. No data is moved between nodes.

Round robin

– rows are alternated evenly accross partitions. This partitioning method guarantees an exact load balance (the same number of rows processed) between nodes and is very fast.

Random

– rows are randomly distributed accross partitions

All key based stages, by default are associated with Hash as Key based Technique.

Hash Technique

Hash Partitioning is one of the most popular and frequently used technique in Data Stage. Under this part we send data with Same Key Colum to the same partition.

Example

Partitioning Technique in DataStage

11Same Key Column Values are Given to Same Node .

11Hash partitioning Technique can be Selected in to 2 cases

  1. If Key Column > 1
  2. If key column = 1, other than Integer

 Why  Modulus ?

  • Modules is having good performance when compared to hash.
  • In modules ,it distributes the data by calculating MOC  Value

DataStage Tutorials

Note Modules is Having Good performance than Hash

Learn DataStage by Tekslate - Fastest growing sector in the industry.
Explore Online DataStage Training and course is aligned with industry needs & developed by industry veterans.
Tekslate will turn you into DataStage Expert.

MOD is selected , when it has only 1 key column, and it is an integer.

Join 11 Dept no, E no

11We use Hash,  As key column > 1

That is dept no, E no    ( we have 2 key column values )

DataStage tutorials

because, we want the same distribution from join to Aggregator , and key column value is also same , so we use SAM
11 SAME is Key less

DataStage tutorials

We cant use SAME, here, as join has 2 key column values D no, loc, if we use SAME, We don’t Know  that what data is getting from join to Aggregator.

Modulus  à because only 1 key column (D NO) and Integer.

DataStage Tutorials

For indepth understanding of DataStage click on

Summary
Review Date
Reviewed Item
Partitioning Technique in DataStage
Author Rating
5

“At TekSlate, we are trying to create high quality tutorials and articles, if you think any information is incorrect or want to add anything to the article, please feel free to get in touch with us at info@tekslate.com, we will update the article in 24 hours.”

1 Responses on Partitioning Technique in DataStage"

    Leave a Message

    Your email address will not be published. Required fields are marked *

    Site Disclaimer, Copyright © 2016 - All Rights Reserved.

    Support


    Please leave a message and we'll get back to you soon.

    I agree to be contacted via e-mail.