mobileNavlogo
headerlogo

Blog

Horizontal Sorting in Data Stage

20 September, 2018

Related Blogs

Horizontal Combining

Combining primary rows with Secondary rows based on key column values The stage that perform  Horizontal combining are
  • Join
  • Look – up
  • Merge
These Stage differ with each other w. r. t
  1. input requirements
  2. Treatment of unmatched records
  3. Memory usage
 
Description join lookup Merge
Input Names  Screenshot_38
  1. Primary /i/p
  2. Reference lookup
S1 - -> lookup
 Screenshot_39
Types of joins Inner leftouter,right outer, full outer Inner, leftouter Inner left outer
i/p,o/p,rejects
n-i/p2- i/p1-o/p
No reject
n-i/p2- i/p1-o/p
1-reject
n-i/p1-o/p
n-1-rejects
key column Name requirements Key column Name should be SAME
In general- optional
It should be SAME in look up file set
Key column name should in SAME
inner join type
p.k column values should have matching with “ALL” secondary records
o/p:10
p/k column values should have matching with “all” secondary records.
o/p
p.k columns values should have matching with “Any” secondary records.
o/p:10,20,30
  1. i/p requirements w.r.to sorting
  1. primary sorting
secondary sorting
  1. mandatory
mandatory
  1. optional
optional
  1. mandatory
mandatory
  1. i/p requirements w.r.t de- duplication (remaining duplicates)
  1. primary
secondary
  1. no need to de- duplicate- ok
ok
  1. ok
warnings
  1. warnings(make sure that primary should be de- duplicate)
ok
  1. treatment of un matched records

a.  primary

b.  secondary
drop(inner)Target(left)b.Drop(inner)
target(right)
  1. drop/target/reject
only drop
  1. drop/target
drop/reject
memory usage light heavy light
 

lookup

Cross  verification of primary records with Secondary records. It is a processing Stage which performs Horizontal combining In Data stage we have 4 types of look – ups Screenshot_40

Normal lookup

Look up at memory, that is all the Secondary records are copied  to the memory and primary records are cross verified with Secondary records at memory. Sparse look up Look – up at Source , that is each primary record is send to the Source for cross verification. look up, failure condition :-
  1. Continue 7 left outer joins
  2. Drop 7inner join
  3. fail
  4. Reject

Join

4  types of joins can be performed  
E no E name Dept no
11 a 10
22 b 20
33 c 30
   
Dept No D Name loc
10 Account NY
20 Search OL
40 Sales CH
  Inner Join  T1 n T2 = {10, 20} Matched records from primary and Secondary Tables . Left outer join T1 n (t1 u t2)  = {10, 20, 30} Matched records from primary and Secondary Tables and unmatched   records from primary. Right outer join  T2 n (t1 u t2)  = {10, 20, 40} Matched records from primary and Secondary Tables and unmatched  records from Secondary. Full outer Join  T1 u T2 = {10,20,30, 40} 7Matched and un matched records from primary and secondary table  

Job :- look up

Screenshot_41  

  Primary 

 
Eid Ename State Status
11 SUMAN AP 1001
22 MOON TN 1000
33 SumaN AP 1111
44 Moon STAR NN 1011
55 SUMAN AP 1101
66 MooN JK 1001
77 MoonSUN AP 1011
88 MooN MP 1111
   
State State desc
Ap Andhra Pradesh
TN TamilNadu
JK Jammu Kashmir
DL Delhi
KK Kenya kumari
 

Sequential file 1 

Load primary.txt 7 columns7 select   state  Sequential file  0  Load primary.txt 7columns7 select   state  

Look –up 

Drag key column value “State” from primary to ref   7In ref, for State, Set key = case less 7(that is case sensitive can look -up)  
Key expression Range column
Eid
Ename
State
Status
  Dslink
Derivation Column Name
Primary – EID EID
Primary- Ename EName
Ref.state State-desc
Primary- status Status
 
Key expression Key types Column name
State
State-desc
  Click on yellow link on the TOP  7Constraints Set look up failure = Drop   (or) Continue  (or) Reject 7Needs are reject link Output
  1. (a) job :- Normal look –u p :-
  Screenshot_42
Interested in mastering DataStage Training?
Enroll now for FREE demo on DataStage Training.
Dept  (4 rows)    [Secondary records are copied to memory, and primary are cross verified with Secondary at memory so, 4 rows]   oracle  0 Load dept 7 Set look up Type = Normal oracle  1 Load EMP look  up Map key Column Value “Dept no”7Drag and drop required columns7Select continue 7Compile and RUN   (b )Job :- Sparse look up 7 when we need to use Sparse ?   Sol:-
  1. When memory is insufficient
  2. When primary records are relative less than Secondary records
  Screenshot_43  
  • In Sparse :- (conditions apply)
  1. If ref is data base, then it Supports only 2 inputs .
Screenshot_44   Dept  (14 rows)  [Each primary record is send to the Source that is dept for cross verification So 14 rows] oracle 0 Load dept 71st look – up Type = Sparse   o/p   Job:- Sparse look up Sequential file If Ref file is Sequential file, then we have an Stage called “Look up file set” 7 extension is .lfs   Screenshot_45   Sequential file 0   :-  load ref file Sequential file 1 :- load primary file Look up  Map key column value “State”7Drag and drop required columnsàSet look – up failure = continue7Compile and RUN Screenshot_46   Look up file set Look up keys7Key = State Keep = false Target look up file set  = D: /shilpa/new sql look up.lfs Look up Range Range look up = No Options Allow Duplicates = false
  • Compile and RUN
Look up file set 
  1. It creates indexing on that particular key   (key column value)
  2. In look – up file such same key column Name Should be there]
  • Copy the look up first set path
  1. Now Delete ref Sequential file in (a)
  7Paste the lookup file set in the place of ref SEQ file is   (a) 7In the lookup file  set  à paste the file name here 7there is no need to map in the look up first Screenshot_47   Output 
E id E name State Status State - dense
11 SUMAN AP 1001 Andhra Pradesh
22 MOON TN 1000 Tamilnadu
  job :- Database Sparse look up   In Data base , we can use lookup file set if we have 2 ref Data bases Screenshot_48   Output Range look up:-
  1. job :- Grade based on Salary 
Screenshot_49 Oracle enterprise 1 Load EMP table 7 convert Sal to integer in EMP table Oracle enterprise 1  Load Sal grade table  7Convert Grade, lo Sal, HSal to integer in Sal grade table  

 Lookup

Screenshot_50 EMP. Sal >= Sal grade . Lo Sal And EMP. Sal <= Sal grade . H Sal  7Drag and Drop required columns7click on constraints7Select Drop/Continue Data Set PropertiesàFile7Browse  for file 7Range look up .ds Compile and RUN For indepth understanding of DataStage click on
About Author
Authorlogo
Name

TekSlate

Author Bio

TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills .