Home
Blog
IBM DataStage
Horizontal Sorting in Data Stage

Horizontal Sorting in Data Stage

Ratings:

(4)Views: 0

Share this blog:

Horizontal Combining

Combining primary rows with Secondary rows based on key column values The stage that perform Horizontal combining are

Join
Look – up
Merge

These Stage differ with each other w. r. t

input requirements
Treatment of unmatched records
Memory usage

Description

join

lookup

Merge

Input Names

Primary /i/p
Reference lookup

S1 - -> lookup

Types of joins

Inner leftouter,right outer, full outer

Inner, leftouter

Inner left outer

i/p,o/p,rejects

n-i/p2- i/p1-o/p

No reject

n-i/p2- i/p1-o/p

1-reject

n-i/p1-o/p

n-1-rejects

key column Name requirements

Key column Name should be SAME

In general- optional

It should be SAME in look up file set

Key column name should in SAME

inner join type

p.k column values should have matching with “ALL” secondary records

o/p:10

p/k column values should have matching with “all” secondary records.

o/p

p.k columns values should have matching with “Any” secondary records.

o/p:10,20,30

i/p requirements w.r.to sorting

primary sorting

secondary sorting

mandatory

mandatory

optional

optional

mandatory

mandatory

i/p requirements w.r.t de- duplication (remaining duplicates)

primary

secondary

no need to de- duplicate- ok

warnings

warnings(make sure that primary should be de- duplicate)

treatment of un matched records

a. primary

b. secondary

drop(inner)Target(left)b.Drop(inner)

target(right)

drop/target/reject

only drop

drop/target

drop/reject

memory usage

light

heavy

light

lookup

Cross verification of primary records with Secondary records. It is a processing Stage which performs Horizontal combining In Data stage we have 4 types of look – ups Screenshot_40

Normal lookup

Look up at memory, that is all the Secondary records are copied to the memory and primary records are cross verified with Secondary records at memory. Sparse look up Look – up at Source , that is each primary record is send to the Source for cross verification. look up, failure condition :-

Continue left outer joins
Drop inner join
fail
Reject

Join

4 types of joins can be performed

E no	E name	Dept no
11	a	10
22	b	20
33	c	30

Dept No	D Name	loc
10	Account	NY
20	Search	OL
40	Sales	CH

Inner Join T1 n T2 = {10, 20} Matched records from primary and Secondary Tables . Left outer join T1 n (t1 u t2) = {10, 20, 30} Matched records from primary and Secondary Tables and unmatched records from primary. Right outer join T2 n (t1 u t2) = {10, 20, 40} Matched records from primary and Secondary Tables and unmatched records from Secondary. Full outer Join T1 u T2 = {10,20,30, 40} Matched and un matched records from primary and secondary table

Job :- look up

Screenshot_41

Primary

Eid	Ename	State	Status
11	SUMAN	AP	1001
22	MOON	TN	1000
33	SumaN	AP	1111
44	Moon STAR	NN	1011
55	SUMAN	AP	1101
66	MooN	JK	1001
77	MoonSUN	AP	1011
88	MooN	MP	1111

State	State desc
Ap	Andhra Pradesh
TN	TamilNadu
JK	Jammu Kashmir
DL	Delhi
KK	Kenya kumari

Sequential file 1

Load primary.txt columns select state Sequential file 0 Load primary.txt columns select state

Look –up

Drag key column value “State” from primary to ref In ref, for State, Set key = case less (that is case sensitive can look -up)

Key expression	Range	column
	Eid
	Ename
	State
	Status

Dslink

Derivation	Column Name
Primary – EID	EID
Primary- Ename	EName
Ref.state	State-desc
Primary- status	Status

Key expression	Key types	Column name
		State
		State-desc

Click on yellow link on the TOP Constraints Set look up failure = Drop (or) Continue (or) Reject Needs are reject link Output

(a) job :- Normal look –u p :-

Screenshot_42

Interested in mastering DataStage Training?
Enroll now for FREE demo on DataStage Training.

Dept (4 rows) [Secondary records are copied to memory, and primary are cross verified with Secondary at memory so, 4 rows] oracle 0 Load dept Set look up Type = Normal oracle 1 Load EMP look up Map key Column Value “Dept no”Drag and drop required columnsSelect continue Compile and RUN (b )Job :- Sparse look up when we need to use Sparse ? Sol:-

When memory is insufficient
When primary records are relative less than Secondary records

Screenshot_43

In Sparse :- (conditions apply)

If ref is data base, then it Supports only 2 inputs .

Screenshot_44 Dept (14 rows) [Each primary record is send to the Source that is dept for cross verification So 14 rows] oracle 0 Load dept 1^st look – up Type = Sparse o/p Job:- Sparse look up Sequential file If Ref file is Sequential file, then we have an Stage called “Look up file set” extension is .lfs Screenshot_45 Sequential file 0 :- load ref file Sequential file 1 :- load primary file Look up Map key column value “State”Drag and drop required columnsàSet look – up failure = continueCompile and RUN Screenshot_46 Look up file set Look up keysKey = State Keep = false Target look up file set = D: /shilpa/new sql look up.lfs Look up Range Range look up = No Options Allow Duplicates = false

Compile and RUN

Look up file set

It creates indexing on that particular key (key column value)
In look – up file such same key column Name Should be there]

Copy the look up first set path

Now Delete ref Sequential file in (a)

Paste the lookup file set in the place of ref SEQ file is (a) In the lookup file set à paste the file name here there is no need to map in the look up first Screenshot_47 Output

E id	E name	State	Status	State - dense
11	SUMAN	AP	1001	Andhra Pradesh
22	MOON	TN	1000	Tamilnadu

job :- Database Sparse look up In Data base , we can use lookup file set if we have 2 ref Data bases Screenshot_48 Output Range look up:-

job :- Grade based on Salary

Screenshot_49 Oracle enterprise 1 Load EMP table convert Sal to integer in EMP table Oracle enterprise 1 Load Sal grade table Convert Grade, lo Sal, HSal to integer in Sal grade table

Lookup

Screenshot_50 EMP. Sal >= Sal grade . Lo Sal And EMP. Sal <= Sal grade . H Sal Drag and Drop required columnsclick on constraintsSelect Drop/Continue Data Set PropertiesàFileBrowse for file Range look up .ds Compile and RUN For indepth understanding of DataStage click on

You liked the article?

Like: 0

Vote for difficulty

Current difficulty (Avg): Medium

EasyMediumHardDifficultExpert

IMPROVE ARTICLEReport Issue

Recommended Courses

DataStage Training 4.95789 : Learners

Teradata Training 4.92763 : Learners

1/6

About Author

Name

TekSlate

Author Bio

TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.

Stay Updated

Get stories of change makers and innovators from the startup ecosystem in your inbox

Related Blogs