Scheduling Jobs in DataStage

Ratings:

(4)Views: 0

Share this blog:

Schedule DataStage Jobs

Step 1

Start DB/2 repository and Data stage Server (In the Task box, we have a Green colour icon, àRight click àStart)

DataStage Server

Start Program Web sphere Application server profiles à default start the server Next click on Web Console we find the login page that is Server has Started

If the server is not started, the page cannot be displayed is displayed.

Do you want to master DataStage? Then enrol in "DataStage Training" This course will help you to master DataStage

Step 2

Select Designer from Desktop Show a login page Authenticate your self

User id: Admin

Password: phi, and attach to an appropriate project

Project: INDIA / DWS okDisplay a New Window

Step 3

Select parallel jobs from new
A designer canvas and palette of parallel jobs is opened
Select appropriate Stages from the palette
Palette is a component that has Short shortcuts of the stages.
8 categories of stages are divided into 2 groups
Active Stages: ETL –“T”
Passive Stages: ETL – “E” & “L”
General
Data Quality
Database
Development
File
Processing
Real-time
Restructure
Favourites

In General, Annotations are used for comments.

Step 4

Drag and Drop required Stage from Palette, and create a Sample job
Set the Stage properties.
Save, Compile and Run
Shortcuts:-

Compile –F7 Run – Ctrl +F5

Reading Data From Files

We have 3 types of files.

.txt: ( also have different formats)

Fixed width flat files
comma-separated files
variable length files
Head & Tail files
Single column format
Space & tab Separated

In this current job, We are using a “comma-separated format”. We are using “Sequential file” Sequential file – to read the file When we read the Sequential file, we have to follow 4 things

File Name
Location (paths)
Format (comma Structure)
Structure (Table Definition or Meat data)
IN this jobs, Source is a file, loading the data that is Target is also a file

Scheduling Jobs in DataStage

The naming conversion will be decided by the people who prepare transformation SQ- PRIMARY –DATA
Double click on Sequential file (or) Right-click on Sequential fileWe get the properties window Select file

File - C:/data /primary .net

↓

Columns (importing the Structure of the text file into O.S from D.S load into Repository T>B)

↓

Click load

↓

We get a window

↓

“Table Definition” it shows the list of files in Tb (Right-click on TB )

↓

Import Table Definition

↓

Sequential file Definition

↓

Browse Directory

↓

Select files

↓

Select “primary 22 .txt”

↓

Import (click)

↓

The first line is the column name “Define” Sequential Metadata

↓

Ok now the file imports the location into D.S if we want to view the data click “VIEW DATA”

↓

Target file à “Select and drive” take forex: D:/

↓

File Name -- Sample.txt

↓

Save -- jobs (RC) new folder Shilpa jot name (or) item

↓

Compile run

The green colour indicates that jobs have finished

o/p:- total rows

Sequential files stage properties:

It is a file Stage that reads or writes Sequentially or parallels.

Note:

1 file – Sequential (Reading data from only 1 file)

N file – parallel (Reading data from more than 1 file)

Sequential Stage supports 1 i/p or 1 o/p and 1 reject link

(That is either 1 input or 1 output, but not both from each sequential file) - - > IT can Support i/p + reject link (or) o/p +reject link

Multiple Sources

We have an option to read data from Multiple Source Structure of file 1 and file 2 should be the same There is a Memory limit for Text files (That is As the text file, can accommodate only 2 GB, if we have 5 GB, so we need To have multiple text file) - - >TO Read the data from Multiple Sources /files, Sequential file Supports. “Read Method 2 types”

Specific files give file name specifically
File pattern users wild card characters

Ex:- if we select “filename” C:/data/primary * .txt wild card characters Any characters which have multi-purpose That is we have different text files as Primary 11.txt Primary 22.net Primary 11.det * specific char. Matching? multiple char matching # numbers Primary is common, for that reason, we use * symbol C:/ Data primary ? ? txt After primary Characters or numbers or numbers occupied.

Handling Format Mismatch Records

“Reject Mode” à 3 options Continue - Drop format mismatch record and load rest. Fail - job aborts if any format mismatch records Output - captures rejected data, through Reject link Reject link is allowed only with Reject link “output option” is Reject Mode.

Reject Data

Format Mismatch
Data type Mismatch
Condition Mismatch

Reject link is allowed only with the “output option” is Reject Mode

Missing file Mode: --- (if the file is missed)

Error – fails job if any file is missed.
Ok – continue with the rest of the file.

Option:- (To different which record is Coming from which file including the path)

fileName column source properties option
Get src into tag Select file name column come to the column
Select FNC – add to columns Add the at Column Name
Row number columns

1) gives record no. at src to target gets sequence no. of the src at the target, it is integers

Read first rows
Get first ‘n’ records from f1,f2,f3……….fn(that is Read rows = 3, display 1^st 3 records from f1,f2,f3……….fn)
Filter – we can use UNIX based comments like a group
grep “William” grep “moon” – case Sensitive

grep – I “moon” – ignore the case

grep – I “moon” – exact Matching

grep – I “moon” – other than Moon,(case sensitive)

Types of links

Link Types

link Default Colors

Red – Stage Not Connected property (or) job Aborted

Blue – In the process

Green – finished

Black - Stage is ready to set the properties

Link Marks

Indicates what data is moved between Stages

Link Marks

Box at SRC SRC Stage ready with Mata Data.

BOX a Tag Parallel to parallel auto Partitioning

Link mark

W.r.to Data stage :-

The compilation is a process of Converting GUI into its machine code, in Process, it checks link requirements, Mandatory Stage properties, and logical errors

Datastage

Lab Exercise

In the file pattern, What are the options that are missing?

Sol: Options

First line is column Name = false

Keep file partitions = false

Reject Mode = continue

Report progress = yes

Properties File Name column Row Number column Schema file

Specific files

Options

The first line is column Name = false

Keep file partitions = false

Missing File Mode = Depends

Reject Mode = continue

Report progress = yes

Properties

File Name column
Filter
of Record deeper Node
Read first Rows
Read from multiple nodes
Row number column
Schema file

SRC = 1 file (10 records) , Tag = 2 files, how the output will at the target.

Sol: Output will come in 2 files with alternate records.

That is if we have to records, we get 5 in each target file

Example:- Support, we have 10 records will 1,2,3,4…………..10

The output will be

Target 1 1,3,5,7,9

Target 2 2,4,6,8,10 SRC = 2 files , Tag = 2 files, output = ?

Sol: 1^st SRC file into 1^st target file

2^st SRC file into 2nd target file

SRC = 2 files , Tag = 1 files, output = ?

File 1 = c id, c name, address 2 records

File 2 = c id, c name, address, country 2 record

Sol: When we load a file for the first – time

File 1

C id	C name	Address
1	ABC	HYD
2	def	BNG

File 2

C id	C name	Address	Country
3	gri	chne	India
4	sha	knk	India

While file 1 loads, the output will be 4 rows

C id	C name	address
1	abc	HYd
2	def	BNG
3	gri	Chne, India
4	sha	KNK, India

That is, c id, c name, Address in the file is these in file 2(file 1 Structure is there file 2, so it display records from 2 tables)

- - > While file 2 Is loaded, the output will be 2 rows

C id	C name	Address	Country
3	gri	chne	India
4	sha	knk	India

That is cid,c name, Address, in file 2, are not there in file 1, to it does not consider file 2.

That is file 2 structure is not there in file 1 structure so file 1 is discarded and display only File records.

For an in-depth understanding of DataStage click on

You liked the article?

Like: 0

Vote for difficulty

Current difficulty (Avg): Medium

EasyMediumHardDifficultExpert

IMPROVE ARTICLEReport Issue

Recommended Courses

DataStage Training 4.95789 : Learners

Teradata Training 4.92763 : Learners

1/6

About Author

Name

TekSlate

Author Bio

TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.

Stay Updated

Get stories of change makers and innovators from the startup ecosystem in your inbox

Related Blogs

Scheduling Jobs in DataStage

Schedule DataStage Jobs

DataStage Server

Reading Data From Files

.txt: ( also have different formats)

↓

↓

↓

↓

↓

↓

↓

↓

↓

↓

↓

↓

↓

↓

↓

↓

↓

Sequential files stage properties:

Note:

Multiple Sources

Handling Format Mismatch Records

Reject Data

Types of links

link Default Colors

Link Marks

W.r.to Data stage :-

Lab Exercise

Specific files

Properties

File 1

File 2

For an in-depth understanding of DataStage click on

Recommended Articles

Recommended Courses

About Author