Teradata Tutorials

Ratings:
(4)
Views:0
Banner-Img
  • Share this blog:

 

Welcome to Teradata Tutorials. The intent of these tutorials is to provide an in-depth understanding of the Teradata Database. In addition to Teradata Tutorials, we will look at common interview questions, how-to tutorials, issues, and their resolutions

Teradata Introduction

Teradata database now able to connect Hadoop with this QueryGrid so it’s called Teradata Database-to-Hadoop also referred to as Teradata-to-Hadoop connector.

It provides a SQL interface for transferring data between Teradata Database and remote Hadoop hosts.

  • Import Hadoop Data into a temporary or permanent Teradata table.
  • Export data from temporary or permanent Teradata tables into existing Hadoop tables.
  • Create or drop tables in Hadoop from Teradata Database.
  • Reference tables on the remote hosts in SELECT and INSERT statements.
  • Select Hadoop data for use with a business tool.
  • Select and join Hadoop data with data from independent data warehouses for analytical use.

Index

 

Teradata Objects

Tables: A table in a relational database management system is a two-dimensional structure made up of columns and physical rows stored in data blocks on the disk drives.

Views: A view is like a "window" into tables that allows multiple users to look at portions of the same base data. A view may access one or more tables and may show only a subset of columns from the table(s).

Macros: Macros are predefined, stored sets of one or more SQL commands and/or report-formatting (BTEQ) commands. Macros can also contain comments.

Triggers: A trigger is a set of SQL statements usually associated with a column or table that are programmed to be run (or "fired") when specified changes are made to the column or table. The pre-defined change is known as a triggering event, which causes the SQL statements to be processed.

Stored Procedures: A stored procedure is a pre-defined set of statements invoked through a single CALL statement in SQL. While a stored procedure may seem like a macro, it is different in that it can contain:

Teradata SQL data manipulation statements (non-procedural)

Procedural statements (in Teradata, referred to as Stored Procedure Language)

Inclined to build a profession as Teradata Developer? Then here is the blog post on, explore Teradata Training

Teradata Joins 

To Retrieve multiple tables data and to get multiple columns information joins are Recommendable, it performs on column-wise

Types of Joins

Teradata - Joins

How do write Join Query common Join syntax?

Select columns/* From Table A

Cross Join Table

INNER Join Table B ON Condition

Left[outer]Join Table B ON Condition

Right[outer]Join Table B ON Condition

Full[outer]Join Table B ON Condition

Where<Residual Condition>

Note

This Residual condition applied after all the joins perform a single query

Emp

Eid name Did
1 A 10
2 B 20

Dept

DID D Name
10 IT
20 HR

Set   E.EID,EEN, E.DID,D.DID,

  1. D Name From emp E

Cross Join Dept D

Inner Join Dept D ON E.DID=D.DID

Left Join Dept D ON E.DID=D.DID

Right, Join Dept D ON E.DID=D.DID

Full Join Dept D ON E.DID=D.DID

Teradata

Teradata - Joins

Cross Join

It is the cross product of 2 Tables if table A contains m Rows and Table B contains N Rows After cross Join contains max Rows if there is nowhere condition

Real-time uses

Unless this default situation, we should not go for this Because it occupies more memory any operates many rows so it may cause spool space issue(or)Buffer issue

Example:

Select E.EID, E. E Name, D. Deptid, D. Dept Name from emp E, Dept D

Select E.EID, E. E Name, D. Deptid, D. Dept name From emp E Cross Join Dept D

INNER Join

It gets the data based on the condition

EQUI Join

The condition is based on equals operator then it is equal join select E.EID, E. E Name, D. DEPID, D. DEPT Name From emp E. Inner Join Dept D on E. Deptd= D. Dept id

Non-Equi Join

The conditions contain other than equals operator(<,<=,>,>=,<>)

Select E.EID, E. E Name, D. Deptid, D. Dept Name From emp E INNER Join Dept D on E. Deptid<>D. Deptid

Self Join

The Table Joins itself self Join

Example:

Identify employee and it corresponding manager from the below table

Eid ename egrid depid
1 Vinay 2 10
2 Ram Krishna 3 50
3 Raju 1 10
4 Krishna 2 20
5. Mukesh 5 50
6 Raghava 3 30

Select E1.EName AS Employee, E2.EName As manager front

Emp E1

Inner Join emp E2

Are E2.EID=E1.MGRID;

Teradata

Outer Join

It takes matched data

Tacked in-matched data from the left or right

Left outer join

Tacked matched data

Tacked in matched data from the left table

Select E.EID, E. EName, D.Def, D –DEPT Name from

Emp E

Left Join Dept D ON E.DEPT ID= D.DEPT ID

Right Join

It gets matched Record

It gets in the matched record from the table

Note:

In the case of an in-matched record, the left table data will be null

Select e.eid, E. EName, D.Deptid, D.Dept name From

Emp E

Right Join Dept.D ON E. Deptid= D. Dept

Full[Outer]Join

It gets matched data

It gets in- matched data from left and Right Table

Note:

In the case of a match, the other table data will be null

EID EName Deptid DeptName
5 Raj # #
2 Ram # #
3 Krishna # #
10 Ramu # #
1 Vinay 10 IT
4 Kiran 20 HR
2 Raju 40 Sales

Difference  between cross join and full outer join, union and union all, sub Query and correlated sub-Query difference more than 2 tables joining in the same table select E.EID, E. EName  D. DEPTID, D. Dept Name From

Emp E

INNER Join Dept D on E. Deptid= D. Dept id

INNER Join emp- Address E A ON EA.empid= E. Eid

Difference Between cross join and full outer join

Customer Table

CID CName
1 ABC
2 MNQ
3 RTP
4 DEl

 

CID Call id
1 0001
2 0002

a) Display all customers who are available in the customer table and not available in the call table?

Note:

Do with only Joins, Do not use NOT IN,<>, NOT EXISTS operators

Select CT.CID, CT. C Name, CA. Cid CA. called from customer CT

Left Join calls CA-ON CT.CID = CA. CID

Where CA? Cid is Null

 

Teradata Databases

  • A logical repository for Tables, Views, Macros, Stored Procedures
  • The database may own objects
  • Perm Space – max amount of space available for objects
  • Spool Space – the max amount of workspace available for requests (like tempdb)

The database is empty until objects created in it.

Creating Database:

CREATE DATABASE finance from sysadminASPERMANENT = 60000000,SPOOL = 120000000,FALLBACK PROTECTION,AFTER JOURNAL,BEFORE JOURNAL

Teradata Users:

  • User is a database with an assigned password
  • May own objects
  • User may logon to Teradata and access objects within itself and other databases where the user has access rights
  • A user is empty until objects are created within it.
  • SA equivalent user is DBC (Database Computer).

Creating a User:

CREATE USER testuser FROM MyApplASPERM=2000000,SPOOL=5000000 ,PASSWORD=SECRET,DEFAULT DATABASE = Finance,NO FALLBACK

The Hierarchy of Databases

Teradata basics - Hierarchy of Databases

  • A new database or user must be created from an existing database or user.
  • All Perm space specifications are subtracted from the immediate owner or parent.
  • Perm space is a zero-sum game - the total of all Perm Space allocations must equal the total amount of disk space available.
  • Perm space is used for tables only.
  • Perm space currently unassigned is available to be used as a pool.

Teradata temporary tables

Teradata database provides various options in case of a need to use temporary tables. The temporary tables are especially useful when performing complicated calculations, they usually make multiple, complex SQL queries simpler and increase overall SQL query performance. Temporary tables are especially useful for reporting and performing operations on summarized values.

Teradata provides the flexibility to use three types of temporary tables which helps user to accomplish their work more easily. This kind of table is temporary to the database that means tables are not going to store permanently in the disk space, will be discarded after a specific time based on the type of table. 

Teradata Components

Types of temporary tables in Teradata are

1.Global temporary tables

2.Volatile temporary tables

3.Derived tables

Global Temporary Tables(GTT)

  • They exist only for the duration of the SQL session in which they are used.
  • The contents of these tables are private to the session, and System Automatically drops the table at the end of that session.
  • The system saves the Global Temporary Table Definition Permanently in the Data Dictionary.
  • The Saved Definition may be Shared by Multiple Users and Sessions with Each Session getting its Own Instance of the Table.

Example of Global table

CREATE GLOBAL TEMPORARY TABLE MYDB.EMPLOYEE (EMP_NO VARCHAR(10), EMP_NAME VARCHAR(10), SALARY INTEGER) UNIQUE PRIMARY INDEX(EMP_NO)  ON COMMIT PRESERVE ROWS;

Volatile Temporary Tables(VTT)

  • If you need a temporary table for single use only, you can define a volatile table.
  • The definition of a volatile table resides in memory (RAM) but does not survive across a system restart.
  • It improves performance even more than using global temporary tables because the system does not store the definitions of volatile tables in the Data Dictionary.
  • Access-rights checking is not necessary because only the creator can access the volatile table.

Example of Volatile table

CREATE VOLATILE TABLE EMPLOYEE( EMP_NO VARCHAR(10), EMP_NAME VARCHAR(10), SALARY INTEGER) UNIQUE PRIMARY INDEX(EMP_NO)  ON COMMIT PRESERVE ROWS;

ON COMMIT PRESERVE ROWS means to keep the data upon completion of the transaction.

We need to mention ON COMMIT PRESERVE rows explicitly as the default is ON COMMIT DELETE ROWS which delete the data from the table upon completion of the transaction.

Derived Tables(DT)

  • A special type of temporary table is the derived table. It is specified in the SQL SELECT statement.
  • A Derived Table is Obtained from One or More Other Tables as the Result of a Sub-Query.
  • Scope of A Derived Table is only Visible to the Level of the SELECT statement calling the Sub-Query.
  • Using Derived Tables avoids having to use the CREATE and DROP TABLE Statements for Storing Retrieved Information and Assists in Coding More Sophisticated, Complex Queries.

Example of Derived table

SEL EMP_NAME, SALARY FROM    EMPLOYEE,( SEL    AVG(SALARY) FROM   EMPLOYEE) AS EMPLOYEE_TEMP(AVGSAL) WHERE SALARY > AVGSAL ORDER BY 2 DESC;

Here we want to know the employee name whose salary is greater than the average salary. From the above example, you can see that in the from clause, we have calculated the average salary of employees. Here EMPLOYEE_TEMP will act alike a derived table. Please note that we need to mention clearly table name and column list clearly in the derived table. 

Permanent VS Temporary Tables

Permanent and Temporary Tables

  • Permanent storage of tables is necessary when different sessions and users must share table contents.
  • When tables are required for only a single session, we can request the system to create temporary tables.
  • Using this type of table, we can save query results for use in subsequent queries within the same session.
  • We can break down complex queries into smaller queries by storing results in a temporary table for use during the same session. When the session ends, the system automatically drops the temporary table.

Teradata Columnar – Column Partitioned Tables

Column Partitioning (CP) is a new physical database design implementation option that allows single columns or sets of columns of a NoPI table to be stored in separate partitions. Column partitioning can also be applied to join indexes. Combined with Teradata‘s existing multilevel partitioning capability, this provides the capability for a table or join index to be column (vertically) partitioned, row(horizontally) partitioned, or both.

Columnar 

Teradata 14.0 introduces Teradata Columnar – a new approach for organizing the data of a user-defined table or join index on disk.

Teradata Columnar offers the ability to partition a table or join index by column.

Teradata Columnar can be used alone or in combination with row partitioning in multilevel partitioning definitions. Column partitions may be stored using traditional‗ROW‘ storage or alternatively stored using the new ‗COLUMN‘ storage option. In Either case, columnar can automatically compress physical rows where appropriate.

A table or join index that is partitioned by column has several key characteristics:

  • It does not have a primary index.
  • Each column partition can be composed of single or multiple columns.
  • Each column partition usually consists of multiple physical rows.
  • A new physical row format COLUMN may be utilized for a column partition.

Such a physical row is called a ‗container‘ and it is used to implement columnar-storage for a column partition.

  • Alternatively, a column partition may also have traditional physical rows with ROW format. Such a physical row for columnar partitions is called a sub row. This is used to implement row-storage for a column partition.

Note that in subsequent discussions, when ‗row storage‘ or ‗row format‘ is stated, it is referring to columnar partitioning with the ROW storage option selected. This is not to be confused with row-partitioning which we associate with a PPI table.

-In a table with multiple levels of partitioning, only one level may be columnar.

All other levels must be row-partitioned (i.e., PPI).

Increased Partition Size

Concurrent with the columnar feature is an expansion of the number of partitions supported by a Teradata table. Previously, 65,535 was the maximum number of partitions supported. With Teradata 14, the number of partitions permitted is9,223,372,036,854,775,807 or 9.223 quintillion. Whereas a two-byte integer is needed to support up to 65,535 partition numbers, any table exceeding that amount will now require an 8-byte partition number.

Column Partitioning Implementation

With column partitioning, each column or specified group of columns in the table can become a partition containing the column partition values of that column partition. This is the simplest partitioning approach since there is no need to define partitioning expressions, as seen in the following example:

CREATE TABLE SALES (TAx No INTEGER, TxnDate DATE, ItemNo INTEGER, Quantity INTEGER )

Teradata Administrator Tool

Partition by Column

UNIQUE INDEX (Txn No);

The clause PARTITION BY COLUMN specifies that the table has column partitioning. Each column of this table will have its own partition and will be (by default) in column storage since no explicit column grouping is specified.

Teradata Applications

Enterprise Data warehouse

Active Data warehouse

Internet And E-Commerce

CRM[Customer Relationship Management]

Data Mart Appliance etc

Teradata Enterprise Architecture

Teradata Enterprise Architecture

Teradata Architecture

Teradata Database system are 2 types

  • SMP
  • MPP

a) SMP[Symmetric Multi- Processing]

Teradata database system is called SMP System it has a single NODE That contains multiple CPU sharing Memory pool.

b) MPP[Massively parallel Processing]

  • Here multiple NODES are connected together via a component called as BYNET
  • Here all the NODES communication with each other with the help of virtual processes

Real-time usage

To perform mode operation and to have better storage this is recommended. 

 

NODE ARCHITECTURE

Teradata - Node Architecture

PE  2  Passing  Engine

PDF 2 Parallel DataBase Extension

AMP 2 Access Module Processor

VDISK 2 Virtual Disk

VPRDC 2 Virtual Processor

NODE

  • It is an important building block of the Teradata Database system it is a collection of hardware and software components.
  • A server can also be called as a node
  • Channel Attached system and network-attached systems can be connected to a NODE
  • Channel driver and Teradata GATEWAY are the application RUN under the operating system as processes
  • The remaining component runs under PDE
  • PE and AMP Virtual processor, where BYNET Is an Internal Layer between PE and AMP

PDE[Parallel DataBase Extension]

It Runs Teradata component in parallel

TPA[Trusted parallel Application]

A database is called pure parallel application (or) Trusted parallel application, it’s Runs under PDE

Teradata is a database, which is running under PDE, so we call it. Teradata has a pure parallel Database and trusted database 

SMP ARCHITECTURE[Symmetric Multi- Processing]

Single NODE Architecture can also be called as SMP Architecture, Here BYNET Can also refer as BROAD LESS BYNET(or) VIRTUAL BYNET(or)S OFT Ware BYNET

Recommended

For GIGABYTES of data processing with the minimum operation, this is recommended

MPP Architecture(Massively parallel processor)

Teradata - MPP Architecture

Collection of NODE which makes larger configuration which Is called MPP

All these nodes are connected via a component called BYNET,

Which allows multiple virtual processors and multiple NODES to Communicate with each other

This BYNET can also call as BOARD ORIENTED BYNET(OR) HARDWARE BYNET.

Loops in Teradata

Teradata BYNET Features

Fault Tolerance

Each BYNET having a network path to connect to the NODES, if there is a failure in any network path simple it reconfiguration Itself and avoids the unusable or failed path, in this way it is a fault occurrence. It tolerates itself

Load Balances

If BYNET O is not able to reconfiguration or not able to handle traffic, then all instructions redirected to BYNET1 and it is Balance the Load.

High scalability

If we increase the Number of NODES Teradata Doesn’t sacrifice any perform are and it scales Linearly.

NOTE

V2 R5 à512 NODES

V2R6, TD12à1024 NODES

Latest Versionà2048 NODES be can connect to BYNET upcoming

Some companies And Number of the NODES

JPMC -- 40 NODES

BOA -- 40 NODES

DBS -- 8 NODES

ICICI Production -- 4 NODES

CISCO -- 44 NODES

WALL MART -- 340 NODES

Barclays -- 11 NODES (or)

Recommendation

If Processing Tera BYTES of data and many operations the MPP is recommended.

Teradata Interview Questions

Operations and Operators in TeraData

Arithmetic Operations in Teradata

+

--

**

[]

()

Order of evaluation:-[(2+3)]/(4*5/6)

Teradata - Arithmetic operations

Logical Operators

AND

OR

NOT

AND B. OR C. NOT

NOT, AND, OR

Comparison Operators

<

< =

>

>=

<>                Not equal       2<=3>5<6>7  2  Left to right

Arithmetic expression

ABS(Argument)2Absolute value

 

Sqrt(Argument)2sqrt are root LIG(Argument)2Log 10 Algorithm LN(Argument)2 Log Algorithm Exp (Argument)2 ranking to the power of ‘e’ Teradata -DDL

Built-in functions Select  date;

Select  time;

Select  current – date;

Select  current – time;

Select  current – timestamp;

Select  session/user, Account, etc  

Difference between Teradata And Other RDBMS

           TERA DATA      Other RDBMS
Architectures Shared Nothing Shared Every thing
Processes MIPS[Millions of Instructions/sec KIPS[Thousand of institutions/sec]
Indexes Better Distribution And Retrieval Only FASI Retrieval
Parallelism Un-conditional conditional
Bulk Load Many Limited
Facilities Enterprise-wide Data warehousing OLTP More
stores TERA BYTES[Billions of rows] GIGA BYTES[Millions of rows]

 

Teradata

CRM[Customer, Relationship Management]

Teradata CRM

Teradata Advantages

Automatic, Even Data Distribution

In Other RDBMS Sequential distribution is Automatic, But in Teradata Even(or) Uniform (or)Random distribution is Automatic

High Scalability

NODE                         NODE

WORK                        WORK

USERS                        USERS

Other, RDBMS Linear Scalability Application Teradata

If increase Number of Node, USERS (OR) WORK, Teradata Doesn’t sacrifice any performance and it scales linearly

Matura optimizer

Older Version of Oracle/SQL Server

  • 32 Joins/Queries
  • 32 Sub Queries/Queries

Older Version of Tera data

  • 64 Joins/Queries
  • 64 Sub Queries/Queries
  • Formatting Commands
  • Analytical Commands
  • Aggregate Commands etc.

Teradata is having a powerful optimizer so that it is able to perform the above operations.

Models The Business

Teradata highly flexible to 3NF[3rd Normal Form] and it supports the below models also

  • Normalization
  • Star schema
  • Snow Flake schema
  • Bus schema
  • Galaxy Schema
  • Hybrid Schema

 

Acts Like a Single Data Store

Teradata - Single Data Store

Tera data is a warehouse, where we can stoke both CHANNEL ATTACHED Data and NETWORK ATTACHED Data.

Low-cost TCO[total cost of ownership]

Teradata

Functions in TeraData

Many Bulk Load Facilities

There are many facilities in Teradatato load and unload

BTEQ-Load/Unload

FAST LOAD-LOAD

MULTI LOAD- LOAD

FAST EXPORT- UN-LOAD

T PUMP-LOAD

OLE LOAD- LOAD/UN-LOAD

TeradataPARALLEL TRANSPORTER (TPT)-LOAD/UN-LOAD

TeradataPARALLEL TRANSPORTER API etc.

Many appliances added supported from teradata 13 onwards those are

Extreme data appliance1550/1555

Active Enterprise Data warehouse 5555/5550

Datamart appliance 2550/2555/2500 etc

Teradata Installation

Installing Teradata is very simple. It asks 2 things.

Storage space

Custom and typical option

Unzip folder, zip folder

Go to setup.exe

Click Main Menu

Install TeradataExpress

Installation Requirement

Minimum32-bit system Requirement

Micro-soft Windows XP with SP3

Micro-soft Windows Server 2003 with SP2

Micro-soft Windows Server 2008

Micro-soft windows vista

Micro-soft windows 7

Minimum hardware Requirement

1 GB Memory

5 GB to 12GB of Free Disk space

Click Next to install Menu and install the below component one by one

Teradata BYNET DRIVER

Teradata Database

Teradata Express tool

Teradata Tools and utilities

Note

It takes Max 20 to 25 minutes for the entire installation

For In-depth knowledge on Teradata click on:

 

About Author
Authorlogo
Name
TekSlate
Author Bio

TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.