Welcome to Teradata Tutorials. The intent of these tutorials is to provide in depth understanding of Teradata Database. In addition to Teradata Tutorials, we will look at common interview questions, how to tutorials, issues and their resolutions
Teradata database now able to connect Hadoop with this QueryGrid so it’s called as Teradata Database-to-Hadoop also referred as Teradata-to-Hadoop connector.
It provides a SQL interface for transferring data between Teradata Database and remote Hadoop hosts.
-Import Hadoop data into a temporary or permanent Teradata table.
-Export data from temporary or permanent Teradata tables into existing Hadoop tables.
-Create or drop tables in Hadoop from Teradata Database.
-Reference tables on the remote hosts in SELECT and INSERT statements.
-Select Hadoop data for use with a business tool.
-Select and join Hadoop data with data from independent data warehouses for analytical use.
Tables: A table in a relational database management system is a two-dimensional structure made up of columns and physical rows stored in data blocks on the disk drives.
Views: A view is like a “window” into tables that allows multiple users to look at portions of the same base data. A view may access one or more tables, and may show only a subset of columns from the table(s).
Macros: Macros are predefined, stored sets of one or more SQL commands and/or report-formatting (BTEQ) commands. Macros can also contain comments.
Triggers: A trigger is a set of SQL statements usually associated with a column or table that are programmed to be run (or “fired”) when specified changes are made to the column or table. The pre-defined change is known as a triggering event, which causes the SQL statements to be processed.
Stored Procedures: A stored procedure is a pre-defined set of statements invoked through a single CALL statement in SQL. While a stored procedure may seem like a macro, it is different in that it can contain:
Teradata SQL data manipulation statements (non-procedural)
Procedural statements (in Teradata, referred to as Stored Procedure Language)
Interested in mastering Teradata Training? Enroll now for FREE demo on Teradata Training.
To Retrieve multiple tables data and to get multiple columns information joins are Recommendable, it performs on column wise
Types of Joins
How do write Join Query common Join syntax
Select columns/* From Table A
Cross Join Table
INNER Join Table B ON Condition
Left[outer]Join Table B ON Condition
Right[outer]Join Table B ON Condition
Full[outer]Join Table B ON Condition
This Residual condition applied after all the joins perform single query
Set E.EID,EEN, E.DID,D.DID,
- D Name From emp E
Cross Join Dept D
Inner Join Dept D ON E.DID=D.DID
Left Join Dept D ON E.DID=D.DID
Right Join Dept D ON E.DID=D.DID
Full Join Dept D ON E.DID=D.DID
It is cross product of 2 Tables, if table A contains m Rows and Table B contains N Rows, After cross Join contains max Rows, if there is no where condition
Real time uses
Un- less this default situation, we should not go for this Because it occupies more memory any operates many rows so it may cause spool space issue(or)Buffer issue
Select E.EID, E. E Name, D. Deptid, D. Dept Name from emp E, Dept D
Select E.EID, E. E Name, D. Deptid, D. Dept name From emp E Cross Join Dept D
It gets the data based on the condition
The condition is based on equals operator then it is equal join select E.EID, E. E Name, D. DEPID, D. DEPT Name From emp E. Inner Join Dept D on E. Deptd= D. Deptid
The conditions contains other than equals operator(<,<=,>,>=,<>)
Select E.EID, E. E Name, D. Deptid, D. Dept Name From emp E INNER Join Dept D on E. Deptid<>D. Deptid
The Table Joins itself self Join
Identify employee and it corresponding manager from the below table
Select E1.EName AS Emplyee, E2.EName As manager front
Inner Join emp E2
It takes matched data
Tacked in-matched data from the left or right
Left outer Join
Tacked matched data
Tacked in matched data from the left table
Select E.EID,E. EName, D.Def, D –DEPT Name from
Left Join Dept D ON E.DEPT ID= D.DEPT ID
It gets matched Record
It gets in matched record from table
In case of in-matched record the left table data will be null
Select e.eid, E. EName, D.Deptid, D.Dept name From
Right Join Dept.D ON E. Deptid= D. Deptid
It gets matched data
It gets in- matched data from left and Right Table
In case of matched the other table data will be null
Difference between cross join and full outer join, union and union all, sub Query and correlated sub-Query difference more than 2 tables joining in the same table select E.EID, E. EName D. DEPTID, D. Dept Name From
INNER Join Dept D on E. Deptid= D. Deptid
INNER Join emp- Address E A ON EA.empid= E. Eid
Difference Between cross join and full outer join
a) Display all customer who are available in customer table and not available in calls table?
Do with only Joins, Do not use NOT IN,<>, NOT EXISTS operators
Select CT.CID, CT. C Name, CA. Cid CA. called from customer CT
Left Join call CA ON CT.CID = CA. CID
Where CA. Cid is Null
- A logical repository forTables, Views, Macros, Stored Procedures
- Database may own objects
- Perm Space – max amount of space available for objects
- Spool Space – max amount of work space available for requests (like tempdb)
Database is empty until objects created in it
CREATE DATABASE financeFROM sysadminASPERMANENT = 60000000,SPOOL = 120000000,FALLBACK PROTECTION,AFTER JOURNAL,BEFORE JOURNAL
- User is a database with an assigned password
- May own objects
- User may logon to Teradata and access objects within itself and other database where user has access rights
- A user is empty until objects are created within it.
- SA equivalent user is DBC (Database Computer).
Creating a User:
CREATE USER testuser FROM MyApplASPERM=2000000,SPOOL=5000000 ,PASSWORD=SECRET,DEFAULT DATABASE = Finance,NO FALLBACK
The Hierarchy of Databases
- A new database or user must be created from an existing database or user.
- All Perm space specifications are subtracted from the immediate owner or parent.
- Perm space is a zero sum game – the total of all Perm Space allocations must equal the total amount of disk space available.
- Perm space is used for tables only.
- Perm space currently unassigned is available to be used as Spool.
Teradata temporary tables
Teradata database provides various options in case of a need to use temporary tables. The temporary tables are especially useful when performing complicated calculations, they usually make multiple, complex SQL query simpler and increase overall SQL query performance. Temporary tables are especially useful for reporting and performing operations on summarized values.
Teradata provides the flexibility to use three types of temporary table which helps user to accomplish their work more easily. This kind of table is temporary to the database that means tables are not going to store permanently in the disk space, will be discarded after specific time based on type of table.
Types of temporary tables in TeraData are
1.Global temporary tables
2.Volatile temporary tables
Global Temporary Tables(GTT)
- They exist only for the duration of the SQL session in which they are used.
- The contents of these tables are private to the session, and System Automatically drops the table at the end of that session.
- System saves the Global Temporary Table Definition Permanently in the Data Dictionary.
- The Saved Definition may be Shared by Multiple Users and Sessions with Each Session getting its Own Instance of the Table.
Example of Global table
CREATE GLOBAL TEMPORARY TABLE MYDB.EMPLOYEE(
UNIQUE PRIMARY INDEX(EMP_NO)
ON COMMIT PRESERVE ROWS;
Volatile Temporary Tables(VTT)
- If you need a temporary table for a single use only, you can define a volatile table.
- The definition of a volatile table resides in memory (RAM) but does not survive across a system restart.
- It improves performance even more than using global temporary tables because the system does not store the definitions of volatile tables in the Data Dictionary.
- Access-rights checking is not necessary because only the creator can access the volatile table.
Example of Volatile table
CREATE VOLATILE TABLE EMPLOYEE(
UNIQUE PRIMARY INDEX(EMP_NO)
ON COMMIT PRESERVE ROWS;
ON COMMIT PRESERVE ROWS means to keep the data upon completion of the transaction.
We need to mention ON COMMIT PRESERVE rows explicitly as the default is ON COMMIT DELETE ROWS which delete the data from the table upon completion of the transaction.
- A special type of temporary table is the derived table. It is specified in SQL SELECT statement.
- A Derived Table is Obtained from One or More Other Tables as the Result of a Sub-Query.
- Scope of A Derived Table is only Visible to the Level of the SELECT statement calling the Sub-Query.
- Using Derived Tables avoids having to use the CREATE and DROP TABLE Statements for Storing Retrieved Information and Assists in Coding More Sophisticated, Complex Queries.
Example of Derived table
SEL EMP_NAME,SALARY FROM EMPLOYEE,(
FROM EMPLOYEE) AS EMPLOYEE_TEMP(AVGSAL)
WHERE SALARY > AVGSAL
ORDER BY 2 DESC;
Here we want to know the employee name whose salary is greater than the average salary. From the above example, you can see that in the from clause, we have calculated average salary of employees. Here EMPLOYEE_TEMP will act a like a derived table. Please note that we need to mention clearly table name and column list clearly in derived table.
Permanent VS Temporary Tables
Permanent and Temporary Tables
- Permanent storage of tables is necessary when different sessions and users must share table contents.
- When tables are required for only a single session, we can request the system to create temporary tables.
- Using this type of table, we can save query results for use in subsequent queries within the same session.
- We can break down complex queries into smaller queries by storing results in a temporary table for use during the same session. When the session ends, the system automatically drops the temporary table.
Teradata Columnar – Column Partitioned Tables
Column Partitioning (CP) is a new physical database design implementation option that allows single columns or sets of columns of a NoPI table to be stored in separate partitions. Column partitioning can also be applied to join indexes.
Combined with Teradata‘s existing multilevel partitioning capability, this provides the capability for a table or join index to be column (vertically) partitioned, row(horizontally) partitioned or both.
Teradata 14.0 introduces Teradata Columnar – a new approach for organizing the data of a user-defined table or join index on disk.
Teradata Columnar offers the ability to partition a table or join index by column.
Teradata Columnar can be used alone or in combination with row partitioning in multilevel partitioning definitions. Column partitions may be stored using traditional‗ROW‘ storage or alternatively stored using the new ‗COLUMN‘ storage option. In Either case, columnar can automatically compress physical rows where appropriate.
A table or join index that is partitioned by column has several key characteristics:
- It does not have a primary index.
- Each column partition can be composed of single or multiple columns.
- Each column partition usually consists of multiple physical rows.
- A new physical row format COLUMN may be utilized for a column partition.
Such a physical row is called a ‗container‘ and it is used to implement columnar-storage for a column partition.
- Alternatively, a column partition may also have traditional physical rows with ROW format. Such a physical row for columnar partitions is called a sub row.This is used to implement row-storage for a column partition.
Note that in subsequent discussions, when ‗row storage‘ or ‗row format‘ is stated, it is referring to columnar partitioning with the ROW storage option selected. This is not to be confused with row-partitioning which we associate with a PPI table.
-In a table with multiple levels of partitioning, only one level may be columnar.
All other levels must be row-partitioned (i.e., PPI).
Increased Partition Size
Concurrent with the columnar feature is an expansion of the number of partitions supported by a Teradata table. Previously, 65,535 was the maximum number of partitions supported. With Teradata 14, the number of partitions permitted is9,223,372,036,854,775,807 or 9.223 quintillion. Whereas a two-byte integer is needed to support up to 65,535 partition numbers, any table exceeding that amount will now require an 8-byte partition number.
Column Partitioning Implementation
With column partitioning, each column or specified group of columns in the table can become a partition containing the column partition values of that column partition. This is the simplest partitioning approach since there is no need to define partitioning expressions, as seen in the following example:
CREATE TABLE SALES (TAx No INTEGER,TxnDate DATE,ItemNo INTEGER,Quantity INTEGER )
Partition by Column
UNIQUE INDEX (Txn No);
The clause PARTITION BY COLUMN specifies that the table has column partitioning. Each column of this table will have its own partition and will be (by default) in column storage since no explicit column grouping is specified.
Enterprise Data ware house
Active Data ware house
Internet And E-Commerce
CRM[Customer Relationship Management]
Data Mart Appliance etc
Teradata Enterprise Architecture
Teradata Database system are 2 types
a) SMP[Symmetric Multi- Processing]
Teradata database system is called SMP System it have a single NODE That contains multiple CPU sharing Memory pool.
b) MPP[Massively parallel Processing]
Here multiple NODES are connected together via a component called as BYNET
Here all the NODES communication with each other with the help of vistual processes
Real time usage
To perform mode operation and to have better storage this is recommended.
PE Passing Engine
PDF Parallel Data Base Extension
AMP Access Module Processor
VDISK Virtual Disk
VPRDC Virtual Processor
It is important building block of the Teradata Database system it is collection of hardware and software components.
A server can also called as a node
Channel Attached system and network attached systems can be connect to a NODE
Channel driver and Teradata GATEWAY are the application RUN under the operating system as processes
Remaining component runs under PDE
PE and AMP Virtual processor, where BYNET Is an Internal Layer between PE and AMP
PDE[Parallel Data Base Extension]
It Runs Teradata component in parallel
TPA[Trusted parallel Application]
A data base is called pure parallel application (or) Trusted parallel application, it’s Runs under PDE
Teradata is a database, which is running under PDE, so we call it. Teradata has pure parallel Data base and trusted data base
SMP ARCHITECTURE[Symmetric Multi- Processing]
Single NODE Architecture can also be called as SMP Architecture, Here BYNET Can also referred as BROAD LESS BYNET(or) VIRTUAL BYNET(or)S OFT Ware BYNET
For GIGA BYTES of data processing with minimum operation, this is recommended
MPP Architecture(Massively parallel processor)
Collection of NODE which make larger configuration which Is called MPP
All these nodes are connected via component called BYNET,
Which allows multiple virtual processors and multiple NODES Communicate with each other
This BYNET can also called as BOARD ORIENTED BYNET(OR) HARD WARE BYNET.
Terdata BYNET Features
Each BYNET having network path to connected to the NODES, if there is a failure in any net work path simple it reconfiguration Itself and avoids the un- usable or failed path, in this way it is fault occurrence. It tolerates it self
It BYNET O is not able to reconfiguration or not able to handle traffic, than all instructions redirected to BYNET1 and it is Balance the Load.
If we increase the Number of NODES Teradata Doesn’t sacrifice any perform are and it’s scales Linearly.
V2 R5 à512 NODES
V2R6, TD12à1024 NODES
Latest Versionà2048 NODES be can connect to BYNET upcoming
Some companies And Number of the NODES
JPMC — 40 NODES
BOA — 40 NODES
DBS — 8 NODES
ICICI Productial — 4 NODES
CISCO — 44 NODES
WALL MART — 340 NODES
Barclays — 11 NODES (or)
If Processing Tera BYTES of data and many operations the MPP is recommended.
Check out the top Teradata Interview Questions now!
Operations and Operators in TeraData
Arithmetic Operations in Teradata
Order of evaluation:-[(2+3)]/(4*5/6)
AND B. OR C. NOT
NOT, AND, OR
<> Not equal 2<=3>5<6>7 Left to right
Sqrt(Argument)sqrt are root
LIG(Argument)Log 10 Algorithm
LN(Argument) Log Algorithm
Exp (Argument) ranking to the power of ‘e’
Built in functions
Select current – date;
Select current – time;
Select current – time stamp;
Select session/user, Account etc
Difference between Teradata And Other RDBMS
|TERA DATA||Other RDBMS|
|Architectures||Shared Nothing||Shared Every thing|
|Processes||MIPS[Millions of Instructions/sec||KIPS[Thousand of institutions/sec]|
|Indexes||Better Distribution And Retrieval||Only FASI Retrieval|
|Facilities||Enterprise wide Data ware housing||OLTP More|
|stores||TERA BYTES[Billions of rows]||GIGA BYTES[Millions of rows]|
CRM[Customer, Relationship Management]
Automatic, Even Data Distribution
In Other RDBMS Sequential distribution is Automatic, But in Teradata Even(or) Uniform (or)Random distribution is Automatic
Other, RDBMS Linear Scalability Application Teradata
If increase Number of Node, USERS (OR) WORK, Teradata Doesn’t sacrifice any performance and it’s scales linearly
Older Version of Oracle/SQL Server
-32 Sub Queries/Queries
Older Version of Tera data
-64 Sub Queries/Queries
-Aggregate Commands etc.
As Teradata is having powerful optimizer so that it is able to perform above operations.
Models The Business
Teradata highly flexible to 3NF[3rd Normal Form] and it’s supports the below models also
-Snow Flake schema
Acts Like a Single Data Store
Tera data is warehouse, where we can stoke both CHANNEL ATTACHED Data and NETWORK ATTACHED Data.
Low cost TCO[total cost of ownership]
Many Bulk Load Facilities
There are many facilities in Teradatato load and un-load
MULTI LOAD- LOAD
FAST EXPORT- UN-LOAD
OLE LOAD- LOAD/UN-LOAD
TeradataPARALLEL TRANSPORTER (TPT)-LOAD/UN-LOAD
TeradataPARALLEL TRANSPORTER API etc.
Many appliances added supported from teradata 13 onwards those are
Extreme data appliance1550/1555
Active Enterprise Data ware house 5555/5550
Data mart appliance 2550/2555/2500 etc
Installing Teradata is very simple. It asks 2 things.
Custom and typical option
Un zip folder, zip folder
Go to setup.exe
Click Main Menu
Minimum32-bit system Requirement
Micro soft windows XP with SP3
Micro soft windows Server 2003 with SP2
Micro soft windows Server 2008
Micro soft windows vista
Micro soft windows 7
Minimum hard ware Requirement
1 GB Memory
5 GB to 12GB of Free Disk space
Click Next install Menu and install below component one by one
Teradata BYNET DRIVER
Teradata Express tool
Teradata tools and utilities
It takes Max 20 to 25 minutes for entire installation
- BTEQ Features in Teradata
- Secondary Index in TeraData
- Memory Management in TeraData
- TPUMP Structure and Process In TeraData
- Multiple Cliques In System (OR) High Level In TeraData