Aggregation functions:
Using aggregation functions, we can do arithmetic manipulation in SQL. Using aggregate function, we can do column and row wise analysis.
- SQL produces each and every result in column wise.
- Aggregate functions are called summarize functions.
Ex:
Data medi;
Input Gid $ week 1 drug1 $ sub1
Week2 drug2 $ sub2;
Cards;
G100 3 col5mg 56 6 col10mg 40
G200 3 col5mg 50 6 col10mg 45
G300 3 col5mg 60 6 col10mg 49;
Proc sql;
Select Gid, week1,drug1,sub1,
Sum?(sub1) as total,
Mean(sub1) as avgsub,
Max(sub1)as maxsub from
Medi;
Quit;
/* row wise analysis*/
Ex:
Proc sql;
Select * sum(sub1,sub2)as total
From medi;
Quit;
Group by clause:
It can be used to do grouping analysis.
Ex:
Data clinical;
Input center $ trail $
Sub adsub;
Cards;
Appolo phase1 67 12
Nims phase1 78 14
Care phase1 34 10
Appolo phase2 267 22
Nims phase2 178 14
Care phase2 2 34 40
;
Proc sql;
Select center, trail,sub,sum(sub)
As total
From clinical
Group by center;
Quit;
Having clause:
Work like a where clause if your want to do grouping analysis based on condition we use having clause.
Ex:
Proc sql;
Select center, trail,sub,sum(sub)
As total
From clinical
Group by center having center;
Not in (‘care’);
Quit;
Interested in mastering SAS Developer? Learn more about SAS Training and Placement in this blog post.
Count functions:
It can be used to report frequency analysis. If we use * as the argument in count function , ots report number of observation generated by the current query statement.
Proc sql;
Select count(*) as obs
From clinical
Quit;
- If we use variable as argument, it reports number of non missing values generated by the required query statement.
/* no. of group participated in each and every trail*/
Ex:
Proc sql;
Select trail,count(trail) as obs
From clinical group by trail;
Quit;
/* each center conducted each trail no. of times*/
Proc sql;
Select center, trail,
count(center) as obs
From clinical group by
Center, trail;
Quit;
/* Total no. of patients participated in each trail and each center */
Ex:
Proc sql;
Select center, trail,
count(center), sum (sub )as obs
From clinical group by
Center, trail;
Quit;
Ex:
Proc sql;
Select center, trail,
count(center) as obs, sum(sub)
as totsub
From clinical group by
Center, trail;
Quit;
Distinct function:
It can be used to report unique values from the required variables
Syntax:
Distinct (argument)
Ex:
Proc sql;
Select distinct (center) as
Cenlist from clinical;
Quit;
Ex:
Proc sql;
Select distinct (trail) as
traillist from clinical;
Quit;
Coalesce function:
It can be used to replace the missing values for reporting.
- Numeric missing values we can replace using with numeric.
- Character missing values we can replace using with Character
Syntax:
Coalesce (variable name, replace char or replace number)
Ex:
Proc sql;
Select Coalesce (center,’Miss’) as center,trail,
Coalesce (sub) as sub
Adsub from clinical;
Quit;
Views:
Its work like sas data sets. We can create a views from the existed SAS data sets, views are working based on SAS data set.
- Using query statement we can create views
Ex:
Proc sql;
Create view appolo as
Select * from clinical
Where center = ‘appolo’;
Quit;
Views are 3 types:
- Data set views
- Sql views
- Access views
Data set views:
It is created by the data set block.
View option:
Using view option, we can create data set views.
Ex:
Data one/ view =one;
Set sashelp=class;
Where age <=14;
Run;
SQl views:
It is created by the SQL statements.
Ex:
Proc sql;
Create to view two as
Select * from
Sashelp.class;
Quit;
Access point:
it is created by the access procedure.
Difference between data set and views:
Data set | Views |
|
|
Describe statement:
It can be used to report structure of the table or view.
Syntax:
Describe table /view <table name>/<view name>;
Ex:
Proc sql;
Describe table
Sashelp .class;
Quit;
Proc sql;
Describe view two;
Quit;
- Describe statement can be used only for sql statement.
Creation process:
Data set create from data set.
View create from data set
View create from view
Data set create from view.
/* data set to data set*/
Ex:
Proc sql;
Create table emp1
As select * from emp
Where sale>=500;
Quit;
/* data set to view */
Ex:
Proc sql;
Create view emp2
As select * from emp
Quit;
/* view to view*/
Ex:
Proc sql;
Create view emp3
As select * from emp
Where sale>=500;
Quit;
/* view to data set */
Ex:
Proc sql;
Create table emp4
As select * from emp2
Where sale< 50;
Quit;
DROP THE TABLE OR VIEWS FROM THE SAS ENVIRONMENT:
DROP STATEMENT:
Can be used to drop the table or views from the SAS environment
Syntax: Drop table / view <table name>/<view name>
Eg: proc sql;
Drop table emp 1;
Drop table emp 3;
Quit;
CONSTRAINTS:
Can be used to load the necessary data in tables we can assign constrains in two ways
- Column constrains
- Table constrains
Integrity constrains type:
- UNIQUE: can be used to load the data without duplicate data values
- Not null : can be used to load the data without missing values (numeric type and character type )
- Check: can be used to load the data based on condition
- Primary key: can be used to load the data without duplicate data values and duplicate observations and without missing values
- Primary key, foreign key, reference key: can be used to link the tables
Unique:
Using unique constrains we can load the data without duplicate observations
Condata | paid | Age | Gender | Race |
Sheet 1 | 100 | 56 | Female | Asian |
101 | 45 | Female | African | |
101 | 46 | Female | African | |
102 | 34 | Male | Asian |
/* Extraction*/
Eg:
Proc import
Data file =‘d: \condata.xls’
Out =one dbms= XL replace;
Sheet =’sheet 1 $’,
Run;
Proc sql;
Describe table one;
Quit;
/*transformation*/
Proc sql;
Create table
Demo (pid num unique, age num, gender char, race char);
Quit;
Proc sql;
Describe table demo;
Quit;
/*loading*/
Proc append base = demo
Data = one force
- Only one constraint for one variable in table
- Multiple constraint in multiple variables in table
- Multiple constraint for only one variable in table
Not null:
Sheet 2
Eg:
Paid | Age | Gender | race |
100 | 56 | Female | Asian |
101 | 45 | Male | African |
34 | Female | African | |
103 | 34 | Male | Asian |
104 | 56 | Female | Asian |
Step 3 /*loading*/
Proc append base = demo 4
Data = four force;
Run;
Multiple constraints for one variable in table:
Raw data:
Paid | Age | Gender | Race |
100 | 56 | Female | Asian |
101 | 48 | Male | African |
34 | Female | African | |
103 | 34 | Male | Asian |
104 | 56 | Female | Asian |
104 | 48 | Female | Asian |
Step 1:
Importing the data in to SAS dataset name 5
Step 2:
To create a table with multiple constraints for one variable
/*transformation*/
Pro sql;
Create table
Demo (page num unique is not null, age num, gender char, race char);
Quit;
Proc sql;
Describe table demo,
Run;
Step 3:
/* loading*/
Note: using primary key as a column constraint we can load the data without duplicate data values and without missing values
Multiple constraints for multiple variables in table:
Raw data:
Paid | Age | Gender | Race |
100 | 56 | Female | Asian |
101 | 48 | Male | African |
102 | 34 | Female | African |
103 | Male | Asian | |
104 | 56 | Female | Asian |
104 | 45 | Female | Asian |
Step1:
/* extraction*/
Step 2:
/* null data set creation with constraints */ transformation
Proc sql;
Create table
Demo 2(paid num primary key, age num not null,
Check (age<=50), gender char, race char);
Quit;
Proc sql;
Describe table = demo;
Run;
Step3:
/*loading*/
Table constraint:
Using table constraint, we can avoid duplicate observations
Primary key:
Using primary key as a table constraint, we can avoid duplicate observations in loading time
Constraint statement:
Can be used to assign required constraint for required variable
Syntax: constraint <integrity name> <type> (variable);
-UN0001- Unique Pid num
Eg:
Proc sql;
Create table demo
(pis num unique, age num not null );
Quit;
Proc sql;
Describe table demo;
Quit;
Or
Proc sql;
Create table demo 1(pid num, age num, constraint UK unique (pid), constraint nt not null (age));
Quit;
Proc sql;
Describe table demo 1;
Quit;
Eg:
Raw data:
Paid | Test | Units | /*extraction*/ |
200 200 200 201 201 201 205 205 205 | Hr Sbp Dbp Hr Hr Sbp Sbp Dbp dbp | 76 156 90 89 89 145 156 89 89 | Proc importData file=’d: \conast.xls’Out=one dbms=excel replace;Sheet = ‘sheet 4$’;Quit;Proc sql;Describe table one;Quit; |
/*transformation*/ | |||
Proc sql;Create table lab( pid.num,test char, Units num, constraint pk,Primary key(pid, test, units));Quit; | |||
/*loading*/ | |||
Proc append base = labData= one force;Run;Proc print data= lab;Run; |
Link the tables for data loading:
We can link the tables using three keys
- Primary key
- Foreign key
- Reference key
These three constraints can be used as column constraints to link up the tables
In linking concept, we can link up dimension table (master table) with fact table (transformation table) for data loading we must use primary key in dimension table, foreign key and reference key in fact table
Note:
We should use only one primary key in dimension table but in fact table we can use multiple foreign keys
/*EXTRACTION*/
Proc import
Data=’d:\conast-xls’
Out= d dbms = excel replace;
Sheet = ‘sheet 2 $’
Quit;
Proc import
Out =l dbms = excel replace ;
Sheet = ‘sheet 3$’
Quit;
Proc sql;
Describe table d;
Describe table l;
Quit;
Proc sql;
Create table demo(pid num name char, age num, race char, color char, constraint pk primary kry(pid));
Quit;
Proc sql;
Create table lab( pid num, test char, units num, constraint pk primary key (pid) reference demo);
Quit;
Proc sql;
Create table lab 1(num, test char, units num, constraint fk foreign key (pid) reference demo);
Quit;
Proc sql;
Describe table demo;
Describe table lab;
Describe table lab 1;
Quit;
Proc append base = demo;
Data = d force;
Run;
Proc append base = lab
Data = l force;
Run;
Proc append base = lab 1;
Data = lab force;
Run;
Eg:
/*extraction*/
Proc import
Data file=‘d: \conast.xls’
Out = d dbms = =excel replace;
Sheet = ‘sheet 1 $’;
Quit;
Proc import
Data file = ‘d:\lata.xls ‘
Out = L dbms = =excel replace;
Sheet = ‘sheet 2 $’;
Quit;
Proc sql;
Describe table d;
Describe table l;
Quit;
/* Transformation*/
Proc sql;
Create table demo ( pid num, age num, center char, constraint pk primary key (pid));
Quit;
Proc sql;
Create table lab (labno num, center char, constraint pk primary key (labno));
Quit;
Proc sql;
Create table lab ( pid num, test char, unit num, constraint PK primary key(pid));
Quit;
Proc sql;
Create table lab1 ( pid num, test char, unit num, constraint FK foreign key(pid) references demo;
Quit;
Proc sql;
Describe table demo;
Describe table lab;
Describe table lab1;
Quit;
/* Loading */
Proc append base=demo
Data = d force;
Run;
Proc append base=lab
Data = l force;
Run;
Proc append base=lab1
Data = s force;
Run;
EX:
/*extraction*/
Proc import
Data file=‘d: \data.xls’
Out = d dbms = =excel replace;
Sheet = ‘sheet 1 $’;
Quit;
Proc import
Data file = ‘d:\data.xls ‘
Out = L dbms = =excel replace;
Sheet = ‘sheet 2 $’;
Quit;
Proc import
Data file = ‘d:\data.xls ‘
Out = L dbms = =excel replace;
Sheet = ‘sheet 3 $’;
Quit;
Proc sql;
Create table demo ( pid num, age num, center char, constraint pk primary key (pid));
Quit;
Proc sql;
Create table lab (labno num, center char, constraint pk primary key (labno));
Quit;
Proc sql;
Create table sam( pid num, labno num, test char, units num, constraint FK foreign key (pid)
references demo, constraint FK, foreign key(lab no) reference lab);
Quit;
Proc sql;
Create table lab1 ( pid num, test char, unit num, constraint FK foreign key(pid) references demo;
Quit;
Proc sql;
Describe table demo;
Describe table lab;
Describe table sam;
Quit;
Proc append base=demo
Data = d force;
Run;
Proc append base=lab
Data = l force;
Run;
Proc append base=sam
Data = s force;
Run;
TO DELETE THE DATA SETS:
If we want to delete table or data set from the SAS environment, the process is to
- Delete constraints
- Delete observations
- Drop the table from SAS environment
à We can delete the constraints using after statement
Delete constraints:
Syntax: alter table<table name>drop<integrity constraint name>
Eg:
Proc import
Data file = ‘d:\ condata.xls’
Out = one dbms = excel replace;
Sheet = ‘sheet 2$;
Quit;
Proc sql;
Create table demo ( pid num primary key, age num, gender char, race char constraint nt notnull(gender));
Quit;
Proc sql;
Describe table demo;
Quit;
/* Delete constraints */
Proc sql;
After table demo
Drop constraint-PK 0001-,nt;
Quit;
/* delete obs*/
Proc sql;
Delete from demo;
Quit;
/* delete SAS data set*/
Proc sql;
Drop table demo;
Quit;
Delete observations:
Syntax: proc sql;
Delete from <table name>;
Quit;
Delete table from SAS environment:
Proc sql:
Drop table<table name>;
Quit;
To assign the constraint for existed data set:
If we want to assign constraint for existed data set, then it existed data set all observations must be satisfied in required constraint in this case, SAS will be accepted required constraint, otherwise it is rejected.
Eg:
Data demo 1;
Input pid age;
Cards;
100 56
101 89
100 67
;
Data demo 2 ;
Input pid age;
Cards;
100 67
101 56
102 45
;
Proc sql;
After table demo 1 add
Constraint UK wnique(pid);
Quit;
Proc sql;
Describe table demo 1;
Quit;
Proc sql;
Describe table demo 2;
Quit;
For indepth knowledge on SAS, click on below