Group Discounts available for 3+ students and Corporate Clients

# SQL Aggregations & DataSet(Row & colummn operations) in SAS

## Aggregation functions:

Using aggregation functions, we can do  arithmetic manipulation in SQL. Using aggregate function, we can do column and row wise analysis.

• SQL produces each and every result in column wise.
• Aggregate functions are called summarize functions.

Ex:

Data medi;

Input Gid \$ week 1 drug1 \$ sub1

Week2 drug2  \$ sub2;

Cards;

G100  3  col5mg  56   6   col10mg  40

G200  3  col5mg  50   6   col10mg  45

G300  3  col5mg  60   6   col10mg  49;

Proc sql;

Select Gid, week1,drug1,sub1,

Sum?(sub1) as total,

Mean(sub1) as avgsub,

Max(sub1)as maxsub from

Medi;

Quit;

/* row wise analysis*/

Ex:

Proc sql;

Select * sum(sub1,sub2)as total

From medi;

Quit;

### Group by clause:

It can be used to do grouping analysis.

Ex:

Data clinical;

Input center \$ trail \$

Cards;

Appolo   phase1  67  12

Nims      phase1    78  14

Care        phase1  34  10

Appolo   phase2  267  22

Nims      phase2    178  14

Care        phase2  2 34  40

;

Proc sql;

Select  center, trail,sub,sum(sub)

As total

From clinical

Group by center;

Quit;

### Having clause:

Work like a where clause if your want to do grouping analysis based on condition we use having clause.

Ex:

Proc sql;

Select  center, trail,sub,sum(sub)

As total

From clinical

Group by center  having center;

Not in (‘care’);

Quit;

`Interested in mastering SAS Developer? Learn more about SAS Training and Placement in this blog post.`

### Count functions:

It can be used to report frequency analysis. If we use * as the argument in count function , ots report number of observation generated by the current query statement.

Proc sql;

Select   count(*) as obs

From clinical

Quit;

• If we use variable as argument, it reports number of non missing values generated by the required query statement.
##### /* no. of group participated in each and every trail*/

Ex:

Proc sql;

Select   trail,count(trail) as obs

From clinical  group by trail;

Quit;

##### /* each center conducted each trail no. of times*/

Proc sql;

Select   center, trail,

count(center) as obs

From clinical group by

Center, trail;

Quit;

##### /* Total no. of patients participated in each trail and each center */

Ex:

Proc sql;

Select   center, trail,

count(center), sum (sub )as obs

From clinical group by

Center, trail;

Quit;

Ex:

Proc sql;

Select   center, trail,

count(center) as obs, sum(sub)

as totsub

From clinical group by

Center, trail;

Quit;

### Distinct function:

It can be used to report unique values from the required variables

Syntax:

Distinct (argument)

Ex:

Proc sql;

Select distinct (center) as

Cenlist from clinical;

Quit;

Ex:

Proc sql;

Select distinct (trail) as

traillist from clinical;

Quit;

## Coalesce function:

It can be used to replace the missing values for reporting.

• Numeric missing values we can replace using with numeric.
• Character missing values we can replace using with Character

Syntax:

Coalesce (variable name, replace char or  replace number)

Ex:

Proc sql;

Select Coalesce (center,’Miss’) as center,trail,

Coalesce (sub) as sub

Quit;

## Views:

Its work like sas data sets. We can create a views from the existed SAS data sets, views are working based on SAS data set.

• Using query statement we can create views

Ex:

Proc sql;

Create view appolo as

Select * from clinical

Where center = ‘appolo’;

Quit;

Views are 3 types:

1. Data set views
2. Sql views
3. Access views

## Data set views:

It is created by the data set block.

### View option:

Using view option, we can create data set views.

Ex:

Data one/ view =one;

Set sashelp=class;

Where age <=14;

Run;

### SQl views:

It  is created by the SQL statements.

Ex:

Proc sql;

Create to view two as

Select * from

Sashelp.class;

Quit;

### Access point:

it is created by the access procedure.

### Difference between data set and views:

 Data set Views It is a physical storage area. To occupy some place for storage. Data set can be created using required statements import ,insert ,create table etc. It takes less processing time. If we do any changes in data set, the changes are occurred in current data set or same data set.   Data set store the data in physical format(variables and observations) It is a logical area, it does not occupy any dise place for storage. Views are created by the existed data set.   It takes more processing time. If we do any changes in views, these changes are occurred in existed data sets. If we do any changes in existed data set, these changes are occurred in views. Views are stored in query statement instead of data.

## Describe statement:

It can be used to report structure of the table or view.

Syntax:

Describe table /view <table name>/<view name>;

Ex:

Proc sql;

Describe table

Sashelp .class;

Quit;

Proc sql;

Describe view two;

Quit;

• Describe statement can be used only for sql statement.

### Creation process:

Data set create from data set.

View create from data set

View create from view

Data set create from view.

##### /* data set to data set*/

Ex:

Proc sql;

Create table emp1

As select * from emp

Where sale>=500;

Quit;

##### /* data set to view */

Ex:

Proc sql;

Create view emp2

As select * from emp

Quit;

##### /* view to view*/

Ex:

Proc sql;

Create view emp3

As select * from emp

Where sale>=500;

Quit;

##### /* view to data set */

Ex:

Proc sql;

Create table emp4

As select * from emp2

Where sale< 50;

Quit;

## DROP THE TABLE OR VIEWS FROM THE SAS ENVIRONMENT:

DROP STATEMENT:

Can be used to drop the table or views from the SAS environment

Syntax: Drop table / view <table name>/<view name>

Eg: proc sql;

Drop table emp 1;

Drop table emp 3;

Quit;

### CONSTRAINTS:

Can be used to load the necessary data in tables we can assign constrains in two ways

1. Column constrains
2. Table constrains

### Integrity constrains type:

• UNIQUE: can be used to load the data without duplicate data values
• Not null : can be used to load the data without missing values (numeric type and character type )
• Check: can be used to load the data based on condition
• Primary key: can be used to load the data without duplicate data values and duplicate observations and without missing values
• Primary key, foreign key, reference key: can be used to link the tables

Unique:

Using unique constrains we can load the data without duplicate observations

 Condata paid Age Gender Race Sheet 1 100 56 Female Asian 101 45 Female African 101 46 Female African 102 34 Male Asian

/* Extraction*/

Eg:

Proc import

Data file =‘d: \condata.xls’

Out =one dbms= XL replace;

Sheet =’sheet 1 \$’,

Run;

Proc sql;

Describe table one;

Quit;

/*transformation*/

Proc sql;

Create table

Demo (pid num unique, age num, gender char, race char);

Quit;

Proc sql;

Describe table demo;

Quit;

Proc append base = demo

Data = one force

• Only one constraint for one variable in table
• Multiple constraint in multiple variables in table
• Multiple constraint for only one variable in table

Not null:

Sheet 2

Eg:

 Paid Age Gender race 100 56 Female Asian 101 45 Male African 34 Female African 103 34 Male Asian 104 56 Female Asian

Proc append base = demo 4

Data = four force;

Run;

### Multiple constraints for one variable in table:

Raw data:

 Paid Age Gender Race 100 56 Female Asian 101 48 Male African 34 Female African 103 34 Male Asian 104 56 Female Asian 104 48 Female Asian

Step 1:

Importing the data in to SAS dataset name 5

Step 2:

To create a table with multiple constraints for one variable

/*transformation*/

Pro sql;

Create table

Demo (page num unique is not null, age num, gender char, race char);

Quit;

Proc sql;

Describe table demo,

Run;

Step 3:

Note: using primary key as a column constraint we can load the data without duplicate data values and without missing values

## Multiple constraints for multiple variables in table:

Raw data:

 Paid Age Gender Race 100 56 Female Asian 101 48 Male African 102 34 Female African 103 Male Asian 104 56 Female Asian 104 45 Female Asian

Step1:

/* extraction*/

Step 2:

/* null data set creation with constraints */ transformation

Proc sql;

Create table

Demo 2(paid num primary key, age num not null,

Check (age<=50), gender char, race char);

Quit;

Proc sql;

Describe table = demo;

Run;

Step3:

Table constraint:

Using table constraint, we can avoid duplicate observations

### Primary key:

Using primary key as a table constraint, we can avoid duplicate observations in loading time

Constraint statement:

Can be used to assign required constraint for required variable

Syntax: constraint <integrity name>     <type>     (variable);

-UN0001-          Unique      Pid num

Eg:

Proc sql;

Create table demo

(pis num unique, age num not null );

Quit;

Proc sql;

Describe table demo;

Quit;

Or

Proc sql;

Create table demo 1(pid num, age num, constraint UK unique (pid), constraint nt not null (age));

Quit;

Proc sql;

Describe table demo 1;

Quit;

Eg:

Raw data:

 Paid Test Units /*extraction*/ 200 200 200 201 201 201 205 205 205 Hr Sbp Dbp Hr Hr Sbp Sbp Dbp dbp 76 156 90 89 89 145 156 89 89 Proc importData file=’d: \conast.xls’Out=one dbms=excel replace;Sheet = ‘sheet 4\$’;Quit;Proc sql;Describe table one;Quit; /*transformation*/ Proc sql;Create table lab( pid.num,test char, Units num,  constraint pk,Primary key(pid, test, units));Quit; /*loading*/ Proc append base = labData= one force;Run;Proc print data= lab;Run;

We can link the tables using three keys

1. Primary key
2. Foreign key
3. Reference key

These three constraints can be used as column constraints to link up the tables

In linking concept, we can link up dimension table (master table) with fact table (transformation table) for data loading we must use primary key in dimension table, foreign key and reference key in fact table

Note:

We should use only one primary key in dimension table but in fact table we can use multiple foreign keys

/*EXTRACTION*/

Proc import

Data=’d:\conast-xls’

Out= d dbms = excel replace;

Sheet = ‘sheet 2 \$’

Quit;

Proc import

Out =l dbms = excel replace ;

Sheet = ‘sheet 3\$’

Quit;

Proc sql;

Describe table d;

Describe table l;

Quit;

Proc sql;

Create table demo(pid num name char, age num, race char, color char, constraint pk primary kry(pid));

Quit;

Proc sql;

Create table lab( pid num, test char, units num, constraint pk primary key (pid) reference demo);

Quit;

Proc sql;

Create table lab 1(num, test char, units num, constraint fk foreign key (pid) reference demo);

Quit;

Proc sql;

Describe table demo;

Describe table lab;

Describe table lab 1;

Quit;

Proc append base = demo;

Data = d force;

Run;

Proc append base = lab

Data = l force;

Run;

Proc append base = lab 1;

Data = lab force;

Run;

Eg:

/*extraction*/

Proc import

Data file=‘d: \conast.xls’

Out = d dbms = =excel replace;

Sheet = ‘sheet 1 \$’;

Quit;

Proc import

Data file = ‘d:\lata.xls ‘

Out = L dbms = =excel replace;

Sheet = ‘sheet 2 \$’;

Quit;

Proc sql;

Describe table d;

Describe table l;

Quit;

/* Transformation*/

Proc sql;

Create table demo ( pid num, age num, center char, constraint pk primary key (pid));

Quit;

Proc sql;

Create table lab (labno num, center char, constraint pk primary key (labno));

Quit;

Proc sql;

Create table lab ( pid num, test char, unit num, constraint PK primary key(pid));

Quit;

Proc sql;

Create table lab1 ( pid num, test char, unit num, constraint FK foreign key(pid) references demo;

Quit;

Proc sql;

Describe table demo;

Describe table lab;

Describe table lab1;

Quit;

Proc append base=demo

Data = d force;

Run;

Proc append base=lab

Data = l force;

Run;

Proc append base=lab1

Data = s force;

Run;

EX:

/*extraction*/

Proc import

Data file=‘d: \data.xls’

Out = d dbms = =excel replace;

Sheet = ‘sheet 1 \$’;

Quit;

Proc import

Data file = ‘d:\data.xls ‘

Out = L dbms = =excel replace;

Sheet = ‘sheet 2 \$’;

Quit;

Proc import

Data file = ‘d:\data.xls ‘

Out = L dbms = =excel replace;

Sheet = ‘sheet 3 \$’;

Quit;

Proc sql;

Create table demo ( pid num, age num, center char, constraint pk primary key (pid));

Quit;

Proc sql;

Create table lab (labno num, center char, constraint pk primary key (labno));

Quit;

Proc sql;

Create table sam( pid num, labno num, test char, units num, constraint FK foreign key (pid)

references demo, constraint FK, foreign key(lab no) reference lab);

Quit;

Proc sql;

Create table lab1 ( pid num, test char, unit num, constraint FK foreign key(pid) references demo;

Quit;

Proc sql;

Describe table demo;

Describe table lab;

Describe table sam;

Quit;

Proc append base=demo

Data = d force;

Run;

Proc append base=lab

Data = l force;

Run;

Proc append base=sam

Data = s force;

Run;

## TO DELETE THE DATA SETS:

If we want to delete table or data set from the SAS environment, the process is to

1. Delete constraints
2. Delete observations
3. Drop the table from SAS environment

à We can delete the constraints using after statement

Delete constraints:

Syntax: alter table<table name>drop<integrity constraint name>

Eg:

Proc import

Data file = ‘d:\ condata.xls’

Out = one dbms = excel replace;

Sheet = ‘sheet 2\$;

Quit;

Proc sql;

Create table demo ( pid num primary key, age num, gender char, race char constraint nt notnull(gender));

Quit;

Proc sql;

Describe table demo;

Quit;

/* Delete constraints */

Proc sql;

After table demo

Drop constraint-PK 0001-,nt;

Quit;

/* delete obs*/

Proc sql;

Delete from demo;

Quit;

/* delete SAS data set*/

Proc sql;

Drop table demo;

Quit;

### Delete observations:

Syntax: proc sql;

Delete from <table name>;

Quit;

### Delete table from SAS environment:

Proc sql:

Drop table<table name>;

Quit;

### To assign the constraint for existed data set:

If we want to assign constraint for existed data set, then it existed data set all observations must be satisfied in required constraint in this case, SAS will be accepted required constraint, otherwise it is rejected.

Eg:

Data demo 1;

Input pid age;

Cards;

100         56

101         89

100         67

;

Data demo 2 ;

Input pid age;

Cards;

100         67

101         56

102         45

;

Proc sql;

Constraint UK wnique(pid);

Quit;

Proc sql;

Describe table demo 1;

Quit;

Proc sql;

Describe table demo 2;

Quit;

For indepth knowledge on SAS, click on below