String function:
a. Length:
Using length function, we can find out length of the string(no. of characters include blank spaces). Length function returns numeric value.
b. Index function:
Returns position of character in string. It works based on character and word wise.
Note:
Specific character not available in string it returns zero. Length and index function returns numeric value.
c. Scan function:
Using scan function we can get required word from the string.
d. Sub string:
Using this function, we can get part of the string from string. If requires 3 arguments.
- Variable name
- Starting position
- Number of characters
e. concatenation(combine):
it is used for combining the strings. Symbol is||
ex:
data string;
input pid surname $
name & : $15 age ;
cards;
100 chittu Chandra Sekhar 56
101 Kolla ram mohan rao 45
102 konda pavan kumar 34
;
Data string1;
Set string;
Sl = length (name);
Si = index( name, ’a’);
Sib= index (name, ‘ ‘);
Siw = index (name, ’ra’);
Sc = Scan(name,2);
Sub = Substr(name,5,10);
Fname=Surname|| name;
Run;
Proc print data =string1;
Run;
Interested in mastering SAS Certification? Enroll now for FREE demo on SAS Training Program.
f. Compress function:
It can be used to remove specific characters from the string. It requires 2 arguments. It is working based on the characters wise.
Syntax:
Compress =(variable name, ’character’);
Note:
If we omit the 2nd argument in compress function, it removes the blank spaces from the string.
Data:
G100 col 0.005 headache and backpain
G200 Aspirin 0.005 Null and skin problem
G300 col 0.1 headache and Null
G400 Aspirin 0.1 headache and eye problem
Ex:
Data one;
Infile ‘d:\medi.txt’;
Input group $ drug $ dos
Adevent :$12 .;
Run;
Data medi;
Set one;
If dos =0.005 then dos =5;
Else dos =10;
Drug1 = drug ||’-’|| dos || ‘mg’;
Drug1= compress (drug1);
Exad = scan(adevent ,1);
Unexad = scan(adevent ,2);
Drop drug dos adevent;
Rename drug1 =drug;
Run;
Proc print data =medi;
Run;
g. Translate function:
It can be used to replace the required character in string.
Ex:
Data medi;
Input pid drug$;
Cards;
100 col5mg
101 col10mg
102 col15mg
;
Data medi;
Set medi;
Drug = translate(drug.’C’.’c’);
Run;
Proc print data =medi;
Run;
Ex:
Data demo;
Input stno name & :$21 age color $;
Cards;
100 chittu Chandra Sekhar 23 white
101 Kolla ram mohan rao 34 black
Data demo1;
Set demo;
Surname =scan (name,1);
Bp = index (name, ‘ ’);
Name= Substr(name, bp+1);
Drop bp;
Run;
Proc print data =demo1;
Run;
Date and time function:
Ex:
Data demo;
Input stno scvdate :date9
Set time :time 8.;
Format scvdate date 9.
Scvtime time 8;
Cards;
100 12jan2003 12:23:34
101 13feb2003 13:23:34
102 14feb2003 11:23:34
103 15mar2003 10:23:34
;
Data demo1;
Set demo;
Scv day = day(scvdate);
Scv month= month(scvdate);
Scv year= year(scvdate);
Sc hour =hour(scvtime);
Sc min =minute(scvtime);
Sc sec =second (scvtime);
Run;
Proc print data =demo1;
Run;
/*To report Feb month */
Proc print data =demo1;
Where scv month =2;
Run;
Date part function:
It can be used to get date value from the date and time variables.
Syntax:
Datepart(argument)
2. Time part function:
It can be used to get time value from the date and time variables.
Syntax:
Time part(argument)
Ex:
Data demo;
Input stno scvdtime :datetime18;
Format scvdtime datetime 18.;
Cards;
100 12jan2003 12:23:34
101 13feb2003 13:23:34
;
Data demo1;
Set demo;
Scdate = datepart(scvdtime);
Sctime = timepart(scvdtime);
Format scvdate date 9. sctime time 8.;
Drop scvdtime;
Run;
Proc print data =demo1;
Run;
3. Intck function:
It can be used to report difference between the date values in day intervals , month intervals or year intervals.
Ex:
Data medi;
Input stno sdate edate;
Informat sdate edate date 9;
Format sdate edate date 9;
Cards;
100 12jan2003 14dec2003
101 15jan2003 14oct2004
;
Data medi;
Set medi;
Days = inteck(‘day’, sdate edate);
Run;
Proc print data =medi1;
Run;
Data management:
It is one type of data reading concept using this concept, we can access data from different SAS files and arrange the data into specific order.
It can be 2 types
- Adding
- Combine
Adding:
This concept can be done in 2 ways
- Appending
- Concatenation
Appending:
To add one or more data set data into a existed data set. This concept is called appending
Append procedure:
Using this procedure, we can do append procedure and concatenation.
Syntax:
Proc append base =<master> data = <trans>;
Ex:
Data demo 1;
Input pid age gender $;
Cards;
100 34 female
101 56 male
;
Data demo2;
Input pid age gender $;
Cards;
200 56 male
201 34 female
;
Data demo3;
Input pid age gender $;
Cards;
300 56 male
301 34 female
Proc append base =demo 1
Data =demo2;
Run;
Proc append base =demo 1
Data =demo3;
Run;
Concatenation:
To add one or more data sets observation into another new data set this concept is called concatenation.
Using append procedure we can create new data sets for concatenation.
Ex:
Proc append base = demo
Data =demo1;
Run;
Proc append base = demo
Data =demo2;
Run;
Proc append base = demo
Data =demo3;
Run;
Case 1:
If we run the append procedure multiple, observations are added in based data set. To overcome this problem use with ‘constraints’.
Case 2:
In appending time, additional variables are occurred in master table we can run the appending. Its default takes missing values for transition data or observation for additional variable.
Case 3:
In appending time, additional variables are occurred in transition data set. We can’t run the append if we want to run we should use force option.
Ex:
Data demo1;
Input pid age gender $;
Cards;
100 34 female
101 34 male
;
Data =demo2;
Input pid age gender $ race $;
Cards;
200 56 male African
201 34 female Asian
;
Proc append base = demo1
Data =demo2 force;
Run;
Case 4:
If variables names are different ,then we will use rename option in appending procedure. It is renamed only for temporary purpose.
Ex:
Data demo1;
…
Data demo4;
Input pid age gender $;
Cards;
200 56 male
201 34 female
;
Proc append base =demo 1
Data =demo4(rename =()subid=pid));
Run;
Case 5:
If we load transition data into master file then transition data follow the master data set or file formats.
Ex:
Data demo5;
Input pid age gender $ vdate :ddmmyy10.
Format vdate :ddmmyy10;
Cards;
100 34 female 12/07/2003
101 56 male 13/08/2003
;
Data =demo6;
Input pid age gender $ vdate :date 9;
Format vdate :date9;
Cards;
200 56 male 12sept2003
201 34 female 14oct2003
;
Proc append base = demo5
Data =demo6force;
Run;
Update statement:
It can be used to replace the master file data values with transition files based on matching variable.
/* To report employee new salaries */
Ex:
Data old emp1;
Input eid salary;
Cards;
100 4000
102 2000
103 3000
;
Data new emp2;
Input eid salary;
Cards;
103 6000
100 7000
103 .
;
Proc sort data = old emp1;
By eid;
Run;
Proc sort data = new emp2;
By eid;
Run;
Data emp3;
Update old emp1 new emp2;
By eid;
Run;
After updations are completed, these updations can be stored in master file or in new file.
Syntax:
Data < master>/<new>;
Update < master><trans>;
By <match var>;
Run;
If we get missing value in transition data set, the corresponding value in master data set will not be changed in updation time.
Update mode option:
Default working in update statement it avoids missing data value from the transition data or updates.
Default value from missing check:
If we want to get missing values in updation we will use update mode = no missing check
Ex:
Data emp3;
Update emp1 emp2
Update mode = no missing check;
By eid;
Run;
Proc print data =emp3;
Run;
Case 1:
Additional variables or non – matching variables are occurred in master file or transition file we can run the update statement.
Case 2:
Additional observation or non matching observations are occurred in master file or transition file we can run the update statement.
Case 3:
Non- matching observation is occurred in transition file. These observation default load in master file in updation time.
Ex:
Data emp1;
Input eid salary if no $;
Cards;
100 2000 P100
101 4000 P200
102 2000 P300
103 3000 P400;
Data emp2;
Input eid salary bonus;
Cards;
100 7000 300
100 9000 400
103 . .
102 5000 700
;
Proc sort data = emp1;
By eid;
Run;
Proc sort data = emp2;
By eid;
Run;
Data emp3;
Update emp1 emp2;
By eid;
Proc print data =emp3;
Run;
Note:
Both data sets have no matching observations, in these cases if we run the update concept internally appending is running.
For Indepth knowledge on SAS, click on below
- Normalization in SAS
- SQL Aggregations & DataSet(Row & colummn operations) in SAS
- SQL Statements in SAS
- Working with Macro Variable in SAS
- SAS Interact With Oracle
- SAS Interview Questions & Answers