String function in SAS

String function:

   a.        Length:

Using length function, we can find out length of the string(no. of characters include blank spaces). Length function returns numeric value.

 

   b.       Index function:

Returns position of character in string. It works based on character and word wise.

Note:

       Specific character not available in string it returns zero. Length and index function returns numeric value.

 

   c.     Scan function:

          Using scan function we can get required word from the string.

 

    d.     Sub string:

Using this function, we can get part of the string from string. If requires 3 arguments.

  1. Variable name
  2. Starting position
  3. Number of characters

 

e.   concatenation(combine):

it is used for combining the strings. Symbol is||

ex:

data string;

input pid surname $

name & : $15 age ;

cards;

100  chittu Chandra  Sekhar  56

101  Kolla ram mohan rao  45

102  konda pavan kumar  34

;

Data string1;

Set string;

Sl = length (name);

Si = index( name, ’a’);

Sib= index (name, ‘ ‘);

Siw = index (name, ’ra’);

Sc = Scan(name,2);

Sub = Substr(name,5,10);

Fname=Surname|| name;

Run;

Proc print data =string1;

Run;

Interested in mastering SAS Certification? Enroll now for FREE demo on SAS Training Program.

     f.    Compress function:

    It can be used to remove specific characters from the string. It requires 2 arguments. It is working based on the characters wise.

Syntax:

Compress =(variable name, ’character’);

Note:

   If we omit the 2nd argument in compress function, it removes the blank spaces from the string.

 

Data:

G100     col      0.005   headache and backpain

G200      Aspirin  0.005   Null and skin problem

G300     col      0.1   headache and Null

G400      Aspirin  0.1 headache and eye problem

Ex:

Data one;

Infile ‘d:\medi.txt’;

Input group $ drug $ dos

Adevent  :$12 .;

Run;

Data medi;

Set one;

If dos =0.005 then dos =5;

Else dos =10;

Drug1 = drug ||’-’|| dos || ‘mg’;

Drug1= compress (drug1);

Exad = scan(adevent ,1);

Unexad = scan(adevent ,2);

Drop drug dos adevent;

Rename drug1 =drug;

Run;

Proc print data =medi;

Run;

 

       g.       Translate function:

It can be used to replace the required character in string.

Ex:

Data medi;

Input pid drug$;

Cards;

100 col5mg

101 col10mg

102 col15mg

;

Data medi;

Set medi;

Drug = translate(drug.’C’.’c’);

Run;

Proc print data =medi;

Run;

Ex:

Data demo;

Input stno name & :$21 age color $;

Cards;

100  chittu Chandra  Sekhar  23 white

101  Kolla ram mohan rao  34 black

Data demo1;

Set demo;

Surname =scan (name,1);

Bp = index (name, ‘ ’);

Name= Substr(name, bp+1);

Drop bp;

Run;

Proc print data =demo1;

Run;

 

Date and time function:

Ex:

Data demo;

Input stno scvdate :date9

Set time :time 8.;

Format scvdate date 9.

Scvtime time 8;

Cards;

100 12jan2003  12:23:34

101  13feb2003  13:23:34

102 14feb2003  11:23:34

103  15mar2003  10:23:34

;

Data demo1;

Set demo;

Scv day = day(scvdate);

Scv month= month(scvdate);

Scv year= year(scvdate);

Sc hour =hour(scvtime);

Sc min =minute(scvtime);

Sc sec =second (scvtime);

Run;

Proc print data =demo1;

Run;

/*To report Feb month */

Proc print data =demo1;

Where scv month =2;

Run;

 

  1. Date part function:

It can be used to get date value from the date and time variables.

Syntax:

    Datepart(argument)

 

         2. Time part function:

    It can be used to get time value from the date and time variables.

Syntax:

Time part(argument)

Ex:

Data demo;

Input stno scvdtime :datetime18;

Format scvdtime datetime 18.;

Cards;

100 12jan2003  12:23:34

101  13feb2003  13:23:34

;

Data demo1;

Set demo;

Scdate = datepart(scvdtime);

Sctime = timepart(scvdtime);

Format scvdate date 9. sctime time 8.;

Drop scvdtime;

Run;

Proc print data =demo1;

Run;

 

   3.  Intck function:

It can be used to report difference between the date values in day intervals , month intervals or year intervals.

Ex:

Data medi;

Input stno sdate edate;

Informat sdate edate date 9;

Format sdate edate date 9;

Cards;

100 12jan2003  14dec2003

101  15jan2003 14oct2004

;

Data medi;

Set medi;

Days = inteck(‘day’, sdate edate);

Run;

Proc print data =medi1;

Run;

 

Data management:

It is one type of data reading concept using this concept, we can access data from different SAS files and arrange the data into specific order.

It can be 2 types

  1. Adding
  2. Combine

Adding:

This concept can be done in 2 ways

  1. Appending
  2. Concatenation

Appending:

To add one or more data set data into a existed data set. This concept is called appending

 

   Append procedure:

Using this procedure, we can do append procedure and concatenation.

Syntax:

Proc append base =<master> data = <trans>;

Ex:

Data demo 1;

Input pid age gender $;

Cards;

100  34  female

101  56  male

;

Data demo2;

Input pid age gender $;

Cards;

200  56 male

201  34  female

;

Data demo3;

Input pid age gender $;

Cards;

300  56 male

301  34  female

Proc append base =demo 1

Data =demo2;

Run;

Proc append base =demo 1

Data =demo3;

Run;

 

Concatenation:

To add one or more data sets observation into another new data set this concept is called concatenation.

Using append procedure we can create new data sets for concatenation.

Ex:

Proc append base = demo

Data =demo1;

Run;

Proc append base = demo

Data =demo2;

Run;

Proc append base = demo

Data =demo3;

Run;

Case 1:

If we run the append procedure multiple, observations are added in based data set. To overcome this problem use with ‘constraints’.

Case 2:

In appending time, additional variables are occurred in master table we can run the appending. Its default takes missing values for transition data or observation for additional variable.

Case 3:

    In appending time, additional variables are occurred in transition data set. We can’t run the append if we want to run we should use force option.

Ex:

Data demo1;

Input pid age gender $;

Cards;

100  34 female

101  34  male

;

Data =demo2;

Input pid age gender $  race $;

Cards;

200  56 male  African

201  34  female Asian

;

Proc append base = demo1

Data =demo2 force;

Run;

Case 4:

If variables names are different ,then we will use rename option in appending procedure. It is renamed only for temporary purpose.

Ex:

Data demo1;

Data demo4;

Input pid age gender $;

Cards;

200  56 male

201  34  female

;

Proc append base =demo 1

Data =demo4(rename =()subid=pid));

Run;

Case 5:

If we load transition data into master file then transition data follow the master data set or file formats.

Ex:

Data demo5;

Input pid age gender $ vdate :ddmmyy10.

Format vdate :ddmmyy10;

Cards;

100  34 female   12/07/2003

101  56  male     13/08/2003

;

Data =demo6;

Input pid age gender $  vdate :date 9;

Format vdate :date9;

Cards;

200  56 male  12sept2003

201  34  female 14oct2003

;

Proc append base = demo5

Data =demo6force;

Run;

 

Update statement:

It can be used to replace the master file data values with transition files based on matching variable.

/* To report employee new salaries */

Ex:

Data old emp1;

Input eid salary;

Cards;

100  4000

102  2000

103  3000

;

Data new emp2;

Input eid salary;

Cards;

103  6000

100  7000

103  .

;

Proc sort data = old emp1;

By eid;

Run;

Proc sort data = new emp2;

By eid;

Run;

Data emp3;

Update old emp1 new emp2;

By eid;

Run;

After updations are completed, these updations can be stored in master file or in new file.

Syntax:

Data < master>/<new>;

Update < master><trans>;

By <match var>;

Run;

If we get missing value in transition data set, the corresponding value in master data set will not be changed in updation time.

 

Update mode option:

Default working in update statement it avoids missing data value from the transition data or updates.

 

Default value from missing check:

If we want to get missing values in updation we will use update mode = no missing check

Ex:

Data emp3;

Update emp1 emp2

Update mode = no missing check;

By eid;

Run;

Proc print data =emp3;

Run;

Case 1:

     Additional variables or non – matching  variables are occurred in master file or transition file we can run the update statement.

Case 2:

   Additional observation or non matching observations are occurred in master file or transition file we can run the update statement.

Case 3:

Non- matching observation is occurred in transition file. These observation default load in master file in updation time.

Ex:

Data emp1;

Input eid salary  if no $;

Cards;

100  2000    P100

101  4000     P200

102  2000      P300

103  3000      P400;

Data emp2;

Input eid salary  bonus;

Cards;

100  7000   300

100  9000   400

103  .            .

102   5000   700

;

Proc sort data = emp1;

By eid;

Run;

Proc sort data = emp2;

By eid;

Run;

Data emp3;

Update emp1 emp2;

By eid;

Proc print data =emp3;

Run;

Note:

Both data sets have no matching observations, in these cases if we run the update concept internally appending is running.

For Indepth knowledge on SAS, click on below