Handle Missing values for Analysis In SAS

Ratings:
(4)
Views:0
Banner-Img
  • Share this blog:

Handle missing values for analysis

Missing: Its default working and avoid missing from the analysis. If we want to include we should use missing option.

Eg:         

proc freq data = medi;
table group / mising;
run;

  on: It can be used to store frequency analysis required datasets.

Eg:         

proc freq data = medi;
table group / mising;
out =medi1;
run;

/* To apply required format for reporting */

Eg:         

Proc format;
value $GP '  ' = 'Miss';
run;
proc freq data = medi;
table group / mising;
out =medi1;
format group $GP.;
run;
data MDR;
input pid (dos1 - dos3) ($);
cards;
100         Y  N  Y
101         Y  Y  Y
100         Y  N  N
100         Y  Y  N
100         N  Y  Y
100         N  N  Y
100         Y  Y  N
100         Y  Y  Y
;
proc print data = HDR;
run;

/* who has taken dos 1  */

proc freq data = HDR;
table dos1;
where dos 1 = 'Y';
run;

/* who has taken only dos 1  */

proc freq data = HDR;
table dos1;
where dos 1 = 'Y' and dos2 = 'N' and dos3 = 'N';
run;

/* who has taken dos 1 and dos2 */

proc freq data = HDR;
table dos1 * dos2;
where dos 1 = 'Y' and dos2 = 'Y';
run;

/* who has taken only dos 1 and dos2 */

proc freq data = HDR;
table dos1 * dos2;
where dos 1 = 'Y' and dos2 = 'Y' and dos3 = 'N';
run;

- We run the frequency procedure without any statement, it default produces frequency analysis (dependent analysis) based on all variable present in current dataset include charachter and numeric variable.

Eg:         

Proc freq data = hdr;
table_all_;
run;

Tabulate Procedure

Using this procedure, can generate required analysis in table format.

Eg:         

data trtment;
input Gid $ drug  $ visit sub;
cards;
G1234                   col5mg                  1              70
G2345                   col5mg                  1              89
G4567                   col5mg                  1              78
G1234                   col5mg                  2              50
G2345                   col5mg                  2              79
G4567                   col6mg                  2              38
G1234                   col6mg                  3              70
G2345                   col6mg                  3              89
G1234                   col7mg                  4              90
G2345                   col7mg                  4              89
;

Class Statement:

It requires grouping variables or clasification variable.

Var Statement:

It requires analysis variable (numeric)

Table Statement:

Using table statement, we can build table. If we write any variable in table statement, the variable must be available in either class or in var statement.

Syntax: Table rowwise(var) , column wise(var);

Interested in mastering SAS Developer? Enroll now for FREE demo on SAS Training.

/* Each group received no of times drug doses */

proc tabulate data = trtment;
class gid;
table gid;
run;

/* Each group received no of each time drug doses */

proc tabulate data = trtment;
class gid drug;
table gid, drug;
run;

/* Total no of sub taken each drug in each group */

proc tabulate data = trtment;
class gid drug;
var sub;
table gid * drug, sub;
run;
proc tabulate data = trtment;
class gid drug;
var sub;
table gid, drug * sub;
run;
/* Each group taken no of times each drug dos
Total no of sub aken each drug in each group
Average no of sub taken each drug in each group
Max no of sub taken each drug in each group
Min no of sub taken each drug in each group */
proc tabulate data = trtment;
class gid drug;
var sub;
table gid * drug, sub * (N sum mean max min);
run;

/* No of groups , total no of sub, average no of sub, max no of sub, min no of sub taken each drug in each visit */

proc format;
value vs 1 = 'visit1'
value vs 2 = 'visit2'
value vs 3 = 'visit3'
value vs 4 = 'visit4'
run;
proc tabulate data = trtment;
class drug visit;
var sub;
table visit * drug, sub * (N sum mean max min);
format visit vs.;
run;

/* Total no of sub (patients) taken each drug in each visit in each group */

proc tabulate data = trtment;
class drug visit;
var sub;
table gid * drug, visit * sub / misstext = "Didn't take";
format visit vs.;
run;

Missetext Option:

It can be used to replace the drug values for reporting.

Note: We can not run the tabulate procedure without table statement.

/* To apply the tables for reporting */

proc tabulate data = trtment;
class gid;
var sub;
table gid = 'Group id', sub = 'pateints';
run;

/* Total no of sub taken different drug doses & in each group & total no of sub in each group */

proc tabulate data = trtment;
class gid drug;
var sub;
keyword sum;
keylabel all = 'Total';
table gid * (drug all) , sub;
run;

KeyWord Statement:

It can be used to indicate required analysis. This analysis can be done by using previous analysis result.

KeyLabel:

Using keylabel, we can change the keyvariable name for reporting.

/* No of sub taken drug doses in each visit in each group and total no of sub taken drug doses in each visit */

proc format;
value vs 1= 'visit1'
value vs 2= 'visit2'
value vs 3= 'visit3'
value vs 4= 'visit4'
run;
proc tabulate data - trtment;
class gid visit;
keyword sum;
keylabel all = 'Total';
table visit * (gid all), sub;
format visit vs;
run;

Report Procedure:

Using this procedure, we can generate required analysis and generate a reprts in required format. It is powerful reporting tool. Using this procedure we can do frequency procedure analysis, Mean procedure analysis, tabulate procedure analysis & print procedure works.

Report Window:

Report procedure generate report in report window.

/* No of sub taken different doses in each group and total no of sub in each group to received drug */

proc report data = trtment headline;
columns gid drug sub;
define gid/group;
define sub/sum;
break after gid/ol ul summarize;
rbreak after/dot dul summarize;
compute after gid;
gid = 'Total';
endcomp;
compute after;
Gid = 'Gtotal';
endcomp;
run;

HeadLine Option:

It can be used to draw a line between the variables and observation.

Columns Statement:

It require variable list and these variables playing a main role in analysis & report.

Define Statement:

It can be used to how to use required variable in analysis & reporting.

Order, Group, Across Options:

The main use of the these options is to arrange the data in required order for reporting.

Break Statement:

It can be used to give summary breaks in the middle of the reports besed on group variable. Break statement is working based on 2 options.

1) After Option: It indicate to give the break after grouping.

2) Break Option: It indicate to give the break before grouping.

Options:             

ol - overline      dol - double overline
ul - underline     dul - double underline

Summarize Option:

It can be used to report the required analysis.

RBreak Statement:

It can be used to give summary break end of the report or begining of the report based on after or before options.

Compute Block:

Using this block, we can do new analysis for reporting.

  1. To generate new data value for reporting.
  2. To create new variable for reporting.

- Compute block ends with endcomp.

- Compute block also working based on after and before options.

- New data value we can give upto 8 characters only.

/* To transpose the data for reporting using across option */

data company;
input cname & details & amount;
cards;
satyam      invest       6700
tcs         invest       6800
satyam      invest       3400
tcs         invest       2300
wipro       invest       5600
wipro       invest       3400
;
Columns cname (details, amount);
define cname/group;
define details/across;
break after cname/dol;
run;

New Variable Creation for Reporting: (numeric variable)

  Eg:         

data trtment;
input Gid & drug & visit sub;
cards;
G100      col5mg      1       90
G200      col5mg      1       90
G300      col5mg      1       90
G100      col10mg     2       85
G200      col10mg     2       80
G300      col10mg     2       82
G100      col15mg     3       75
G200      col15mg     3       80
G300      col15mg     3       78
;

-Cn-: It is a automatic variable in report procedure. 'n' indicates column number.

/* Total no of patients received drug doses in each group */

proc report data = trtment headline;
columns Gid(drug, sub) totsub;
define gid/group;
define drug/across;
define totsub/computed;
compute totsub;
totsub = _c2_+_c3_+_c4_;
endcomp;
compute after;
line '  ' ;
line '  ' ;
line 'This data belongs to phase1';
line 28 '  ';
endcomp;
run;

Line Statement:

It can be used to print required text in report. Line statement can be written in compute block only.

To Report Character variable for Reporting:

Eg:         

data medi;
input pid Bsbp drug &Asbp;
100         178         col5mg        167
101         156         col5mg        159
102         178         col10mg       168
103         177         col10mg       177
104         180         col15mg       182
105         169         col15mg       134
;

Note: 'If condition' can be written in report procedure for generating the values based on condition.

proc report data = medi headline;
columns pid  drug Bsbp Asbp status;
define pid/drug;
define status/computed;
break after pid/ol;
compute status/character length = 19;
if _c3_ < _c4_ then
status = 'Drug is not working';
else if _c3_ > _c4_ then
status = 'Drug is working';
else status = 'Change the drug';
endcomp;
run;

Note: Functions also can be used in report procedure for reporting.

Eg:         

proc report data = medi headline;
columns pid drug Bsbp Asbp status;
define pid/drug;
define status/computed;
break after pid/ol;
compute status/character length = 19;
if _c3_ < _c4_ then
status = drug  // is not working';
else if _c3_ > _c4_ then
status = drug  // is working';
else
status = 'Change' // drug // drug';
endcomp;
run;

/* To assign the formats and labels */

Eg:         

proc format;
value vs 1 = 'visit1'
value vs 2 = 'visit2'
value vs 3 = 'visit3'
value vs 4 = 'visit4';
run;
proc report data = trtment headline;
columns ('_ _' Gid drug visit sub);
define gid/group 'Group';
define drug/order 'study drug';
define visit/order format = vs;
break after gid/ol;
run;

/* Total no of sub taken each drug in each center & total, avg, max, min no of sub received drug doses in each group in each center */

Eg:         

data clinical;
input center & Gid & drug & sub;
cards;
Appolo          G100          col5mg         500
Appolo          G200          col5mg         560
Appolo          G300          col5mg         600
Nims            G400          col5mg         500
Nims            G500          col5mg         560
Nims            G600          col5mg         600
Care            G700          col5mg         700
Care            G800          col5mg         670
Care            G900          col5mg         680
Appolo          G100          col10mg        450
Appolo          G200          col10mg        500
Appolo          G300          col10mg        580
Nims            G400          col10mg        500
Nims            G500          col10mg        500
Nims            G600          col10mg        560
Care            G700          col10mg        670
Care            G800          col10mg        600
Care            G900          col10mg        650
Appolo          G100          col15mg        400
Appolo          G200          col15mg        450
Appolo          G300          col15mg        580
Nims            G400          col15mg        420
Nims            G500          col15mg        450
Nims            G600          col15mg        560
Care            G700          col15mg        600
Care            G800          col15mg        600
Car             G900          col15mg        680
;

Color Option:

It can be used colors for reporting. This optionn we can write in break, rbreak and define statements.

proc report data = clinical headline box;
columns ('_ _' center gid(drug,sub)
Totsub Avgsub Maxsub Minsub);
define center/group color = green;
define gid/group color = red;
define drug/across '  ' color = blue;
define sub/sum '  ' color = brown;
define Totsub/computed color = orange;
define Avgsub/computed color = yellow;
define Maxsub/computed color = green;
define Minsub/computed color = pink;
break after center/ol ul;
summarize color = orange;
rbreak after/dol dul
summarize color = Brown;
compute totsub;
totsub = sum (_c3_ , _c4_= _c5_);
end comp;
compute Avgsub;
Avgsub = round ( mean(_c3_ , _c4_ , _c5_);
end comp;
compute Maxsub;
Maxsub = Max (_c3_ , _c4_ , _c5_);
end comp;
compute Minsub;
Minsub = Min (_c3_ , _c4_ , _c5_);
end comp;
compute after;
center = 'Gtotal';
line '   ' ;
line '   ' ;
line 'This data belongs to phase - II';
line 30* '__';
end comp;
compute after center;
center = 'Total';
end comp;
compute before center;
line '   ' ;
line '   ' ;
line 'This analysis belongs to' center $9;
line 30* '__';
end comp;
compute before_page_;
line '   ';
line "Sponser is Reddy's lab";
line '   ';
end comp;
run;

_Page_Option: Using this option, we can't print the text before or after reporting based on before and after option.

 

You liked the article?

Like : 0

Vote for difficulty

Current difficulty (Avg): Medium

Recommended Courses

1/15

About Author
Authorlogo
Name
TekSlate
Author Bio

TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.


Stay Updated


Get stories of change makers and innovators from the startup ecosystem in your inbox