data values are represented by at least one blank space. Any character data value does not have embedded space. In this cases we use list input method.
It stores 8 characters for character variable and 12 numbers for number variable.
1 char – 2 bytes 1 number – 1 byte Ex: Data demo; Infile ‘d:\ one.txt’; Input pid name $ age color $; Run; Proc print data =demo; Run; 100 kumar 78 white 101 kiran 89 black 102 lava 78 white INFILE options:
Using infile option we can read the data in proper order. All infile options supports list input method.
Inclined to build a profession as SAS Developer? Then here is the blog post on SAS Certification Program.
It is used to indicate delimiters in raw data. Ex: Data demo 5; Infile cards dlm = ‘, & $’ ; Input pid name $ age color $; Cards; 100, kumar rao,78,white 101 ,kiran ,89 $black 102 ,lava, 78 & white Proc print data =demo5; Run; Note: If we use the dlm options we should indicate all the delimiter including default delimiter(space). Dsd:
In raw data, data values are separated by comma we will use dsd instead of dlm.
In raw data, character data value available with quotation for manual checking, to read this raw data into SAS file without quotation with data sensitive delimiter. Ex: Data demo 6; Infile cards dsd ; Input pid name $ age color $; Cards; 100, kumar rao,78,white 101 ,kiran ,89 “black” 102 ,lava, 78 & white
; Proc print data =demo6; Run; Ex: Data demo 6; Infile cards dsd dlm = ‘$’; Input pid name $ age color $; Cards; 100,$ kumar rao$78,white 101 ,kiran ,89 “black” 102 $lava, 78 & white
; Proc print data =demo6; Run;
Default nature of the SAS is controlled by SAS system to read one line for one observation. If any value is missing in raw data it tried to read next data value. Some cases it is success and some cases it is failure. table align="left" border="1" cellpadding="0" cellspacing="0">
Missing | Next | Rating |
Ch | N | |
N | Ch | * |
Ch | Ch | |
N | N |
Character or numeric value is missing in raw data then character missing value is black and numeric missing value is dot. Ex: Data sasuser.demo; Infile cards; Input pid age color $ race $ Cards; 100 23 white Asian 101 . white Asian 102 78 . African ; Proc print data =demo; Run;
When error is occur in raw data in reading stop over nature tries to stops the reading. Errors are 2 types
We can identify in compilation
This can be identified by in execution. This is execution error (or) data error
Miss over: In raw data, at the end of the values are missing. No need to use dot(.). we should use miss over option. Miss over is controlled by flow over. Ex: Data demo; Infile cards miss over; Input pid age color $ race $ Cards; 100 23 white 101 63 white Asian 102 78 103 23 black African 104 45 black 105 56 white Asian 106 ; Proc print data =demo; Run; Syntax: /* comment */ *comment ; /*use for multiple infile options*/ Data demo3; Infile cards Dsd miss over dlm=’, $’; Input pid age color $ race $ Cards; 100 , 23 $ white 101 ,67 $ white,” Asian” 102 , 78 103 $23, black “African” ; Proc print data =demo; Run;
It works like a miss over. It adjust vary of the length for required variable and have the displace and reduce the storage place.
We can read part of the data in sequential order, firstobs is based on lines. Data demo; Infile cards firstobs =2 obs=4; Dsd miss over dlm=’, $’; Input pid age color $ race $ Cards; 100 23 white Asian 101 63 white Asian 102 78 white Asian 103 23 black African 104 45 black African 105 56 white Asian ; Proc print data =demo; Run;
It is used to read required data values based on key data values. It is used only for character not for numeric data. Ex: Data clinical; Infile cards scan over; Input @ ‘appolo’ trail $ year sub; Cards; Appolo phase 1 1996 28 Nims phase 1 1997 30 Appolo phase 2 1997 290 Nims phase2 1998 250 ; Proc print data = appolo; run;
Tabspace delimiter indicated by ‘09’ x Data clinical; Infile ‘d:\tab file.txt’ dlm =’09’x; Input center $ trail $ year sub; run; Proc print data = clinical; run; Note: If we run the data set application without data set name, it default takes default name data 1,..data n Space between the SAS words is called token Note:
Input name $ age; Cards; Uma 45 Kiran 78 ;
Input name $age; Cards; Uma 45 Kiran 78 ;
& modifier list input method : modifier list input method ~ (tilde) modifier list input method
It is used to indicate 2 more blank spaces, delimiter between the required data values.
It is used to increase the storage capacity with out changing the order if data. Data demo 6; Input pid name & $ : 13 age color $; Cards; 100 kiran kumar 89 white 101 pavan 90 white 102 lava kumar 89 white ; Proc print data =demo6; run;
Tilde modifier can be used to avoid dsd option for required variables. Ex: Data medi; In file cards dsd; Input pid drug $ adevents : ~ $ 1+; Cards; 100,”col 5 mg”, “fever,cold” 101,”col 5 mg”, “cold, headache” ; Proc print data = medi; run;
Ex: Date time amounts (numerical data)
It is used in data set block. Using informat technique , we can read non standard data into standard data. Using informat technique we can read data values into number value(standard format). This number otherwise called SAS date value. Ex: Data medi; Input pid Jdate; Informat Jdate ddmmyy10; Cards; 200 01/01/1960 201 01/01/1961 202 12/12/1970 203 23/03/1989 ; Proc print data =medi; run; SAs date value – no.of days difference between sas discover date to your loading date or presenting date.
It can be used to convert standard data into non – standard data for reporting. Format statement can be written in the procedure block. 01/01/1960 – informat -> standard (number) ->0 0 -> format -> non – standard format – 01/01/1960-> (reporting) Ex: Data medi; Input pid Jdate; Informat Jdate ddmmyy10; Cards; 200 01/01/1960 201 01/01/1961 202 12/12/1970 203 23/03/1989 ; Proc print data =medi; Format jdate ddmmyy10; run; Syntax: Informat<variable name><informat tech> Format <variable name><format tech>
30/02/2003 ddmmyy10. Or s10 30-02-2003 ddmmyyD10. 30:02:2003 ddmmyyc10. 30.02.2003 ddmmyy10 ddmmyyp10. 30022003 ddmmyyB10 Ex: Data medi; Input pid Jdate; Informat Jdate ddmmyy10; Cards; 200 01/01/1960 12.01-1961 201 01/01/1961 13:04:1961 202 12/12/1970 14:02:1971 203 23/03/1989 2511190 ; Proc print data =medi; Format jdate ddmmyyD10; Ldate ddmmyyC10; run; we can report required variables in required format we can’t report each and every data value in required format.
30/02/03 ddmmyy10S8. 30-02-03 ddmmyyd8. 30:02:03 ddmmyyC8. 30.02.03 ddmmyy10 ddmmyyP8. 3002203 ddmmyyB8.
02/30/2003 mmddyy10. ddyymm10 yyddmm10 wrong application mm/yy/dd yy/mm/dd Ex: Data medi; Input pid drug $ sdate edate; Informat sdate mmddyy10. Edate mmddyy 8.; Cards; 200 5mg 03/23/2003 04/25/03 201 10mg 02/25/2003 05/26/03 ; Proc print data = medi; Format sdate ddmmyD10. Edate mmddyyC10.; Run; 4)
Date value | informat | Format |
23oct2003 | Date9 | Date9 |
23 oct03 | Date7 | Date7 |
Dec2003 | Monyy7 | Monyy7 |
Dec 03 | Monyy5 | Monyy5 |
Ex: Data medi; Input pid drug $ sdate edate adsdate; Informat sdate date9 Edate date 7 Adsdate monyy7.; Cards; 100 5mg 12oct2003 13dec03 jan2004 101 10mg 13 jun2003 26dec03 feb2004 ; Proc print data = medi; Format sdate date9. Edatedate Adsdatemonyy7.; run; report available in paper filr or electronic files. Another date value(Julian date value(or)code level date value)
It is also called code level date. In this code level date the 1st 4 digits indicate years, next 3 digits indicated no.of days completed in that year. Normal year 365 days, leap year 366 days. Max or min value for this is 7. Ex: Data adevent; Input pid ad $ sdate; Informat sdate Julian7.; Cards; 100 headache 2003032 101 cold 2003145 102 skinprb 2003178 ; Proc print data = adevent; Format sdate date9. run; only formats – reporting purpose word date 18(max), week date 24(min) word date 15(min), week date 30(max) ex: Data adevent; Input pid ad $ sdate; Informat sdate Julian7.; Cards; 100 headache 2003032 101 cold 2003145 102 skinprb 2003178 ; Proc print :adevent; Format sdate weekdate 24; Run; Amounts: represented by this application
Date values | informat | Format |
2,225,000 | Comma9 | Comma9 |
$2,225,000 | Dollar10 | Dollar10 |
Ex: Data emp; Input eid salary pf; Information salary comma 6 Pf dollar 10.; Cards; 100 23,000 $1,345,000 101 34,000 $1,234,678 ; Proc print data =emp; Format salary comma 6. Pf dollar 10.; Run; Words w: Using with words we can report amounts in words w Note: Based on client requirement we should create informat and format using format procedure. Times:
Date values | Informat | format |
12hr -10:12:23AM02:23:52 PM | Time10 | Time ampm12(etime-stime) |
29hr- 22:10:2010:12:20 | Time 8 | Time 8 |
12 oct 2003:02:23:30pm12oct2003:15:23:30 | Datetime20Datetime18 | Dateampm22Datetime18 |
Ex: Data trt; Input pid stime etime; Informat stime time10. Etime time8.; Cards; 200 10:23:34 am 15:34:19 201 02:23:34pm 18:23:34 ; Proc print data =trt; Format Stime time ampm12. Etime time8.; Run; Note: In raw data, only time values are available. in these cases SAS default recognizes data values is SAS discovered date. Ex: Data trt; Input pid stime etime; Informat stime datetime20. Etime datetime18.; Cards; 200 12oct2003:10:23:34am 15dec2003:15;34:19 201 13dec2003:02:23:34pm 30dec2003:18:23:94 ; Proc print data =trt; Format Stime datetime ampm12. Etime datetime18.; Run;
Ex: Data bank; Input loantype $irate; Informat irate percent3.; Cards; Personal 5% House 20% Education 2% ; Proc print data =bank; Format irate percent5.; Run; We can run old technology in new technology, but new technology can n’t run in old technology. Note: If informat technique written in input statement then we should use colon modifier. Ex: Data bank; Input loantype ; Irate: percent3.; Cards; Personal 5% House 20% Education 2% ; Format can be assigned in 3 ways 1.Temporary 2.parmanent If we write formats in procedure block these formats are called temporary If we write formats in data set block. These formats are called permanent. Permanent formats can be changed for “reporting’. Ex: Data sasuser. Medicine; Input pid sdate : date 9. Edate : ddmmyy 10.; Format sdate date9 Edate:ddmmyy10; Cards; 100 12jan2003 13-10-1003 101 14nov2003 15-12-2003 ; Proc print data =sasuser.medicine; Run;
Ex: Data ‘d:\demo’; Input pid age; Cards; 100 89 234 34 ; Proc print data =’d:\demo’; Run;
SAS application can be stores or saved inside or outside of SAS environment.
If we SAS application outside(p.c) it default takes extension name as.sas(open program)
If we want to save SAS application inside of the SAS environment, 1st of all we can create catalogue in required library. Save your application it default takes file name as source. For indepth knowledge on SAS, click on below
You liked the article?
Like: 0
Vote for difficulty
Current difficulty (Avg): Medium
TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.