Missing Values in SAS
Posted on Mar 28, 2015 in Computer Science
Things under legendu.net/outdated are outdated technologies that the author does not plan to update any more. Please look for better alternatives.
-
SAS uses a dot (
.
) to stand for a numeric missing value and any number (can be 0 which correspond to the blank string""
) of white spaces (e.g.," "
) for character missing value. (This also means that you cannot save pure white spaces in SAS.) However, when you enter values afterdatalines
in the data step, you always use dots (not a blank/space) to stand for missing values (no matter a variable is numeric or character). You can usewhere v is null
orwhere v is missing
to check whether the variablev
is null/missing. Herenull
andmissing
have the same meaning. However, both ofis null
andis missing
can only be used in thewhere
clause (in any procedure) and theon
clause (in the SQL procedure). In other logical comparisons (e.g., if), you have to usev = .
orv = " "
according to whetherv
is a numeric variable or a character variable. In Teradata SQL,null
means missing value and you can usenull
(and onlynull
) for both numeric and character variables and in any logical comparisons. When SAS displays missing values, a numerical missing value is displayed as a dot and a character missing value is displayed as a blank/space. When Teradata SQL Assistant displays query results,null
values are indicated by?
. It is suggested that you alwaysis null
instead ofis missing
orv = .
orv = " "
inwhere
clauses in SAS. This makes your SAS SQL code more portable. -
In the IML procedure (seems also true for data step?) missing values and white space (no matter how many) all have length 1, which is ridiculous. You have to be very careful when you work with string in
proc iml
. -
SAS treats the numeric missing value (
.
) as the smallest numerical value. When you check whether a numeric value is negative, you have to first get rid of missing values. -
Most functions (e.g.,
sum
,min
,max
, etc.) in SAS ignores missing values instead of propagate missing values. This is a little bit crazy as propagating missing values sounds more reasonable. You'd better filtering out missing values (if any) before you do calculations. -
input("", 8.)
returns.
(numeric missing value) whileput(., 3.)
returns"."
instead of""
(character missing value). The inconsistent is annoying.
Questions
-
Numeric missing value (
.
) affects functions such aslag
anddif
-
the missing(.) function is strange, check it, and I think it should replaced by
x is missing
orx is null
...