Missing Values in SAS
Posted on Mar 28, 2015 in Computer Science
Things under legendu.net/outdated are outdated technologies that the author does not plan to update any more. Please look for better alternatives.
-
SAS uses a dot (
.) to stand for a numeric missing value and any number (can be 0 which correspond to the blank string"") of white spaces (e.g.," ") for character missing value. (This also means that you cannot save pure white spaces in SAS.) However, when you enter values afterdatalinesin the data step, you always use dots (not a blank/space) to stand for missing values (no matter a variable is numeric or character). You can usewhere v is nullorwhere v is missingto check whether the variablevis null/missing. Herenullandmissinghave the same meaning. However, both ofis nullandis missingcan only be used in thewhereclause (in any procedure) and theonclause (in the SQL procedure). In other logical comparisons (e.g., if), you have to usev = .orv = " "according to whethervis a numeric variable or a character variable. In Teradata SQL,nullmeans missing value and you can usenull(and onlynull) for both numeric and character variables and in any logical comparisons. When SAS displays missing values, a numerical missing value is displayed as a dot and a character missing value is displayed as a blank/space. When Teradata SQL Assistant displays query results,nullvalues are indicated by?. It is suggested that you alwaysis nullinstead ofis missingorv = .orv = " "inwhereclauses in SAS. This makes your SAS SQL code more portable. -
In the IML procedure (seems also true for data step?) missing values and white space (no matter how many) all have length 1, which is ridiculous. You have to be very careful when you work with string in
proc iml. -
SAS treats the numeric missing value (
.) as the smallest numerical value. When you check whether a numeric value is negative, you have to first get rid of missing values. -
Most functions (e.g.,
sum,min,max, etc.) in SAS ignores missing values instead of propagate missing values. This is a little bit crazy as propagating missing values sounds more reasonable. You'd better filtering out missing values (if any) before you do calculations. -
input("", 8.)returns.(numeric missing value) whileput(., 3.)returns"."instead of""(character missing value). The inconsistent is annoying.
Questions
-
Numeric missing value (
.) affects functions such aslaganddif -
the missing(.) function is strange, check it, and I think it should replaced by
x is missingorx is null...