In SQL we use
NULL as a placeholder for missing values. It has at
least three different uses:
NULL. Here a
NULLmeans unknown, unknown because information that is needed for evaluating the expression is missing.
If a value in a column is
NULL, all you know is that it is missing. If
you need to know why, you can store that information in a separate
column. Another approach is to disallow NULLs and use an encoding
that has distinct values for missing data. An example is the standard
encoding for gender: ISO 5218.
The usage of
NULL is one of the big controversies in SQL and
relational systems. Some feels that it introduces unnecessary
complexity, and the general advice seems to be to use
NULLs only when
there is no better option.
NULLbehaves differently from similar non-value markers in most programming languages.
NULLwith the empty string, zero, or false.
NULLs introduces 3-valued logic. That is, in addition to
FALSEyou also have
UNKNOWN, a source of many misunderstandings and errors in SQL statements.
NULLin a special manner. To remember how
NULLis handled by different functions is painful. Some examples:
GROUP BYputs all rows with
NULLs in the same group, thous treating them equal when in fact the equality is unknown. The
SUMaggregate function ignores
NULLs when summarizing the values of a column.
NULLin a table, typically introduces a penalty on performance.
NULLwhen you need to mark a value as missing and there is no easy way handle missing values in any reasonable encoding for the column. Just be aware that you don’t fall in any of the traps mentioned above.