Postgres 15 improves UNIQUE and NULL
Postgres 15 beta 2 was released
recently! I enjoy Beta season... reviewing and testing new features
is a fun diversion from daily tasks. This post takes a look at an improvement
to UNIQUE
constraints on columns with NULL
values. While the nuances of unique constraints are not as flashy
as making sorts faster (that's exciting!),
improving the database developer's control over data quality is always a good benefit.
This email chain has the history behind this change. The Postgres 15 release notes summarize this improvement:
"Allow unique constraints and indexes to treat NULL values as not distinct (Peter Eisentraut)
Previously
NULL
values were always indexed as distinct values, but this can now be changed by creating constraints and indexes usingUNIQUE NULLS NOT DISTINCT
."
Two styles of UNIQUE
To take a look at what this change does, we create two tables.
The null_old_style
table has a 2-column UNIQUE
constraint
on (val1, val2)
. The val2
allows NULL
values.
CREATE TABLE null_old_style
(
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
val1 TEXT NOT NULL,
val2 TEXT NULL,
CONSTRAINT uq_val1_val2
UNIQUE (val1, val2)
);
The null_new_style
table uses the new option: UNIQUE NULLS NOT DISTINCT
.
The only difference from the previous table is the addition of the
new syntax for the unique constraint.
CREATE TABLE null_new_style
(
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
val1 TEXT NOT NULL,
val2 TEXT NULL,
CONSTRAINT uq_val1_val2_new
UNIQUE NULLS NOT DISTINCT (val1, val2)
);
Change in allowed data
In Postgres 14 and prior, unique constraints treated NULL
values as
not equal to other NULL
values. From the
Postgres 14 docs:
When an index is declared unique, multiple table rows with equal indexed values are not allowed. Null values are not considered equal.
This is consistent with the SQL Standard handling of NULL
in general,
where NULL
is unknown. It is impossible to determine if one unknown is
equal to another unknown. Because NULL
values are of unknown equality
to one another, they do not violate UNIQUE
constraints.
This is illustrated by adding 5 identical rows to the null_old_style
table.
INSERT INTO null_old_style (val1, val2)
SELECT 'Hello', NULL
FROM generate_series(1, 5)
;
SELECT * FROM null_old_style;
id|val1 |val2|
--+-----+----+
1|Hello| |
2|Hello| |
3|Hello| |
4|Hello| |
5|Hello| |
This behavior is documented and expected, and is even part of the ANSI SQL Standard. That said, I never really liked the above behavior because it is not restrictive enough.
With the new NULLS NOT DISTINCT
option, the unique constraint
is more restrictive by not allowing repeated NULL
values.
Add one row to start.
INSERT INTO null_new_style (val1, val2)
SELECT 'Hello', NULL;
SELECT * FROM null_new_style;
id|val1 |val2|
--+-----+----+
1|Hello| |
Attempting to add a second row with 'Hello'
in val1
and NULL
in val2
now results in a violation of the unique constraint.
INSERT INTO null_new_style (val1, val2)
SELECT 'Hello', NULL;
SQL Error [23505]: ERROR: duplicate key value violates unique constraint "uq_val1_val2_new"
Detail: Key (val1, val2)=(Hello, null) already exists.
Of course, a changing the value in val1
to a new value will allow
another NULL
to be added in val2
.
INSERT INTO null_new_style (val1, val2)
SELECT 'World', NULL;
id|val1 |val2|
--+-----+----+
1|Hello| |
3|World| |
This fits with my mental model of how I think unique constraints should work with null values.
Summary
I am happy to see the addition of UNIQUE NULLS NOT DISTINCT
coming
in Postgres 15. This adds an additional level of quality control to
Postgres. It's also a low-impact change as the default operation
will continue as it has been.
Need help with your PostgreSQL servers or databases? Contact us to start the conversation!