RustProof Labs: blogging for education (logo)

Postgres 15 improves UNIQUE and NULL

By Ryan Lambert -- Published July 11, 2022

Postgres 15 beta 2 was released recently! I enjoy Beta season... reviewing and testing new features is a fun diversion from daily tasks. This post takes a look at an improvement to UNIQUE constraints on columns with NULL values. While the nuances of unique constraints are not as flashy as making sorts faster (that's exciting!), improving the database developer's control over data quality is always a good benefit.

This email chain has the history behind this change. The Postgres 15 release notes summarize this improvement:

"Allow unique constraints and indexes to treat NULL values as not distinct (Peter Eisentraut)

Previously NULL values were always indexed as distinct values, but this can now be changed by creating constraints and indexes using UNIQUE NULLS NOT DISTINCT."

Two styles of UNIQUE

To take a look at what this change does, we create two tables. The null_old_style table has a 2-column UNIQUE constraint on (val1, val2). The val2 allows NULL values.

CREATE TABLE null_old_style
(
    id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    val1 TEXT NOT NULL,
    val2 TEXT NULL,
    CONSTRAINT uq_val1_val2
        UNIQUE (val1, val2)
);

The null_new_style table uses the new option: UNIQUE NULLS NOT DISTINCT. The only difference from the previous table is the addition of the new syntax for the unique constraint.

CREATE TABLE null_new_style
(
    id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    val1 TEXT NOT NULL,
    val2 TEXT NULL,
    CONSTRAINT uq_val1_val2_new
        UNIQUE NULLS NOT DISTINCT (val1, val2)
);

Change in allowed data

In Postgres 14 and prior, unique constraints treated NULL values as not equal to other NULL values. From the Postgres 14 docs:

When an index is declared unique, multiple table rows with equal indexed values are not allowed. Null values are not considered equal.

This is consistent with the SQL Standard handling of NULL in general, where NULL is unknown. It is impossible to determine if one unknown is equal to another unknown. Because NULL values are of unknown equality to one another, they do not violate UNIQUE constraints. This is illustrated by adding 5 identical rows to the null_old_style table.

INSERT INTO null_old_style (val1, val2)
SELECT 'Hello', NULL
    FROM generate_series(1, 5)
;

SELECT * FROM null_old_style;

id|val1 |val2|
--+-----+----+
 1|Hello|    |
 2|Hello|    |
 3|Hello|    |
 4|Hello|    |
 5|Hello|    |

This behavior is documented and expected, and is even part of the ANSI SQL Standard. That said, I never really liked the above behavior because it is not restrictive enough.

With the new NULLS NOT DISTINCT option, the unique constraint is more restrictive by not allowing repeated NULL values. Add one row to start.

INSERT INTO null_new_style (val1, val2)
SELECT 'Hello', NULL;

SELECT * FROM null_new_style;

id|val1 |val2|
--+-----+----+
 1|Hello|    |

Attempting to add a second row with 'Hello' in val1 and NULL in val2 now results in a violation of the unique constraint.

INSERT INTO null_new_style (val1, val2)
SELECT 'Hello', NULL;

SQL Error [23505]: ERROR: duplicate key value violates unique constraint "uq_val1_val2_new"
  Detail: Key (val1, val2)=(Hello, null) already exists.

Of course, a changing the value in val1 to a new value will allow another NULL to be added in val2.

INSERT INTO null_new_style (val1, val2)
SELECT 'World', NULL;

id|val1 |val2|
--+-----+----+
 1|Hello|    |
 3|World|    |

This fits with my mental model of how I think unique constraints should work with null values.

Summary

I am happy to see the addition of UNIQUE NULLS NOT DISTINCT coming in Postgres 15. This adds an additional level of quality control to Postgres. It's also a low-impact change as the default operation will continue as it has been.

Need help with your PostgreSQL servers or databases? Contact us to start the conversation!

By Ryan Lambert
Published July 11, 2022
Last Updated July 11, 2022