Ignore_Dup_Key – the good the bad and the ugly - ProcureSQL Data Architect as a Service

What is ignore_dup_key anyway?

Simply put, it allows you to insert duplicate keys (ignore_dup_key) in an index without generating errors; the duplicate rows are merely ignored, and warnings are generated or not based on settings; more on that later.

I was working on a performance tuning exercise with a client when all the low-hanging fruit had already been taken care of. I was staring at an insert that was the number one user of resources on an instance. It looked roughly like this (trimmed down to be concise)

INSERT INTO dbo.Store (StoreId)
SELECT ids.Id
FROM @StoreIdsTable ids
    LEFT JOIN dbo.Store st ON st.StoreId = ids.Id
WHERE st.StoreId IS NULL

We want to insert all the records from the table variable where they don’t already exist.
The Store table is defined with a primary key clustered index on the StoreId Column.
This generates a plan that looks like so:

Execution plan of Insert with Left Join for dup checking.

Notice the scan of the table that is being inserted into, this is to remove potential duplicates before insert via the left join.

Ignore_Dup_Key Use Case

Enter IGNORE_DUP_KEY. Since we’re ignoring duplicates in the Store table anyway before they are inserted through the left join null condition. Why don’t we enable IGNORE_DUP_KEY so we can just run the insert straight away AKA “Just do it”. That generates the following plan:

Execution plan with ignore_dup_key enabled.

If you assumed that the ignore_dup_key is significantly faster, like I did, you’d be incorrect. While it is faster, I can only assume that the anticipated increase in speed is offset by which part of the engine is handling the duplicate exceptions.

Left Join Check Results

This is the performance outcome of the left outer join solution.

Ignore_Dup_Key Results

And here is the ignore_dup_key solution

The outcome is roughly
CPU: 47ms vs 63ms
Elapsed: 61ms vs 84ms
Logical Reads: 42907 vs 77287

This is not a normal place I’d go looking for performance increases in an application, but sometimes you need to use all the available tools in your tool belt to achieve success.

One note about IGNORE_DUP_KEY. In SQL 2017 and above, there is a new option SUPPRESS_MESSAGES = ON that instructs SQL not to return messages when using the “just do it” approach. In Azure SQL DB, however, this syntax is not available and seems to be enabled by default in current combat levels. All tests were performed in Azure.

To wrap this up, The performance of “Just Do it” is overall better if you expect very few duplicates; however, the more duplicates you have, the closer the performance will be to the typical left outer join. I intentionally set up the test harness for this with few duplicates as a % to show a typical use case for the rarely used option of IGNORE_DUP_KEY