check for duplicates - count, case, and exist in postgres

2019-07-16

 | 

~3 min read

 | 

504 words

Before I write to a database, I want to make sure that I don’t create a duplicate record based on the name and parent record id.

To accomplish that, I want to abort early and alert the client if the name they’ve provided is a duplicate.

The pseudocode looks something like:

const createNewRecord({name, parent_id}) => {

const duplicate = await checkForDuplicate({name, parent_id})
if (duplicate) throw new Error(`Oy! A duplicate. Try a different name. We already have a value for --> `, ${name})

const data = await this.pool.query(SQL`
-- insert query
`);
return data.rows;

I’ve written previously about how to insert values into tables with Postgres, but what I wasn’t sure about was how to write a query that would return a boolean value for duplicates (i.e. SQL query to use in checkForDupcliate) (Note: I’m writing in Javascript and interpolating SQL statements using sql-template-string)

Start With COUNT(*)

One way to figure out if any records exist would be to count the total that meet a condition using COUNT(*).

For example:

SELECT COUNT(*) FROM my_table
  WHERE parent_id = ${parent_id}
  AND name ILIKE ${name};

This is asking a different question, however. Instead of asking a true/false question, I’m now asking how many. My conditional in Javascript would no longer be if (duplicate) but more reasonably 0).

From a purely computational perspective, counting is more expensive than checking for the existence of something.

Move On To Conditionals

Instead of the count, we can use the CASE conditional. The CASE is a generic conditional expression in Postgres.

The Postgres documentation for CASE provides the following definition:

CASE WHEN condition THEN result
     [WHEN ...]
     [ELSE result]
END

Great, but what is my condition?

In this case, all I care about is if something exists… which is convenient, because Postgres’ EXIST checks for exactly that.

EXIST is one of several subquery expressions (similar to IN, NOT IN, ALL, etc.).

It takes a subquery and evaluates to true if rows are returned by the subquery and false otherwise.

The resulting query might look like:

SELECT CASE WHEN EXISTS (
  SELECT * FROM my_table
  WHERE parent_id = ${parent_id}
  AND name ILIKE ${name}
  ) THEN TRUE::bool
  ELSE FALSE::bool
END `;

Refactoring Time: Dropping Case

This CASE approach works. But, it’s unnecessary. If I wanted to return something other than a boolean I could use CASE. E.g.,

SELECT CASE WHEN EXISTS (
  SELECT * FROM my_table
  WHERE parent_id = ${parent_id}
  AND name ILIKE ${name}
  ) THEN 'exists'::text
  ELSE 'does not exist'::text
END `;

But that’s not what I’m looking for. I just need TRUE/FALSE which is what the EXIST returns natively.

As a result, I simplified the query to the following:

SELECT EXISTS (
  SELECT * FROM my_table
  WHERE parent_id = ${parent_id}
  AND name ILIKE ${name}
  ) AS exists `

Additional Reading



Hi there and thanks for reading! My name's Stephen. I live in Chicago with my wife, Kate, and dog, Finn. Want more? See about and get in touch!