Avoid NULL Pitfalls: Use NOT EXISTS Instead of NOT IN in PostgreSQL
- Mukilyavimali
- Jun 16
- 4 min read
Hey Everyone,
Happy to connect you all once again through my writing...! Recently I have come across a situation where I need to find the customer details who never rented any movie. This one seems to be very simple but I failed to produce the expected output. There I have learnt few things which is quite interesting and as a data viz we all should remember this kind of scenario. So, Let’s Start….
Prerequisite: Postgresql (PgAdmin 4) I have used Postgresql for the demo purpose.
Problem Statement:
"Find the Customers who never rented a movie from the given dataset."
Table Creation and Data Insertion:
Table 1: customer_temp
Here I have created a temporary table to explain the scenario that I had come across. So first I have created a temporary table named “customer_temp” with the following query. You can see the table with values below.

Table 2: rental_temp
Now I have created another table named “rental_temp” with the below values.

So, here our aim is to find the customer who never rented a movie. Means we have to find the customer id which is not present in the rental_temp table. To solve this query, We have few approaches. We could try each one and let’s see how each approach is producing the result.
Using NOT IN Subquery:
The first approach is to use "Subquery with NOT IN", which has to result the customer id that is not present in the rental_temp table. So, let’s see…

But above "Subquery with NOT IN" is not showing the expected output, what could be the issue? Does it mean that all the customers rented at least a movie? OR does the table has any invalid data? I would like to try an another approach and see the result.
Using NOT EXIST Subquery:
The next approach is to use "Subquery with NOT EXIST".

Above "Subquery with NOT EXIST" is showing the expected output that two persons who never rented a movie.
Why I am getting the desired output with "NOT EXIST" but not in when using ‘NOT IN" ?
As per my expectation, both would result the same but its not, so after my research I came to know that there could be some invalid values in the rental table which made NOT IN inner query silently failed to consider the invalid values. But in NOT EXIST case the inner query evaluation is different that is why it has produced the expected output.
So, How the "NOT EXIST" query is making the evaluation different ?
This query checks the rental_temp table for any rows where the customer_id matches the one from the outer customer_temp table.
In above query we used SELECT 1 because it doesn't find the exact values in the table , it's just consider whether a row exists or not.
If no such row is found, the outer query knows: “This customer has never rented a movie.”
This is how above query is executed and produced the expected output.
If we look at the above rental_temp table which has the invalid values such as NULL or Missing values thats why "Subquery with NOT IN" didn't produce the expected output. So, its advisable to use "Subquery with NOT EXIST" for this kind of scenario.
We could try another approach with JOIN...
Using LEFT JOIN Function:
Here I am trying LEFT JOIN to see the result.

Above LEFT JOIN query is also showing the same expected output.
We have found that "NOT IN" is not advisable approach for this type of scenarios rather "NOT EXIST" is the one safer to use. Once again, why are we speaking about JOINS here?
Let me show you....!
What is the advantage of JOINS over Subqueries?
Let’s analyze their execution and planning time.
For the Subquery,

Above Sub query shows 0.289ms as execution time.
Now we will analyze the Left Join query,

Above Left Join is showing 0.230ms which is lesser than the previous "Subquery with NOT EXIST".
Imagine if we have really a large data set which has millions of rows, JOINS will ultimately result the optimised query performance. This proves that Joins are always better than the Subqueries undoubtedly.
I have curated a comparison table for the above three approaches:
Method | Handles NULLs? | Returns Correct Result? | Recommended? |
NOT IN (subquery) | ❌ No | ❌ No rows returned | ❌ Not safe |
NOT EXISTS (subquery) | ✅ Yes | ✅ Yes | ✅ Best practice |
LEFT JOIN ... IS NULL | ✅ Yes | ✅ Yes | ✅ Good option |
Yes, Now we have a better understanding on these approaches and to write optimised query. If you are in line with me, you would also have the doubt in real time, how would I distinguish the situations that I need to use NOT EXIST or LEFT JOIN. Don’t worry I have documented that also for our better understanding. And I am damn sure you would also get better flow when you practice on different domain related problems.
Sample scenarios are,
📦 Customer & Orders / Rentals / Purchases
Customers who never rented
Customers who never placed an order
Customers who did not buy a specific product
🛒 Products / Inventory / Sales
Products never sold
Items not in stock (In inventory)
Products never ordered by a particular region
🧑💼 Employee / Task / Attendance
Employees who were never assigned a task
Employees who never logged attendance
Staff with no sales this month
🏫 Student / Course / Exam
Students who didn’t enroll in any course
Students who didn’t attend an exam
Students who never failed a course
🏦 Banking / Finance
Customers with no transactions
Accounts that were never debited
Users who never applied for a loan
📈 Marketing / Ads / Analytics
Users who never clicked an ad
Campaigns with no conversions
Visitors who didn't sign up
Conclusion:
So guys, Here comes the conclusion. For the above mentioned problem statements , always prefer to use either "Subquery with NOT EXIST" or "LEFT JOIN ...... IS NULL" for better query performance and to avoid unexpected behaviour of the query result. Hope my learning experience in this blog would help you in a way whenever the needful situation arise.
Happy Learning…!

