top of page

Welcome
to NumpyNinja Blogs

NumpyNinja: Blogs. Demystifying Tech,

One Blog at a Time.
Millions of views. 

SQL in Action: Tracking Customer Outstanding Loans in Banking Using "SELF JOIN"

Image from stock.adobe.com
Image from stock.adobe.com

In the previous blogs of this series, I explored how data analysis fits into real business scenarios.


Now, it’s time to take the next step in this journey, working with larger more complex datasets, the kind that banks deal with every day.


One of the challenges they face constantly is figuring out the loan outstanding for customers who have multiple loans.


Think about it! customers rarely just have one loan. Many of them juggle a home loan, a personal loan and maybe even a credit loan all at the same time. Each with different disbursement dates, repayment schedules, and balances.

When data reaches that scale and complexity, manual calculations or simple tools just don’t cut it. That’s exactly where SQL comes in to save the day.


Why SQL is a Game Changer?


I chose SQL for this analysis because it’s designed to handle large, structured datasets efficiently, while still letting you dig into the data and make sense of it logically.

SQL allows you to connect records within the same table, compare related rows, and uncover meaningful insights without duplicating data or getting lost in spreadsheets.


In this scenario, one technique stands out, the SELF JOIN. With a self-join, you can link a customer’s multiple loan records together, which makes it much easier to see their total outstanding loan amount. It’s a simple idea, but incredibly powerful when dealing with real world banking data.


SQL in Action: A Banking Case Study


In this blog, I’ll show you exactly how SQL can be used to track and analyze customer loan outstanding, even when customers have several active loans at the same time. This isn’t just theory I’m grounding everything in a practical banking scenario, similar to what analysts deal with every day.


The goal isn’t just to write SQL queries, it’s to understand how SQL can turn chunks of banking data into actionable insights. If you’re starting out in data analysis, or just curious about how SQL can handle more than simple queries, this blog will give you a clear look at why SQL is such a key skill for data analysts in banking and how self joins make sense of complex financial relationships in ways that are both practical and useful.


How Self-Joins Work in SQL?


If you’ve ever tried to compare data within the same table, you’ve probably realized that a normal join between two different tables won’t help. That’s exactly where a self-join comes in. Think of it like having a conversation between a table and itself. Each row can be compared or connected to other rows in the same table based on a condition you set.


Why is this useful? Imagine a table of employees, where each employee has a manager listed in the same table. With a self-join, you can pair each employee with their manager, even though all the data lives in one table. Self joins are also great when you need to compare, aggregate, or relate records that share a common key and in banking, that common key is often the customer ID.


Tracking Customer Loans with Self Joins


In this project, our goal is to figure out the loan outstanding for customers who have multiple loans. Each customer may appear several times in the loans table, with different loan types, disbursement dates, and balances. By using a self-join, we can link all these records for the same customer and analyze them together.


Essentially, the self-join lets us connect each loan record to the other loans of the same customer. This makes it possible to calculate the total outstanding, understand repayment timelines, and even identify patterns across different loan types all in a single query. In the next section, we’ll see how this works in practice with a real SQL example.



Setting Up the Customer Loans Table


To start, we need a table that holds our customer loan data. Let’s call it customer_loans. This table will include important details like the customer ID, customer name, loan type, and the outstanding amount. By setting up this table, we can simulate a real-world banking dataset and use it to demonstrate how a self-join works.

CREATE TABLE customer_loans 
(
    customer_id INT,
    customer_name VARCHAR (50),
    loan_type VARCHAR (20),
    outstanding_amount DECIMAL (10,2)
);

SQL Query to Create the Customer_loans Table
SQL Query to Create the Customer_loans Table

This structure allows us to store multiple loans for the same customer, which is exactly what we need to demonstrate the self-join technique.


Populating the Customer Loans Table


Next, we populate the table with some example data. This gives us multiple loans for the same customer, so the self-join can link these records effectively.

INSERT INTO customer_loans (customer_id, customer_name, loan_type, outstanding_amount) VALUES
(101, 'Ravi Kumar', 'Home Loan', 450000.00),
(102, 'Anita Sharma', 'Home Loan', 300000.00),
(103, 'Suresh Patel', 'Car Loan', 90000.00),
(101, 'Ravi Kumar', 'Car Loan', 120000.00),
(104, 'Priya Singh', 'Personal Loan', 60000.00),
(105, 'Amit Verma', 'Home Loan', 500000.00),
(102, 'Anita Sharma', 'Education Loan', 150000.00),
(104, 'Priya Singh', 'Credit Card Loan', 4000.00),
(105, 'Amit Verma', 'Car Loan', 110000.00),
(101, 'Ravi Kumar', 'Personal Loan', 80000.00);

Data populated into the customer_loans table
Data populated into the customer_loans table

After inserting the data, the query

 SELECT * FROM customer_loans; 

Will show the table with all loans listed for each customer. Now we have a dataset ready to apply the self-join.


Table Displaying the Inserted Data
Table Displaying the Inserted Data


Seeing the Self Join in Action


To determine each customer’s outstanding loans, we use a self-join on the customer_loans table. Here’s the query:


Self-join Query to get customer loan outstanding
Self-join Query to get customer loan outstanding

Here’s what’s happening:


  • The table is referenced multiple times using different aliases, each alias representing a different logical view of the same loan data.

  • A distinct customer list is used as the base view, ensuring each customer appears only once in the final result.

  • Each additional alias corresponds to one specific loan category rather than a loan pairing.

  • All aliases are linked using the common customer identifier, grouping related loans under the same customer.

  • LEFT JOINs are applied so that absence of a loan does not remove the customer from the output.

  • This technique reshapes multiple rows into columns without relying on pivot functions.

  • Flexible matching logic is used to handle inconsistencies in loan type values.

  • Each loan category is projected into its own column, improving readability for reporting.

  • Missing loans naturally appear as NULL (or zero if handled explicitly).

  • The final result presents a customer-centric view of loan outstanding amounts.



Breaking Down Customer’s Outstanding Loans


Result of customer's Outstanding Loan amount and its type after SELF-JOIN
Result of customer's Outstanding Loan amount and its type after SELF-JOIN

Looking at the above screenshot, we can see a clear, structured view of customers and their loans along with outstanding amounts. Each customer’s multiple loans are paired together, making it easy to assess total liabilities at a glance.

For example:

  • Amit Verma has both a Car Loan and a Home Loan, with balances shown in each category.

  • Anita Sharma has an Education Loan and a Home Loan, making her liabilities easy to understand.

  • Customers like Ravi Kumar appear in multiple combinations, reflecting Car Loans, Personal Loans, and Home Loans across different records.

  • Even customers with only one loan are displayed in a clear format, highlighting their individual loan details.

By using a self-join, we can connect all of a customer’s loans in a single view, making it much easier to compare, combine, analyze multiple records and avoid redundancy where something that would be harder to do manually or with simpler queries.


Key Insights and What’s Next


From this exercise, it’s clear that using SQL and self joins is a reliable, real-world method for analyzing customer loan data. It lets analysts quickly link multiple loan records for the same customer, providing a complete and organized view of outstanding amounts without manual calculations. This approach reflects the kind of challenges analysts face daily in the banking industry, making it both practical and powerful.


Looking ahead, in the final blog of this series, we’ll explore data visualization. Here, we’ll transform these SQL insights into clear, interactive charts and dashboards.

This is where the story comes alive helping data analysts communicate complex financial information effectively to end users and decision-makers.







 
 

+1 (302) 200-8320

NumPy_Ninja_Logo (1).png

Numpy Ninja Inc. 8 The Grn Ste A Dover, DE 19901

© Copyright 2025 by Numpy Ninja Inc.

  • Twitter
  • LinkedIn
bottom of page