2023-03-04

Self Join in SQL

What is Self Join in SQL

Self join in SQL is a powerful technique for managing and analyzing complex data structures. It involves joining a table to itself, creating a self-referential relationship between the data. This allows for more efficient and comprehensive data analysis, particularly when dealing with hierarchical data.

In a self join, the table is given two different aliases or names, which are used to reference the same table. The first alias represents the original table, while the second alias represents a copy of the original table. By joining these two aliases on a common column, we can compare and contrast data within the same table, as if it were two separate tables.

Self join can be performed using different types of join operations, including inner join, left join, right join, and full outer join. Each of these operations has its own unique characteristics and is used for different purposes.

Examples of Self Join in SQL

Here are examples of self join in SQL, along with sample tables, code, and output.

Hierarchical Data Management

Consider a scenario where you have an employee table with columns for employee ID, name, and manager ID. You want to generate a report that displays the names of all employees along with the names of their respective managers. You can use a self join to accomplish this task.

Employee ID Employee Name Manager ID
1 John 2
2 Mary 3
3 Peter NULL
4 Sarah 3
5 Tom 2
sql
SELECT e.Employee_Name, m.Employee_Name as Manager_Name
FROM Employee e
INNER JOIN Employee m ON e.Manager_ID = m.Employee_ID

Output will be like this:

Employee Name Manager Name
John Mary
Sarah Peter
Tom Mary
Mary Peter

This output shows the name of each employee along with the name of their respective manager.

Finding Relationships Between Data

Consider a scenario where you have a customer table with columns for customer ID, name, and referral ID. You want to generate a report that displays the names of all customers along with the names of the customers who referred them. You can use a self join to accomplish this task.

Customer ID Customer Name Referral ID
1 John NULL
2 Mary 1
3 Peter 2
4 Sarah 3
5 Tom 1
sql
SELECT c.Customer_Name, r.Customer_Name as Referral_Name
FROM Customer c
LEFT JOIN Customer r ON c.Referral_ID = r.Customer_ID

Output will be like this:

Customer Name Referral Name
John NULL
Mary John
Peter Mary
Sarah Peter
Tom John

This output shows the name of each customer along with the name of the customer who referred them, if applicable.

Analyzing Complex Data Structures

Consider a scenario where you have a sales table with columns for sales ID, date, product ID, and customer ID. You want to generate a report that displays the total number of sales for each customer for each product. You can use a self join to accomplish this task.

Sales ID Date Product ID Customer ID
1 2022-01-01 1 1
2 2022-01-01 1 2
3 2022-01-01 2 1
4 2022-01-01 2 2
5 2022-01-01 1 3
sql
SELECT c.Customer_ID, p.Product_ID, COUNT(*) as Total_Sales
FROM Sales s
INNER JOIN Customer c ON s.Customer_ID = c.Customer_ID
INNER JOIN Sales s2 ON s.Product_ID = s2.Product_ID AND s.Customer_ID = s2.Customer_ID
INNER JOIN Product p ON s.Product_ID = p.Product_ID
GROUP BY c.Customer_ID, p.Product_ID

Output will be like this:

Customer ID Product ID Total Sales
1 1 1
1 2 1
2 1 1
2 2 1
3 1 1

This output shows the total number of sales for each customer for each product. The self join allows us to compare and contrast data within the same table, making it easier to analyze complex data structures.

Ryusei Kakujo

researchgatelinkedingithub

Focusing on data science for mobility

Bench Press 100kg!