You are viewing a preview of this course. Sign in to start learning

Lesson 3: JOINs — Combining Tables

Learn how to combine data from multiple tables using INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN. Understand why databases split data across tables and how to bring it back together.

Lesson 3: JOINs — Combining Tables 🔗

Introduction

In Lesson 1, you learned to query a single table. In Lesson 2, you summarized data with aggregations. But real-world databases don't store everything in one giant table. Instead, they normalize data—splitting it across multiple related tables to avoid redundancy and maintain data integrity.

Imagine an online store. You might have:

  • A customers table with customer information
  • An orders table with order details
  • A products table with product information

Each order references a customer and contains products. To answer questions like "What did customer Sarah Johnson order?" or "Which products are most popular?", you need to JOIN these tables together.

💡 Think of JOINs as relationship matchmakers: They connect rows from different tables based on shared information (like customer IDs or product codes).

Why Tables Are Split: Database Normalization 📚

Before we dive into JOINs, let's understand why databases are organized this way.

❌ The Problem: One Giant Table

Imagine storing everything in a single order_data table:

+----------+---------------+-------+-------------+-----------+-------+
| order_id | customer_name | email | product     | price     | qty   |
+----------+---------------+-------+-------------+-----------+-------+
| 1        | Sarah Johnson | s@... | Laptop      | 999.99    | 1     |
| 2        | Sarah Johnson | s@... | Mouse       | 29.99     | 2     |
| 3        | Mike Chen     | m@... | Keyboard    | 79.99     | 1     |
| 4        | Sarah Johnson | s@... | USB Cable   | 9.99      | 3     |
+----------+---------------+-------+-------------+-----------+-------+

Problems:

  • Sarah's name and email are duplicated (wastes space, risks inconsistency)
  • If Sarah changes her email, you must update multiple rows
  • If you delete all of Sarah's orders, you lose her contact information

✅ The Solution: Normalized Tables

Split the data into related tables:

customers table:

+-------------+---------------+------------------+
| customer_id | name          | email            |
+-------------+---------------+------------------+
| 1           | Sarah Johnson | sarah@email.com  |
| 2           | Mike Chen     | mike@email.com   |
+-------------+---------------+------------------+

products table:

+------------+-------------+---------+
| product_id | name        | price   |
+------------+-------------+---------+
| 101        | Laptop      | 999.99  |
| 102        | Mouse       | 29.99   |
| 103        | Keyboard    | 79.99   |
| 104        | USB Cable   | 9.99    |
+------------+-------------+---------+

orders table:

+----------+-------------+------------+----------+
| order_id | customer_id | product_id | quantity |
+----------+-------------+------------+----------+
| 1        | 1           | 101        | 1        |
| 2        | 1           | 102        | 2        |
| 3        | 2           | 103        | 1        |
| 4        | 1           | 104        | 3        |
+----------+-------------+------------+----------+

Benefits:

  • Each piece of data stored once
  • Update Sarah's email in one place
  • Customer info persists even without orders

🧠 Mnemonic: Normalization Organizes Data Uniquely—"NO DUPLICATION"

Core Concepts: Types of JOINs 🔄

A JOIN combines rows from two or more tables based on a related column (usually a foreign key). The join condition specifies how rows should match.

Basic JOIN Syntax

SELECT columns
FROM table1
JOIN_TYPE table2
  ON table1.column = table2.column;

Components:

  • table1: The left table (first mentioned)
  • table2: The right table (being joined)
  • ON: Specifies the join condition (how to match rows)
  • JOIN_TYPE: Determines which rows are included

INNER JOIN: Only Matching Rows 🎯

An INNER JOIN returns only rows that have matches in both tables.

   Table A          Table B          Result
  +-------+        +-------+        +-------+
  |   1   |        |   2   |        |   2   |
  |   2   |   ∩    |   3   |   =    |   3   |
  |   3   |        |   4   |        |   3   |
  +-------+        +-------+        +-------+

Example:

SELECT customers.name, orders.order_id
FROM customers
INNER JOIN orders
  ON customers.customer_id = orders.customer_id;

This returns only customers who have placed orders. Customers without orders are excluded.

💡 Tip: INNER JOIN can be written as just JOIN—they're identical.

LEFT JOIN (LEFT OUTER JOIN): All Left + Matches 👈

A LEFT JOIN returns all rows from the left table, plus matching rows from the right table. When there's no match, right table columns are NULL.

   Table A          Table B          Result
  +-------+        +-------+        +-------+
  |   1   |        |   2   |        |   1   | NULL
  |   2   |   ∪←   |   3   |   =    |   2   |
  |   3   |        |   4   |        |   3   |
  +-------+        +-------+        +-------+

Example:

SELECT customers.name, orders.order_id
FROM customers
LEFT JOIN orders
  ON customers.customer_id = orders.customer_id;

This returns all customers, including those who haven't placed orders (their order_id will be NULL).

🔧 Use case: "Show me all customers and their orders, including customers with no orders."

RIGHT JOIN (RIGHT OUTER JOIN): All Right + Matches 👉

A RIGHT JOIN is the opposite of LEFT JOIN—it returns all rows from the right table, plus matching rows from the left table.

   Table A          Table B          Result
  +-------+        +-------+        +-------+
  |   1   |        |   2   |        |   2   |
  |   2   |   →∪   |   3   |   =    |   3   |
  |   3   |        |   4   |        |   4   | NULL
  +-------+        +-------+        +-------+

Example:

SELECT customers.name, orders.order_id
FROM customers
RIGHT JOIN orders
  ON customers.customer_id = orders.customer_id;

This returns all orders, even if customer data is missing (unlikely in a well-designed database).

💡 Tip: Most developers prefer LEFT JOIN and simply swap table order. A LEFT JOIN B equals B RIGHT JOIN A.

FULL JOIN (FULL OUTER JOIN): Everything 🌐

A FULL JOIN returns all rows from both tables. When there's no match, the missing side shows NULL.

   Table A          Table B          Result
  +-------+        +-------+        +-------+
  |   1   |        |   2   |        |   1   | NULL
  |   2   |   ∪    |   3   |   =    |   2   |
  |   3   |        |   4   |        |   3   |
  +-------+        +-------+        |   4   | NULL
                                    +-------+

Example:

SELECT customers.name, orders.order_id
FROM customers
FULL JOIN orders
  ON customers.customer_id = orders.customer_id;

This shows all customers and all orders, including unmatched rows from both sides.

⚠️ Note: Not all databases support FULL JOIN (notably MySQL). You can simulate it with UNION of LEFT and RIGHT JOINs.

Visual Summary: JOIN Types 📊

┌─────────────────────────────────────────────────────────┐
│                    JOIN TYPE MATRIX                     │
├─────────────┬───────────────────────────────────────────┤
│ INNER JOIN  │ Only matching rows from both tables       │
├─────────────┼───────────────────────────────────────────┤
│ LEFT JOIN   │ All left + matching right (NULL if none)  │
├─────────────┼───────────────────────────────────────────┤
│ RIGHT JOIN  │ All right + matching left (NULL if none)  │
├─────────────┼───────────────────────────────────────────┤
│ FULL JOIN   │ All rows from both tables                 │
└─────────────┴───────────────────────────────────────────┘

Detailed Examples 💻

Example 1: INNER JOIN with Customer Orders

Scenario: Find all orders with customer names and email addresses.

SELECT 
  orders.order_id,
  customers.name,
  customers.email,
  orders.quantity
FROM orders
INNER JOIN customers
  ON orders.customer_id = customers.customer_id;

Result:

+----------+---------------+------------------+----------+
| order_id | name          | email            | quantity |
+----------+---------------+------------------+----------+
| 1        | Sarah Johnson | sarah@email.com  | 1        |
| 2        | Sarah Johnson | sarah@email.com  | 2        |
| 3        | Mike Chen     | mike@email.com   | 1        |
| 4        | Sarah Johnson | sarah@email.com  | 3        |
+----------+---------------+------------------+----------+

Explanation:

  • We join orders and customers on the common customer_id column
  • Only orders with matching customer records appear
  • Notice Sarah appears three times (she has three orders)
  • We use table prefixes (orders.order_id, customers.name) to avoid ambiguity

Example 2: Multiple JOINs for Complete Order Information

Scenario: Show order details including customer names, product names, and prices.

SELECT 
  orders.order_id,
  customers.name AS customer_name,
  products.name AS product_name,
  products.price,
  orders.quantity,
  (products.price * orders.quantity) AS total
FROM orders
INNER JOIN customers ON orders.customer_id = customers.customer_id
INNER JOIN products ON orders.product_id = products.product_id
ORDER BY orders.order_id;

Result:

+----------+---------------+--------------+--------+----------+--------+
| order_id | customer_name | product_name | price  | quantity | total  |
+----------+---------------+--------------+--------+----------+--------+
| 1        | Sarah Johnson | Laptop       | 999.99 | 1        | 999.99 |
| 2        | Sarah Johnson | Mouse        | 29.99  | 2        | 59.98  |
| 3        | Mike Chen     | Keyboard     | 79.99  | 1        | 79.99  |
| 4        | Sarah Johnson | USB Cable    | 9.99   | 3        | 29.97  |
+----------+---------------+--------------+--------+----------+--------+

Explanation:

  • We chain two JOINs to connect three tables
  • First JOIN brings customer information
  • Second JOIN brings product information
  • We calculate total using data from both joined tables
  • Column aliases (AS customer_name) make results clearer

💡 Tip: You can join as many tables as needed—just add more JOIN clauses.

Example 3: LEFT JOIN to Find Customers Without Orders

Scenario: List all customers and show how many orders they've placed (including zero).

SELECT 
  customers.name,
  customers.email,
  COUNT(orders.order_id) AS order_count
FROM customers
LEFT JOIN orders
  ON customers.customer_id = orders.customer_id
GROUP BY customers.customer_id, customers.name, customers.email
ORDER BY order_count DESC;

Assume we have an additional customer:

+-------------+---------------+------------------+
| customer_id | name          | email            |
+-------------+---------------+------------------+
| 1           | Sarah Johnson | sarah@email.com  |
| 2           | Mike Chen     | mike@email.com   |
| 3           | Lisa Park     | lisa@email.com   |
+-------------+---------------+------------------+

Result:

+---------------+------------------+-------------+
| name          | email            | order_count |
+---------------+------------------+-------------+
| Sarah Johnson | sarah@email.com  | 3           |
| Mike Chen     | mike@email.com   | 1           |
| Lisa Park     | lisa@email.com   | 0           |
+---------------+------------------+-------------+

Explanation:

  • LEFT JOIN ensures all customers appear, even Lisa who has no orders
  • COUNT(orders.order_id) counts orders (returns 0 for NULL)
  • We GROUP BY customer to aggregate their orders
  • This pattern is crucial for finding "missing" relationships

🔧 Try this: Change to INNER JOIN and Lisa disappears—only customers with orders remain.

Example 4: Self JOIN for Hierarchical Data

Scenario: Find employees and their managers (both stored in the same employees table).

employees table:

+-------------+---------------+------------+
| employee_id | name          | manager_id |
+-------------+---------------+------------+
| 1           | Alice Chen    | NULL       |
| 2           | Bob Smith     | 1          |
| 3           | Carol Lee     | 1          |
| 4           | David Park    | 2          |
+-------------+---------------+------------+

Query:

SELECT 
  e.name AS employee,
  m.name AS manager
FROM employees e
LEFT JOIN employees m
  ON e.manager_id = m.employee_id
ORDER BY e.employee_id;

Result:

+-------------+-------------+
| employee    | manager     |
+-------------+-------------+
| Alice Chen  | NULL        |
| Bob Smith   | Alice Chen  |
| Carol Lee   | Alice Chen  |
| David Park  | Bob Smith   |
+-------------+-------------+

Explanation:

  • We join the employees table to itself
  • Table aliases (e for employee, m for manager) are essential here
  • Alice has no manager (top of hierarchy, manager_id is NULL)
  • This pattern works for any hierarchical or self-referential data

🧠 Mnemonic: Self JOIN = Same table, different roles

Common Mistakes to Avoid ⚠️

1. Forgetting the Join Condition

Wrong:

SELECT customers.name, orders.order_id
FROM customers
JOIN orders;

Problem: Without ON, you get a Cartesian product—every customer paired with every order (exponential rows!).

Correct:

SELECT customers.name, orders.order_id
FROM customers
JOIN orders
  ON customers.customer_id = orders.customer_id;

2. Ambiguous Column Names

Wrong:

SELECT name
FROM customers
JOIN orders
  ON customers.customer_id = orders.customer_id;

Problem: If orders also has a name column, database doesn't know which one you want.

Correct:

SELECT customers.name
FROM customers
JOIN orders
  ON customers.customer_id = orders.customer_id;

💡 Best practice: Always use table prefixes in JOINs, even when not ambiguous—makes queries clearer.

3. Using INNER JOIN When You Need LEFT JOIN

Problem: "Show all customers" with INNER JOIN excludes customers without orders.

-- This MISSES customers without orders
SELECT customers.name, COUNT(orders.order_id)
FROM customers
INNER JOIN orders ON customers.customer_id = orders.customer_id
GROUP BY customers.name;

Correct:

SELECT customers.name, COUNT(orders.order_id)
FROM customers
LEFT JOIN orders ON customers.customer_id = orders.customer_id
GROUP BY customers.name;

🔍 Ask yourself: Do I need all rows from one table? Use LEFT/RIGHT JOIN.

4. Confusing Join Order with LEFT JOIN

-- These are DIFFERENT:
SELECT * FROM customers LEFT JOIN orders ...
SELECT * FROM orders LEFT JOIN customers ...

The first keeps all customers; the second keeps all orders. Table order matters with LEFT/RIGHT JOIN!

5. Not Using Aliases with Multiple JOINs

Hard to read:

SELECT orders.order_id, customers.name, products.name
FROM orders
JOIN customers ON orders.customer_id = customers.customer_id
JOIN products ON orders.product_id = products.product_id;

Clearer with aliases:

SELECT o.order_id, c.name, p.name AS product_name
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id;

Key Takeaways 🎯

  1. Normalization splits data to eliminate redundancy—JOINs bring it back together
  2. INNER JOIN returns only matching rows from both tables
  3. LEFT JOIN returns all rows from the left table plus matches from the right
  4. RIGHT JOIN returns all rows from the right table plus matches from the left
  5. FULL JOIN returns all rows from both tables (not supported in all databases)
  6. Always specify the join condition with ON to avoid Cartesian products
  7. Use table prefixes or aliases to avoid ambiguity
  8. You can join multiple tables by chaining JOIN clauses
  9. Self JOIN joins a table to itself (useful for hierarchical data)
  10. Choose INNER vs. LEFT/RIGHT based on whether you need all rows from one side

🤔 Did you know? The term "JOIN" in SQL is inspired by set theory in mathematics—you're essentially performing set operations (intersection, union) on table data!

Advanced Concepts Preview 🚀

What's next:

  • JOIN with WHERE: Filter joined results
  • Subqueries in JOIN: Join to a query result
  • CROSS JOIN: Deliberate Cartesian products
  • Non-equi JOINs: Join conditions beyond equality
  • Join performance: Indexes and optimization

📚 Further Study

  1. PostgreSQL JOIN Documentation: https://www.postgresql.org/docs/current/tutorial-join.html
  2. SQL Joins Visualizer: https://sql-joins.leopard.in.ua/ (interactive diagrams)
  3. Use The Index, Luke - Joins: https://use-the-index-luke.com/sql/join (performance deep-dive)

📋 Quick Reference Card: JOINs Cheat Sheet

┌────────────────────────────────────────────────────────────────┐
│                        SQL JOINS                               │
├──────────────┬─────────────────────────────────────────────────┤
│ JOIN Type    │ Returns                                         │
├──────────────┼─────────────────────────────────────────────────┤
│ INNER JOIN   │ Only matching rows from both tables             │
│ LEFT JOIN    │ All left + matching right (NULL for missing)    │
│ RIGHT JOIN   │ All right + matching left (NULL for missing)    │
│ FULL JOIN    │ All rows from both tables                       │
├──────────────┴─────────────────────────────────────────────────┤
│ Basic Syntax:                                                  │
│   SELECT columns                                               │
│   FROM table1                                                  │
│   JOIN table2 ON table1.id = table2.id;                        │
├────────────────────────────────────────────────────────────────┤
│ Multiple Joins:                                                │
│   FROM orders o                                                │
│   JOIN customers c ON o.customer_id = c.id                     │
│   JOIN products p ON o.product_id = p.id;                      │
├────────────────────────────────────────────────────────────────┤
│ Self Join:                                                     │
│   FROM employees e                                             │
│   JOIN employees m ON e.manager_id = m.id;                     │
├────────────────────────────────────────────────────────────────┤
│ ⚠️ Always use ON clause to avoid Cartesian products           │
│ 💡 Use table aliases (a, b) for readability                   │
│ 🎯 LEFT JOIN when you need ALL from left table                │
└────────────────────────────────────────────────────────────────┘