Lesson 6: Table Design — CREATE and ALTER
Design robust database tables with proper data types, constraints, and indexes. Learn to create, modify, and optimize table structures.
Lesson 6: Table Design — CREATE and ALTER 🏗️
You've been querying existing tables for five lessons. Now it's time to build your own. Understanding table design is crucial—poor design leads to data inconsistencies, performance issues, and maintenance nightmares. Great design makes every query easier.
Why Table Design Matters 🎯
Think of a database table like a filing cabinet. You wouldn't throw papers randomly into drawers—you'd organize them with labels, dividers, and a logical structure. Similarly, proper table design ensures:
- Data integrity: The right constraints prevent bad data from entering
- Performance: Proper data types and indexes speed up queries
- Clarity: Well-named columns and meaningful constraints document your database
- Scalability: Good design handles growth without major rewrites
💡 Tip: Spend time designing tables upfront. Changing them later with live data is risky and complex.
Core Concepts: Building Tables from Scratch 🔨
1. CREATE TABLE — The Foundation
The CREATE TABLE statement defines a new table's structure:
CREATE TABLE table_name (
column_name data_type constraints,
column_name data_type constraints,
...
);
Every column needs:
- A name: Descriptive, lowercase, use underscores (e.g.,
first_name, notFirstName) - A data type: What kind of data it holds
- Optional constraints: Rules that govern the data
2. Data Types — Choosing the Right Container 📦
Data types tell the database what kind of data to expect. Using the right type is critical for storage efficiency and data validation.
Numeric Types:
+---------------+-------------------+------------------------+
| Type | Range | Use Case |
+---------------+-------------------+------------------------+
| SMALLINT | -32,768 to 32,767 | Small whole numbers |
| INTEGER (INT) | -2B to 2B | Standard whole numbers |
| BIGINT | Very large | IDs, large counts |
| DECIMAL(p,s) | Exact precision | Money, precise values |
| NUMERIC(p,s) | Same as DECIMAL | Financial data |
| REAL | 6 decimal digits | Scientific data |
| DOUBLE | 15 decimal digits | High-precision floats |
+---------------+-------------------+------------------------+
⚠️ Common Mistake: Using REAL or DOUBLE for money. Floating-point numbers have rounding errors! Always use DECIMAL(10,2) for currency.
Character/String Types:
+------------------+---------------------------+--------------------+
| Type | Description | Use Case |
+------------------+---------------------------+--------------------+
| CHAR(n) | Fixed length, padded | Country codes (US) |
| VARCHAR(n) | Variable length, max n | Names, emails |
| TEXT | Unlimited length | Descriptions, notes|
+------------------+---------------------------+--------------------+
💡 Tip: VARCHAR(255) is common because it's the maximum length that uses 1 byte for length storage in many databases.
Date/Time Types:
DATE: Calendar date (2024-01-15)TIME: Time of day (14:30:00)TIMESTAMP: Date and time combined (2024-01-15 14:30:00)INTERVAL: Time span (3 days, 2 hours)
Boolean Type:
BOOLEAN: TRUE or FALSE (some databases use TINYINT(1) instead)
Other Types:
UUID: Universally unique identifierJSON: Store JSON documents (PostgreSQL, MySQL 5.7+)ARRAY: Store arrays (PostgreSQL)
3. Constraints — The Rules of the Game 📜
Constraints enforce data integrity. They're your database's quality control.
PRIMARY KEY 🔑
Uniquely identifies each row. Every table should have one.
CREATE TABLE users (
user_id INTEGER PRIMARY KEY,
username VARCHAR(50)
);
Rules:
- Must be UNIQUE
- Cannot be NULL
- Only one per table (but can span multiple columns)
💡 Tip: Use SERIAL (PostgreSQL) or AUTO_INCREMENT (MySQL) for auto-generated integer IDs:
CREATE TABLE users (
user_id SERIAL PRIMARY KEY,
username VARCHAR(50)
);
FOREIGN KEY 🔗
Creates relationships between tables. Points to a PRIMARY KEY in another table.
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(user_id),
order_date DATE
);
This ensures every user_id in orders exists in the users table. You can't create an order for a non-existent user!
users orders
+---------+ +-----------+
| user_id | ←─────────── | user_id | (FOREIGN KEY)
| name | | order_id |
+---------+ | amount |
+-----------+
ON DELETE and ON UPDATE actions:
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(user_id)
ON DELETE CASCADE
ON UPDATE CASCADE
);
CASCADE: Delete/update related rows automaticallySET NULL: Set foreign key to NULLRESTRICT: Prevent deletion/update if related rows exist (default)NO ACTION: Similar to RESTRICT
NOT NULL ⛔
Prohibits NULL values. Use for required fields.
CREATE TABLE products (
product_id SERIAL PRIMARY KEY,
product_name VARCHAR(100) NOT NULL,
price DECIMAL(10,2) NOT NULL
);
⚠️ Common Mistake: Forgetting NOT NULL on obviously required fields like email addresses or product names.
UNIQUE ✨
Ensures all values in a column are different.
CREATE TABLE users (
user_id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
username VARCHAR(50) UNIQUE NOT NULL
);
DEFAULT 🎲
Provides a default value if none is specified.
CREATE TABLE posts (
post_id SERIAL PRIMARY KEY,
title VARCHAR(200) NOT NULL,
status VARCHAR(20) DEFAULT 'draft',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CHECK ✅
Enforces a custom condition.
CREACREATE TABLE products (
product_id SERIAL PRIMARY KEY,
product_name VARCHAR(100) NOT NULL,
price DECIMAL(10,2) CHECK (price > 0),
quantity INTEGER CHECK (quantity >= 0),
discount_percent INTEGER CHECK (discount_percent BETWEEN 0 AND 100)
);
4. ALTER TABLE — Modifying Existing Tables 🔧
Rarely is a table design perfect from day one. ALTER TABLE lets you modify existing tables.
Add a column:
ALTER TABLE users
ADD COLUMN phone VARCHAR(20);
Drop a column:
ALTER TABLE users
DROP COLUMN phone;
⚠️ Warning: Dropping a column deletes all its data permanently!
Modify a column:
-- PostgreSQL
ALTER TABLE users
ALTER COLUMN email TYPE VARCHAR(100);
-- MySQL
ALTER TABLE users
MODIFY COLUMN email VARCHAR(100);
Add constraints:
ALTER TABLE users
ADD CONSTRAINT email_unique UNIQUE (email);
ALTER TABLE orders
ADD CONSTRAINT fk_user FOREIGN KEY (user_id) REFERENCES users(user_id);
Drop constraints:
ALTER TABLE users
DROP CONSTRAINT email_unique;
5. Indexes — Speeding Up Queries 🚀
An index is like a book's index—it helps find data quickly without scanning every page.
CREATE INDEX idx_users_email ON users(email);
When to use indexes:
✅ Columns in WHERE clauses ✅ Columns in JOIN conditions ✅ Columns in ORDER BY ✅ Foreign key columns
When NOT to use indexes:
❌ Small tables (< 1000 rows) ❌ Columns with low cardinality (few unique values, like boolean) ❌ Columns frequently updated
💡 Tip: Indexes speed up reads but slow down writes. Each INSERT/UPDATE/DELETE must update the indexes too.
Types of indexes:
- B-tree (default): General purpose, handles ranges
- Hash: Exact matches only, very fast
- GiST/GIN: PostgreSQL, for full-text search and complex types
Composite indexes (multiple columns):
CREATE INDEX idx_orders_user_date ON orders(user_id, order_date);
Useful when you often query both columns together.
Examples: Building Real-World Tables 🌍
Example 1: E-commerce Product Catalog 🛒
CREATE TABLE categories (
category_id SERIAL PRIMARY KEY,
category_name VARCHAR(100) NOT NULL UNIQUE,
description TEXT
);
CREATE TABLE products (
product_id SERIAL PRIMARY KEY,
product_name VARCHAR(200) NOT NULL,
description TEXT,
price DECIMAL(10,2) NOT NULL CHECK (price >= 0),
stock_quantity INTEGER DEFAULT 0 CHECK (stock_quantity >= 0),
category_id INTEGER NOT NULL REFERENCES categories(category_id),
is_active BOOLEAN DEFAULT TRUE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_products_category ON products(category_id);
CREATE INDEX idx_products_active ON products(is_active);
Explanation:
SERIAL PRIMARY KEY: Auto-incrementing IDsDECIMAL(10,2): Exact precision for money (up to 99,999,999.99)CHECKconstraints: Ensure price and quantity aren't negativeDEFAULTvalues: New products start inactive with 0 stockREFERENCES: Products must belong to existing categories- Indexes on
category_id(for filtering by category) andis_active(for showing only active products)
Example 2: User Authentication System 🔐
CREATE TABLE users (
user_id SERIAL PRIMARY KEY,
username VARCHAR(50) NOT NULL UNIQUE,
email VARCHAR(255) NOT NULL UNIQUE,
password_hash VARCHAR(255) NOT NULL,
first_name VARCHAR(100),
last_name VARCHAR(100),
is_verified BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
last_login TIMESTAMP,
CONSTRAINT email_format CHECK (email LIKE '%@%')
);
CREATE TABLE user_sessions (
session_id UUID PRIMARY KEY,
user_id INTEGER NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
ip_address VARCHAR(45),
user_agent TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP NOT NULL
);
CREATE INDEX idx_sessions_user ON user_sessions(user_id);
CREATE INDEX idx_sessions_expires ON user_sessions(expires_at);
Explanation:
UNIQUEon username and email: No duplicates allowedCHECKconstraint: Basic email validationON DELETE CASCADE: When a user is deleted, all their sessions are deleted tooUUID: Better for session IDs than integers (harder to guess)VARCHAR(45): Accommodates IPv6 addresses (39 chars max)- Index on
expires_at: For efficiently deleting expired sessions
Example 3: Blog with Comments 📝
CREATE TABLE authors (
author_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
bio TEXT,
email VARCHAR(255) UNIQUE NOT NULL
);
CREATE TABLE posts (
post_id SERIAL PRIMARY KEY,
author_id INTEGER NOT NULL REFERENCES authors(author_id),
title VARCHAR(300) NOT NULL,
content TEXT NOT NULL,
status VARCHAR(20) DEFAULT 'draft' CHECK (status IN ('draft', 'published', 'archived')),
published_at TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE comments (
comment_id SERIAL PRIMARY KEY,
post_id INTEGER NOT NULL REFERENCES posts(post_id) ON DELETE CASCADE,
author_name VARCHAR(100) NOT NULL,
author_email VARCHAR(255) NOT NULL,
content TEXT NOT NULL CHECK (LENGTH(content) >= 10),
is_approved BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_posts_author ON posts(author_id);
CREATE INDEX idx_posts_status ON posts(status);
CREATE INDEX idx_comments_post ON comments(post_id);
CREATE INDEX idx_comments_approved ON comments(is_approved);
Explanation:
CHECK (status IN (...)): Limits status to specific values (like an enum)CHECK (LENGTH(content) >= 10): Comments must have at least 10 charactersON DELETE CASCADE: Deleting a post deletes all its comments- Multiple indexes for common query patterns (posts by author, published posts, comments per post)
Example 4: Modifying an Existing Table 🔄
Let's say we need to add tags to our blog posts:
-- First, create a tags table
CREATE TABLE tags (
tag_id SERIAL PRIMARY KEY,
tag_name VARCHAR(50) NOT NULL UNIQUE
);
-- Create a junction table (many-to-many relationship)
CREATE TABLE post_tags (
post_id INTEGER REFERENCES posts(post_id) ON DELETE CASCADE,
tag_id INTEGER REFERENCES tags(tag_id) ON DELETE CASCADE,
PRIMARY KEY (post_id, tag_id)
);
-- Add a view counter to posts
ALTER TABLE posts
ADD COLUMN view_count INTEGER DEFAULT 0 CHECK (view_count >= 0);
-- Add an index on the new column
CREATE INDEX idx_posts_views ON posts(view_count);
-- Add a slug for SEO-friendly URLs
ALTER TABLE posts
ADD COLUMN slug VARCHAR(350) UNIQUE;
Explanation:
- Junction table (
post_tags): Implements many-to-many relationship (a post can have many tags, a tag can be on many posts) - Composite primary key:
(post_id, tag_id)ensures each tag-post pair is unique ALTER TABLE ADD COLUMN: Adds new columns to existing table- All existing rows get
DEFAULT 0forview_countautomatically
Common Mistakes to Avoid ⚠️
1. Wrong Data Types for the Job
❌ Bad: Using VARCHAR(255) for everything
CREATE TABLE orders (
quantity VARCHAR(255), -- Should be INTEGER
price VARCHAR(255), -- Should be DECIMAL
order_date VARCHAR(255) -- Should be DATE
);
✅ Good: Use appropriate types
CREATE TABLE orders (
quantity INTEGER,
price DECIMAL(10,2),
order_date DATE
);
2. Forgetting Constraints
❌ Bad: No constraints, anything goes
CREATE TABLE users (
user_id INTEGER,
email VARCHAR(255),
age INTEGER
);
-- Can insert duplicate emails, negative ages, NULL user_ids!
✅ Good: Enforce data integrity
CREATE TABLE users (
user_id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
age INTEGER CHECK (age >= 0 AND age <= 150)
);
3. No Indexes on Foreign Keys
❌ Bad: Foreign key without index
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(user_id)
-- No index on user_id!
);
Querying orders by user will be slow on large tables.
✅ Good: Index foreign keys
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(user_id)
);
CREATE INDEX idx_orders_user ON orders(user_id);
4. Over-Indexing
❌ Bad: Index every column
CREATE INDEX idx1 ON products(product_id); -- Already PRIMARY KEY!
CREATE INDEX idx2 ON products(product_name);
CREATE INDEX idx3 ON products(description);
CREATE INDEX idx4 ON products(price);
CREATE INDEX idx5 ON products(stock);
-- Inserts will be very slow!
✅ Good: Index only commonly queried columns
-- Only index columns used in WHERE, JOIN, ORDER BY
CREATE INDEX idx_products_category ON products(category_id);
5. Using Reserved Words as Column Names
❌ Bad: Reserved keywords
CREATE TABLE items (
order INTEGER, -- 'order' is a reserved word!
select VARCHAR(50),
from DATE
);
✅ Good: Use descriptive, non-reserved names
CREATE TABLE items (
order_number INTEGER,
selection VARCHAR(50),
start_date DATE
);
Key Takeaways 🎯
✅ Choose data types carefully: Use INTEGER for whole numbers, DECIMAL for money, DATE/TIMESTAMP for dates, VARCHAR for variable-length text
✅ Always use PRIMARY KEY: Every table needs a unique identifier
✅ Enforce integrity with constraints: NOT NULL for required fields, UNIQUE for unique values, FOREIGN KEY for relationships, CHECK for validation
✅ Set meaningful DEFAULT values: Use CURRENT_TIMESTAMP for timestamps, sensible defaults for status columns
✅ Index strategically: Index columns used in WHERE, JOIN, and ORDER BY, but don't over-index
✅ Document with names: Clear column names and constraint names make databases self-documenting
✅ Use ALTER TABLE carefully: Always back up before modifying production tables
✅ Think about relationships: Use FOREIGN KEY with appropriate ON DELETE/UPDATE actions
🧠 Mnemonic Device: The CUPID Rules of Good Table Design
- Constraints: Use NOT NULL, UNIQUE, CHECK to enforce rules
- Unique identifiers: Every table needs a PRIMARY KEY
- Proper types: Match data types to data (DECIMAL for money, DATE for dates)
- Indexes: Add them for performance, but not everywhere
- Documentation: Clear names and comments explain intent
🤔 Did You Know?
The SERIAL type in PostgreSQL isn't actually a real data type—it's shorthand for creating an INTEGER column with a sequence that auto-generates values. Under the hood, PostgreSQL creates a sequence object and sets the column's default to nextval('sequence_name'). MySQL's AUTO_INCREMENT works similarly but is implemented differently!
📋 Quick Reference Card: Table Design Cheat Sheet
╔══════════════════════════════════════════════════════════╗
║ SQL TABLE DESIGN ESSENTIALS ║
╠══════════════════════════════════════════════════════════╣
║ CREATE TABLE ║
║ ──────────────────────────────────────────────────────── ║
║ CREATE TABLE name ( ║
║ col1 TYPE CONSTRAINTS, ║
║ col2 TYPE CONSTRAINTS ║
║ ); ║
║ ║
║ DATA TYPES ║
║ ──────────────────────────────────────────────────────── ║
║ INTEGER, BIGINT Whole numbers ║
║ DECIMAL(10,2) Money, precise values ║
║ VARCHAR(n), TEXT Strings ║
║ DATE, TIMESTAMP Dates and times ║
║ BOOLEAN TRUE/FALSE ║
║ ║
║ CONSTRAINTS ║
║ ──────────────────────────────────────────────────────── ║
║ PRIMARY KEY Unique identifier ║
║ FOREIGN KEY Reference to other table ║
║ NOT NULL Required field ║
║ UNIQUE No duplicates ║
║ DEFAULT value Default if not specified ║
║ CHECK (condition) Validation rule ║
║ ║
║ ALTER TABLE ║
║ ──────────────────────────────────────────────────────── ║
║ ADD COLUMN name TYPE; Add column ║
║ DROP COLUMN name; Remove column ║
║ ADD CONSTRAINT name ...; Add constraint ║
║ ║
║ INDEXES ║
║ ──────────────────────────────────────────────────────── ║
║ CREATE INDEX idx_name Speed up queries ║
║ ON table(column); ║
║ ║
║ INDEX ON: WHERE, JOIN, ORDER BY columns ║
║ DON'T INDEX: Small tables, low cardinality columns ║
╚══════════════════════════════════════════════════════════╝
📚 Further Study
- PostgreSQL Data Types: https://www.postgresql.org/docs/current/datatype.html — Comprehensive guide to all PostgreSQL data types
- SQL Constraints Tutorial: https://www.sqlitetutorial.net/sqlite-constraints/ — Detailed examples of all constraint types
- Database Indexing Best Practices: https://use-the-index-luke.com/ — Deep dive into how indexes work and when to use them
You now understand how to design tables from scratch, choose appropriate data types, enforce data integrity with constraints, and optimize performance with indexes. In the next lessons, you'll use these skills to build complex database schemas and learn advanced SQL techniques. The foundation you've built here will make everything else easier! 🚀