Complete the FAISS HNSW index creation: ```python import faiss dim = 128 M = 32 index = faiss.{{1}}(dim, M) index.hnsw.{{2}} = 200 ```

["IndexHNSWFlat","efConstruction"]

ANN Algorithms

Deep dive into Approximate Nearest Neighbor search methods and their trade-offs between speed and accuracy.

ANN Algorithms for Vector Search

Master Approximate Nearest Neighbor (ANN) algorithms with free flashcards and spaced repetition practice. This lesson covers graph-based methods like HNSW, tree-based structures like Annoy, locality-sensitive hashing, and quantization techniques—essential concepts for building scalable AI search systems and modern RAG applications.

Welcome 🎯

When you're searching through millions or billions of high-dimensional vectors in a production AI system, exact nearest neighbor search becomes impossibly slow. This is where Approximate Nearest Neighbor (ANN) algorithms shine. These clever techniques trade a small amount of accuracy for massive speed improvements—often delivering 100x to 1000x faster searches while maintaining 95%+ recall.

In this lesson, you'll discover the core algorithms powering modern vector databases: the hierarchical graph navigation of HNSW, the binary tree partitioning of Annoy, the hash-bucketing magic of LSH, and the compression brilliance of Product Quantization. By understanding these algorithms, you'll know how to choose the right approach for your specific use case, tune parameters for optimal performance, and debug issues when your vector search isn't performing as expected.

💡 Why This Matters: Every major vector database—from Pinecone to Weaviate to Qdrant—implements one or more of these algorithms under the hood. Understanding them transforms you from a database user into someone who can architect intelligent, high-performance search systems.

Core Concepts: The ANN Algorithm Landscape 🗺️

The Fundamental Trade-off

All ANN algorithms balance three competing factors:

Factor	Description	Trade-off
Speed	Query latency (ms)	Faster → Less accurate
Recall	% of true neighbors found	Higher → Slower/more memory
Memory	Index size overhead	Lower → Less accurate

Recall is the key metric: if you want the top 10 nearest neighbors, and your ANN algorithm returns 9 that would be in the true top-10, that's 90% recall. Production systems typically target 95-99% recall.

Algorithm Categories

ANN algorithms fall into four main families:

┌─────────────────────────────────────────────────┐
│         ANN ALGORITHM TAXONOMY                  │
├─────────────────────────────────────────────────┤
│                                                 │
│  📊 GRAPH-BASED                                │
│     ├─ HNSW (Hierarchical Navigable Small World)│
│     ├─ NSG (Navigable Small Graph)             │
│     └─ DiskANN                                 │
│                                                 │
│  🌳 TREE-BASED                                 │
│     ├─ Annoy (Approximate Nearest Neighbors)   │
│     ├─ KD-Tree variants                        │
│     └─ Ball Tree                               │
│                                                 │
│  #️⃣ HASH-BASED                                 │
│     ├─ LSH (Locality-Sensitive Hashing)        │
│     ├─ Multi-Probe LSH                         │
│     └─ Cross-Polytope LSH                      │
│                                                 │
│  🗜️ QUANTIZATION                                │
│     ├─ Product Quantization (PQ)               │
│     ├─ Scalar Quantization (SQ)                │
│     └─ Binary Quantization                     │
│                                                 │
└─────────────────────────────────────────────────┘

Hierarchical Navigable Small World (HNSW) is currently the most popular ANN algorithm for its exceptional balance of speed, accuracy, and ease of use.

How HNSW Works

HNSW builds a multi-layer graph where:

Each vector is a node
Edges connect similar vectors
Upper layers are sparse (skip lists for fast navigation)
Bottom layer is dense (contains all vectors)

HNSW STRUCTURE (3 layers)

Layer 2 (sparse):
     (A)────────────────────────(F)
      │                          │
      │                          │

Layer 1 (medium):
     (A)───(B)───────(D)───(E)───(F)
      │     │         │     │     │
      │     │         │     │     │

Layer 0 (all vectors):
     (A)─(B)─(C)─(D)─(E)─(F)─(G)─(H)
      │   │   │   │   │   │   │   │
     (I)─(J)─(K)─(L)─(M)─(N)─(O)─(P)

Search Process:

Start at a random entry point in the top layer
Navigate to the nearest neighbor using greedy search
Drop down one layer
Repeat until reaching the bottom layer
Perform a local search to find k nearest neighbors

💡 Why It's Fast: By starting at sparse upper layers, HNSW quickly zooms in on the right region, then refines at lower layers. It's like using a map's zoom levels—start zoomed out, then drill down.

HNSW Parameters

Parameter	Description	Impact
M	Max edges per node	↑ = Better recall, more memory
ef_construction	Candidates during build	↑ = Better quality, slower build
ef_search	Candidates during query	↑ = Better recall, slower search

Typical Production Values:

M = 16  # Good default balance
ef_construction = 200  # Build once, search many times
ef_search = 50  # Tune based on latency requirements

⚠️ Common Mistake: Setting ef_search < k (fewer candidates than results requested). Always ensure ef_search >= k.

2. Annoy: Binary Tree Partitioning 🌳

Annoy (Approximate Nearest Neighbors Oh Yeah) was developed by Spotify for music recommendations. It builds multiple random projection trees to partition the vector space.

How Annoy Works

Tree Construction:

Pick two random points from your dataset
Draw a hyperplane halfway between them
Split all points based on which side they fall on
Recursively partition each side
Build multiple trees (typically 10-100) with different random splits

ANNOY RANDOM PROJECTION TREE

                    Root
                     │
          ┌──────────┴──────────┐
          │   Hyperplane 1      │
     ┌────┴────┐           ┌────┴────┐
     │ H-plane2│           │ H-plane3│
   ┌─┴─┐     ┌─┴─┐       ┌─┴─┐     ┌─┴─┐
   │A,B│     │C,D│       │E,F│     │G,H│
   └───┘     └───┘       └───┘     └───┘
   Leaves contain actual vectors

Search Process:

Traverse each tree from root to leaf
Collect all candidates from the reached leaves
Compute exact distances to all candidates
Return the k closest

💡 Why Multiple Trees: A single tree might unluckily separate similar vectors. Multiple trees with different random splits ensure you'll find neighbors in at least some trees.

Annoy Parameters

from annoy import AnnoyIndex

f = 128  # Vector dimensionality
t = AnnoyIndex(f, 'angular')  # or 'euclidean'

## Add vectors
for i, vector in enumerate(vectors):
    t.add_item(i, vector)

## Build with n_trees
t.build(n_trees=10)  # More trees = better recall, more memory
t.save('index.ann')

## Search with search_k candidates
neighbors = t.get_nns_by_vector(query_vector, k=10, search_k=100)

Parameter	Description	Typical Range
n_trees	Number of trees to build	10-100 (more for higher recall)
search_k	Nodes to inspect during search	n_trees * k to n_trees * k * 10

Advantages:

🚀 Memory-mapped: Can search without loading entire index into RAM
📦 Compact: Index size is close to raw vector data
🎯 Predictable: Performance scales smoothly with parameters

Disadvantages:

🔒 Static: Must rebuild entire index to add vectors
📊 Lower recall: Typically doesn't match HNSW at same speed

3. Locality-Sensitive Hashing (LSH) #️⃣

LSH uses special hash functions that map similar vectors to the same hash buckets with high probability.

The LSH Magic Trick

Normal hash functions (like MD5) aim to spread similar inputs far apart. LSH does the opposite:

REGULAR HASH vs LSH

Regular Hash (MD5):
Input: "hello"  → Hash: a3f8c9...
Input: "hallo"  → Hash: 8d2e1f...  (totally different!)

LSH for vectors:
Vector A: [0.9, 0.1, 0.2]  → Bucket: 10110
Vector B: [0.85, 0.15, 0.25] → Bucket: 10110  (same!)
Vector C: [0.1, 0.9, 0.8]  → Bucket: 01001  (different)

Random Hyperplane LSH

The most common LSH for cosine similarity uses random hyperplanes:

import numpy as np

class LSHIndex:
    def __init__(self, dim, n_bits=16, n_tables=10):
        self.dim = dim
        self.n_bits = n_bits
        self.n_tables = n_tables
        
        # Create random hyperplanes for each table
        self.hyperplanes = [
            np.random.randn(n_bits, dim) 
            for _ in range(n_tables)
        ]
        self.tables = [{} for _ in range(n_tables)]
    
    def hash_vector(self, vector, table_idx):
        # Project onto hyperplanes, check which side
        projections = np.dot(self.hyperplanes[table_idx], vector)
        # Convert to binary hash
        bits = (projections > 0).astype(int)
        return tuple(bits)
    
    def insert(self, vector, item_id):
        for i in range(self.n_tables):
            hash_val = self.hash_vector(vector, i)
            if hash_val not in self.tables[i]:
                self.tables[i][hash_val] = []
            self.tables[i][hash_val].append(item_id)
    
    def query(self, vector, k=10):
        candidates = set()
        for i in range(self.n_tables):
            hash_val = self.hash_vector(vector, i)
            if hash_val in self.tables[i]:
                candidates.update(self.tables[i][hash_val])
        # Rank candidates by actual distance
        return list(candidates)[:k]  # Simplified

LSH Parameters

Parameter	Effect
n_bits	Hash length (more bits = finer buckets)
n_tables	Independent hash tables (more = better recall)

💡 The LSH Insight: Similar vectors will land in the same bucket in at least one table. More tables = more chances to find neighbors.

Advantages:

⚡ Sub-linear search: O(n^ρ) where ρ < 1
💾 Memory efficient: Just store hash buckets
➕ Supports updates: Easy to add/remove vectors

Disadvantages:

🎯 Lower recall: Hard to tune for high recall
📐 Dimension dependent: Performance degrades in very high dimensions

4. Product Quantization (PQ) 🗜️

Product Quantization compresses vectors by breaking them into chunks and replacing each chunk with the nearest centroid from a learned codebook. Think of it as "lossy compression for vectors."

How PQ Works

Step 1: Split Vectors into Subvectors

## Original 128-dim vector
vector = [0.1, 0.5, 0.3, ..., 0.8]  # 128 floats

## Split into 8 subvectors of 16 dimensions each
subvectors = [
    vector[0:16],    # Subvector 1
    vector[16:32],   # Subvector 2
    vector[32:48],   # Subvector 3
    # ... 8 total
]

Step 2: Learn Codebooks

For each subvector position, cluster all training subvectors into M centroids (typically M=256):

CODEBOOK for Subvector 1 (256 centroids)
┌──────────────────────────────────┐
│ ID 0:  [0.12, 0.45, 0.32, ...]  │
│ ID 1:  [0.89, 0.01, 0.56, ...]  │
│ ID 2:  [0.34, 0.67, 0.11, ...]  │
│ ...                              │
│ ID 255: [0.55, 0.23, 0.78, ...] │
└──────────────────────────────────┘

Step 3: Encode Vectors

Replace each subvector with its nearest centroid ID:

## Original: 128 floats × 4 bytes = 512 bytes
vector = [0.1, 0.5, 0.3, ..., 0.8]

## Encoded: 8 bytes (one ID per subvector)
encoded = [42, 17, 201, 88, 156, 3, 99, 211]
##          ↑   Each ID is 0-255 (1 byte)

Compression Ratio: 512 bytes → 8 bytes = 64x compression!

PQ Search

To search without decompressing:

def pq_search(query_vector, encoded_db, codebooks, k=10):
    # 1. Split query into subvectors
    query_subs = split_into_subvectors(query_vector)
    
    # 2. Precompute distances from query to all centroids
    distance_tables = []
    for i, sub in enumerate(query_subs):
        # Distance from query_sub to each of 256 centroids
        distances = compute_distances(sub, codebooks[i])
        distance_tables.append(distances)
    
    # 3. Score each database vector using lookup tables
    scores = []
    for encoded_vec in encoded_db:
        score = 0
        for i, centroid_id in enumerate(encoded_vec):
            # Just look up precomputed distance!
            score += distance_tables[i][centroid_id]
        scores.append(score)
    
    # 4. Return top-k
    return get_topk(scores, k)

💡 The PQ Speedup: We compute distances to only 8×256=2048 centroids once, then use table lookups for every database vector. No floating-point math per vector!

PQ Parameters

Parameter	Typical Value	Trade-off
m (subvectors)	8, 16, 32	More = better accuracy, larger tables
k (centroids)	256 (fits in 1 byte)	More = better accuracy, more memory

PQ + IVFPQ: Often combined with inverted file index (IVF) for two-stage search:

Coarse search: Find relevant clusters using IVF
Fine search: Use PQ within selected clusters

from faiss import IndexIVFPQ

## Create IVF with PQ
quantizer = faiss.IndexFlatL2(dim)
index = faiss.IndexIVFPQ(
    quantizer,
    dim,           # Vector dimension
    n_clusters=1024,  # IVF clusters
    m=8,           # PQ subvectors
    bits_per_code=8  # 256 centroids
)

5. Scalar Quantization (SQ) 📉

A simpler compression approach: convert float32 to int8.

## Original float32 (4 bytes per dimension)
vector = np.array([0.123, 0.876, -0.432, 0.591], dtype=np.float32)
## Memory: 4 dimensions × 4 bytes = 16 bytes

## Find min/max
vmin, vmax = vector.min(), vector.max()

## Quantize to int8 (1 byte per dimension)
quantized = np.round(
    (vector - vmin) / (vmax - vmin) * 255
).astype(np.uint8)
## Memory: 4 dimensions × 1 byte = 4 bytes

## Result: [0, 255, 42, 196]
## Compression: 4x, but some precision lost

Advantages of SQ:

🎯 Better recall than PQ (less information loss)
💻 SIMD-friendly: Fast with modern CPUs
🔧 Simple: No training needed

Disadvantages:

📦 Lower compression: Only 4x vs PQ's 64x
📊 Dimension dependent: Savings scale with original dimension

Examples: Choosing and Implementing ANN Algorithms 🔧

Example 1: Building an HNSW Index with FAISS

Scenario: You have 1 million product embeddings (384 dimensions) for semantic search.

import faiss
import numpy as np

## Generate sample data
n_vectors = 1_000_000
dim = 384
vectors = np.random.randn(n_vectors, dim).astype('float32')

## Normalize for cosine similarity
faiss.normalize_L2(vectors)

## Create HNSW index
M = 32  # Number of connections per layer
ef_construction = 200  # Quality during construction

index = faiss.IndexHNSWFlat(dim, M)
index.hnsw.efConstruction = ef_construction

## Build the index
print("Building HNSW index...")
index.add(vectors)
print(f"Index built with {index.ntotal} vectors")

## Search with different ef_search values
query = np.random.randn(1, dim).astype('float32')
faiss.normalize_L2(query)

for ef in [16, 32, 64, 128]:
    index.hnsw.efSearch = ef
    distances, indices = index.search(query, k=10)
    print(f"ef_search={ef}: Found {len(indices[0])} neighbors")

Expected Output:

Building HNSW index...
Index built with 1000000 vectors
ef_search=16: Found 10 neighbors
ef_search=32: Found 10 neighbors  # Better recall
ef_search=64: Found 10 neighbors  # Even better
ef_search=128: Found 10 neighbors  # Highest quality

💡 Tuning Insight: Monitor latency vs recall. Start with ef_search = 50, increase if recall is too low, decrease if latency is too high.

Example 2: Memory-Efficient Search with IVF + PQ

Scenario: You have 10 million vectors but limited RAM. Compress with Product Quantization.

import faiss
import numpy as np

n_vectors = 10_000_000
dim = 768  # Large language model embeddings
vectors = np.random.randn(n_vectors, dim).astype('float32')

## Create IVF + PQ index
n_clusters = 4096  # Number of IVF clusters
m = 64  # Split into 64 subvectors (768/64 = 12 dims each)
bits = 8  # 256 centroids per subvector

quantizer = faiss.IndexFlatL2(dim)
index = faiss.IndexIVFPQ(quantizer, dim, n_clusters, m, bits)

## Train on sample data (PQ needs training)
print("Training index...")
training_sample = vectors[:100_000]  # Use 100k for training
index.train(training_sample)

## Add all vectors
print("Adding vectors...")
index.add(vectors)

print(f"\nMemory comparison:")
print(f"Original: {n_vectors * dim * 4 / 1e9:.2f} GB")
print(f"With PQ: {n_vectors * m / 1e9:.2f} GB")
print(f"Compression: {(dim * 4) / m:.1f}x")

## Search
index.nprobe = 32  # Search top 32 clusters
query = np.random.randn(1, dim).astype('float32')
distances, indices = index.search(query, k=10)

Output:

Training index...
Adding vectors...

Memory comparison:
Original: 30.72 GB
With PQ: 0.64 GB
Compression: 48.0x

💡 Production Tip: Train PQ on representative data. If your distribution changes, retrain the codebooks.

Example 3: Dynamic Updates with LSH

Scenario: Real-time vector search where vectors are constantly added/removed.

import numpy as np
from collections import defaultdict

class SimpleLSH:
    def __init__(self, dim, n_bits=16, n_tables=20):
        self.dim = dim
        self.n_bits = n_bits
        self.n_tables = n_tables
        
        # Random hyperplanes for each table
        self.planes = [
            np.random.randn(n_bits, dim)
            for _ in range(n_tables)
        ]
        
        # Hash tables: {hash -> [vector_ids]}
        self.tables = [
            defaultdict(list) for _ in range(n_tables)
        ]
        
        # Store actual vectors
        self.vectors = {}
    
    def _hash(self, vector, table_idx):
        """Compute binary hash for a vector"""
        projections = self.planes[table_idx] @ vector
        bits = tuple((projections > 0).astype(int))
        return bits
    
    def insert(self, vector_id, vector):
        """Add a vector to the index"""
        self.vectors[vector_id] = vector
        for i in range(self.n_tables):
            hash_val = self._hash(vector, i)
            self.tables[i][hash_val].append(vector_id)
    
    def remove(self, vector_id):
        """Remove a vector from the index"""
        if vector_id not in self.vectors:
            return
        
        vector = self.vectors[vector_id]
        for i in range(self.n_tables):
            hash_val = self._hash(vector, i)
            self.tables[i][hash_val].remove(vector_id)
        
        del self.vectors[vector_id]
    
    def query(self, query_vector, k=10):
        """Find k nearest neighbors"""
        # Collect candidates from all tables
        candidates = set()
        for i in range(self.n_tables):
            hash_val = self._hash(query_vector, i)
            candidates.update(self.tables[i][hash_val])
        
        # Rank by actual distance
        distances = [
            (vid, np.linalg.norm(self.vectors[vid] - query_vector))
            for vid in candidates
        ]
        distances.sort(key=lambda x: x[1])
        
        return [vid for vid, _ in distances[:k]]

## Usage
lsh = SimpleLSH(dim=128, n_bits=16, n_tables=20)

## Add vectors dynamically
for i in range(10000):
    vec = np.random.randn(128).astype('float32')
    lsh.insert(f"vec_{i}", vec)

## Search
query = np.random.randn(128).astype('float32')
results = lsh.query(query, k=10)
print(f"Found neighbors: {results}")

## Remove vector (instant!)
lsh.remove("vec_5000")

Key Advantage: Updates are O(n_tables) = O(1) with respect to database size.

Example 4: Hybrid Approach - HNSW + SQ

Scenario: Balance speed, accuracy, and memory.

import faiss
import numpy as np

n_vectors = 5_000_000
dim = 512
vectors = np.random.randn(n_vectors, dim).astype('float32')

## Create HNSW with Scalar Quantization
M = 32
index = faiss.IndexHNSWSQ(dim, faiss.ScalarQuantizer.QT_8bit, M)
index.hnsw.efConstruction = 200

## Train scalar quantizer (learns min/max per dimension)
print("Training scalar quantizer...")
index.train(vectors[:100_000])

## Add vectors
print("Adding vectors...")
index.add(vectors)

print(f"\nMemory savings:")
print(f"Float32: {n_vectors * dim * 4 / 1e9:.2f} GB")
print(f"Int8 (SQ): {n_vectors * dim * 1 / 1e9:.2f} GB")
print(f"Savings: 4x compression")

## Search with high quality
index.hnsw.efSearch = 100
query = np.random.randn(1, dim).astype('float32')
distances, indices = index.search(query, k=10)

print(f"\nTop-10 neighbors: {indices[0]}")

Why This Works:

HNSW provides excellent graph navigation
SQ reduces memory by 4x with minimal recall loss
Still much faster than exhaustive search

Common Mistakes ⚠️

1. Not Normalizing Vectors for Cosine Similarity

❌ Wrong:

index = faiss.IndexHNSWFlat(dim, 32)
index.add(vectors)  # Vectors not normalized

✅ Right:

import faiss
faiss.normalize_L2(vectors)  # Normalize first!
index = faiss.IndexHNSWFlat(dim, 32)
index.add(vectors)

Why: HNSW uses L2 distance by default. For cosine similarity, you must normalize vectors so that L2 distance approximates cosine.

2. Setting ef_search < k in HNSW

❌ Wrong:

index.hnsw.efSearch = 10
results = index.search(query, k=50)  # Asking for more than candidates!

✅ Right:

index.hnsw.efSearch = 100  # Always >= k
results = index.search(query, k=50)

3. Not Training PQ/IVF Indices

❌ Wrong:

index = faiss.IndexIVFPQ(quantizer, dim, 1024, 8, 8)
index.add(vectors)  # Error: index not trained!

✅ Right:

index = faiss.IndexIVFPQ(quantizer, dim, 1024, 8, 8)
index.train(training_vectors)  # Train first
index.add(vectors)

4. Using Too Few nprobe in IVF Search

❌ Wrong:

index.nprobe = 1  # Only search 1 cluster - terrible recall!

✅ Right:

index.nprobe = 32  # Search 32 clusters - much better

💡 Rule of Thumb: Start with nprobe = sqrt(n_clusters), tune based on recall.

5. Forgetting to Set Quantizer Training

❌ Wrong:

quantizer = faiss.IndexFlatL2(dim)
index = faiss.IndexIVFPQ(quantizer, dim, 1024, 8, 8)
index.train(vectors)  # Quantizer also needs vectors

✅ Right:

quantizer = faiss.IndexFlatL2(dim)
index = faiss.IndexIVFPQ(quantizer, dim, 1024, 8, 8)
quantizer.train(vectors)  # Train quantizer
index.train(vectors)      # Then train index

6. Not Benchmarking Recall

Always measure recall to ensure your approximation is acceptable:

def measure_recall(index, queries, ground_truth, k=10):
    """Measure recall@k against ground truth"""
    _, predicted = index.search(queries, k)
    
    recalls = []
    for i in range(len(queries)):
        true_neighbors = set(ground_truth[i][:k])
        pred_neighbors = set(predicted[i])
        recall = len(true_neighbors & pred_neighbors) / k
        recalls.append(recall)
    
    return np.mean(recalls)

## Get ground truth with exact search
exact_index = faiss.IndexFlatL2(dim)
exact_index.add(vectors)
_, ground_truth = exact_index.search(queries, k=10)

## Measure your ANN index
recall = measure_recall(ann_index, queries, ground_truth, k=10)
print(f"Recall@10: {recall:.2%}")

Key Takeaways 🎯

📋 ANN Algorithm Quick Reference

Algorithm	Best For	Updates	Recall	Memory
HNSW	High recall, static datasets	❌ Slow	⭐⭐⭐⭐⭐	Medium
Annoy	Memory-mapped, read-heavy	❌ Rebuild	⭐⭐⭐⭐	Low
LSH	Dynamic datasets, streaming	✅ Fast	⭐⭐⭐	Low
IVF+PQ	Huge datasets, limited RAM	⚠️ Retrain	⭐⭐⭐⭐	Very Low
HNSW+SQ	Balance speed/memory/recall	❌ Slow	⭐⭐⭐⭐⭐	Low

🎓 Decision Framework

Choose HNSW when:

Recall is critical (95%+ required)
Dataset fits in memory
Updates are infrequent
You need predictable performance

Choose Annoy when:

Dataset exceeds RAM (use memory mapping)
You can rebuild index offline
You need compact disk storage
Spotify-level recall (90-95%) is acceptable

Choose LSH when:

Vectors arrive in real-time
You need immediate updates/deletes
Lower recall (85-90%) is acceptable
Memory is very limited

Choose IVF+PQ when:

Dataset is massive (10M+ vectors)
RAM is constrained
You can afford retraining periodically
You need 50-100x compression

Parameter Tuning Priorities:

Start simple: Default parameters often work well
Measure baseline: Get recall/latency metrics
Tune for recall: Increase ef_search, nprobe, search_k
Optimize latency: Decrease parameters once recall is met
Monitor memory: Add quantization if needed

📚 Further Study

FAISS Documentation: https://github.com/facebookresearch/faiss/wiki - Comprehensive guide to Meta's vector search library
HNSW Paper: https://arxiv.org/abs/1603.09320 - Original research on Hierarchical Navigable Small World graphs
ANN Benchmarks: https://ann-benchmarks.com/ - Compare algorithm performance across datasets

🧠 Memory Device for Algorithm Selection:

High recall → HNSW
Annoy for Archived/static data
LSH for Live updates
PQ for Petabyte-scale compression

Ready to test your knowledge? Complete the practice questions below to reinforce your understanding of ANN algorithms! 🚀

📝

Ready to practice?

This lesson has 15 questions to help you learn

ANN Algorithms

ANN Algorithms for Vector Search

Welcome 🎯

Core Concepts: The ANN Algorithm Landscape 🗺️

The Fundamental Trade-off

Algorithm Categories

1. HNSW: The Graph Navigation Champion 🏆

How HNSW Works

HNSW Parameters

2. Annoy: Binary Tree Partitioning 🌳

How Annoy Works

Annoy Parameters

3. Locality-Sensitive Hashing (LSH) #️⃣

The LSH Magic Trick

Random Hyperplane LSH

LSH Parameters

4. Product Quantization (PQ) 🗜️

How PQ Works

PQ Search

PQ Parameters

5. Scalar Quantization (SQ) 📉

Examples: Choosing and Implementing ANN Algorithms 🔧

Example 1: Building an HNSW Index with FAISS

Example 2: Memory-Efficient Search with IVF + PQ

Example 3: Dynamic Updates with LSH

Example 4: Hybrid Approach - HNSW + SQ

Common Mistakes ⚠️

1. Not Normalizing Vectors for Cosine Similarity

2. Setting ef_search < k in HNSW

3. Not Training PQ/IVF Indices

4. Using Too Few nprobe in IVF Search

5. Forgetting to Set Quantizer Training

6. Not Benchmarking Recall

Key Takeaways 🎯

📋 ANN Algorithm Quick Reference

🎓 Decision Framework

📚 Further Study