You are viewing a preview of this lesson. Sign in to start learning
Back to Python Programming

Sets & Dictionaries

Working with unique collections and key-value pairs

Sets and Dictionaries in Python

Master Python's most powerful data structures with free flashcards and spaced repetition practice. This lesson covers sets for unique collections, dictionaries for key-value storage, and practical operations on bothβ€”essential concepts for efficient Python programming and data manipulation.

Welcome to Sets & Dictionaries! πŸ’»

After learning about lists and tuples, you're ready to explore two more fundamental data structures that solve different problems. Sets help you work with unique collections and perform mathematical operations, while dictionaries store data as key-value pairs for lightning-fast lookups. Together, they're indispensable tools in every Python programmer's toolkit.

Think of a set like a bag of unique itemsβ€”no duplicates allowedβ€”and a dictionary like a real dictionary where you look up words (keys) to find their definitions (values). These structures are optimized for different use cases and can dramatically improve your code's efficiency.

Core Concepts: Sets 🎯

What is a Set?

A set is an unordered collection of unique elements. Sets automatically eliminate duplicates and don't maintain any particular order. They're perfect when you need to:

  • Remove duplicates from a collection
  • Test membership quickly
  • Perform mathematical set operations (union, intersection, difference)

Creating Sets

There are several ways to create sets in Python:

## Using curly braces
fruits = {'apple', 'banana', 'orange'}

## Using the set() constructor
numbers = set([1, 2, 3, 4, 5])

## Creating an empty set (MUST use set(), not {})
empty_set = set()  # βœ… Correct
empty_dict = {}    # ❌ This creates an empty dictionary!

## Automatic duplicate removal
duplicates = {1, 2, 2, 3, 3, 3, 4}
print(duplicates)  # Output: {1, 2, 3, 4}

πŸ’‘ Tip: You cannot create an empty set with {} because Python reserves that syntax for empty dictionaries. Always use set() for empty sets.

Set Operations

Sets support powerful mathematical operations:

OperationSymbol/MethodDescriptionExample
Union| or .union()All elements from both sets{1,2} | {2,3} β†’ {1,2,3}
Intersection& or .intersection()Only common elements{1,2} & {2,3} β†’ {2}
Difference- or .difference()Elements in first but not second{1,2} - {2,3} β†’ {1}
Symmetric Difference^ or .symmetric_difference()Elements in either but not both{1,2} ^ {2,3} β†’ {1,3}
set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}

## Union: all unique elements
print(set_a | set_b)  # {1, 2, 3, 4, 5, 6}

## Intersection: common elements
print(set_a & set_b)  # {3, 4}

## Difference: in A but not in B
print(set_a - set_b)  # {1, 2}

## Symmetric difference: in A or B but not both
print(set_a ^ set_b)  # {1, 2, 5, 6}

Common Set Methods

my_set = {1, 2, 3}

## Adding elements
my_set.add(4)           # Add single element: {1, 2, 3, 4}
my_set.update([5, 6])   # Add multiple elements: {1, 2, 3, 4, 5, 6}

## Removing elements
my_set.remove(6)        # Remove element (raises error if not found)
my_set.discard(10)      # Remove element (no error if not found)
popped = my_set.pop()   # Remove and return arbitrary element

## Checking membership
print(3 in my_set)      # True (very fast operation!)
print(10 in my_set)     # False

## Set size
print(len(my_set))      # Number of elements

## Clearing all elements
my_set.clear()          # Empty the set

🧠 Memory Aid: Think "ARUD" for set operations:

  • Add (add, update)
  • Remove (remove, discard, pop)
  • Union (combine sets)
  • Difference (subtract sets)

Core Concepts: Dictionaries πŸ“š

What is a Dictionary?

A dictionary is an unordered collection of key-value pairs. Each key maps to a specific value, allowing instant lookup by key. Dictionaries are like real-world dictionaries where you look up a word (key) to find its definition (value).

Key characteristics:

  • Keys must be unique and immutable (strings, numbers, tuples)
  • Values can be any type and can be duplicated
  • Extremely fast lookups (O(1) average case)
  • Unordered in Python 3.6+, but insertion order is preserved

Creating Dictionaries

## Using curly braces with key-value pairs
student = {
    'name': 'Alice',
    'age': 20,
    'major': 'Computer Science'
}

## Using the dict() constructor
student2 = dict(name='Bob', age=21, major='Mathematics')

## Creating from sequences of pairs
pairs = [('a', 1), ('b', 2), ('c', 3)]
letter_dict = dict(pairs)

## Empty dictionary
empty_dict = {}  # or dict()

## Dictionary with various value types
mixed = {
    'string': 'hello',
    'number': 42,
    'list': [1, 2, 3],
    'nested_dict': {'inner': 'value'}
}

Accessing Dictionary Values

person = {'name': 'Charlie', 'age': 25, 'city': 'Boston'}

## Using square brackets (raises KeyError if key doesn't exist)
print(person['name'])      # 'Charlie'

## Using .get() method (returns None or default if key doesn't exist)
print(person.get('age'))           # 25
print(person.get('country'))       # None
print(person.get('country', 'USA'))  # 'USA' (default value)

## Checking if key exists
if 'city' in person:
    print(person['city'])

print('email' in person)   # False

πŸ’‘ Tip: Use .get() when you're unsure if a key exists. Use square brackets [] when you expect the key to exist and want an error if it doesn't.

Modifying Dictionaries

scores = {'Alice': 85, 'Bob': 90}

## Adding or updating values
scores['Charlie'] = 88      # Add new key-value pair
scores['Alice'] = 87        # Update existing value

## Update multiple items at once
scores.update({'David': 92, 'Eve': 89})

## Removing items
del scores['Bob']           # Remove specific key
popped = scores.pop('Eve')  # Remove and return value
popped_item = scores.popitem()  # Remove and return arbitrary (key, value) tuple

## Clear all items
scores.clear()

Dictionary Methods

data = {'a': 1, 'b': 2, 'c': 3}

## Getting keys, values, and items
keys = data.keys()       # dict_keys(['a', 'b', 'c'])
values = data.values()   # dict_values([1, 2, 3])
items = data.items()     # dict_items([('a', 1), ('b', 2), ('c', 3)])

## Iterating through dictionaries
for key in data:
    print(key, data[key])

for key, value in data.items():
    print(f"{key}: {value}")

for value in data.values():
    print(value)

## Copying a dictionary
shallow_copy = data.copy()
deep_copy = dict(data)

## Creating with default values
from collections import defaultdict
counts = defaultdict(int)  # New keys default to 0
counts['a'] += 1  # Works even though 'a' didn't exist
Dictionary Operations Visualization

    KEY β†’ VALUE Mapping
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  'name'  β†’  'Alice' β”‚
    β”‚  'age'   β†’    20    β”‚
    β”‚  'gpa'   β†’   3.8    β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓
    Hash Table (Fast O(1) lookup)
         ↓
    Memory addresses point directly
    to values

Detailed Examples πŸ”

Example 1: Removing Duplicates with Sets

Problem: You have a list of email addresses from user registrations, and you need to find the unique emails and count how many duplicates were removed.

## Registration emails (with duplicates)
emails = [
    'alice@email.com',
    'bob@email.com',
    'alice@email.com',  # duplicate
    'charlie@email.com',
    'bob@email.com',    # duplicate
    'diana@email.com',
    'alice@email.com'   # duplicate
]

## Convert to set to get unique emails
unique_emails = set(emails)

print(f"Total registrations: {len(emails)}")
print(f"Unique users: {len(unique_emails)}")
print(f"Duplicates removed: {len(emails) - len(unique_emails)}")
print(f"\nUnique emails: {sorted(unique_emails)}")

Output:

Total registrations: 7
Unique users: 4
Duplicates removed: 3

Unique emails: ['alice@email.com', 'bob@email.com', 'charlie@email.com', 'diana@email.com']

Explanation: Sets automatically eliminate duplicates when you convert a list to a set. This is one of the most common uses of sets in data cleaning and processing tasks.

Example 2: Finding Common Interests with Set Intersection

Problem: Two users have lists of interests. Find what interests they have in common to suggest mutual friends or group activities.

## User interests
alice_interests = {'reading', 'hiking', 'photography', 'cooking', 'yoga'}
bob_interests = {'gaming', 'cooking', 'hiking', 'music', 'photography'}

## Find common interests
common = alice_interests & bob_interests
print(f"Common interests: {common}")

## Find what Alice likes that Bob doesn't
alice_unique = alice_interests - bob_interests
print(f"Alice's unique interests: {alice_unique}")

## Find what Bob likes that Alice doesn't
bob_unique = bob_interests - alice_interests
print(f"Bob's unique interests: {bob_unique}")

## Find all interests between both users
all_interests = alice_interests | bob_interests
print(f"All interests combined: {all_interests}")

## Calculate similarity percentage
similarity = len(common) / len(all_interests) * 100
print(f"\nInterest similarity: {similarity:.1f}%")

Output:

Common interests: {'photography', 'cooking', 'hiking'}
Alice's unique interests: {'reading', 'yoga'}
Bob's unique interests: {'music', 'gaming'}
All interests combined: {'reading', 'gaming', 'photography', 'hiking', 'cooking', 'music', 'yoga'}

Interest similarity: 42.9%

Explanation: Set operations make it trivial to find commonalities and differences. This pattern is useful in recommendation systems, social networks, and data analysis.

Example 3: Word Frequency Counter with Dictionaries

Problem: Count how many times each word appears in a text passage.

text = """
Python is amazing. Python is powerful. 
Python is easy to learn. Learning Python is fun.
"""

## Clean and split text into words
words = text.lower().replace('.', '').replace('\n', ' ').split()

## Count word frequencies
word_count = {}
for word in words:
    if word in word_count:
        word_count[word] += 1
    else:
        word_count[word] = 1

## Alternative using .get()
word_count_alt = {}
for word in words:
    word_count_alt[word] = word_count_alt.get(word, 0) + 1

print("Word frequencies:")
for word, count in sorted(word_count.items()):
    print(f"  {word}: {count}")

## Find most common word
most_common = max(word_count.items(), key=lambda x: x[1])
print(f"\nMost common word: '{most_common[0]}' ({most_common[1]} times)")

Output:

Word frequencies:
  amazing: 1
  easy: 1
  fun: 1
  is: 4
  learn: 1
  learning: 1
  powerful: 1
  python: 4
  to: 1

Most common word: 'is' (4 times)

Explanation: Dictionaries excel at counting and grouping data. The .get(word, 0) pattern is a common idiom for initializing counters. Python also provides collections.Counter for this specific use case.

Example 4: Student Grade Management System

Problem: Create a system to store student grades, calculate averages, and manage enrollment.

## Dictionary with nested data structures
student_grades = {
    'Alice': {'math': 95, 'science': 88, 'english': 92},
    'Bob': {'math': 78, 'science': 85, 'english': 80},
    'Charlie': {'math': 90, 'science': 92, 'english': 88}
}

## Calculate average for each student
for student, grades in student_grades.items():
    average = sum(grades.values()) / len(grades)
    print(f"{student}'s average: {average:.1f}")

## Add a new grade for a student
student_grades['Alice']['history'] = 94

## Calculate class average for a subject
subject = 'math'
math_scores = [grades[subject] for grades in student_grades.values()]
class_avg = sum(math_scores) / len(math_scores)
print(f"\nClass average in {subject}: {class_avg:.1f}")

## Find students with average above 85
high_performers = set()
for student, grades in student_grades.items():
    if sum(grades.values()) / len(grades) >= 85:
        high_performers.add(student)

print(f"High performers (β‰₯85): {high_performers}")

## Create a set of all subjects taught
all_subjects = set()
for grades in student_grades.values():
    all_subjects.update(grades.keys())

print(f"All subjects: {sorted(all_subjects)}")

Output:

Alice's average: 91.7
Bob's average: 81.0
Charlie's average: 90.0

Class average in math: 87.7
High performers (β‰₯85): {'Alice', 'Charlie'}
All subjects: ['english', 'history', 'math', 'science']

Explanation: This example demonstrates combining dictionaries and sets for real-world data management. Nested dictionaries allow complex data structures, while sets help find unique values across all data.

Common Mistakes ⚠️

Mistake 1: Using Mutable Types as Dictionary Keys

❌ Wrong:

## Lists are mutable and cannot be dictionary keys
my_dict = {[1, 2]: 'value'}  # TypeError!

βœ… Correct:

## Use immutable types like tuples instead
my_dict = {(1, 2): 'value'}  # Works!
my_dict = {'string_key': 'value'}  # Works!
my_dict = {42: 'value'}  # Works!

Why: Dictionary keys must be hashable (immutable). Lists, sets, and other dictionaries cannot be keys because they can change.

Mistake 2: Confusing `

❌ Wrong:

empty_set = {}  # This creates an empty dictionary, not a set!
print(type(empty_set))  # <class 'dict'>

βœ… Correct:

empty_set = set()  # This creates an empty set
print(type(empty_set))  # <class 'set'>

Why: Python uses {} for both sets and dictionaries, but {} alone defaults to an empty dictionary for historical reasons.

Mistake 3: Modifying Dictionary While Iterating

❌ Wrong:

scores = {'Alice': 85, 'Bob': 70, 'Charlie': 90}

## Causes RuntimeError: dictionary changed size during iteration
for student in scores:
    if scores[student] < 75:
        del scores[student]  # Don't modify while iterating!

βœ… Correct:

scores = {'Alice': 85, 'Bob': 70, 'Charlie': 90}

## Create a list of keys first
to_remove = [student for student, score in scores.items() if score < 75]
for student in to_remove:
    del scores[student]

## Or create a new dictionary
scores = {student: score for student, score in scores.items() if score >= 75}

Why: Modifying a dictionary's structure during iteration causes unpredictable behavior. Always iterate over a copy or create a new dictionary.

Mistake 4: Assuming Dictionary/Set Order (in older Python)

❌ Wrong (Python < 3.7):

my_dict = {'z': 1, 'a': 2, 'b': 3}
print(list(my_dict.keys()))  # Order was unpredictable!

βœ… Correct:

## If order matters, be explicit
my_dict = {'z': 1, 'a': 2, 'b': 3}
sorted_keys = sorted(my_dict.keys())  # ['a', 'b', 'z']

## Or use OrderedDict (though not needed in Python 3.7+)
from collections import OrderedDict
ordered = OrderedDict([('z', 1), ('a', 2), ('b', 3)])

Why: While Python 3.7+ guarantees insertion order for dictionaries, it's good practice to sort explicitly when order is important, especially for code compatibility.

Mistake 5: Using .remove() Instead of .discard() on Sets

❌ Wrong:

my_set = {1, 2, 3}
my_set.remove(5)  # KeyError: 5 not in set!

βœ… Correct:

my_set = {1, 2, 3}
my_set.discard(5)  # No error, even though 5 isn't in the set

## Or check first
if 5 in my_set:
    my_set.remove(5)

Why: .remove() raises an error if the element doesn't exist. .discard() silently does nothing if the element is absent, which is often what you want.

Mistake 6: Forgetting that Set Elements Must be Immutable

❌ Wrong:

my_set = {[1, 2], [3, 4]}  # TypeError: unhashable type: 'list'

βœ… Correct:

## Use tuples instead of lists
my_set = {(1, 2), (3, 4)}  # Works!

## Or use frozenset for sets within sets
my_set = {frozenset([1, 2]), frozenset([3, 4])}  # Works!

Why: Like dictionary keys, set elements must be hashable. Use tuples or frozenset for complex elements.

Key Takeaways 🎯

Sets: The Unique Collection

βœ… Use sets when you need:

  • Automatic duplicate removal
  • Fast membership testing (in operator)
  • Mathematical set operations (union, intersection, difference)
  • Unique elements without caring about order

Remember:

  • Elements must be immutable (hashable)
  • No guaranteed order (though Python 3.7+ maintains insertion order)
  • Can't access elements by index
  • Create empty sets with set(), not {}

Dictionaries: The Key-Value Store

βœ… Use dictionaries when you need:

  • Fast lookups by key
  • Associations between data (key-value pairs)
  • Counting, grouping, or categorizing data
  • Configuration or settings storage

Remember:

  • Keys must be unique and immutable
  • Values can be any type
  • Use .get() for safe access
  • Order is preserved in Python 3.7+
  • Can nest dictionaries for complex data

Performance Insights ⚑

OperationListSetDictionary
Check membership (in)O(n)O(1)O(1)
Add elementO(1)O(1)O(1)
Remove elementO(n)O(1)O(1)
Access by index/keyO(1)N/AO(1)
Iterate all elementsO(n)O(n)O(n)

πŸ”₯ Pro Tip: When checking if an item exists in a collection thousands of times, sets and dictionaries are dramatically faster than lists due to their hash table implementation.

πŸ“‹ Quick Reference Card

TaskSet CodeDictionary Code
Create emptys = set()d = {}
Create with valuess = {1, 2, 3}d = {'a': 1, 'b': 2}
Add elements.add(4)d['c'] = 3
Remove elements.discard(4)d.pop('c')
Check membership4 in s'c' in d
Get sizelen(s)len(d)
Iteratefor x in s:for k, v in d.items():
Clear alls.clear()d.clear()
Combine twos1 | s2d1.update(d2)
Copys.copy()d.copy()

Try This! πŸ”§

Before moving on, challenge yourself with these mini-exercises:

  1. Set Challenge: Create two sets of programming languages you know and want to learn. Find which ones are in both sets (you know AND want to learn more about).

  2. Dictionary Challenge: Create a dictionary representing a shopping cart with items as keys and quantities as values. Write code to calculate the total number of items.

  3. Combined Challenge: Given a list of words, create a dictionary where keys are words and values are the number of vowels in each word. Then create a set of all unique vowel counts.

Further Study πŸ“š


πŸŽ‰ Congratulations! You now understand Python's sets and dictionariesβ€”two powerful tools for organizing and manipulating data efficiently. Practice using them in your projects, and you'll quickly see how they simplify complex problems. Next, you'll learn about advanced data structure operations and comprehensions that make working with these structures even more elegant!