Collections Framework
Work with Java's built-in data structures
Java Collections Framework
Master Java's Collections Framework with free flashcards and spaced repetition practice. This comprehensive lesson covers the core interfaces (List, Set, Map, Queue), concrete implementations (ArrayList, LinkedList, HashSet, TreeSet, HashMap), and advanced topics like performance characteristics, choosing the right collection, and common iteration patternsβessential knowledge for Java certification exams and professional development.
Welcome to the Collections Framework π»
The Java Collections Framework is one of the most powerful and frequently used components of the Java Standard Library. Think of it as a sophisticated toolkit for organizing and managing groups of objects. Just as a carpenter wouldn't use a hammer for every job, Java developers shouldn't use the same collection type for every data structure problem.
π€ Did you know? The Collections Framework was introduced in Java 1.2 (1998) and replaced the older Vector and Hashtable classes. It was designed using the principles of the Strategy Pattern and Template Method Pattern, making it both flexible and extensible.
Core Concepts
The Collection Hierarchy
The Collections Framework is built on a hierarchy of interfaces with multiple concrete implementations. Understanding this hierarchy is crucial:
Collection
β
ββββββββββββββββββΌβββββββββββββββββ
β β β
List Set Queue
β β β
βββββ΄ββββ βββββ΄ββββ βββββ΄ββββ
β β β β β β
ArrayList LinkedList HashSet TreeSet PriorityQueue Deque
β β
Vector ArrayDeque
Separately, we have Map<K,V> (which doesn't extend Collection):
Map
β
ββββββββββββββββΌβββββββββββββββ
β β β
HashMap TreeMap LinkedHashMap
β
Hashtable
The Core Interfaces
1. Collection
Defines basic operations like:
add(E element)- Add an elementremove(Object o)- Remove an elementcontains(Object o)- Check if element existssize()- Get number of elementsisEmpty()- Check if collection is emptyclear()- Remove all elementsiterator()- Get an iterator for traversal
2. List
Key characteristics:
- β Maintains insertion order
- β Allows duplicate elements
- β Positional access via index (0-based)
- π Additional methods:
get(int index),set(int index, E element),add(int index, E element)
3. Set
Key characteristics:
- β No duplicates allowed
- β No guaranteed order (except TreeSet and LinkedHashSet)
- π― Models mathematical set abstraction
- π No additional methods beyond Collection
4. Queue
Key characteristics:
- π― Typically FIFO (First-In-First-Out)
- π Additional methods:
offer(E e),poll(),peek() - β‘ Used for task scheduling, breadth-first search
5. Map<K,V> - Object that maps keys to values (not a true Collection)
Key characteristics:
- π Key-value pairs
- β No duplicate keys (values can duplicate)
- π Methods:
put(K key, V value),get(Object key),containsKey(Object key),keySet(),values(),entrySet()
Concrete Implementations
ArrayList - Resizable array implementation
When to use: Default choice for List, great for random access
| Operation | Time Complexity | Notes |
|---|---|---|
| get(index) | O(1) | Direct array access |
| add(element) | O(1) amortized | May need to resize |
| add(index, element) | O(n) | Shifts elements |
| remove(index) | O(n) | Shifts elements |
| contains(element) | O(n) | Linear search |
π‘ Tip: Initialize with capacity if you know the approximate size: new ArrayList<>(1000) to avoid multiple resizings.
LinkedList - Doubly-linked list implementation
When to use: Frequent insertions/deletions at beginning or middle
| Operation | Time Complexity | Notes |
|---|---|---|
| get(index) | O(n) | Must traverse list |
| add(element) | O(1) | Add to end |
| add(0, element) | O(1) | Add to beginning |
| remove(index) | O(n) | Must find element first |
| addFirst/addLast | O(1) | Deque operations |
π§ Memory Mnemonic: "LinkedList = Lots of Links" - Each element stores references to previous and next elements (more memory overhead than ArrayList).
HashSet - Hash table implementation of Set
When to use: Need fast lookups, don't care about order
| Operation | Time Complexity | Notes |
|---|---|---|
| add(element) | O(1) | Assuming good hash function |
| remove(element) | O(1) | Direct hash access |
| contains(element) | O(1) | Fast lookup |
β οΈ Important: Elements must properly implement hashCode() and equals() methods!
TreeSet - Red-black tree implementation of Set
When to use: Need sorted set, range operations
| Operation | Time Complexity | Notes |
|---|---|---|
| add(element) | O(log n) | Maintains sorted order |
| remove(element) | O(log n) | Tree rebalancing |
| contains(element) | O(log n) | Binary search |
| first/last | O(log n) | Access min/max |
π‘ Tip: Elements must be Comparable or you must provide a Comparator.
HashMap<K,V> - Hash table implementation of Map
When to use: Default choice for Map, fast key-value lookups
| Operation | Time Complexity | Notes |
|---|---|---|
| put(key, value) | O(1) | Average case |
| get(key) | O(1) | Average case |
| remove(key) | O(1) | Average case |
| containsKey(key) | O(1) | Average case |
π Real-world analogy: HashMap is like a dictionary - you look up a word (key) to find its definition (value). You don't read every entry; you jump directly to the word you need.
TreeMap<K,V> - Red-black tree implementation of Map
When to use: Need sorted map, range queries on keys
| Operation | Time Complexity | Notes |
|---|---|---|
| put(key, value) | O(log n) | Maintains sorted order |
| get(key) | O(log n) | Binary search |
| firstKey/lastKey | O(log n) | Access min/max keys |
Choosing the Right Collection: Decision Tree
ββββββββββββββββββββ
β Need key-value? β
ββββββββββ¬ββββββββββ
β
ββββββββββββββββ΄βββββββββββββββ
β β
ββββ΄βββ ββββ΄βββ
β YES β β NO β
ββββ¬βββ ββββ¬βββ
β β
βΌ βΌ
βββββββββββββββ ββββββββββββββββββββ
β Need sorted β β Need duplicates? β
β by keys? β ββββββββββ¬ββββββββββ
ββββββββ¬βββββββ β
β ββββββββββββ΄βββββββββββ
βββββββ΄ββββββ β β
β β ββββ΄βββ ββββ΄βββ
ββββ΄βββ ββββ΄βββ β YES β β NO β
β YES β β NO β ββββ¬βββ ββββ¬βββ
ββββ¬βββ ββββ¬βββ β β
β β βΌ βΌ
βΌ βΌ βββββββββββββββ βββββββββββββββ
βββββββββββ βββββββββββ βNeed order?β βNeed sorted? β
βTreeMap β βHashMap β βββββββ¬ββββββ ββββββββ¬βββββββ
βββββββββββ βββββββββββ β β
βββββββ΄ββββββ βββββββ΄ββββββ
β β β β
ββββ΄βββ ββββ΄βββ β ββββ΄βββ
β YES β β NO β β β NO β
ββββ¬βββ ββββ¬βββ β ββββ¬βββ
β β β β
βΌ βΌ βΌ βΌ
βββββββββββ βββββββββββ βββββββββββ βββββββββββ
βArrayListβ βHashSet β βTreeSet β βHashSet β
βor β βββββββββββ βββββββββββ βββββββββββ
βLinkedListβ
βββββββββββ
Iteration Patterns
1. Enhanced For Loop (for-each)
Simplest and most readable:
List<String> names = new ArrayList<>();
for (String name : names) {
System.out.println(name);
}
2. Iterator
Allows safe removal during iteration:
Iterator<String> iterator = names.iterator();
while (iterator.hasNext()) {
String name = iterator.next();
if (name.startsWith("A")) {
iterator.remove(); // Safe removal
}
}
β οΈ Common mistake: Don't modify collection directly while iterating:
// β WRONG - throws ConcurrentModificationException
for (String name : names) {
if (name.startsWith("A")) {
names.remove(name); // DANGER!
}
}
3. Stream API (Java 8+)
Functional approach:
names.stream()
.filter(name -> name.length() > 5)
.map(String::toUpperCase)
.forEach(System.out::println);
4. forEach Method (Java 8+)
With lambda expressions:
names.forEach(name -> System.out.println(name));
// Or with method reference
names.forEach(System.out::println);
Collections Utility Class
The Collections class (note the 's') provides static utility methods:
| Method | Purpose | Example |
|---|---|---|
| sort(List) | Sort list in natural order | Collections.sort(list) |
| reverse(List) | Reverse order | Collections.reverse(list) |
| shuffle(List) | Random permutation | Collections.shuffle(deck) |
| binarySearch(List, key) | Binary search (list must be sorted) | Collections.binarySearch(list, "apple") |
| max(Collection) | Maximum element | Collections.max(numbers) |
| min(Collection) | Minimum element | Collections.min(numbers) |
| frequency(Collection, Object) | Count occurrences | Collections.frequency(list, "x") |
| unmodifiableList(List) | Create immutable view | Collections.unmodifiableList(list) |
Examples with Detailed Explanations
Example 1: Building a Student Grade Manager
import java.util.*;
public class GradeManager {
// Map student name to their grades (multiple grades per student)
private Map<String, List<Integer>> studentGrades;
public GradeManager() {
// HashMap for fast student lookup
studentGrades = new HashMap<>();
}
public void addGrade(String studentName, int grade) {
// computeIfAbsent: if key doesn't exist, create new ArrayList
studentGrades.computeIfAbsent(studentName, k -> new ArrayList<>())
.add(grade);
}
public double getAverage(String studentName) {
List<Integer> grades = studentGrades.get(studentName);
if (grades == null || grades.isEmpty()) {
return 0.0;
}
// Stream API for clean average calculation
return grades.stream()
.mapToInt(Integer::intValue)
.average()
.orElse(0.0);
}
public Set<String> getTopStudents(int threshold) {
Set<String> topStudents = new TreeSet<>(); // Sorted by name
for (Map.Entry<String, List<Integer>> entry : studentGrades.entrySet()) {
double avg = getAverage(entry.getKey());
if (avg >= threshold) {
topStudents.add(entry.getKey());
}
}
return topStudents;
}
}
Key Points:
- Uses HashMap for O(1) student lookup
- Uses ArrayList for each student's grades (allows duplicates, maintains order)
- Uses TreeSet for sorted student names
computeIfAbsent()is a powerful method that simplifies null checkingMap.Entryallows iterating over key-value pairs
π§ Try this: Modify the code to use a TreeMap instead. What changes in behavior? (Answer: Students would be processed in alphabetical order)
Example 2: Implementing a Task Queue with Priority
import java.util.*;
class Task implements Comparable<Task> {
private String name;
private int priority; // Higher number = higher priority
public Task(String name, int priority) {
this.name = name;
this.priority = priority;
}
@Override
public int compareTo(Task other) {
// Reverse order for max-heap behavior
return Integer.compare(other.priority, this.priority);
}
@Override
public String toString() {
return name + "(priority:" + priority + ")";
}
}
public class TaskScheduler {
private Queue<Task> taskQueue;
public TaskScheduler() {
// PriorityQueue maintains heap order
taskQueue = new PriorityQueue<>();
}
public void addTask(String name, int priority) {
taskQueue.offer(new Task(name, priority));
}
public Task getNextTask() {
return taskQueue.poll(); // Removes and returns highest priority
}
public Task peekNextTask() {
return taskQueue.peek(); // Returns but doesn't remove
}
public static void main(String[] args) {
TaskScheduler scheduler = new TaskScheduler();
scheduler.addTask("Send email", 2);
scheduler.addTask("Fix critical bug", 5);
scheduler.addTask("Update docs", 1);
while (scheduler.peekNextTask() != null) {
System.out.println("Processing: " + scheduler.getNextTask());
}
// Output order: Fix critical bug, Send email, Update docs
}
}
Key Points:
- PriorityQueue automatically orders elements
- Elements must implement Comparable or provide a Comparator
offer()andpoll()are Queue interface methods (prefer over add/remove)peek()lets you look at next element without removing it- Priority Queue uses a binary heap internally (O(log n) insertion and removal)
π‘ Tip: Remember "PriorityQueue β Peek before Poll" to avoid null pointer exceptions!
Example 3: Removing Duplicates While Preserving Order
import java.util.*;
public class DuplicateRemover {
// Method 1: Using LinkedHashSet (elegant, preserves insertion order)
public static <T> List<T> removeDuplicatesV1(List<T> list) {
return new ArrayList<>(new LinkedHashSet<>(list));
}
// Method 2: Using Set for tracking (more explicit)
public static <T> List<T> removeDuplicatesV2(List<T> list) {
List<T> result = new ArrayList<>();
Set<T> seen = new HashSet<>();
for (T item : list) {
if (seen.add(item)) { // add() returns false if already present
result.add(item);
}
}
return result;
}
// Method 3: Using Stream API (Java 8+)
public static <T> List<T> removeDuplicatesV3(List<T> list) {
return list.stream()
.distinct()
.collect(Collectors.toList());
}
public static void main(String[] args) {
List<String> names = Arrays.asList(
"Alice", "Bob", "Alice", "Charlie", "Bob", "David"
);
System.out.println("Original: " + names);
System.out.println("No duplicates: " + removeDuplicatesV1(names));
// Output: [Alice, Bob, Charlie, David]
}
}
Key Points:
- LinkedHashSet combines HashSet's uniqueness with insertion order preservation
Set.add()returns boolean - false if element already exists- Stream.distinct() uses equals() method to determine uniqueness
- All three methods have O(n) time complexity
π€ Did you know? LinkedHashSet uses both a hash table AND a doubly-linked list internally. The hash table provides O(1) lookups, while the linked list maintains order. This dual structure uses more memory but provides the best of both worlds!
Example 4: Word Frequency Counter
import java.util.*;
import java.util.stream.Collectors;
public class WordFrequencyCounter {
public static Map<String, Integer> countWords(String text) {
Map<String, Integer> frequency = new HashMap<>();
String[] words = text.toLowerCase()
.split("\\W+"); // Split on non-word characters
for (String word : words) {
if (!word.isEmpty()) {
// getOrDefault avoids null check
frequency.put(word, frequency.getOrDefault(word, 0) + 1);
// Alternative using merge() method:
// frequency.merge(word, 1, Integer::sum);
}
}
return frequency;
}
public static List<Map.Entry<String, Integer>> getTopWords(Map<String, Integer> frequency, int n) {
return frequency.entrySet().stream()
.sorted(Map.Entry.<String, Integer>comparingByValue().reversed())
.limit(n)
.collect(Collectors.toList());
}
public static void main(String[] args) {
String text = "the quick brown fox jumps over the lazy dog the fox";
Map<String, Integer> frequency = countWords(text);
System.out.println("Frequency map: " + frequency);
List<Map.Entry<String, Integer>> top3 = getTopWords(frequency, 3);
System.out.println("\nTop 3 words:");
for (Map.Entry<String, Integer> entry : top3) {
System.out.println(entry.getKey() + ": " + entry.getValue());
}
// Output: the: 3, fox: 2, ...
}
}
Key Points:
getOrDefault(key, defaultValue)simplifies null handlingmerge(key, value, remappingFunction)is even more elegant for countersMap.Entryis the way to access key-value pairs togethercomparingByValue()creates a Comparator for sorting by map values- Chaining stream operations creates readable data pipelines
β‘ Performance note: For very large text processing, consider using ConcurrentHashMap if you're working with multiple threads.
Common Mistakes
β Mistake 1: Using ArrayList When You Need Set Behavior
// β WRONG - inefficient duplicate checking
List<String> uniqueNames = new ArrayList<>();
for (String name : allNames) {
if (!uniqueNames.contains(name)) { // O(n) operation!
uniqueNames.add(name);
}
}
// β
RIGHT - O(1) duplicate checking
Set<String> uniqueNames = new HashSet<>(allNames);
Why it's wrong: contains() on ArrayList is O(n), making the entire loop O(nΒ²). HashSet's contains() is O(1).
β Mistake 2: Forgetting to Override hashCode() and equals()
class Person {
String name;
int age;
// β WRONG - using default Object.equals() and Object.hashCode()
}
Set<Person> people = new HashSet<>();
people.add(new Person("Alice", 25));
people.add(new Person("Alice", 25)); // Treated as different!
System.out.println(people.size()); // Output: 2 (should be 1!)
// β
RIGHT - properly override both methods
class Person {
String name;
int age;
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Person)) return false;
Person person = (Person) o;
return age == person.age && Objects.equals(name, person.name);
}
@Override
public int hashCode() {
return Objects.hash(name, age);
}
}
Why it matters: HashSet and HashMap rely on hashCode() for bucket placement. Without proper overrides, objects that should be equal won't be recognized as such.
β Mistake 3: Modifying Collection During Iteration
List<Integer> numbers = new ArrayList<>(Arrays.asList(1, 2, 3, 4, 5));
// β WRONG - ConcurrentModificationException
for (Integer num : numbers) {
if (num % 2 == 0) {
numbers.remove(num);
}
}
// β
RIGHT - use Iterator.remove()
Iterator<Integer> it = numbers.iterator();
while (it.hasNext()) {
if (it.next() % 2 == 0) {
it.remove();
}
}
// β
ALSO RIGHT - use removeIf (Java 8+)
numbers.removeIf(num -> num % 2 == 0);
β Mistake 4: Using Wrong Collection for Performance-Critical Code
// β WRONG - LinkedList for random access
List<String> data = new LinkedList<>();
for (int i = 0; i < 1000; i++) {
data.add("item" + i);
}
String item = data.get(500); // O(n) - slow!
// β
RIGHT - ArrayList for random access
List<String> data = new ArrayList<>();
for (int i = 0; i < 1000; i++) {
data.add("item" + i);
}
String item = data.get(500); // O(1) - fast!
β Mistake 5: Assuming TreeSet/TreeMap Are Always Better
// β WRONG - unnecessary sorting overhead
Set<String> recentSearches = new TreeSet<>(); // O(log n) operations
recentSearches.add("java");
recentSearches.add("python");
// Don't need sorting, just uniqueness!
// β
RIGHT - use HashSet for better performance
Set<String> recentSearches = new HashSet<>(); // O(1) operations
Rule of thumb: Only use TreeSet/TreeMap if you actually need the sorted order. Otherwise, the O(log n) overhead isn't worth it.
Key Takeaways
β Choose the right collection based on requirements:
- Need index access? β ArrayList
- Frequent insertions at beginning? β LinkedList
- Need uniqueness? β HashSet
- Need sorted order? β TreeSet or TreeMap
- Need key-value pairs? β HashMap
- Need processing order? β Queue or Deque
β Know your time complexities:
- ArrayList: O(1) get, O(n) insert/remove in middle
- LinkedList: O(n) get, O(1) insert/remove at ends
- HashSet/HashMap: O(1) average for add/remove/contains
- TreeSet/TreeMap: O(log n) for all operations
β Remember to override hashCode() and equals() when using custom objects in HashSet or as HashMap keys
β Use Iterator or removeIf() when you need to remove elements during iteration
β Leverage the Collections utility class for common operations like sorting, searching, and shuffling
β Consider the Stream API (Java 8+) for functional-style collection processing
π Further Study
Oracle's Collections Tutorial: https://docs.oracle.com/javase/tutorial/collections/index.html - Official comprehensive guide with examples
Java Collections Framework Internals: https://www.baeldung.com/java-collections - Deep dive into how collections work under the hood
Effective Java by Joshua Bloch (Item 28-34): https://www.oreilly.com/library/view/effective-java/9780134686097/ - Best practices for using collections effectively
π Quick Reference Card
| Need | Use | Key Feature |
|---|---|---|
| Ordered, indexed access | ArrayList | O(1) get, dynamic sizing |
| Frequent front insertions | LinkedList | O(1) addFirst/addLast |
| No duplicates, fast lookup | HashSet | O(1) contains |
| No duplicates, sorted | TreeSet | O(log n), natural order |
| Key-value pairs, fast lookup | HashMap | O(1) get/put |
| Key-value pairs, sorted keys | TreeMap | O(log n), sorted navigation |
| FIFO queue | LinkedList/ArrayDeque | offer/poll operations |
| Priority-based processing | PriorityQueue | Heap-based ordering |
π§ Essential Methods to Remember
add(E e) | Add element (throws exception if fails) |
offer(E e) | Add element (returns false if fails) - Queue |
remove() | Remove & return (throws exception if empty) |
poll() | Remove & return (returns null if empty) - Queue |
peek() | View but don't remove - Queue |
computeIfAbsent(K, Function) | Compute value if key missing - Map |
getOrDefault(K, V) | Get or return default - Map |
removeIf(Predicate) | Remove matching elements safely |