One variable, many values
A variable holds one value. Real programs deal in many: a cart of items, a class of students, a feed of posts. Collections are the containers languages provide for groups of values — and choosing the right one is the first genuinely architectural decision you'll make daily for the rest of your career.
Four shapes cover ~95% of all code:
| Shape | Python | Java | C++ | One-liner |
|---|---|---|---|---|
| Ordered sequence | list | ArrayList | std::vector | Items in a row, by position |
| Key → value lookup | dict | HashMap | std::unordered_map | A label on every value |
| Unique items | set | HashSet | std::unordered_set | Membership, no duplicates |
| Fixed group | tuple | record | std::pair/struct | A few values that travel together |
Lists — the workhorse
An ordered, growable sequence, accessed by index (position, counting from 0):
# Python
cart = ["bread", "milk"]
cart.append("eggs") # add at the end
print(cart[0]) # "bread" — first item
print(cart[-1]) # "eggs" — negative = from the end
cart[1] = "oat milk" # replace by index
print(len(cart)) # 3
for item in cart: # iterate (Control Flow page)
print(item)
// Java
List<String> cart = new ArrayList<>();
cart.add("bread");
cart.add("milk");
System.out.println(cart.get(0));
cart.set(1, "oat milk");
// C++
std::vector<std::string> cart = {"bread", "milk"};
cart.push_back("eggs");
std::cout << cart[0];
Slicing (Python superpower): cart[0:2] is a new list of the first two
items; cart[::-1] is the reverse. Java/C++ do the same with explicit loops
or library calls.
Use a list when order matters or you'll iterate everything: a feed, a queue of steps, last-10 search results.
Dictionaries — label every value
A dictionary (also called map or hash map — same idea) stores key → value pairs and answers "what's the value for this key?" essentially instantly:
# Python
prices = {"bread": 40, "milk": 60, "eggs": 90}
print(prices["milk"]) # 60 — lookup by key
prices["butter"] = 120 # add or update
if "tofu" in prices: # membership check
print(prices["tofu"])
print(prices.get("tofu", 0)) # 0 — default instead of an exception
for item, price in prices.items():
print(item, price)
// Java
Map<String, Integer> prices = new HashMap<>();
prices.put("bread", 40);
prices.getOrDefault("tofu", 0);
// C++
std::unordered_map<std::string, int> prices = {{"bread", 40}, {"milk", 60}};
prices["butter"] = 120;
The analogy: a list is lockers in a row (find by position); a dict is a phone book (find by name). Asking "is bread in this list of 1 million items?" requires checking up to all million; asking a dict takes one step regardless of size. Why that's true — hashing — is exactly hash tables in Level 2; for now, trust the speed.
Use a dict when you look things up by identity: user by id, price by product, count by word. The moment you write a loop to find something in a list repeatedly, you probably wanted a dict.
Sets — membership and uniqueness
A set holds unique values, unordered, with instant membership checks:
# Python
visitors = {"asha", "rahul", "asha"} # duplicates collapse
print(len(visitors)) # 2
visitors.add("meera")
print("rahul" in visitors) # True — instant, like dict keys
admins = {"asha", "meera"}
print(visitors & admins) # intersection: both
print(visitors - admins) # difference: visitors who aren't admins
Use a set for "have I seen this before?" (dedupe, visited pages, already-processed ids) and for set algebra (common followers, missing permissions). A set is essentially a dict that stores only keys.
Tuples — small fixed groups
A tuple is a small, immutable (unchangeable) bundle of values that belong together:
point = (3, 7) # x, y — forever
x, y = point # unpacking
locations = {(28.6, 77.2): "Delhi"} # immutable ⇒ usable as dict keys
def min_max(nums):
return (min(nums), max(nums)) # return two values at once
Use tuples for coordinates, RGB colors, (key, value) pairs, multiple return
values. The immutability is a feature: a function can hand one out without
fearing the caller will mutate it
(remember aliasing). When a tuple grows fields or
needs names, graduate to a class (OOP Basics) —
point.x beats point[0].
Choosing: the decision in four questions
And the performance intuition to carry into Level 2 (full version: complexity cheat sheet):
| Operation | list | dict / set |
|---|---|---|
| Get by index/key | instant | instant |
x in collection | scans everything — O(n) | instant — O(1) |
| Add at end / insert pair | instant | instant |
| Insert at front/middle | shifts everything — O(n) | — |
That one row — membership: O(n) vs O(1) — explains half of all real-world slow code, and half of all DSA interview optimizations.
Nesting: collections of collections
Real data is nested — and you already know this shape from JSON:
orders = [ # a list of dicts
{"id": 101, "customer": "Asha", "items": ["bread", "eggs"]},
{"id": 102, "customer": "Rahul", "items": ["milk"]},
]
print(orders[0]["items"][1]) # "eggs"
word_count = {} # THE classic pattern: counting
for word in text.split():
word_count[word] = word_count.get(word, 0) + 1
That counting loop is worth memorizing — it's the accumulator pattern from Control Flow wearing a dict, it appears in countless interview problems, and it's your first real hash table algorithm.
Common beginner mistakes
- List when you meant dict. Looping through
usersto find an id, inside another loop — accidental O(n²) (nested-loop multiplication). Buildusers_by_idonce; look up forever. - Mutating a list while iterating it — skipped items, subtle chaos. Build a new list with the items you keep.
- The aliasing classic:
b = acopies the reference (Functions) —b.append(x)changesatoo. Copy explicitly:b = a.copy()(andcopy.deepcopyfor nested data). dict[key]on a maybe-missing key →KeyError(Exceptions). Use.get(key, default)or checkinfirst.- Relying on order in a set. Sets (and Java's HashMap keys) have no useful order; if order matters, you wanted a list (or sort at the end).
Interview perspective
Practice
Beginner
- Keep a shopping list: loop reading commands
add x,remove x,show,quit— implement each (list + while loop). - Given
[3, 7, 3, 1, 7, 3], produce: the unique values, the count of each value, and the list reversed — pick the right collection for each.
Intermediate
- Two lists: students enrolled in Math, students in Physics. Produce: in both, in either, only in Math. (Sets — then re-do it without sets and appreciate them.)
- Build a gradebook
dictof student → list of marks. Write functionsadd_mark,average(student),topper(). Handle the missing-student case both ways:.getdefault and try/except.
Advanced
- Group a list of words into anagram families:
["eat","tea","tan","ate"]→{"aet": ["eat","tea","ate"], "ant": ["tan"]}. Hint: what key makes anagrams collide? Why must that key be a string or tuple, not a list? (You just invented the canonical-form trick — see it again in the problem bank.)
Next: File Handling — making data outlive the program.