Hash tables are one of the most powerful and widely used data structures in computer science.
A hash table lets you store and retrieve data in constant average time — O(1). That’s incredibly fast compared to arrays or linked lists, where searching might take linear time.
Hash tables are everywhere in the real world.
dict and Java’s HashMap.In this chapter, I’ll cover:
So, what exactly is a hash table?
A hash table is a data structure that stores information as key–value pairs.
Here’s how it works:
So instead of searching through an entire list to find what you need, you can jump straight to it.
That’s what makes hash tables so fast: insertions, lookups, and deletions all happen in O(1) time on average.
A hash table gives you direct access through hashing, instead of scanning through every entry one by one.
Now that you know what a hash table is, let’s see how it actually works under the hood.
1. Hash Function
Everything starts with a hash function.
It takes your key which could be a string like "apple" or a number and converts it into an integer index.
That index decides where the value will be stored inside an internal array.
For example:
"apple"5"apple"’s value at index 5 in the array.A good hash function is fast and spreads keys evenly across the array.
But what happens if two keys map to the same index? This is called a collision.
And no matter how good your hash function is, collisions are unavoidable.
So we need a strategy to handle them.
There are two main approaches:
"apple" and "orange" both hash to index 5, they’re stored in a small linked list or collection at that spot.A key concept that determines how likely collisions are is called the Load Factor.
It measures how full a hash table is.
This resizing keeps the hash table efficient, ensuring that operations like insertion, deletion, and lookup remain close to O(1) on average.
Now let’s look at the core operations that make hash tables so powerful and how fast each one really is.
When you insert something, the key goes through the hash function to find its index. The value is then stored at that index in the internal array.
If another key already maps to that same index (that’s a collision) the hash table uses its collision-handling strategy, like chaining or open addressing, to resolve it as we discussed earlier.
Time Complexity for insert is O(1) on average but it can go upto O(n) in the worst case if multiple keys map to the same spot.
To search for a value, you hash the key and jump directly to the index where it should be.
If the key exists, you return its value immediately. If there’s a collision chain, you simply walk through that list until you find the match.
Time Complexity for search is O(1) on average, but O(n) in the worst case if all keys collide.
Deleting works in a similar way: first hash the key to locate its index, then remove the entry.
Time Complexity for delete is O(1) on average and O(n) in worst case.
Updating is really just insertion with an existing key. If the key already exists, you simply overwrite the old value with the new one.
Choosing a good hash function is the key in deciding how efficient your hash table actually is.