Hash Tables

November 18, 2025
#hash-tables #maps #dictionaries

Introduction

Hash tables (also called hash maps or dictionaries) are one of the most important data structures you will use as a developer. They provide near constant-time lookups, inserts, and deletes on average, making them a go-to choice for many real-world problems.

By the end of this tutorial, you’ll understand how hash tables work, what a hash function is, how collisions are handled, and when to prefer a hash table over other structures like arrays or linked lists.

What is a hash table?

A hash table stores key–value pairs. Instead of finding values by numeric index (like an array), you use a key (e.g., a string like “username”). A hash function converts the key into an array index (a bucket). The value is stored in that bucket.

Core components

  1. Key: The identifier you use to access a value (e.g., “email”).
  2. Value: The data you want to store (e.g., a user object).
  3. Hash function: Maps a key to a bucket index.
  4. Buckets: Where key–value pairs are placed; multiple pairs can land in the same bucket if they collide.

Collision handling strategies

Even great hash functions can map two different keys to the same bucket. That’s called a collision. Common strategies to handle collisions:

  1. Separate chaining
    • Each bucket stores a small list (often a linked list) of key–value pairs.
    • On insert, you append to the bucket’s list; on lookup, you scan that small list to find your key.
  2. Open addressing (probing)
    • If a bucket is taken, probe to another bucket based on a probe sequence (linear probing, quadratic probing, double hashing).
    • Keeps everything in the same underlying array, but requires careful load factor management.

Load factor and resizing

The load factor is number_of_items / number_of_buckets. As the load factor grows, collisions become more frequent. Hash tables typically resize (allocate a larger array and rehash items) when the load factor passes a threshold (e.g., 0.7).

When to use a hash table

Use a hash table when you need:

  • Fast average-time lookup by key (e.g., look up a user’s record by email).
  • Fast membership tests (e.g., “have we seen this ID before?”).
  • Counting, grouping, or deduplicating items.

Prefer other structures when:

  • You need to maintain sort order by key (use a tree map or sort the keys when needed).
  • You need fast predecessor/successor queries (balanced BSTs are better).
  • Memory is extremely constrained and predictable iteration order matters.

Basic operations

Implementing a tiny hash table (from scratch)

To really understand hash tables, let’s implement a toy version. We’ll use separate chaining with arrays for buckets. This is intentionally simplified for learning.

Complexity recap

  • Average case: O(1) for get, set, delete.
  • Worst case: O(n) if many keys collide into the same bucket.
  • Space: O(n) to store n elements, plus overhead for buckets.

Practice ideas

  • Implement your own HashSet using the TinyHashTable above (store only keys, no values).
  • Measure performance as you vary capacity and load factor.
  • Compare lookups versus a sorted array + binary search from our Binary Search tutorial.
Thanks for visiting
We are actively updating content to this site. Thanks for visiting! Please bookmark this page and visit again soon.
Sponsor