Hash Tables

November 18, 2025

Introduction

Hash tables (also called hash maps or dictionaries) are one of the most important data structures you will use as a developer. They provide near constant-time lookups, inserts, and deletes on average, making them a go-to choice for many real-world problems.

By the end of this tutorial, you’ll understand how hash tables work, what a hash function is, how collisions are handled, and when to prefer a hash table over other structures like arrays or linked lists.

What is a hash table?

A hash table stores key–value pairs. Instead of finding values by numeric index (like an array), you use a key (e.g., a string like “username”). A hash function converts the key into an array index (a bucket). The value is stored in that bucket.

Core components

Key: The identifier you use to access a value (e.g., “email”).
Value: The data you want to store (e.g., a user object).
Hash function: Maps a key to a bucket index.
Buckets: Where key–value pairs are placed; multiple pairs can land in the same bucket if they collide.

On average, hash tables offer O(1) time for get, set, and delete. In the worst case (many collisions), operations can degrade to O(n). If you’re unfamiliar with these notations, see Big-O notation.

Collision handling strategies

Even great hash functions can map two different keys to the same bucket. That’s called a collision. Common strategies to handle collisions:

Separate chaining
- Each bucket stores a small list (often a linked list) of key–value pairs.
- On insert, you append to the bucket’s list; on lookup, you scan that small list to find your key.
Open addressing (probing)
- If a bucket is taken, probe to another bucket based on a probe sequence (linear probing, quadratic probing, double hashing).
- Keeps everything in the same underlying array, but requires careful load factor management.

Load factor and resizing

The load factor is number_of_items / number_of_buckets. As the load factor grows, collisions become more frequent. Hash tables typically resize (allocate a larger array and rehash items) when the load factor passes a threshold (e.g., 0.7).

When to use a hash table

Use a hash table when you need:

Fast average-time lookup by key (e.g., look up a user’s record by email).
Fast membership tests (e.g., “have we seen this ID before?”).
Counting, grouping, or deduplicating items.

Prefer other structures when:

You need to maintain sort order by key (use a tree map or sort the keys when needed).
You need fast predecessor/successor queries (balanced BSTs are better).
Memory is extremely constrained and predictable iteration order matters.

Basic operations

// JavaScript: Objects and Map
const ages = new Map();

// Insert / set
ages.set('alice', 30);
ages.set('bob', 28);

// Lookup / get
console.log(ages.get('alice')); // 30

// Update
ages.set('alice', 31);

// Check existence
console.log(ages.has('carol')); // false

// Delete
ages.delete('bob');

# Python: dict
ages = {}

# Insert / set
ages['alice'] = 30
ages['bob'] = 28

# Lookup / get
print(ages.get('alice'))  # 30

# Update
ages['alice'] = 31

# Check existence
print('carol' in ages)  # False

# Delete
del ages['bob']

// Java: HashMap
import java.util.HashMap;

HashMap<String, Integer> ages = new HashMap<>();

// Insert / set
ages.put("alice", 30);
ages.put("bob", 28);

// Lookup / get
System.out.println(ages.get("alice")); // 30

// Update
ages.put("alice", 31);

// Check existence
System.out.println(ages.containsKey("carol")); // false

// Delete
ages.remove("bob");

Implementing a tiny hash table (from scratch)

To really understand hash tables, let’s implement a toy version. We’ll use separate chaining with arrays for buckets. This is intentionally simplified for learning.

class TinyHashTable {
  constructor(capacity = 8) {
    this.capacity = capacity;
    this.buckets = Array.from({ length: capacity }, () => []);
    this.size = 0;
  }

  // Simple hash function for strings (djb2 variant)
  hash(key) {
    let h = 5381;
    for (let i = 0; i < key.length; i++) {
      h = ((h << 5) + h) ^ key.charCodeAt(i); // h * 33 ^ char
    }
    return Math.abs(h) % this.capacity;
  }

  set(key, value) {
    const idx = this.hash(key);
    const bucket = this.buckets[idx];
    for (let i = 0; i < bucket.length; i++) {
      if (bucket[i][0] === key) { // update
        bucket[i][1] = value;
        return;
      }
    }
    bucket.push([key, value]);
    this.size++;
    if (this.size / this.capacity > 0.7) this.resize(this.capacity * 2);
  }

  get(key) {
    const idx = this.hash(key);
    const bucket = this.buckets[idx];
    for (let i = 0; i < bucket.length; i++) {
      if (bucket[i][0] === key) return bucket[i][1];
    }
    return undefined;
  }

  delete(key) {
    const idx = this.hash(key);
    const bucket = this.buckets[idx];
    for (let i = 0; i < bucket.length; i++) {
      if (bucket[i][0] === key) {
        bucket.splice(i, 1);
        this.size--;
        return true;
      }
    }
    return false;
  }

  resize(newCapacity) {
    const old = this.buckets;
    this.capacity = newCapacity;
    this.buckets = Array.from({ length: newCapacity }, () => []);
    this.size = 0;
    for (const bucket of old) {
      for (const [k, v] of bucket) this.set(k, v);
    }
  }
}

// Usage
const table = new TinyHashTable();
table.set('alice', 30);
table.set('bob', 28);
console.log(table.get('alice')); // 30
table.delete('bob');

class TinyHashTable:
  def __init__(self, capacity=8):
    self.capacity = capacity
    self.buckets = [[] for _ in range(capacity)]
    self.size = 0

  def _hash(self, key: str) -> int:
    # Simple djb2 variant for demonstration (strings only)
    h = 5381
    for ch in key:
      h = ((h << 5) + h) ^ ord(ch)
    return abs(h) % self.capacity

  def set(self, key: str, value):
    idx = self._hash(key)
    bucket = self.buckets[idx]
    for i, (k, v) in enumerate(bucket):
      if k == key:
        bucket[i] = (k, value)
        return
    bucket.append((key, value))
    self.size += 1
    if self.size / self.capacity > 0.7:
      self._resize(self.capacity * 2)

  def get(self, key: str):
    idx = self._hash(key)
    for k, v in self.buckets[idx]:
      if k == key:
        return v
    return None

  def delete(self, key: str) -> bool:
    idx = self._hash(key)
    bucket = self.buckets[idx]
    for i, (k, v) in enumerate(bucket):
      if k == key:
        del bucket[i]
        self.size -= 1
        return True
    return False

  def _resize(self, new_capacity: int):
    old = self.buckets
    self.capacity = new_capacity
    self.buckets = [[] for _ in range(new_capacity)]
    self.size = 0
    for bucket in old:
      for k, v in bucket:
        self.set(k, v)

# Usage
t = TinyHashTable()
t.set('alice', 30)
t.set('bob', 28)
print(t.get('alice'))  # 30
t.delete('bob')

import java.util.ArrayList;
import java.util.List;

class TinyHashTable {
  private static class Pair {
    String key; Object value;
    Pair(String k, Object v) { key = k; value = v; }
  }

  private int capacity;
  private List<List<Pair>> buckets;
  private int size;

  public TinyHashTable(int capacity) {
    this.capacity = capacity;
    this.buckets = new ArrayList<>();
    for (int i = 0; i < capacity; i++) buckets.add(new ArrayList<>());
    this.size = 0;
  }

  public TinyHashTable() { this(8); }

  private int hash(String key) {
    long h = 5381;
    for (int i = 0; i < key.length(); i++) {
      h = ((h << 5) + h) ^ key.charAt(i);
    }
    return (int)(Math.abs(h) % capacity);
  }

  public void set(String key, Object value) {
    int idx = hash(key);
    List<Pair> bucket = buckets.get(idx);
    for (int i = 0; i < bucket.size(); i++) {
      if (bucket.get(i).key.equals(key)) {
        bucket.get(i).value = value;
        return;
      }
    }
    bucket.add(new Pair(key, value));
    size++;
    if ((double) size / capacity > 0.7) resize(capacity * 2);
  }

  public Object get(String key) {
    int idx = hash(key);
    for (Pair p : buckets.get(idx)) {
      if (p.key.equals(key)) return p.value;
    }
    return null;
  }

  public boolean delete(String key) {
    int idx = hash(key);
    List<Pair> bucket = buckets.get(idx);
    for (int i = 0; i < bucket.size(); i++) {
      if (bucket.get(i).key.equals(key)) {
        bucket.remove(i);
        size--;
        return true;
      }
    }
    return false;
  }

  private void resize(int newCapacity) {
    List<List<Pair>> old = buckets;
    capacity = newCapacity;
    buckets = new ArrayList<>();
    for (int i = 0; i < newCapacity; i++) buckets.add(new ArrayList<>());
    size = 0;
    for (List<Pair> bucket : old) {
      for (Pair p : bucket) set(p.key, p.value);
    }
  }
}

// Usage
// TinyHashTable t = new TinyHashTable();
// t.set("alice", 30);
// t.set("bob", 28);
// System.out.println(t.get("alice")); // 30
// t.delete("bob");

Complexity recap

Average case: O(1) for get, set, delete.
Worst case: O(n) if many keys collide into the same bucket.
Space: O(n) to store n elements, plus overhead for buckets.

Practice ideas

Implement your own HashSet using the TinyHashTable above (store only keys, no values).
Measure performance as you vary capacity and load factor.
Compare lookups versus a sorted array + binary search from our Binary Search tutorial.