Database Introduction
Table of Contents
Every application you use — from social media to banking to the notes app on your phone — stores data somewhere. That somewhere is a database. A database is an organized collection of data that software can efficiently read, write, and query. Understanding how databases work is a foundational skill for any developer, regardless of what you’re building.
In this tutorial, we’ll cover what databases are, why they exist, the major types you’ll encounter, and how to think about choosing one. No setup required — this is conceptual groundwork that’ll make everything else in this series click.
Why Not Just Use Files?
You could store data in plain files — JSON, CSV, text. For small scripts and personal projects, that works fine. But files fall apart when things get real:
- Concurrent access: What happens when two users try to update the same file at the same time? One overwrites the other.
- Querying: Finding all users over age 25 in a JSON file means reading the entire file and looping through it. A database can answer that in milliseconds using indexes.
- Data integrity: Files don’t enforce rules. Nothing stops you from saving a user without an email, or storing a string where a number should be.
- Scale: A 10GB JSON file is painful to work with. Databases are designed to handle terabytes efficiently.
Databases solve all of these problems with purpose-built engines for storage, retrieval, and data integrity.
Key Concepts
Before we look at database types, here are the core concepts that apply to all of them.
CRUD Operations
Almost everything you do with a database falls into four operations:
- Create — insert new data
- Read — retrieve existing data
- Update — modify existing data
- Delete — remove data
You’ll see this pattern everywhere, regardless of the database type.
Schemas
A schema defines the structure of your data — what fields exist, what types they are, and what constraints apply. Some databases enforce schemas strictly (relational databases), while others are more flexible (document databases).
Indexes
An index is a data structure that speeds up queries. Without an index, the database has to scan every row to find what you’re looking for (a “full table scan”). With an index on the right column, it can jump directly to the matching rows.
Think of it like a book index — instead of reading every page to find “binary trees,” you look up the term in the index and go straight to page 142.
Transactions
A transaction groups multiple operations into a single unit that either fully succeeds or fully fails. The classic example is a bank transfer: debit one account and credit another. If the credit fails, the debit should be rolled back. Transactions guarantee this with four properties known as ACID:
- Atomicity — all operations succeed or none do
- Consistency — the database moves from one valid state to another
- Isolation — concurrent transactions don’t interfere with each other
- Durability — once committed, the data survives crashes
Relational Databases (SQL)
Relational databases store data in tables with rows and columns, similar to a spreadsheet. Tables are linked by relationships — a column in one table references a column in another.
Here’s what a simple relational schema looks like:
users orders
┌────┬─────────┬──────────┐ ┌────┬─────────┬────────┬───────┐
│ id │ name │ email │ │ id │ user_id │ total │ date │
├────┼─────────┼──────────┤ ├────┼─────────┼────────┼───────┤
│ 1 │ Alice │ a@ex.com │ │ 1 │ 1 │ 59.99 │ 04-01 │
│ 2 │ Bob │ b@ex.com │ │ 2 │ 1 │ 34.00 │ 04-02 │
│ 3 │ Charlie │ c@ex.com │ │ 3 │ 2 │ 124.50 │ 04-03 │
└────┴─────────┴──────────┘ └────┴─────────┴────────┴───────┘
The user_id column in the orders table references the id column in the users table. This is a foreign key — it creates a relationship between the two tables.
You interact with relational databases using SQL (Structured Query Language):
-- Find all orders for Alice
SELECT orders.id, orders.total, orders.date
FROM orders
JOIN users ON orders.user_id = users.id
WHERE users.name = 'Alice';
Popular relational databases:
- PostgreSQL — feature-rich, great for most applications
- MySQL — widely used, especially in web development
- SQLite — lightweight, runs as a single file (great for learning and small apps)
- SQL Server — Microsoft’s enterprise database
NoSQL Databases
“NoSQL” is a broad category covering any database that doesn’t use the traditional table-and-SQL model. There are several types, each designed for different use cases.
Document Databases
Store data as flexible JSON-like documents. Each document can have a different structure, which makes them good for data that doesn’t fit neatly into rows and columns.
{
"_id": "user_001",
"name": "Alice",
"email": "a@example.com",
"orders": [
{ "total": 59.99, "date": "2026-04-01" },
{ "total": 34.00, "date": "2026-04-02" }
]
}
Notice how the orders are nested inside the user document — no separate table, no joins. This is great for read-heavy workloads where you usually need the user and their orders together.
Popular: MongoDB, CouchDB
Key-Value Stores
The simplest model — every piece of data is stored as a key-value pair. Extremely fast for lookups by key, but limited for complex queries.
session:abc123 → { userId: 1, expires: "2026-04-14T12:00:00Z" }
cache:homepage → "<html>...</html>"
Popular: Redis, DynamoDB
Graph Databases
Designed for data with complex relationships — social networks, recommendation engines, fraud detection. Data is stored as nodes (entities) and edges (relationships).
Popular: Neo4j, Amazon Neptune
Wide-Column Stores
Store data in column families rather than rows. Optimized for queries over large datasets where you only need specific columns.
Popular: Cassandra, HBase
SQL vs NoSQL: How to Choose
There’s no universal winner. The right choice depends on your data and how you’ll use it.
| Factor | SQL | NoSQL |
|---|---|---|
| Data structure | Fixed schema, structured | Flexible, varied |
| Relationships | Strong (joins, foreign keys) | Weak or embedded |
| Consistency | Strong (ACID) | Varies (some offer eventual consistency) |
| Scaling | Vertical (bigger server) | Horizontal (more servers) |
| Query language | SQL (standardized) | Varies by database |
| Best for | Transactions, complex queries, structured data | High throughput, flexible schemas, real-time apps |
In practice, many applications use both. A common pattern is a relational database for core business data (users, orders, payments) and Redis for caching and sessions.
How Databases Fit Into an Application
In a typical web application, the database sits behind the backend server. The flow looks like this:
Browser → HTTP Request → Backend Server → Database Query → Database
← Query Results ←
← HTTP Response ←
The backend server (built with Node.js, Python, Java, etc.) connects to the database, runs queries, and sends the results back to the client. The browser never talks to the database directly — that would be a massive security risk.
Most backend frameworks provide libraries or ORMs (Object-Relational Mappers) that make database interaction easier:
// Example using a Node.js database library (pseudocode)
const users = await db.query("SELECT * FROM users WHERE age > $1", [25]);
What’s Next
Now that you understand what databases are and the major types, the next step is learning SQL — the language that powers relational databases. Head to SQL Basics to start writing queries.