embeddings

Embeddings & Vector Search

So far in this series, we’ve focused on LLMs that generate text. But there’s another fundamental capability that powers many AI applications: embeddings. Embeddings let you convert text into numbers that capture meaning, making it possible to search, compare, and cluster content based on what it means rather than what words it contains. This is the technology behind semantic search, recommendation systems, and RAG (Retrieval-Augmented Generation) — which we’ll cover later in this series. Read more →

March 28, 2026

Introduction to RAG

LLMs are trained on public data up to a cutoff date. They don’t know about your company’s documentation, your product’s API, or the email you received this morning. RAG (Retrieval-Augmented Generation) solves this by fetching relevant information at query time and feeding it to the model as context. RAG is the most important pattern in applied GenAI. It’s how you build chatbots that answer questions about your docs, search engines that understand intent, and assistants that stay grounded in facts instead of hallucinating. Read more →

March 28, 2026

Building a RAG Pipeline

In Introduction to RAG, we built a minimal RAG system with an in-memory store and simple documents. Now let’s build something closer to production: loading real documents, chunking them intelligently, using a vector database, and evaluating the results. We’ll build a documentation Q&A system — the most common RAG use case — using JavaScript, OpenAI, and Chroma as our vector database. Architecture Markdown files → Chunker → Embeddings → ChromaDB ↓ User question → Embedding → Vector search → Top chunks → LLM → Answer Setup mkdir rag-pipeline && cd rag-pipeline npm init -y npm install openai chromadb Add "type": "module" to package. Read more →

March 28, 2026