#genai
What is Generative AI?
You’ve probably used ChatGPT, asked Copilot to write some code, or generated an image with DALL-E. But what’s actually happening under the hood? In this tutorial, we’ll break down what Generative AI is, how it differs from traditional AI, and why it matters for developers. AI, Machine Learning, and Generative AI Before diving into Generative AI specifically, it helps to understand where it sits in the broader AI landscape. Artificial Intelligence (AI) is the broad field of building systems that can perform tasks that typically require human intelligence — things like recognizing speech, making decisions, or translating languages. Read more →
March 28, 2026
How LLMs Work
In What is Generative AI?, we covered the big picture of GenAI and where Large Language Models fit in. Now let’s go a level deeper. How does an LLM actually generate text? What’s happening when ChatGPT or Claude “thinks” about your question? You don’t need a PhD in machine learning to understand this. We’ll walk through the core ideas — transformers, attention, and next-token prediction — with enough depth to make you a more effective developer when working with these models. Read more →
March 28, 2026
Tokens, Context Windows & Model Parameters
When you work with LLM APIs, three concepts come up constantly: tokens, context windows, and model parameters like temperature. These aren’t abstract theory — they directly affect your costs, the quality of responses, and what you can build. This tutorial covers all three with practical examples. If you haven’t already, read How LLMs Work first for the underlying architecture. Tokens LLMs don’t read text the way humans do. They break input into tokens — chunks that are roughly word fragments. Read more →
March 28, 2026
Embeddings & Vector Search
So far in this series, we’ve focused on LLMs that generate text. But there’s another fundamental capability that powers many AI applications: embeddings. Embeddings let you convert text into numbers that capture meaning, making it possible to search, compare, and cluster content based on what it means rather than what words it contains. This is the technology behind semantic search, recommendation systems, and RAG (Retrieval-Augmented Generation) — which we’ll cover later in this series. Read more →
March 28, 2026
Fine-Tuning vs RAG vs Prompt Engineering
You’ve built a prototype with an LLM and it works pretty well, but the model doesn’t know about your company’s products, it sometimes gets the tone wrong, or it hallucinates facts about your domain. How do you fix that? There are three main approaches to customizing LLM behavior: prompt engineering, RAG (Retrieval-Augmented Generation), and fine-tuning. Each solves different problems, and choosing the wrong one wastes time and money. This tutorial breaks down when to use each. Read more →
March 28, 2026
Prompt Engineering Fundamentals
Prompt engineering is the practice of crafting inputs to get better outputs from LLMs. It’s not magic — it’s about understanding how these models process text and structuring your requests accordingly. A well-written prompt can be the difference between a vague, unhelpful response and exactly what you need. This tutorial covers the foundational techniques. If you’re new to GenAI, start with What is Generative AI? and How LLMs Work first — understanding how models predict tokens will make these techniques more intuitive. Read more →
March 28, 2026
Zero-Shot, Few-Shot & Chain-of-Thought Prompting
In Prompt Engineering Fundamentals, we covered the basics of writing effective prompts. Now let’s look at three specific techniques that can dramatically improve the quality and accuracy of LLM responses: zero-shot, few-shot, and chain-of-thought prompting. These aren’t just academic concepts — they’re practical tools you’ll use every day when working with LLMs, whether you’re building applications or just getting better answers from a chat interface. Zero-Shot Prompting Zero-shot means asking the model to perform a task without giving it any examples. Read more →
March 28, 2026
System Prompts & Role Design
When you build an application on top of an LLM, you don’t want to repeat the same instructions in every user message. That’s what system prompts are for. A system prompt is a set of instructions that defines how the model should behave across an entire conversation — its role, tone, constraints, and output format. This tutorial covers how system prompts work, how to design effective ones, and common patterns you’ll use when building LLM-powered applications. Read more →
March 28, 2026
Structured Output & JSON Mode
When you’re building applications with LLMs, you usually need the output in a specific format — JSON, CSV, a particular schema. But LLMs are text generators by nature, and they don’t always cooperate. They might add explanatory text around your JSON, produce invalid syntax, or miss required fields. This tutorial covers the techniques and tools for getting reliable structured output from LLMs. You should be familiar with Prompt Engineering Fundamentals and System Prompts & Role Design before reading this. Read more →
March 28, 2026
Calling LLM APIs with JavaScript
Time to write some code. In this tutorial, we’ll go from zero to making LLM API calls in JavaScript — setting up a project, making your first completion request, managing conversations, handling errors, and building a simple interactive chatbot. We’ll use the OpenAI API as the primary example since it’s the most widely used, but the patterns apply to any LLM provider. You should be familiar with the concepts from What is Generative AI? Read more →
March 28, 2026
Calling LLM APIs with Python
This tutorial covers calling LLM APIs using Python — the same concepts from Calling LLM APIs with JavaScript, but with Python’s SDK and idioms. If you’ve already read the JavaScript version, this will feel familiar. If Python is your primary language, start here. You should be familiar with What is Generative AI? and Tokens, Context Windows & Model Parameters. Setup Prerequisites Python 3.9+ An OpenAI API key (sign up at platform. Read more →
March 28, 2026
Streaming Responses
When you make a standard LLM API call, you wait for the entire response to be generated before you see anything. For short answers that’s fine, but for longer responses the user stares at a blank screen for seconds. Streaming fixes this by sending tokens to the client as they’re generated, creating the “typing” effect you see in ChatGPT and other AI chat interfaces. This tutorial covers streaming in depth — how it works under the hood, implementation in both JavaScript and Python, and how to integrate streaming into web applications. Read more →
March 28, 2026
Error Handling & Rate Limits
LLM API calls fail. Servers go down, rate limits get hit, tokens exceed context windows, and networks time out. If your application doesn’t handle these failures gracefully, your users get cryptic errors or broken experiences. This tutorial covers the common failure modes, how to detect them, and how to build retry logic that keeps your application running. You should have read Calling LLM APIs with JavaScript or Calling LLM APIs with Python first. Read more →
March 28, 2026
Comparing LLM Providers
There are now dozens of LLM providers and hundreds of models to choose from. This tutorial cuts through the noise and compares the major options — what they’re good at, how they differ, and how to decide which one to use for your project. This isn’t an exhaustive benchmark. Models improve constantly, and today’s rankings will shift. Instead, we’ll focus on the factors that matter when making practical decisions. The Major Providers OpenAI The company behind GPT-4, GPT-4o, and the model that started the GenAI wave. Read more →
March 28, 2026
Introduction to RAG
LLMs are trained on public data up to a cutoff date. They don’t know about your company’s documentation, your product’s API, or the email you received this morning. RAG (Retrieval-Augmented Generation) solves this by fetching relevant information at query time and feeding it to the model as context. RAG is the most important pattern in applied GenAI. It’s how you build chatbots that answer questions about your docs, search engines that understand intent, and assistants that stay grounded in facts instead of hallucinating. Read more →
March 28, 2026
Building a RAG Pipeline
In Introduction to RAG, we built a minimal RAG system with an in-memory store and simple documents. Now let’s build something closer to production: loading real documents, chunking them intelligently, using a vector database, and evaluating the results. We’ll build a documentation Q&A system — the most common RAG use case — using JavaScript, OpenAI, and Chroma as our vector database. Architecture Markdown files → Chunker → Embeddings → ChromaDB ↓ User question → Embedding → Vector search → Top chunks → LLM → Answer Setup mkdir rag-pipeline && cd rag-pipeline npm init -y npm install openai chromadb Add "type": "module" to package. Read more →
March 28, 2026
Generative AI Tutorials
Learn Generative AI from the ground up. Understand LLMs, prompt engineering, RAG, AI agents, and how to build AI-powered applications. Read more →
March 28, 2026