I built RAG search on a headless CMS. The CMS did more work than the vector database.
Semantic search is live on fieldnotes-ai.com. Voyage AI, Supabase pgvector, Claude Sonnet. But the interesting part is what Contentful contributed.
Click ASK ✦ in the nav, or press Cmd+K. Ask anything about this site in plain English. It finds the relevant field notes and streams back a cited answer.
The stack: Voyage AI voyage-3.5-lite for embeddings, Supabase pgvector for vector storage, Claude Sonnet 4.6 for synthesis and streaming. 108 chunks across 18 field notes.
Most RAG pipelines start with a scraper. This one starts with an API call.
Most people building RAG on a content site crawl the published pages, strip the nav and footer, parse the HTML, and try to recover structure that rendering already destroyed. It works. But you are always fighting the content.
This pipeline never touches the webpage.
The embed script calls the Contentful Delivery API directly and gets back structured JSON. The Rich Text body is a typed document tree with explicit node types: heading-2, paragraph, hyperlink. The chunking logic splits on heading-2 nodes. Not <h2> tags. Not regex patterns. Not heuristics. The document tells you exactly where the sections are.
Contentful Delivery API → typed Rich Text JSON → walk nodes, split at heading-2 → clean text chunks with slug, title, section metadata → Voyage AI voyage-3.5-lite → 1024-dim embeddings in Supabase pgvector → Claude Sonnet 4.6 streams a cited answer
The webpage is one rendering of the content. The search index is another.
There is a timing property worth noting. The embedding pipeline is completely independent of deployment. Contentful is the source of truth. The webpage only exists after Vercel builds and deploys. The search index does not care about Vercel at all. You could update the index the moment a draft is ready, before anyone hits publish.
This is what headless CMS architecture actually means for AI pipelines. The "write once, publish everywhere" framing undersells it. The more useful framing: your content exists as structured data in an API, and every consumer, whether a browser, a mobile app, or an embedding pipeline, talks to that API directly. Nobody is scraping someone else's rendered output.
If your content lives in a structured CMS, go to the API first.
The structure you need for clean chunking is already there. You do not need to reverse-engineer it from HTML. You just need to call the right endpoint.
For Contentful specifically: the Rich Text response gives you a document tree you can walk node by node. Split at heading-2. Keep H3s inside their parent chunk. Merge anything under 50 words with the next section. That is the entire chunking logic. It took about 30 lines of TypeScript.
The ✦ button in the nav is the visible part. Contentful is why it works cleanly.