Prequel AI Compliance Discovery

Built a multi-agent LLM system with RAG achieving 100% recall on critical requirements—from whiteboard to working prototype.

Overview

Three years of building healthcare products at Dentsply Sirona taught me something about compliance: even with a fully staffed Quality Assurance & Regulatory Affairs team, we as builders didn't always have what we needed. Navigating FDA 510(k) validation for the AI Aligner Fit ScanSee how we navigated FDA 510(k) for an AI diagnostic feature →, shipping through Dentsply Sirona's regulatory pipeline—each launch surfaced gaps. The QARA team's expertise had blind spots, especially around AI, where the regulatory landscape was evolving faster than anyone could keep up.

There were times when regulatory would flag a requirement mid-build: "if you want to build it out like this, you'll need to do xyz for compliance." Those things could take months, pushing back a launch we'd already committed to.

That experience was my discovery phase. I wasn't running formal user research yet, but I was living the problem every sprint.

Prequel is the product I built to solve that. It's an AI-powered compliance co-pilot that uses a multi-agent LLM system with RAG to surface regulatory and documentation requirements in plain English, before launch. I led the discovery, defined the product strategy, and built (okay, vibe-coded) the MVP.

THE INSIGHT

Most Compliance Questions Are Routine

Builders don't need a $500/hour consultant to learn that their telehealth product triggers HIPAA and state licensing requirements. They need that information surfaced proactively, at the moment they're scoping the build—not weeks before launch when it becomes a blocker.

THE PROBLEM

Blindsided by Compliance

Builders in regulated industries routinely get blindsided by compliance requirements weeks before launch. The regulatory language is dense, requirements vary by market, and there's no systematic way to discover what you need until it blocks you.

The typical fallback is a regulatory consultant at $250-500/hour. That makes sense for complex edge cases. But for the 80% of questions that are routine? Teams are overpaying for information a well-informed system could surface on its own.

And the problem is growing. AI is lowering the barrier to building healthcare products, which means more builders are entering regulated spaces without regulatory backgrounds. They need a way to close the knowledge gap early—before they've committed to an architecture or promised a launch date.

"Byte paid between $25,000 and $30,000 for an FDA compliance contractor. That's a substantial business expense, especially for early-stage companies. That's where there's a market need."
— Sienna, PM at Daybreak

PRODUCT STRATEGY

What I Built For vs. What I Didn't

Existing compliance tools are either enterprise monitoring platforms (Norm AI, Regology) that track when regulations change, or consulting firms (Emergo, BSI) that charge premium rates for expert guidance.

I chose to build for neither use case. Both serve ongoing compliance. Prequel is purpose-built for the product launch moment—the point when a builder needs to know "what do I need before I ship this?" That's a different job entirely, and no one was solving it.

Why startups, not enterprise

Enterprise companies have internal regulatory teams. Regulated startups (Series A-C) in digital health, fintech, and medical devices are moving fast, launching across markets, and often don't have a regulatory expert on speed dial. They feel the pain most acutely.

Why freemium

AI costs per discovery run $1-6 with gross margins of 87-96%. A free tier is sustainable as top-of-funnel acquisition, not a loss leader. Paid tiers ($149/mo, $299/mo) unlock PRD upload, document composition, multi-market comparison, and team features.

Why multi-agent

Multiple specialized agents rather than one monolithic prompt. Each handles a distinct job, stays focused, and is independently testable. I can swap models per agent based on cost and performance.

BUILDING THE AI

Scoping the MVP

The full product vision includes multiple AI agents: Quill (the compliance co-pilot), Dixon (document composer), Monte B (compliance copy editor), plus multi-market comparison, change detection, and Slack integration. For the MVP, I focused entirely on Quill.

The filter was straightforward: which agent delivers the core value loop (describe your product → get your requirements)? That's Quill. Everything else makes the product better, but the product works without them.

🔍 Discovery

Takes product details and generates a structured requirements checklist with priorities, explanations, and timelines.

📋 Planning

Converts the checklist into a 90-day action plan tied to the user's launch date.

🤿 Deep Dive

Users highlight any requirement for a plain-English breakdown.

📅 Date Parsing

Interprets natural language dates into actionable timeline entries.

GROUNDING THE AI

Real Regulation, Not Search Results

The first version of Quill relied entirely on live web search. The Product Analyzer would generate search queries, Tavily would fetch results, and the Requirements Generator would synthesize what came back. It worked—but accuracy was dependent on what the search engine returned for each run. For well-known regulations like HIPAA and FDA 510(k), results were consistent. For niche requirements like DEA e-prescribing rules or state telehealth licensing, they were hit-or-miss.

For a compliance tool, that's not acceptable. A content recommendation that's slightly off is an inconvenience. A compliance requirement that's missing is a liability. The accuracy bar is fundamentally different.

I built a RAG (Retrieval-Augmented Generation) pipeline to solve this: a curated corpus of 30 authoritative regulatory documents—FDA guidance, HIPAA rules, EU MDR, GDPR, FTC health app requirements, NIST frameworks—collected, chunked into semantically meaningful segments, and embedded as vectors in a Postgres database (Supabase pgvector).

Now when a user runs discovery, Quill searches both sources in parallel: the curated corpus (high trust, always available) and live web search (broad coverage for anything not yet in the corpus). The Search Orchestrator merges and scores both—results from authoritative .gov sources auto-score at the highest tier regardless of where they came from.

MODEL SELECTION

A Product Decision, Not a Technical One

I benchmarked Claude Haiku, Sonnet, and Opus against a telehealth + e-prescriptions scenario:

Haiku

100% recall on critical requirements (5/5). 11 total requirements. ~190s, ~$0.05/query.

Sonnet

100% recall on critical requirements (5/5). 15 total requirements. ~285s, ~$0.25/query.

Opus

Timed out (>300s, ~$1.25/query). Too slow and expensive for interactive use.

Both Haiku and Sonnet hit 100% recall on the requirements that matter most. Sonnet surfaced 4 additional requirements at 5x the cost. I selected Sonnet for the MVP and designed a tiered architecture: Haiku for the free tier, Sonnet for paid tiers. That's a product decision about what level of depth each tier deserves and what unit economics each tier can sustain.

Prequel compliance checklist UI showing structured requirements output

RESULTS

Live and Learning

100%

Recall on critical requirements

87-96%

Gross margin per discovery run

$0.05

Cost per free-tier query

The prototype is live at buildprequel.com. The core flow works end-to-end: describe your product, get a structured compliance checklist, generate an action plan, and deep-dive into any requirement.

User research is ongoing. I've been talking to designers and engineers alongside PMs—they're builders too, and their decisions are directly shaped by compliance requirements.

"A company that isn't scared of AI would pay so much money for this. I think there's a real need for a product like this. This is a phenomenal product."
— Geppi, former VP of Product, Dentsply Sirona

REFLECTION

What I Took Away

What I learned: Vibe coding changed how I think about what's possible. Being able to build a working prototype, run my own test cases, and evaluate models myself—it felt like I could build anything. Running the evaluations was especially valuable. Testing Haiku vs. Sonnet vs. Opus against real scenarios taught me firsthand why model selection is a product decision. It depends on your users, your pricing, and what tradeoffs your product can tolerate. Building the RAG pipeline reinforced something similar: the decision to curate a regulatory corpus wasn't an engineering decision—it was a product decision about what accuracy standard a compliance tool has to meet.

What's next: The natural extension is ongoing compliance audits. Prequel currently solves the launch moment: "what do I need before I ship?" But requirements don't stop at launch. Regulations change, products evolve, and teams need periodic verification that they still meet the requirements Prequel surfaced. Moving from one-time discovery into continuous compliance monitoring is the logical next step.

What I'd do differently: Start user research earlier. Three years of lived experience gave me strong conviction, but talking to builders outside my own context sooner would have pressure-tested my assumptions and shaped the MVP scope faster.