Chatbot as Intent Classifier

No-code chatbot builders collapse under their own complexity—here’s how treating messaging flows as intent classification problems makes them readable, testable, and scalable.

The Spider Web Problem

Most chatbots today come with no-code builders that let users chain messaging flows using conditionals, buttons, and external actions. On paper this sounds convenient. In practice, as more elements are added to the chain, the nodes and edges interleave until the whole thing resembles a complex spider web.

There is no validation test in any branch to prevent regression. The core chatbot logic quickly becomes an unmaintainable mess that nobody dares to touch. This is the canonical trap of visual flow builders: they lower the floor for entry but also lower the ceiling for maintainability.

Flattening the Graph

One solution is to flatten the graph into adjacency lists. This presents the same information in a compact, linear format with contextual data visible at a glance. It is genuinely easier to read and edit than a visual node graph.

The problem is cycles. A real conversation is rarely a directed acyclic graph. A user who reaches step 5 may want to correct a mistake they made at step 2. Cycles are common in practice and adjacency lists do not represent them cleanly. Worse, if you implement this as a state machine, the current conversation state must be stored in a database and kept in sync with the graph structure. The operational surface area grows fast.

Intent Classification as a First-Class Primitive

A more robust alternative is to treat messaging flows as an intent classification problem.

Each message template may optionally define a frontmatter block that describes the source state and the triggering condition. When classifying an incoming user message, fetch the conversation history starting from the source state and forward it to the LLM. Since each intent is expressed as a normalised combination of (source, condition, messages, target), the model has no difficulty matching incoming messages against the full state transition table at a configurable confidence threshold.

To the author of the template, this format is much easier to read and update. All relevant context for a given intent lives in one file. There is no graph to maintain, no visual editor to navigate, and no implicit state hidden across dozens of connected nodes.

Structured Data Extraction

The same representation extends naturally to structured data extraction. Rather than building a standalone form to collect user input, we can leverage the LLM to parse free-form responses into JSON based on the surrounding conversation context.

A low confidence score indicates ambiguity. Instead of silently accepting a bad parse, the system asks the user to confirm before proceeding. Extracted data can be forwarded to an external endpoint or stored for future reference.

This process is resistant to prompt injection because the system prompt is fixed and contains no sensitive information. It can, however, be more susceptible to hallucination when pushed outside its intended use cases, so the confidence threshold deserves careful tuning.

Verifying Against External Data

Another common chatbot requirement is validating user input against an external data source. A password reset flow, for example, may need to check that the email address a user provides matches the one on their registered account.

This is handled by declaring data connectors in the admin interface. Each connector stores an API token and a base URL. Message templates reference connectors by name, and the runtime fetches and injects the relevant data at classification time. Users never write API call logic directly; they only declare what connector to use and what field to check.

Scaling with RAG

When the number of intents grows into the tens of thousands, sending the entire state transition table to the LLM in a single API call becomes impractical. Retrieval Augmented Generation is the natural solution here.

We use an embedding model to convert the incoming user query—combined with recent message history—into a single vector, then run a cosine similarity search against an indexed database of template intents. The closest matching intent in vector space is returned as the chat response. This approach scales to millions of intents and is ultimately bounded only by the quality of the embedding model.

Putting It Together

The combination of intent classification, structured data extraction, external data connectors, and vector retrieval gives us a chatbot that is both intelligent in its responses and tightly controlled in its behaviour. The templates remain readable, the logic stays auditable, and the system can grow without becoming the spider web it was designed to replace.