When we set out to build VectorAutomate, we made a non-negotiable architectural decision: every technical claim the system makes must be traceable to a specific passage in a specific version of a specific document. No exceptions.
This post is a deep dive into how we built that citation engine and why it matters for enterprise service organizations.
The Problem With Uncited AI
Most large language models generate text by predicting the next token in a sequence. They’re excellent at producing fluent, plausible-sounding responses. But “plausible-sounding” is not the same as “correct.”
In a field service context, an uncited answer is worse than no answer at all. A technician acting on incorrect troubleshooting guidance can damage equipment, void warranties, or create safety risks. And without a citation trail, there’s no way to audit what went wrong.
Our Architecture
VectorAutomate’s citation engine operates in three stages:
Stage 1: Retrieval — When a query arrives, we run a hybrid search across the customer’s document corpus. This combines dense vector similarity (for semantic matching) with sparse keyword matching (for exact model numbers, error codes, and part IDs). The retrieval stage returns ranked document passages, each tagged with its source document, page, section, and version.
Stage 2: Generation with Grounding — The retrieved passages are fed to the language model as context, with strict instructions to only generate claims that can be directly supported by the provided passages. We use a constrained generation approach where the model must indicate which passage supports each claim.
Stage 3: Citation Verification — After generation, a separate verification module checks each citation. It confirms that the cited passage actually exists, that the passage supports the claim being made, and that the document version is current. Claims that fail verification are either rewritten or flagged as unverifiable.
Version Locking
Documents change. Manuals get updated, service bulletins are revised, procedures are deprecated. A citation that was valid last month might be invalid today.
VectorAutomate maintains a version-locked index of every document. When a citation is created, it references a specific version of a specific document. Even if the document is later updated, the original citation remains traceable to the exact text that existed at the time the response was generated.
This is critical for audit scenarios. When a regulator or warranty team needs to verify what information was available to a technician at the time of service, the citation trail provides an exact answer.
Scaling Challenges
The biggest engineering challenge wasn’t the citation logic itself — it was scaling it. Enterprise customers have millions of pages of technical documentation across thousands of documents. The retrieval stage needs to return results in under 200 milliseconds. The verification stage needs to complete before the response is delivered to the technician.
We solved this with a tiered indexing architecture. Frequently accessed documents (active manuals, recent service bulletins) are kept in a hot index with sub-50ms retrieval times. Less frequently accessed documents (archived manuals, legacy procedures) are in a warm index. And we use predictive pre-fetching based on the equipment model identified in the query to warm up relevant documents before the full query is processed.
Why This Matters
Citation-level traceability isn’t a nice-to-have feature. For regulated industries — medical devices, aerospace, industrial machinery — it’s a compliance requirement. Every technical decision in the field needs a paper trail.
VectorAutomate provides that trail automatically, for every interaction, without adding any burden to the technician’s workflow.
