Build AI Citation in WebViewer
AI Citation connects an LLM answer to exact locations in the PDF so users can verify where each claim comes from.
The core API is text.locateSource(), which maps quoted source text to page coordinates you can highlight in the viewer.
Why this matters
In document-heavy workflows (legal, finance, compliance, research), users need traceability, not just fluent answers.
AI Citation solves that by:
- taking LLM-provided source quotes
- locating those quotes in the document text layer
- rendering highlight rects directly in WebViewer
No additional LLM call is needed for the locate/highlight step.
Prerequisites
- A loaded PDF in WebViewer
- LLM output that includes quote citations (not plain prose only)
- High-quality Markdown/text extracted from the same PDF (for quote grounding)
text.locateSource() is designed for high-quality model output. Avoid passing noisy, manually typed snippets.
Flow
Your app: PDF -> Markdown -> LLM answer with quote citations
MuPDF WebViewer APIs: text.locateSource() -> viewer.highlight() + viewer.scrollTo()
- You implement: PDF-to-Markdown pipeline, prompt/response contract for quoted citations, and when citation actions should run (for example on answer render, hover, or click).
- WebViewer provides: quote-to-coordinate mapping via
text.locateSource(), plus rendering/navigation APIs (viewer.highlight() and viewer.scrollTo()).
Recommended answer contract
Use a structured response where each paragraph carries one or more citations with verbatim quote text.
{
"paragraphs": [
{
"text": "MuPDF minimizes global state by isolating context and document lifetimes.",
"citations": [
{
"chunk_id": 12,
"quote": "MuPDF has no global variables and therefore no hidden dependencies..."
}
]
}
]
}
The important field for AI Citation is quote. That quote is what you pass to text.locateSource().
Example
This example resolves all quotes from an answer, highlights every matched rect, and scrolls to the first resolved page.
async function highlightAnswerSources(webViewer, paragraphs) {
// Reuse locateSource results when the same quote appears multiple times.
const quoteCache = new Map()
// Collect every rect first, then draw highlights in a single call.
const highlightRects = []
// Remember the first match so we can navigate users to it.
let firstResolvedPage = null
for (const paragraph of paragraphs) {
const citations = Array.isArray(paragraph?.citations) ? paragraph.citations : []
for (const citation of citations) {
// Normalize AI output and skip empty/non-string-like quote values.
const quote = String(citation?.quote ?? '').trim()
if (!quote) {
continue
}
let located = quoteCache.get(quote)
if (located === undefined) {
try {
// Map a quote to page/word rectangles in the loaded PDF.
located = await webViewer.text.locateSource({ text: quote })
} catch {
// Cache failures too, so repeated bad quotes do not re-query.
located = null
}
quoteCache.set(quote, located)
}
// Ignore unresolved matches or unexpected result shapes.
if (!located || !Array.isArray(located.words)) {
continue
}
if (firstResolvedPage === null && Number.isFinite(located.pageIndex)) {
firstResolvedPage = located.pageIndex
}
for (const word of located.words) {
const rects = Array.isArray(word?.rects) ? word.rects : []
for (const rect of rects) {
highlightRects.push({
color: '#ff00ff',
pageIndex: located.pageIndex,
rect,
})
}
}
}
}
if (highlightRects.length > 0) {
// Batch highlight for better performance and fewer UI updates.
await webViewer.viewer.highlight({ rects: highlightRects })
}
if (Number.isFinite(firstResolvedPage)) {
// Move viewport to the first citation hit for immediate context.
await webViewer.viewer.scrollTo({
type: webViewer.refs.scroll.type.PAGE,
value: firstResolvedPage,
})
}
}
Production checklist
- Cache
quote -> locateSource result to remove duplicate lookups.
- Handle unresolved quotes gracefully (skip or show “source not found”).
- Batch highlight rects in one
viewer.highlight() call when possible.
- Optionally use
pageRange in text.locateSource({ text, pageRange }) for faster scoped lookup.
- Keep citation click targets in the UI so users can jump between answer and source pages.
Next steps