PDF Workbench — Viewer, Bookmarks & Links

DNXT Publisher Suite · PDF Workbench

Regulatory PDFs
Reviewers Can Actually Navigate

FDA and EMA reviewers navigate dossiers by bookmarks and hyperlinks. DNXT PDF Workbench is the only browser-based toolset that understands eCTD document structure — AI-guided bookmarks, 100% hyperlink surface coverage, and a 21 CFR Part 11-compliant audit trail on every decision. No Acrobat license. No plugin. No desktop chaos.

Open a PDF in the Workbench Request a Live Demo

DNXT PDF Workbench — m2-5-3-1-study-report.pdf

Bookmarks

▶ 1. Introduction

▼ 2. Study Design AI

• 2.1 Objectives

• 2.2 Population

• 2.3 Endpoints

▶ 3. Results AI

▶ 4. Safety Data

▶ 5. Discussion

▶ 6. References

See Table 14.3.1 (m5-5-3-appendix.pdf, p.42) LINK

Ref [12] — SmPC Section 4.4 REVIEW

ICH E6(R2) Guideline §5.18.4 LINK

23 Links Accepted

7 Pending Review

AI: 14 Bookmarks Suggested

100%

Hyperlink Surface Coverage
Every page scanned, no missed links

91%

AI Bookmark Accuracy
vs. regulatory reviewer expectations

Plugins Required
100% browser-based, any OS

21 CFR

Part 11 Compliant Audit Trail
Every accept/reject decision logged

Who This Is Built For

Three teams who spend too much time fighting desktop PDF tools the week before every submission.

Regulatory Publishing Manager

Mid-Size Pharma · NDA/MAA Teams

You run a 2-day QC sprint before every submission where the team manually opens every PDF in Acrobat, checks bookmarks against the eCTD outline, and clicks through hyperlinks to verify they resolve. When someone finds a broken cross-reference on the morning of submission day, the scramble to fix, repaginate, and re-QC a single document can delay filing by 48 hours. There's no audit trail, no record of who checked what, and the process resets from zero for every submission cycle.

Eliminates the 2-day manual QC process before every submission cycle
Automated hyperlink scan catches all broken cross-document references before packaging
AI bookmark suggestions cut bookmark build time from hours to minutes per document
Full audit trail satisfies QA review without manual documentation
Single browser-based tool replaces fragmented Acrobat workflows across the team

Regulatory Affairs Director

Biotech · Pre-NDA / BLA Filing Stage

Your last FDA technical rejection cited two broken hyperlinks in Module 5 study reports — documents that had been reviewed three times by the publishing team using Acrobat. The FDA query took 14 days to resolve and pushed your PDUFA clock. You can't explain to your CMO or board why a hyperlink problem delayed a critical milestone, and you have no systemic fix beyond telling the team to "be more careful." Every submission is an undocumented risk.

Eliminates the #1 cause of FDA technical queries: broken or missing hyperlinks
Documented proof of hyperlink review for every submission — defensible at inspection
Reduces exposure to technical rejection delays that disrupt PDUFA timelines
AI bookmark advisor aligns documents to reviewer navigation patterns, improving review experience
Centralised workbench sessions mean the Director can see submission readiness status in real time

Senior Regulatory Publishing Specialist

CRO · Multi-Client Submission Services

You juggle PDFs across six active clients, each with different Acrobat versions, different bookmark naming conventions, and different opinions on what constitutes a valid hyperlink. You've personally spent 6-hour sessions clicking through 400-page clinical study reports checking every blue-text instance by hand. When a client's scanned legacy document arrives with no searchable text, you either push back on scope, charge for manual re-work, or absorb the cost of reformatting. None of those options feel right.

Handles scanned documents with scan enhancement — no manual reformatting required
Automated blue-text detection replaces the 6-hour manual hyperlink click-through process
Workbench import/export sessions are client-portable — work transfers without rework
No software installation on client systems — browser-based access eliminates version conflicts
Per-session audit trails mean CRO work is defensible and billable with clear deliverables

How It Works

From raw PDF upload to submission-ready document — the technical process behind the Workbench.

PDF Upload and Structural Parsing

When a PDF is uploaded to the Workbench, the platform parses the document's internal object structure — extracting the page tree, existing bookmark hierarchy (PDF outline dictionary), annotation objects, and embedded metadata. This parsing happens server-side in a sandboxed processing container and typically completes within 4–12 seconds for documents up to 500 pages. The result is a structured internal representation of the document that powers every downstream tool in the Workbench without requiring the user to wait for full page rendering.

AI Bookmark Analysis and Advisory Generation

The AI Bookmark Advisor sends the document's extracted text structure and existing outline to the DNXT regulatory language model, which has been fine-tuned on eCTD document hierarchies, ICH M4 module structures, and agency reviewer navigation patterns drawn from FDA guidance on electronic submissions. The model compares the current bookmark set against expected navigation points for the document type — clinical study report, SmPC, investigator brochure, etc. — and generates a ranked list of bookmark suggestions with confidence scores and proposed anchor page numbers. Suggestions flagged above 85% confidence are pre-selected; lower-confidence items are presented for manual review.

Full-Document Hyperlink Surface Scan

The blue-text detection engine performs a pixel-level and text-layer dual scan of every page. The text-layer pass uses regular expression patterns trained on regulatory cross-reference formats — section references, appendix citations, module path notation, and external URL patterns. The pixel-layer pass independently identifies rendered blue-coloured text regions using colour space thresholding, catching hyperlink candidates in scanned documents or image-heavy PDFs where the text layer is absent or unreliable. The two pass results are merged and de-duplicated, producing a comprehensive map of all potential hyperlink locations with page number and bounding-box coordinates. This dual-pass approach is why DNXT achieves 100% surface coverage where single-pass text extraction routinely misses 8–15% of link candidates.

Link Resolution Validation and Risk Scoring

Each detected hyperlink candidate is resolved against the submission's file index. For internal cross-document links, the platform checks whether the target file exists in the dossier, whether the target page number falls within the document's page count, and whether the target bookmark anchor is present in the destination document. For external URLs, the platform performs a HEAD request to check reachability (where policy permits) and flags any URLs pointing to non-persistent resources, which are inappropriate in a regulatory submission. Each link receives a status: Resolved, Broken Target, Missing Anchor, or External — surfaced visually in the viewer with colour coding so the publishing specialist sees the problem directly on the page where it appears.

Accept / Reject Workflow with Audit Logging

The publishing specialist reviews each flagged link directly in the browser viewer — clicking Accept confirms the link annotation will be written to the output PDF, clicking Reject removes it, and Defer holds it for escalation. Every decision is written to the Workbench session audit log with a UTC timestamp, user identity, document name, page number, link text, and the decision made. This event stream is tamper-evident and exportable as a signed PDF audit report. For teams operating under 21 CFR Part 11, this provides the complete electronic record of who reviewed what and when — a record that desktop Acrobat workflows are constitutionally incapable of producing.

Scan Enhancement for Legacy Documents

When a scanned document is detected — identified by the absence of a text layer covering more than 60% of pages — the Workbench automatically routes the document through the scan enhancement pipeline. This applies adaptive deskewing, noise reduction, and contrast normalisation before running optical character recognition at the page level. The resulting searchable text layer is embedded into the PDF without altering the visual rendering of the original scan, making the document compliant with FDA's requirement for searchable text in electronic submissions. The original scan pixel data is preserved in a separate archival layer so the enhancement is non-destructive and the output is auditable as a derivative document.

Workbench Export and Submission-Ready Output

When the publishing specialist completes the review, the Workbench writes all accepted bookmarks, confirmed link annotations, and embedded text layers into a new PDF/A-compliant output file. The export also packages a Workbench Session Report — a structured JSON and human-readable PDF document listing every change made, every link reviewed, and the full audit trail — which travels with the submission package for QA sign-off. The output PDF is validated against the platform's eCTD technical specification checker before download, catching issues like invalid PDF version flags or non-conforming bookmark structures before the document ever reaches the submission authoring tool.

Workbench Features

Every tool in the PDF Workbench is designed specifically for regulatory publishing — not adapted from a generic PDF library.

📄

Interactive PDF Viewer

The Workbench viewer renders PDFs natively in the browser using a high-fidelity rendering engine with no reliance on the browser's built-in PDF handler, which lacks annotation layer visibility. The viewer exposes all annotation objects as interactive overlays — hyperlinks are highlighted in blue, bookmark anchors are shown as margin flags, and the current reading position is synchronised with the bookmark panel in real time. For large clinical study reports exceeding 1,000 pages, the viewer uses progressive page loading so the document is navigable within two seconds of opening, without waiting for the full document to render. Regulatorily sensitive content such as redacted blocks is preserved and rendered with their original visual treatment intact.

🔖

Bookmark Editor

The Bookmark Editor provides a fully interactive tree view of the document's PDF outline dictionary, allowing publishing specialists to add, delete, rename, reorder, and reparent bookmarks without leaving the browser. Each bookmark can be assigned a destination type — page-level, named anchor, or fit-to-window — which maps directly to the PDF specification's destination object types and determines how the bookmark renders in agency reviewer software such as Adobe Reader and the FDA's ESG toolkit. Bulk operations allow the entire bookmark hierarchy to be imported from a structured CSV template, enabling teams to enforce corporate or submission-specific bookmark naming standards across all documents in a dossier in minutes rather than hours.

🤖

AI Bookmark Advisor

The AI Bookmark Advisor is the only regulatory-specific bookmark intelligence tool available in a cloud publishing platform. The underlying model was trained on the structural conventions of ICH M4 CTD sections, FDA reviewer guidance on navigating electronic submissions, and corpus data derived from thousands of accepted regulatory documents across Module 2 through Module 5 content types. When invoked, it analyses the document's heading hierarchy, section numbering, and content signals to propose a complete bookmark set aligned to how a reviewer at FDA or EMA would expect to navigate that document type. Users can accept individual suggestions, bulk-accept high-confidence items above a set threshold, and provide rejection feedback that is logged as training signal for model improvement. The advisor is explicitly not a replacement for human review — it is an accelerator that eliminates the blank-page problem of building bookmarks from scratch.

🔗

Hyperlink Detection Engine

The hyperlink detection engine performs the dual-pass scan described in the technical architecture — text-layer pattern matching combined with pixel-level blue-text region detection — to surface every potential hyperlink candidate in the document regardless of whether it carries an existing annotation object. This is critical because many source documents produced from Word templates or LaTeX carry visible blue-underlined text that was never encoded as a proper PDF link annotation, meaning the reviewer sees something that looks like a link but clicking it does nothing. The engine surfaces these "orphan blue text" instances as candidates requiring annotation before submission, which is the exact scenario that generates FDA technical queries. The internal link map view gives publishing leads a full cross-document link graph showing which documents reference which, making it easy to identify missing destination anchors before packaging.

✅

Link Accept / Reject Workflow

The accept/reject workflow is the operational heart of the Workbench's compliance model. Each detected link candidate appears as a review card in the side panel, paired with the in-viewer highlight on the relevant page, so the specialist never loses context between the review queue and the document content. The workflow supports three decisions — Accept (encode as annotation), Reject (mark as not a hyperlink, suppress annotation), and Flag for Expert Review (escalate to a named colleague with a comment) — and all three decisions are immediately written to the tamper-evident audit log. Batch operations allow specialists to accept all links of a given type — for example, all internal cross-document references that resolved successfully — with a single action, reducing per-document review time without bypassing the audit requirement. The completed decision log is the documentary evidence that hyperlink review was performed, which is what QA requires before a submission is signed.

🖨️

Scan Enhancement

Legacy regulatory dossiers frequently contain scanned documents — historic preclinical reports, paper-era manufacturing records, or third-party documents received as image-only PDFs. These documents fail FDA's technical submission requirements for searchable text and are routinely rejected at technical screening. The Workbench scan enhancement module applies a four-stage processing pipeline: deskewing to correct page rotation up to ±15°, binarisation and noise reduction to clean degraded scan quality, optical character recognition tuned for scientific and pharmaceutical text including Latin nomenclature and numeric tables, and text layer embedding into the original PDF container. The visual appearance of the original scan is completely unchanged — the enhancement adds a transparent text layer that enables search, copy, and bookmark anchoring without altering the evidentiary record of the original document. Enhancement processing is tracked in the session audit trail with before/after metrics on text coverage percentage.

DNXT vs The Alternative

The tools your team is currently using for this job were not designed