Document Intelligence · How It Works

Overview

Document Intelligence is an AI-powered extraction system that reads documents using vision models and returns structured data with confidence scores. Upload a PDF, image, Word document, or ZIP archive, define what you want to extract using plain English, and the system builds a strict JSON extraction prompt for the selected model.

This is inspired by Salesforce MuleSoft IDP, but simpler, faster, and wired into a live model registry covering OpenRouter free models, direct SDK models, and AWS Bedrock.

The Extraction Process

Here's what happens when you upload a document:

1

Upload Document

You upload a PDF, image, Word document, or ZIP archive (up to 10 MB). PDFs and images are sent as multimodal content; ZIPs are unpacked for supported documents.

2

Select Fields

Define extraction fields using plain English prompts (e.g., "What is the expiry date?").

3

Choose Model

Pick from currently selectable models. Anonymous users only get tested free OpenRouter models; signed-in users may also use healthy SDK and Bedrock models.

4

Vision Processing

The selected model reads the document content and returns strict JSON, including extracted fields, a summary, and verbatim evidence where available.

5

Confidence Scoring

The model returns each field with a confidence score (0.0–1.0) indicating extraction reliability.

6

Export Results

Get results as JSON or CSV, save reusable templates, call the template API, or provision matching Salesforce metadata.

Model Registry & Health Checks

The model dropdown is no longer a fixed hard-coded list. It is generated from live provider state and filtered by health checks:

OpenRouter: only free, vision-capable models are fetched. Paid OpenRouter models are suppressed because Bedrock is the preferred paid path.
SDK models: Anthropic, OpenAI, and Gemini direct models are tested individually before they are selectable.
AWS Bedrock: enabled Bedrock models are shown to signed-in users, with ap-southeast-2 as the configured region.
Anonymous access: unauthenticated users can only select tested, passing OpenRouter free models. Everything else is greyed out.
Failure handling: if an OpenRouter model fails during normal extraction, it is removed from rotation and a background refresh is triggered.

Tested OpenRouter Free Models

OpenRouter

Public default path. Refreshed and retested on a schedule and after runtime failures.

Claude Sonnet / Haiku / Opus 4.x

Anthropic SDK

Signed-in only. Greyed out automatically when credit, quota, or sample tests fail.

GPT-4o / GPT-4o Mini

OpenAI SDK

Signed-in only. Tested with both sample image and sample PDF before selection is allowed.

Gemini 2.x / 2.5

Google Gemini SDK

Signed-in only. Quota and sample failures are surfaced in model health metadata.

Amazon Nova Lite / Pro

AWS Bedrock

Preferred paid option for high-volume or data-residency-sensitive extraction.

Bedrock Claude / Other Vision Models

AWS Bedrock Discovery

Admin-configured from Bedrock model discovery and enabled-model settings.

Health probes use one image sample and one PDF sample. Models that cannot process both are greyed out until a later scan proves they are working again.

Templates & Reusability

Save your extraction configurations as templates to reuse them. Each template stores:

Field definitions — the prompts you want to extract
Vision model choice — which model processes these documents
Shareable link — anyone can load your template with one click
Optional Salesforce mapping — saved field metadata can be used to generate Salesforce custom objects and fields

Templates are retained for 365 days. Share templates with colleagues so they use the exact same extraction logic, API token, and model policy you defined.

Programmatic Access (API)

Once you save a template, you get a personal API endpoint to submit documents programmatically. No browser needed — perfect for automation, batch processing, or integrating into your application.

Two modes are available:

Synchronous — submit a file, wait for results (best for single documents, <10 second turnaround)
Asynchronous (Polling) — submit a file, get a job ID, poll for results (best for batch processing, concurrent submissions)
JSON submission — send base64 file content to /api/idp/submit/{token}/json, useful for Salesforce Flow and platforms that cannot send multipart forms.

Sync Example (cURL) curl -X POST https://www.danscodellaro.com/api/idp/submit/{token} \ -F "[email protected]"

      Async Example (Python)
import requests, time

r = requests.post(
    "https://www.danscodellaro.com/api/idp/submit/{token}/async",
    files={"file": open("doc.pdf", "rb")}
)
job = r.json()  # {"job_id": "...", "poll_url": "..."}

# Poll for completion
while True:
    status = requests.get(job["poll_url"]).json()
    if status["status"] in ("done", "error"):
        break
    time.sleep(1)

print(status["result"])
    

Access, Rate Limits & Costs

Anonymous users can test the system using only the currently passing OpenRouter free models. Signed-in users can access additional SDK and Bedrock models when those models are healthy.

Free tier: 3 extractions per day per IP address. Admin users get unlimited extractions for testing. The system also caps global daily processing to prevent abuse.

Paid usage depends on the selected provider, model, document size, and output length:

OpenRouter free models — public test path, only if the model passes live image and PDF tests.
Direct SDK models — useful when provider credits are available, but automatically disabled on quota or billing failures.
AWS Bedrock — preferred paid path, especially for AU-region/data-residency workflows.
Bedrock pricing — the admin Bedrock page discovers available multimodal models and refreshes AWS pricing metadata.

Features & Capabilities

📄

Multi-format support: PDFs, Word documents, JPG, PNG, WebP, TIFF, BMP, GIF, HEIC/HEIF, AVIF, and ZIP archives

🎯

Confidence scores: Each extracted field includes a confidence value (0.0–1.0) so you know how reliable each extraction is

🔗

Reusable templates: Save extraction configs and share shareable links that never expire

⚙️

API access: Both sync and async endpoints for programmatic submission and result retrieval

📊

Multiple export formats: JSON (with prompts included) and CSV for easy integration

🧠

Health-gated model choice: Only models that pass current health rules are selectable; failed models are greyed out until the next scan passes

Built-in Templates

Built-in templates and saved samples are available to get you started:

Identity Document — extracts passport/ID info: document type, expiry date, country, ID number
Utility Bill — captures account number, due date, amount due, service address
Medical Report — extracts patient name, date, diagnosis, medications, results
Website Screenshot — OCR and structured extraction from webpage screenshots
Saved samples — demo files can be filtered by template and loaded directly into the processor

You can customize any template or start from scratch with your own extraction fields.

Common Use Cases

🏦

Finance & Banking: Extract loan amounts, terms, borrower info from loan documents

🏥

Healthcare: Parse medical records, lab results, prescriptions, insurance info

📋

HR & Recruitment: Extract resume data, education, experience, certifications

🛂

Government & Compliance: ID verification, passport scanning, visa document processing

🏢

Real Estate: Extract property details, prices, addresses from listing documents

📦

Logistics & Supply Chain: Extract tracking numbers, dates, carrier info from shipping documents

Limitations & Best Practices

What works well:

Clear, legible documents with good contrast (photos or scans)
Structured or semi-structured formats (forms, invoices, IDs, certificates)
Documents in major languages (English, Spanish, French, German, Chinese, etc.)
Hand-written fields that are clear and legible

What doesn't work well:

Extremely blurry or rotated images
Heavily redacted or obscured documents
Documents with very small or pixelated text
Images with significant background noise or watermarks

Best practices:

Use clear, well-lit photos of documents (natural lighting preferred)
Write extraction prompts as specific questions, not generic labels
Test with a few sample documents before batch processing
Use tested OpenRouter free models for public experimentation and demos
Use Bedrock Nova or Bedrock Claude models for paid, repeatable production-style workflows
If a model is greyed out, wait for the next health scan or choose another selectable model