How It Works Pricing Blog About FAQ Contact
Log In Get Started →

The Full Process

From messy PDF to
clean structured data

Here's exactly what happens from the moment you submit your documents to the moment you receive a clean, ready-to-use spreadsheet — every technical step, fully explained.

1

Choose Your Plan

2

Upload Documents

3

AI Processing

4

Human QA Review

5

You Receive Output


1

Step One

Choose your plan & complete checkout

Select the plan that matches your document volume and turnaround needs. Payment takes under 60 seconds via Stripe — no account required for Starter plans.

How to choose the right plan

Our plans are organized by page volume and turnaround time. Before you buy, here's how to estimate your needs:

  • Count your pages: Each side of a document counts as one page. A 10-page PDF is 10 pages. A double-sided scanned form is 2 pages per physical sheet.
  • Consider your deadline: Starter plans have 24–48h turnaround. If you need same-day processing, contact us for an Enterprise rush quote.
  • Not sure of your volume? Choose the next size up — unused page capacity doesn't expire on monthly plans.
  • Have a very large archive? Contact us before purchasing. Projects over 5,000 pages get custom Enterprise pricing that's significantly cheaper per page.
💡 Free Sample First

Not ready to commit? Email us a 5–10 page sample at contact@onyorai.com — we'll process it free and show you the output quality before you buy.

📋 What happens after payment

📧

Confirmation email sent

Instantly — includes order number, plan details, and upload portal link

🔑

Upload portal unlocked

Secure, encrypted portal specific to your order — valid for 7 days

📋

Submission form opened

Specify output format, column preferences, and any special requirements

Processing clock starts

Turnaround timer begins once all files are confirmed received and readable

2

Step Two

Submit your documents securely

Upload your files through our encrypted portal, share a cloud folder link, or send via WeTransfer for very large archives. All uploads are TLS 1.3 encrypted in transit.

OnyorAI Secure Upload Portal
📁

Click to upload or drag & drop files here

PDF · JPG · PNG · TIFF · BMP · up to 2GB

📄invoices_Q3_2024.pdf14.2 MB✓ Uploaded
📷patient_forms_batch.zip87.5 MB✓ Uploaded
📄contracts_archive.pdf3.1 MB↻ Uploading…

Accepted formats & submission methods

  • Direct portal upload — Up to 2GB per batch. Supports PDF, JPG, PNG, TIFF, BMP, and ZIP archives.
  • Cloud folder link — Share a Google Drive, Dropbox, or OneDrive folder link for larger archives.
  • WeTransfer — For archives between 2–10GB, send a WeTransfer link to support@onyorai.com with your order number.

Document quality tips for best results

  • Minimum 200 DPI for scans — 300 DPI or higher recommended for handwritten documents.
  • Pages should be flat and fully in frame — Avoid photographs at an angle. We can de-skew up to ~15 degrees.
  • Include all pages, in order — Combine multi-page documents into a single PDF per document type.
  • No password protection — Remove password locks before uploading, or provide the password in submission notes.
🔒 Security reminder

Your files are encrypted the moment they leave your device. Only the project team assigned to your order has access. Source documents are permanently deleted within 72 hours of delivery.

3

Step Three

AI pipeline extracts & structures your data

Our hybrid OCR + AI Vision pipeline processes your documents in parallel layers — achieving accuracy that neither technology can reach alone.

How the hybrid pipeline works

Traditional OCR reads documents character-by-character without understanding context. Our pipeline adds a second intelligence layer on top:

  • Layer 1 — OCR Pre-processing: Nanonets OCR extracts all machine-readable text and structures it into a preliminary schema, catching 95%+ of clearly printed content.
  • Layer 2 — AI Vision Review: GPT-4 Vision processes the document as an image, handling ambiguous handwriting, crossed-out corrections, degraded text, tables, checkboxes, and multi-language content.
  • Layer 3 — Schema Mapping: Our Make.com pipeline maps all extracted fields to your specified output schema — normalizing dates, cleaning whitespace, and deduplicating records.
  • Layer 4 — Confidence Scoring: Every field is assigned a confidence score. Anything below 95% is automatically routed to human QA review.

📊 Accuracy by document type — OnyorAI vs. Traditional OCR

Clean digital PDFs99.4%
Printed scanned forms98.1%
Mixed print + handwriting96.4%
Primarily handwritten94.7%
Degraded / old documents88.9%

Compared to traditional OCR: 73–84% on same document types. Based on internal benchmark of 5,000 documents.

4

Step Four

Human QA specialist reviews flagged records

Every field flagged as low-confidence by the AI is manually reviewed by a trained QA specialist against the original document image. We never silently pass a questionable record.

🔍 QA Review Queue — Sample

Invoice #4821 — Total AmountAI read: €2,847.50 · QA confirmed · Confidence: 98%
Form 0047 — Date of BirthAI read: 03/07/1991 or 07/03/1991 · QA corrected to 07/03/1991 · Flagged in report
Contract Ref — Company NameAI read: “Rousseau & Partners” · QA confirmed vs. letterhead · Confidence: 99%
Patient Form — Medication FieldAmbiguous handwriting · QA cross-referenced diagnosis field · Corrected to Metformin
Invoice #4825 — All Fields100% confidence on all 14 extracted fields · Passed directly to output

What our QA team actually does

  • Opens every flagged field alongside the original document image — not just the extracted text
  • Manually confirms, corrects, or escalates each low-confidence record, documented with a reason code
  • Cross-references context — verifying a medication name against a diagnosis field, or confirming an invoice total against line items
  • Marks unresolvable fields as “unreadable” rather than guessing — you always know what's uncertain
  • Generates a low-confidence report — a separate sheet listing all QA-reviewed fields with page references
🧑 Real humans, every project

Our QA team consists of trained document specialists, not crowdsourced workers. Healthcare projects are reviewed by HIPAA-trained staff. Legal documents are handled by team members with legal document experience.

5

Step Five

You receive your clean, structured output

A notification email is sent the moment your project is ready. Download your structured spreadsheet, open it in Excel or Google Sheets, and start using your data immediately.

What's included in every delivery

  • Main data file — One row per document, one column per extracted field. Fully clean, normalized, and formatted to your specifications.
  • Summary sheet — Totals, record count, date range, and key aggregates where applicable.
  • QA report — All fields that went through human review, with original AI reading, corrected value, confidence score, and document page reference.
  • Delivery receipt — Timestamped confirmation and scheduled data deletion date (72 hours after delivery).

After delivery

  • Source documents automatically deleted within 72 hours
  • Output files remain in the portal for 30 days
  • Free revisions within 14 days if you spot any errors
  • Monthly subscribers: processing allocation resets on your billing date
📊

invoices_Q3_extracted.xlsx

847 records · Delivered 48h after submission · QA: 12 fields reviewed

Invoice #DateVendorTotalStatus
INV-48212024-09-03Rousseau & Co€2,847.50✓ Paid
INV-48222024-09-05Tech Supplies SA€188.00⏳ Due
INV-48232024-09-07Office Depot€54.99✓ Paid
INV-48242024-09-09CloudServ Inc.€1,200.00✓ Paid
INV-48252024-09-12Dupont Partners€3,450.75⏳ Due

📦 How would you like your output?

📊

Excel (.xlsx)

With summary + QA sheets included

Most Popular
📋

CSV

Clean import-ready, any system

Free
📁

Airtable Base

Pre-configured with views & filters

Free
📄

Google Sheets

Shared directly to your account

Free

The technology stack that powers everything

OnyorAI's pipeline is purpose-built for document intelligence — combining best-in-class tools at each layer, integrated into a seamless, automated workflow.

GPT-4 Vision

Handwriting & context intelligence — understands document meaning, not just characters

Nanonets OCR

High-speed machine-print extraction — processes 1,000+ pages per hour

Adobe Acrobat AI

Structured PDF parsing — tables, form fields, and embedded text extraction

Make.com

Workflow automation — schema mapping, normalization, and delivery orchestration

AWS Frankfurt

EU-based encrypted storage — GDPR compliant, AES-256 at rest, TLS 1.3 in transit

Internal QA Tool

Confidence scoring, flagging, and human review interface built in-house

Document Type Traditional OCR OnyorAI
Digital PDFs96–98%99.4%
Printed scans88–93%98.1%
Mixed handwriting71–76%96.4%
Fully handwritten62–69%94.7%
Old / degraded55–64%88.9%
Multi-language70–82%95.2%
Tables & forms84–89%97.8%

Ready to start your first project?

Most clients go from first upload to finished spreadsheet in under 48 hours. Choose a plan and submit your documents — we handle everything from there.

📄

You: Choose a plan & upload documents

Takes ~5 minutes. Specify your output format and any special requirements.

Us: AI processes + human QA reviews

24–48 hours for most projects. Track status by email or by contacting us.

📧

You: Receive email with download link

Click to download your structured spreadsheet instantly.

🗑

Us: Source documents deleted within 72h

Automatic, irreversible, and confirmed by email. Your privacy is protected.