Invoices are the most common PDFs in business workflows. They flow from suppliers to customers, get matched against purchase orders, route to approvers, and end up in accounting systems. The right PDF-aware workflow can compress invoice processing from days to hours. This guide walks through the practical pieces.
The invoice lifecycle
A typical B2B invoice journey:
- Generation, supplier creates from their billing system
- Delivery, emailed, uploaded to a portal, or sent through EDI
- Ingestion, receiving organization captures the PDF
- Extraction, invoice data is pulled from the PDF
- Validation, checked against PO, expected amounts, vendor records
- Routing, sent to appropriate approver
- Approval, approver signs off (digitally or otherwise)
- Posting, sent to ERP / accounting for payment
- Payment, invoice paid; matched to a payment record
- Archive, stored for the legally-required retention period
PDFs sit at the center of all of this. Each step has its own tools and best practices.
Generation
Invoice PDFs are typically generated programmatically:
- ERP systems (SAP, Oracle, Workday, NetSuite) generate invoices automatically
- Billing platforms (Stripe Billing, Recurly, Chargebee) produce invoices for SaaS billing
- Accounting software (QuickBooks, Xero, Sage) generates per-transaction
- Custom systems generate via libraries (ReportLab, iText, wkhtmltopdf), see how to convert HTML to PDF
Best practices at generation:
- Include all required fields per local invoice law (varies by country)
- Embed both human-readable visual layout and machine-readable structured data
- Use PDF/A for long-term archival
- Consider ZUGFeRD or Factur-X for embedded XML invoice data
- Include unique invoice numbers and dates
- Sign or certify for integrity
Hybrid invoices (PDF + XML)
European workflows increasingly use hybrid PDF/XML invoices:
- ZUGFeRD (German standard), PDF/A-3 with embedded structured XML
- Factur-X (French/European equivalent), same idea
- UBL-based national variants, Italy's FatturaPA, Spain's Facturae
The PDF view satisfies humans; the embedded XML satisfies automated systems. One file, two purposes.
For the underlying concept, see hybrid PDF explained.
Delivery
Invoices reach buyers via:
- Email, most common; usually PDF attached
- Portals, supplier uploads to buyer's portal
- EDI, structured data exchange; PDFs may accompany or follow
- PEPPOL, European electronic invoicing network
- Paper to scan, still happens; needs OCR
For buyers, the variety of channels means a unified ingestion layer is essential.
Ingestion
Inbound invoice handling:
- Email-to-folder, invoices arrive at a dedicated mailbox, then auto-route to processing
- Portal monitoring, automated scrapers pull from supplier portals
- Manual upload, for paper invoices that get scanned
- EDI integration, direct system-to-system
Once ingested, invoices need to be normalized: PDFs converted to consistent format, scanned ones OCR'd, embedded XML extracted if present.
For OCR-heavy ingestion, see PDF OCR explained and how to make a PDF searchable OCR.
Extraction
Pulling structured data out of invoice PDFs:
- Embedded XML, if present (ZUGFeRD, Factur-X), extract directly
- AI / cloud document AI, AWS Textract Analyze Expense, Google Document AI, specialized tools like Rossum, Klippa, Hypatos
- Template-based, for high-volume single-vendor flows where layout is known
- Manual entry, fallback for low-confidence extractions
Modern AI extraction handles arbitrary invoices with 90-99% field accuracy. See AI data extraction from PDFs.
Key fields extracted:
- Invoice number, date
- Vendor name, address, tax ID
- Customer reference (PO number)
- Line items (description, quantity, unit price, line total)
- Subtotal, tax, total
- Payment terms, due date
- Bank details
Validation
Before routing for approval:
- 3-way match, invoice vs PO vs receiving record
- Duplicate detection, has this invoice number from this vendor been seen?
- Vendor verification, is this a known approved vendor?
- Math check, line totals sum to subtotal; tax rate matches; total matches
- Compliance checks, required fields present, format correct
- Anomaly detection, unusual amounts trigger review
Automated validation catches most errors; failures route to AP for review.
Routing for approval
Invoices route based on:
- Amount, small auto-approve; medium one approver; large multiple
- Cost center, different departments have different approvers
- Category, different commodity codes route differently
- PO existence, PO-backed invoices route differently than ad-hoc
Workflow systems handle the routing. See document approval workflows.
Approval
Approvers see the invoice (the PDF) plus extracted data:
- Click to view the PDF for context
- Confirm amounts and line items
- Approve or reject
- Optionally add notes
The PDF should display inline or open quickly. A poorly-optimized invoice that loads slowly hurts throughput.
For digital signature on approval, see how to sign a PDF online and digital signatures vs electronic signatures.
ERP posting
Approved invoices flow to the ERP:
- Vendor code matched
- GL coding applied
- Posted to accounts payable
- Payment terms applied (Net 30, 2/10 Net 30, etc.)
The PDF is linked to the AP record for reference.
Payment
When the invoice comes due:
- ACH, wire, check, or other method
- Remittance advice sometimes sent back to the vendor (often as a PDF)
- Payment record linked to the original invoice
Archive
Post-payment, invoices archive:
- Stored for the legally-required period (typically 7-10 years for tax)
- Indexed by invoice number, vendor, date, amount
- Available for audit retrieval
- Compliant with PDF/A for long-term
See how to archive PDFs long-term and document retention policies.
Tooling
Tools across the lifecycle:
Generation:
- ERP-integrated invoice generation
- Programmatic libraries (ReportLab, iText)
Ingestion:
- Email integrations (Microsoft Power Automate, n8n)
- Document capture platforms (Kofax, ABBYY FlexiCapture)
Extraction:
- Cloud document AI (AWS Textract, Google Document AI, Azure)
- Specialized invoice tools (Rossum, Klippa, Hypatos)
Workflow:
- Coupa, SAP Ariba, Basware, full AP automation suites
- Bill.com, SMB-focused
- Tipalti, payments-focused
ERP integration:
- Direct ERP connectors
- iPaaS platforms (MuleSoft, Workato)
Archive:
- DAM/DMS systems (M-Files, OpenText, SharePoint)
- Cloud storage with retention policies
For small businesses
Smaller organizations have lighter setups:
- QuickBooks / Xero with attachment support for PDF invoices
- Email folder + manual processing
- Bill.com for AP automation
- Receipt Bank / Hubdoc for capture and extraction
For a few invoices per week, manual processing is fine. For dozens daily, even small businesses benefit from AP automation tools.
For freelancers and consultants
For sending invoices:
- Generate from Wave / Stripe / FreshBooks, auto-PDF
- Word / Google Docs template + Save as PDF
- Manual Word + signature
For receiving:
- Drop into email folder
- Track in spreadsheet
- File at year-end for tax
Compliance and regulation
Specific concerns:
- Country-specific invoice laws. Required fields, languages, tax breakdowns vary.
- VAT / sales tax compliance. Real-time reporting in some jurisdictions (Mexico, Italy, Spain, Hungary).
- Electronic invoicing mandates. PEPPOL, ViDA in EU; emerging in many countries.
- GDPR for B2C invoices. Personal data on invoices needs protection, see GDPR and PDF documents.
- Anti-fraud. Invoice fraud is a major risk; detection is critical.
- Audit retention. Tax authority audit windows determine retention.
Common gotchas
Duplicate invoices. Same invoice from same vendor processed twice. Strong duplicate detection is essential.
Vendor fraud. Spoofed emails with fraudulent invoices. Verify bank details against records.
Currency mismatches. International invoices need explicit currency. "USD" vs "$" vs "$ AUD" all mean different things.
Tax handling. Reverse charge, tax-exempt, multi-jurisdiction, complex. Get accounting input.
Line items with non-numeric quantities. "1 service" vs "10 hours" vs "1 monthly subscription", extractor confusion.
Embedded approvals. Signing on the PDF vs in a workflow tool, track which is the authoritative record.
Lost invoices. A PDF that lands in spam, gets stuck in approval, or has no AP owner. Workflow visibility is essential.
Late payments. Tracking due dates is critical for both maintaining vendor relationships and managing working capital.
Practical recipes
Send an invoice (freelancer):
- Generate in Wave / FreshBooks / Word template
- Save as PDF with embedded metadata (your business name)
- Sign if appropriate
- Email to client with clear subject and message
Process an inbound invoice (small business):
- Receive in invoice email folder
- Save PDF; back up to cloud
- Match against PO (if applicable)
- Enter into accounting software
- Schedule payment
- File for retention
Process an inbound invoice (medium business):
- Invoice arrives in dedicated mailbox
- Workflow automation captures and routes
- AI extracts fields; matches against PO
- Routes to approver
- Approver signs off in workflow tool
- Posts to ERP
- Payment scheduled
- Archived
Takeaway
Invoice management with PDFs is the operational backbone of B2B commerce. The right tools, AI extraction, workflow automation, ERP integration, turn a paper-laden multi-day process into automated minutes for most invoices. For browser-based PDF operations alongside invoice workflows (combining attachments, signing, watermarking), Docento.app handles common tasks. For specific operations, see how to convert HTML to PDF (for generation), AI data extraction from PDFs (for ingestion), and PDF/A archival format explained (for archive).