Docento.app Logo
Docento.app
Padlock on a circuit-board background
All Posts

How to Edit PDF Metadata: Author, Title, Keywords, and Beyond

May 4, 2026·8 min read

Every PDF carries metadata, information about the file rather than its visible content. Title, author, subject, keywords, creation and modification dates, the tool that produced it, sometimes far more. Editing this metadata matters for organization, searchability, branding, privacy, and compliance. The mechanics are simple. The choices about what to include or strip are more interesting.

What metadata lives in a PDF

Two distinct metadata stores in a typical PDF:

Document Information Dictionary (DocInfo). The classic PDF metadata: Title, Author, Subject, Keywords, Creator, Producer, CreationDate, ModDate. Located in the document's /Info object. Compatible with every PDF reader since 1993.

XMP metadata. An XML-based metadata standard used in Adobe products and many modern PDFs. Embedded as a stream in the document catalog. Richer than DocInfo, supports custom schemas, rights management, version history, and more.

Modern PDFs typically have both, with the same values mirrored in each. Some tools maintain both; others only update one, leading to inconsistencies.

Why metadata matters

  • Search and indexing. File systems, document management systems, and search engines use Title and Keywords to surface documents.
  • Display. A reader shows the Title in the window title bar and tab name. "Untitled" PDFs are a poor user experience.
  • Citation. Academic papers, legal filings, and reports get cited by their metadata, not their filename.
  • Branding. A consistent Author and Producer across a corporate document set is part of professional polish.
  • Privacy. Metadata can leak sensitive information, internal usernames, tool versions, paths from the producer's machine. See hidden data in PDFs explained.
  • Compliance. Some regulations require specific metadata (e.g., document retention category, classification level).

Tools that edit PDF metadata

Adobe Acrobat Pro. File → Properties → Description tab. Edit Title, Author, Subject, Keywords. Save the file. For deeper XMP edits, click "Additional Metadata".

Foxit PDF Editor. File → Properties → Description.

PDF-XChange Editor. File → Properties → Description.

ExifTool. The CLI Swiss army knife. Despite the name, ExifTool handles PDF metadata excellently. exiftool -Title="My Document" file.pdf. Reads and writes both DocInfo and XMP. Free, open source, scriptable. Used in many automated pipelines.

qpdf and pikepdf. Programmatic access for batch operations. See qpdf introduction.

Browser-based. Docento.app lets you edit document metadata in the browser without installing anything.

Standard fields and conventions

The classic DocInfo fields and what they typically contain:

  • Title, human-readable document title. "Q4 2026 Financial Report", not "Microsoft Word - draft7.docx".
  • Author, the human or organization that created the content. Often a person ("Jane Doe") or a company ("Acme Corp").
  • Subject, a brief description. "Quarterly financial results for Q4 2026".
  • Keywords, comma-separated tags. "financial, quarterly, Q4, 2026, earnings".
  • Creator, the tool that authored the source (e.g., "Microsoft Word for Microsoft 365").
  • Producer, the tool that produced the PDF (e.g., "Microsoft® Word for Microsoft 365"). Sometimes the same as Creator; sometimes different (when the source was authored in one tool and PDF'd by another).
  • CreationDate and ModDate, ISO-like timestamps for when the file was created and last modified.

Author and Title are the two fields most worth investing in. Subject and Keywords help search. The rest are usually set by tools automatically.

Setting metadata at PDF creation

The cleanest place to set metadata is when generating the PDF in the first place:

  • Microsoft Word. File → Info → Properties → Advanced Properties. Set Title, Author, etc. before saving as PDF.
  • LibreOffice. File → Properties → Description.
  • Google Docs. File → Document Details.
  • Adobe InDesign. File → File Info.
  • LaTeX. Use the hyperref package with \hypersetup{pdftitle={...}, pdfauthor={...}}.
  • Programmatic generators, ReportLab, iText, PDFKit, PDFlib all expose metadata fields in their APIs.

Setting metadata once at generation is much easier than fixing it across many files later.

Setting metadata after the fact

For existing PDFs:

exiftool -Title="Q4 2026 Financial Report" \
         -Author="Acme Corp Finance Team" \
         -Subject="Quarterly results, Q4 2026" \
         -Keywords="financial, quarterly, Q4, 2026" \
         file.pdf

ExifTool updates both DocInfo and XMP by default. The original file is backed up as file.pdf_original unless you pass -overwrite_original.

For batch jobs:

for f in *.pdf; do
  exiftool -Author="Acme Corp" "$f"
done

Updates every PDF in the folder with a uniform Author field.

Stripping metadata for privacy

A common workflow is to strip metadata before sharing a PDF externally:

exiftool -all= file.pdf

Removes all metadata fields. Use this when:

  • Sharing a PDF outside your organization
  • Distributing a document where the internal usernames or tool versions should not leak
  • Anonymizing a document for review

For more thorough hidden-data removal (not just metadata, but also annotations, hidden layers, JavaScript), see how to strip metadata from PDF and how to anonymize PDF documents.

XMP metadata: when you need it

XMP supports richer metadata than DocInfo. Use cases:

  • Custom fields. Track project codes, classification levels, expiration dates that do not fit DocInfo's standard schema.
  • Rights management. Copyright owner, license terms, usage rights.
  • Provenance. Who edited the file when, with what software.
  • Domain-specific schemas. Many industries (publishing, healthcare, legal) have XMP extensions for their needs.

To edit XMP:

  • Acrobat Pro: Properties → Additional Metadata → Advanced → load or edit XMP directly.
  • ExifTool: exiftool -XMP-dc:Description="My description" file.pdf. The -XMP-dc: prefix accesses the Dublin Core XMP namespace.
  • Programmatic: pikepdf can read and write XMP via the .docinfo and .open_metadata() APIs.

For most documents, DocInfo is sufficient. XMP becomes necessary for compliance workflows and asset management.

Document properties beyond DocInfo and XMP

A PDF can also expose:

  • Page count, auto-derived from the page tree.
  • PDF version, set at creation.
  • File size, derived from the file itself.
  • Encryption status, from the security dictionary.
  • Form field status, does the document contain interactive forms?
  • Number of words, some tools count automatically.

These are typically displayed in File → Properties but not directly editable. You change them indirectly (e.g., by re-saving at a different PDF version).

Common gotchas

DocInfo and XMP get out of sync. Edit Title in DocInfo only, and XMP still has the old Title. Some viewers prefer one over the other for display. Always update both, ExifTool does by default.

Metadata strips when re-saving. Some tools rewrite the entire file on save and drop custom metadata. After heavy edits, verify metadata is still intact.

Date format inconsistencies. PDF dates use a specific format (D:YYYYMMDDHHmmSS+TZ). Tools that write dates in other formats may produce metadata that breaks parsers.

Unicode in metadata fields. Older PDF tools sometimes mangle non-ASCII characters in metadata. Test with curly quotes, em dashes, and accented characters.

Document title falls back to filename. Some readers show the filename in the title bar if the Title metadata is empty. Setting Title explicitly avoids the "Untitled-document-2.pdf" experience.

XMP packet hidden in the file content. XMP packets in PDF can sometimes be found at the file's end by other tools that walk binary data. If privacy matters, also clear the packet, not just the displayed metadata fields. ExifTool's -all= handles this.

Bookmarks and metadata confusion. Bookmarks (PDF outlines) are sometimes thought of as metadata, but they are not, they live in a separate part of the file. See how to add bookmarks to PDF.

Privacy-first metadata workflow

For documents shared externally:

  1. Author the document with full metadata for internal use
  2. Before external sharing, strip the metadata: exiftool -all= file.pdf
  3. Re-add only the externally-appropriate fields: Title, Author (organization name, not individual), Subject
  4. Verify with exiftool file.pdf showing only the intended fields

This pattern keeps internal-only information out of external copies.

Quick recipes

Set Title and Author on a single file:

exiftool -Title="My Title" -Author="Jane Doe" file.pdf

Or open in Acrobat Pro, File → Properties → Description, edit, save.

Strip all metadata before sharing:

exiftool -all= file.pdf

Read current metadata:

exiftool file.pdf

Batch-update Author across a folder:

exiftool -Author="Acme Corp" *.pdf

Takeaway

Editing PDF metadata is one of the cheapest improvements you can make to a document. Setting a meaningful Title and Author affects how the file is displayed, searched, and cited. Stripping metadata before external sharing keeps internal information out of the wrong hands. ExifTool is the right CLI tool for both reading and writing; Acrobat Pro and similar editors handle the GUI workflow; Docento.app handles browser-based metadata edits. For deeper hidden-data concerns beyond DocInfo and XMP, see hidden data in PDFs explained. Set metadata once at PDF creation when you can; fix it after the fact when you must.

Related Posts