Docento.app Logo
Docento.app
Notebook, pen and laptop
All Posts

pdftk Introduction: The Friendly PDF Toolkit

April 21, 2026·7 min read

pdftk, the PDF Toolkit, is the friendly CLI tool that lets you manipulate PDFs with intuitive commands. While Ghostscript is the heavyweight engine and qpdf is the structural surgeon, pdftk is the everyday utility: split, merge, rotate, stamp, fill forms, encrypt, repair. This guide is an introduction to what pdftk does and how to use it.

What pdftk is

pdftk was originally created by Sid Steward as a Java front-end to iText. The classic pdftk (Java-based) became the long-running standard. A modern fork, pdftk-java, maintains compatibility on current Java runtimes. There is also pdftk-server (the original) and a paid GUI called pdftk Pro for Windows.

For Linux users in 2026, pdftk-java (often packaged as just pdftk) is the standard install.

Installing pdftk

Debian / Ubuntu:

sudo apt install pdftk-java

(Sometimes packaged as just pdftk.)

Fedora:

sudo dnf install pdftk

macOS:

brew install pdftk-java

Windows:

Install Java (if not already present), then download pdftk-java from GitHub releases and add to PATH. Or use Chocolatey: choco install pdftk.

After install, pdftk is the command.

Basic command structure

pdftk [input.pdf] [operation] [output.pdf]

Or with verbs:

pdftk in.pdf cat 1-3 5-end output out.pdf
pdftk in1.pdf in2.pdf cat output combined.pdf
pdftk in.pdf burst output page-%d.pdf

pdftk's command syntax is more readable than Ghostscript's. Commands include verbs like cat, burst, merge, stamp, fill_form, dump_data, encrypt, and flatten.

Common operations

Combine PDFs:

pdftk file1.pdf file2.pdf file3.pdf cat output combined.pdf

Split into individual pages:

pdftk input.pdf burst output page-%04d.pdf

This creates page-0001.pdf, page-0002.pdf, ...

Extract a page range:

pdftk input.pdf cat 1-5 output pages-1-5.pdf

Extract specific pages:

pdftk input.pdf cat 1 3 5 7-9 output selected.pdf

Reorder pages:

pdftk input.pdf cat 1 3 2 4 output reordered.pdf

Reverse pages:

pdftk input.pdf cat end-1 output reversed.pdf

Rotate pages:

pdftk input.pdf cat 1-3 4east 5-end output rotated.pdf

4east rotates page 4 by 90° clockwise. Other rotations: west (-90°), south (180°), north (0°).

See how to rotate a PDF page.

Merge multiple PDFs with handles:

pdftk A=file1.pdf B=file2.pdf cat A1-3 B1 A4-end output combined.pdf

Take pages 1-3 from file1, page 1 of file2, then pages 4 onward from file1.

Stamp and watermark

pdftk's stamp commands are uniquely useful:

Background stamp (puts a template behind the content):

pdftk input.pdf background template.pdf output stamped.pdf

Foreground stamp (puts a template over the content):

pdftk input.pdf stamp template.pdf output stamped.pdf

Multi-stamp (one template per page):

pdftk input.pdf multistamp templates.pdf output stamped.pdf

Each page of the input gets stamped with the corresponding page of the templates file.

See how to add a watermark to PDF and how to add a header and footer to PDF.

Form operations

pdftk's form-handling capabilities are why many people install it.

Extract form data:

pdftk filled.pdf dump_data_fields

Lists every form field with its name, type, and value.

pdftk filled.pdf dump_data_fields_utf8 > data.txt

Saves to a text file for parsing.

See how to export PDF form data.

Fill a form from data:

pdftk template.pdf fill_form data.fdf output filled.pdf

Where data.fdf is an FDF file with field/value pairs. See how to import data into a PDF form.

Flatten a filled form (so values become page content, not editable fields):

pdftk filled.pdf output flat.pdf flatten

See how to flatten a PDF.

Encryption and decryption

Encrypt:

pdftk input.pdf output encrypted.pdf user_pw user_password owner_pw owner_password

Decrypt (with password):

pdftk input.pdf input_pw user_password output decrypted.pdf

Set permissions:

pdftk input.pdf output protected.pdf \
      owner_pw owner_password \
      allow Printing CopyContents

Allowed values: Printing, DegradedPrinting, ModifyContents, Assembly, CopyContents, ScreenReaders, ModifyAnnotations, FillIn, AllFeatures. See PDF permissions explained.

Metadata operations

Dump metadata:

pdftk input.pdf dump_data

Output includes metadata, bookmarks (outline), page labels, and form data references.

Update metadata:

pdftk input.pdf update_info_utf8 metadata.txt output updated.pdf

Where metadata.txt contains the updated metadata. See how to edit PDF metadata.

Bookmarks (outline)

pdftk lets you read and write the document outline:

pdftk input.pdf dump_data | grep -A 3 Bookmark

To add bookmarks, edit the dump_data output and re-import:

pdftk input.pdf update_info data.txt output with-bookmarks.pdf

See how to add bookmarks to PDF.

Attachments

Attach a file to a PDF:

pdftk input.pdf attach_files document.docx output with-attachment.pdf

Extract attachments:

pdftk input.pdf unpack_files output extracted_directory

Repair damaged PDFs

pdftk damaged.pdf output repaired.pdf

pdftk's just-running-through-the-file behavior often fixes minor corruption. For severe damage, see how to recover a corrupted PDF.

Compress

pdftk does not have its own compression options. For compression, pipe through Ghostscript:

pdftk input.pdf output - | gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite \
                              -dPDFSETTINGS=/ebook \
                              -sOutputFile=compressed.pdf -

See reduce PDF file size and Ghostscript introduction.

Common workflows

Bulk merging:

pdftk *.pdf cat output combined.pdf

Batch form filling:

for r in customers/*.fdf; do
  pdftk template.pdf fill_form "$r" output "filled/${r##*/}.pdf"
done

Stamp every PDF in a folder with a watermark:

for f in *.pdf; do
  pdftk "$f" stamp watermark.pdf output "stamped/$f"
done

Burst a multi-page PDF then OCR each page:

pdftk input.pdf burst output page-%04d.pdf
for p in page-*.pdf; do
  ocrmypdf "$p" "ocr/$p"
done

pdftk's strengths and weaknesses

Strengths:

  • Intuitive verb-based syntax
  • Strong form support
  • Excellent stamp/watermark capabilities
  • Good multi-PDF handling
  • Decent damage repair

Weaknesses:

  • No native compression (use Ghostscript)
  • Limited image manipulation
  • Original pdftk (non-Java) was abandoned years ago; pdftk-java is the maintained successor
  • Less powerful than qpdf for structural operations

When to use pdftk vs alternatives

A typical Linux PDF toolchain uses pdftk + qpdf + Ghostscript together. Each tool has a sweet spot.

Common gotchas

pdftk vs pdftk-java. The original pdftk is unmaintained and incompatible with current Java. Install pdftk-java (often packaged as pdftk on modern distros) for compatibility.

Java runtime. pdftk-java requires Java. If startup feels slow, the JVM is initializing. Consider faster alternatives for high-volume jobs.

Form field names. Field names are case-sensitive and must match exactly. Use dump_data_fields to verify.

FDF encoding. Older FDF files use specific encoding. UTF-8 versions (fill_form_utf8) handle non-ASCII names better.

Stamp position. Stamps are placed at the same coordinates as the template's content. If the template has a footer at the bottom, the stamp goes at the bottom. Adjust the template, not pdftk.

Output overwrite. pdftk does not warn before overwriting. Use distinct output filenames or backup before running.

Memory on large PDFs. Java's default heap may run out on very large PDFs. Bump with JAVA_OPTS=-Xmx4G pdftk ....

Practical recipe

For a typical pdftk workflow:

  1. Inspect: pdftk input.pdf dump_data | less
  2. Operate: pdftk input.pdf cat 1-5 output range.pdf
  3. Verify: open range.pdf and check
  4. Iterate

Takeaway

pdftk is the friendly verb-based PDF toolkit that handles split, merge, rotate, stamp, fill forms, encrypt, and dump data. The syntax is more readable than Ghostscript's, the form support is unmatched in CLI tools, and the stamp commands are uniquely useful. For Linux and Mac PDF pipelines, pdftk is an essential complement to qpdf and Ghostscript. For browser-based one-off operations, Docento.app covers many similar tasks without installing software. For the broader CLI toolkit, see Ghostscript introduction, qpdf introduction, poppler-utils introduction, and MuPDF introduction.

Related Posts