Docento.app Logo
Docento.app
Lines of source code on a screen
All Posts

How to Convert a PDF to EPUB for Comfortable Reading on Any Device

April 24, 2026·7 min read

PDF and EPUB are both document formats, but they serve completely different purposes. A PDF locks the layout, every page is a fixed canvas. An EPUB flows. Text resizes, lines re-wrap, font choice belongs to the reader, and the document adapts to whatever screen it lives on. If you want to actually read a long document on a phone, e-reader, or tablet, EPUB beats PDF in nearly every dimension. This guide walks through converting PDFs to EPUB cleanly.

Why convert PDF to EPUB at all

Three reasons that come up over and over:

  • Comfortable reading on small screens. A PDF on a phone is a zoom-and-pan exercise. An EPUB reflows to fit the screen, with adjustable font sizes and night-mode background.
  • Accessibility. EPUB is inherently reflowable, supports text-to-speech well, and integrates cleanly with screen readers (assuming the conversion preserves structure).
  • E-readers. Kindle, Kobo, Boox, and other e-ink devices are designed for reflowable formats. PDFs work, but EPUB is the native experience.

For a comparison of the formats themselves, see PDF vs EPUB and EPUB vs MOBI ebook formats.

The core challenge

PDF was designed to preserve layout, the same content on page 12 always renders the same way at the same coordinates. EPUB was designed to do the opposite: render whatever way the user's device prefers. Converting between them is not a one-to-one mapping. You are throwing away layout in exchange for flow.

Done well, the EPUB output has clean headings, paragraphs, lists, and inline images. Done poorly, it has every line as its own paragraph, hyphens stuck mid-word, columns interleaved into nonsense, and figures floating at random positions.

The quality of conversion depends almost entirely on the source PDF. A tagged PDF with logical structure converts cleanly. An untagged PDF requires heuristic reconstruction and produces uneven results. A scanned PDF requires OCR first and the output is rougher still.

Tools that produce decent EPUBs

Calibre. The dominant open-source ebook tool. Free, runs on Windows, macOS, and Linux. To convert:

  1. Open Calibre
  2. Add the PDF
  3. Click "Convert books" and set output format to EPUB
  4. Tune the conversion settings (more on this below) and convert

Calibre uses several heuristics, heading detection, paragraph reflow, hyphenation removal, controllable in the conversion dialog. It is the right starting point for nearly every conversion.

Adobe Acrobat Pro. Export As → EPUB. Good for well-tagged PDFs because it uses the tag tree to build EPUB structure. Poor for untagged PDFs.

Sigil + manual cleanup. If you have a great PDF and want a beautiful EPUB, convert with Calibre, then open the result in Sigil (an open-source EPUB editor) and clean up the HTML and CSS by hand. Time-consuming but produces publishable quality.

Pandoc. pandoc input.pdf -o output.epub. Works, but Pandoc is stronger on Markdown and HTML inputs than on PDF. Use Calibre instead unless you have a reason.

Online converters. Many free services convert PDF to EPUB. Convenient for one-off jobs, but check the terms of service and avoid them for sensitive content, see are online PDF editors safe.

Calibre conversion settings that matter

The default Calibre conversion is okay; the tuned conversion is much better. The important settings:

  • Look & Feel → Remove first image. Some PDFs start with a cover. Calibre may treat it as a content image and place it again on page 1. Removing it cleans the result.
  • Heuristic Processing → Enable heuristic processing. Calibre's heuristics merge wrongly-broken paragraphs, recognize headings by font size, dedupe headers and footers. This is often the difference between "garbage EPUB" and "decent EPUB".
  • Search & Replace. For known artifacts in your PDF (e.g., "Page X of Y" footers), set up regex replacements to delete them.
  • Page Setup → Input profile / Output profile. Match the output profile to the target device (Generic e-ink for Kindle/Kobo, iPad for tablets). Affects margins and image sizing.
  • EPUB Output → Preserve cover aspect ratio. Keeps the cover from being squashed.

After conversion, open the EPUB in Calibre's preview pane. Skim for major problems: misnested headings, broken tables, missing images.

When the PDF is scanned

A scanned PDF has no selectable text. You need OCR first.

  1. Run OCR using OCRmyPDF (open source) or ABBYY FineReader (commercial). This embeds a text layer in the PDF.
  2. Convert the OCR'd PDF to EPUB with Calibre.

Quality is bounded by OCR accuracy. Run a spell-checker over the EPUB and fix obvious OCR errors (commonly: "rn" misread as "m", "0" vs "O", "1" vs "l"). For more on OCR, see PDF OCR explained and how to convert a scanned PDF to text.

Multi-column PDFs

Academic papers, magazines, and old technical books are often two-column or three-column layouts. Naive conversion interleaves the columns badly. Two approaches:

  • Calibre's "Detect chapters at" + heuristic processing. Sometimes good enough.
  • Pre-process the PDF. Run the PDF through a column-aware OCR tool (ABBYY, Tesseract with LSTM and layout analysis) that detects columns and outputs text in correct reading order. Then convert that text or the cleaned PDF.

If a multi-column PDF is critical, this is the case for paying for ABBYY FineReader or for using an AI-based document understanding service.

Equations, formulas, and code

PDFs with mathematics, code snippets, or specialized typography are the hardest to convert. Some practical advice:

  • Math: Modern Calibre and recent academic publishers increasingly produce MathML in EPUBs. If your PDF was produced from LaTeX, you may have better luck converting the LaTeX source to EPUB (via Pandoc) than converting the PDF.
  • Code: Monospace fonts and indentation often get mangled. After conversion, search for code blocks and re-wrap with <pre> tags in the EPUB.
  • Footnotes: Should become pop-up notes in the EPUB. Calibre handles this with the "Footnote handling" setting if footnotes are well-marked in the PDF.

Images and figures

Figures usually survive the conversion intact but may end up in odd positions. After conversion:

  • Open the EPUB in Sigil
  • Move figures to logical positions in the flow
  • Add captions if Calibre stripped them
  • Set image dimensions to fit reading devices (typically max 800 px wide for e-ink)

Quick reference recipe

A repeatable recipe that works for most native-text PDFs:

  1. Open the PDF in Calibre
  2. Right-click → Convert books → Convert individually
  3. Output format: EPUB (or EPUB 3 for richer features)
  4. Heuristic Processing: enable, default settings
  5. Search & Replace: remove known headers/footers
  6. Click OK
  7. Open the resulting EPUB in Calibre's E-book viewer
  8. Spot-check the first few chapters, table of contents, and one image
  9. If acceptable, send it to your device. If not, tune settings and re-convert.

Common gotchas

  • Hyphens at line breaks. PDFs often have hyphenated line breaks. Enable Calibre's "Unwrap" heuristic to re-join.
  • Wrong table of contents. Calibre detects headings to build a TOC. If detection misfires, manually edit chapter detection rules.
  • DRM. Some PDFs are encrypted with DRM (commercial ebook PDFs). EPUB conversion of DRM'd content is generally not legal and Calibre will not do it by default.
  • Reading order on tagged PDFs. Trust the tags, they get reading order right. Untagged PDFs need heuristics that may fail.
  • Page breaks. EPUB does not have pages in the PDF sense. Hard page breaks in your PDF (every chapter starts on a new page) translate to EPUB chapter breaks. Soft page breaks disappear, which is usually what you want.

Takeaway

Converting PDF to EPUB is a layout-to-flow translation that depends heavily on the source PDF's quality. Tagged, native-text PDFs convert cleanly with Calibre and a tuned set of heuristics. Scanned PDFs need OCR first. Multi-column and equation-heavy PDFs may need specialized tools or manual cleanup. For most reading workflows, a 30-minute Calibre tuning pass produces an EPUB that is far more pleasant on a phone or e-reader than the original PDF ever was. And if you need to first extract specific pages, say, just one chapter, Docento.app lets you isolate them before conversion.

Related Posts