Docento.app Logo
Docento.app
Close-up of a circuit board
All Posts

How to Fix PDF Text Not Selectable

May 15, 2026·7 min read

You open a PDF, try to select a passage to copy, and the text refuses to highlight. Or it selects as one big block. Or you can copy but the paste is gibberish. PDFs where text is not selectable are common, and the cause determines the fix. This guide walks through the diagnostics and solutions.

Why text might not be selectable

Several common reasons:

  1. Scanned PDF. The "text" is actually pixels in an image; no text layer exists.
  2. Text converted to paths. Text was flattened to vector shapes; no character data remains.
  3. Custom encoding. Text is selectable but copies as gibberish because the font has no Unicode mapping.
  4. Reader limitation. Some readers do not enable selection by default.
  5. Permission restriction. PDF permissions disable text copying.
  6. Form field overlay. Text is in a form field; you need to click into the field.
  7. Background image hiding text. Text exists under an image and cannot be clicked.

The diagnosis usually takes seconds; the fix may take minutes.

Diagnosis 1: Is it a scanned PDF?

Quick check:

  • Right-click and try "Find" (Ctrl+F). If search finds nothing, no text layer exists.
  • Run pdftotext file.pdf - in a terminal. If it returns empty, no text layer. See poppler-utils introduction.
  • Zoom in. Scanned text often has visible image artifacts.

If scanned: run OCR. See PDF OCR explained and how to make a PDF searchable OCR.

A typical command:

ocrmypdf scanned.pdf with-text.pdf

After OCR, text becomes selectable.

Diagnosis 2: Is text converted to paths?

Sometimes text in PDFs is "flattened to outlines", converted from text characters to vector shapes. The visual is identical but the underlying data is geometric paths, not text.

Check:

  • Find (Ctrl+F), returns nothing
  • pdftotext, returns nothing or fragments
  • pdffonts file.pdf shows no fonts or unusual font listings

Fix: this is generally irreversible. The original was destroyed when fonts were converted to paths. Workarounds:

  • OCR the file, even though it has "text", it's effectively an image. OCR creates a real text layer.
  • Get the original from the producer

For OCR on a vectorized PDF, render the page as image first then OCR. Or run OCRmyPDF, which handles this.

Diagnosis 3: Custom encoding (text copies as gibberish)

You can select but the copied text is ~&*( !@#$%^&* nonsense. The font uses a custom encoding without a proper ToUnicode mapping.

Check:

  • Select a known word
  • Paste somewhere
  • Compare to original, if different, encoding issue

Fix:

  • OCR the PDF to create a proper text layer
  • Open in Acrobat Pro which sometimes handles encoding mapping
  • Try a different reader, Foxit, PDF-XChange may handle differently

The "OCR over OCR" trick:

ocrmypdf --redo-ocr input.pdf output.pdf

--redo-ocr forces re-OCR even if a text layer exists.

Diagnosis 4: Reader limitation

Some readers default to "hand tool" or "pan mode" that does not allow text selection:

  • In Adobe Acrobat Reader: switch to the Select tool (Tools menu or right-click → Selection Tool)
  • In Preview on Mac: click the text icon in the toolbar
  • In Foxit: Tools → Selection
  • In browsers: the default cursor allows selection

If switching tools enables selection, the problem is reader UI, not the file.

Diagnosis 5: Permission restriction

PDF permissions can disable copying:

  • Check in Acrobat: File → Properties → Security tab. "Content Copying" should be "Allowed" if you want to select.
  • Programmatic check: qpdf --show-encryption file.pdf shows permissions.

If copying is restricted:

  • If you have the owner password: decrypt with qpdf --decrypt --password=owner file.pdf out.pdf
  • Without permission to lift restrictions: you should not bypass without authorization

See PDF permissions explained and how to remove a password from a PDF.

Diagnosis 6: Form field overlay

A form field with default text may look like body text but is actually a form field:

  • Click directly on the text, does the cursor change?
  • Tab between fields, do they form a sequence?
  • View → Show/Hide → Tools → "Prepare Form" reveals form fields visually

If text is in a form field, click into the field to interact with it.

Diagnosis 7: Background image hiding text

Sometimes there's text but an image overlay prevents selection. Rare but happens.

Check:

  • Acrobat Pro: Edit PDF mode. See if there are stacked objects.
  • mutool show file.pdf catalog to inspect structure

Fix: edit the PDF to remove the overlay, or use pdftotext which doesn't care about visual stacking.

Specific scenarios

Scanned book. Run OCR. Plan time, OCR of a 300-page book takes minutes.

PDF with redactions. Redactions may make the surrounding text behave oddly. Verify the redaction did not affect non-redacted text.

PDF generated from LaTeX. Usually selectable; if not, the LaTeX source may have used fonts without ToUnicode. Regenerate with proper fonts.

PDF from a creative tool (InDesign, Illustrator). If text was flattened to outlines, irrecoverable. Get the original source.

PDF from old software. Some old PDF producers use non-standard encoding. OCR fixes.

Mixed-content PDF. Some pages are selectable (native text), others are not (scanned). OCR the scanned pages selectively.

Tools that fix text selection

OCR tools:

  • OCRmyPDF, free, scriptable, the default choice
  • Adobe Acrobat Pro, Tools → Scan & OCR
  • ABBYY FineReader, paid, highest accuracy
  • Cloud OCR, AWS Textract, Google Document AI for hard cases

Repair tools:

  • qpdf, for general PDF repair
  • Ghostscript, for re-rendering
  • mutool clean, for content stream cleanup

For non-OCR cases:

qpdf input.pdf output.pdf
gs -o output.pdf -sDEVICE=pdfwrite input.pdf
mutool clean -ggg input.pdf cleaned.pdf

These re-render and often fix subtle text issues.

Verifying the fix

After fixing:

  • Try selecting in your reader
  • Run pdftotext, verify text comes out correctly
  • Search (Ctrl+F) for known words
  • Spot-check the output for OCR errors if you OCR'd

Common gotchas

OCR introduces typos. OCR is not perfect. Verify accuracy on important documents.

Mixed encoding. Different parts of the same PDF use different encodings. Some passages copy fine; others gibberish. OCR the whole thing for consistency.

Reader-specific quirks. Acrobat may select fine while Chrome's viewer mishandles. Standardize on a reader.

Hidden content. Some PDFs have invisible text layers (e.g., for accessibility) that interact oddly with selection. Usually fine but occasionally surprising.

Rights-restricted PDFs. Even after decryption, some readers honor an internal "no copy" flag. Use Acrobat Pro with the owner password.

Right-to-left text. Arabic, Hebrew selection works but may visually look odd. The underlying text is correct.

Multi-column documents. Selection in multi-column PDFs may interleave incorrectly. Tagged PDFs help.

Prevention

For PDFs you generate:

  • Don't convert text to outlines unless absolutely necessary
  • Use standard fonts with proper Unicode mappings
  • Embed fonts correctly
  • Tag the PDF for accessibility, see tagged PDF vs untagged PDF
  • Test selection after generating

Practical recipe

For a non-selectable PDF you receive:

  1. Try search (Ctrl+F). If nothing, no text layer.
  2. Run pdftotext file.pdf -. Same check.
  3. If no text: run ocrmypdf to add a text layer.
  4. If text exists but selects as gibberish: ocrmypdf --redo-ocr to re-OCR.
  5. Verify with selection and paste test.

For PDFs you generate, prevent the issue at source.

Takeaway

PDFs with non-selectable text are typically scanned without OCR, have text-flattened-to-paths, or have custom font encodings. The fix for most cases is OCR, OCRmyPDF handles 90% of cases for free. For more stubborn issues, re-rendering with Ghostscript, qpdf, or mutool helps. For permission-restricted PDFs, you need authorization to lift restrictions. For browser-based OCR and text-related PDF operations, Docento.app covers common tasks. For related topics, see PDF OCR explained, how to make a PDF searchable OCR, and how to convert a PDF to text.

Related Posts