Open a 200-page PDF in your browser. It appears almost instantly, first page rendered, scrollable, searchable, even though the file is 80 megabytes and you obviously have not downloaded all 80 megabytes yet. That trick is called linearization, and the resulting file is a linearized PDF, also marketed as "PDF Optimized for Fast Web View". This article explains what is happening, why it matters, and how to make sure your PDFs are linearized.
The problem with a normal PDF over the network
A PDF file ends with a structure called the cross-reference table (xref), which is the index of every object in the file. When a reader opens a PDF, it has to:
- Locate the xref (which is at the end)
- Use the xref to find every object on the page you want to view
- Resolve fonts, images, and other resources
If the file lives on your local disk, this is fast. The reader seeks to the end, then jumps around freely. Done.
If the file lives on a web server and the user has only downloaded the first 100 kilobytes of a 50-megabyte PDF, the reader cannot find the xref (it has not arrived yet). So the browser has to download the entire file before showing anything. On a fast connection that is a brief annoyance; on a phone tethered to a flaky network, it is the difference between "useful" and "abandoned".
What linearization does
A linearized PDF is structured so the parts the reader needs first appear first in the byte stream. Specifically:
- A linearization dictionary appears in the first few hundred bytes. It tells the reader exactly where everything is: which byte offset has the first page, where the page tree is, where each chunk of the xref lives.
- The first page appears very early, typically within the first 64 KB.
- An incremental xref sits near the front, just for the objects on the first page.
- The remainder of the file is laid out roughly in page order, so as the user scrolls, the bytes they need are likely already on their way.
A linearization-aware reader, paired with a web server that supports HTTP range requests, can show the first page almost immediately and then fetch additional pages on demand as the user scrolls. The byte order matches the usage order.
How linearization differs from compression
Linearization is about layout, not size. A linearized PDF and its non-linearized twin can be the same number of bytes; the bytes are just arranged differently. Compression and image downsampling reduce size; linearization reduces time-to-first-render over a network.
You typically want both. Compress first (or reduce PDF file size) so the user is downloading fewer bytes, then linearize so the bytes they get arrive in a useful order.
How to tell if a PDF is linearized
A few ways:
- Adobe Acrobat / Reader. File → Properties → Description tab. Look at "Fast Web View: Yes/No".
qpdf --check file.pdf, open-source CLI that reports linearization status. See our qpdf introduction.pdftk file.pdf dump_data | grep -i linearized, older but still works on many systems.- Look at the first few bytes. A linearized file's header contains a linearization dictionary very early. If you
head -c 1024 file.pdfand see/Linearizednear the top, you have one.
How to linearize a PDF
Most production tools can do this on export or as a post-processing step:
- Ghostscript:
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dFastWebView=true -sOutputFile=out.pdf in.pdf. See our Ghostscript introduction for more. - qpdf:
qpdf --linearize in.pdf out.pdf. Usually the cleanest and fastest option. - Adobe Acrobat Pro: File → Save As Other → Optimized PDF. Make sure "Fast Web View" is checked.
- PDF libraries:
pikepdfin Python (pdf.save("out.pdf", linearize=True)), iText in Java, PDFlib'slinearizeoption. - Hosting platforms. Some content delivery networks linearize PDFs at the edge if you give them the option.
If you generate PDFs server-side and serve them to web users, run linearization in the same step as compression. A single one-shot pipeline produces small, fast-rendering files for download.
Server-side requirements
Linearization on the file alone is not enough. The server has to support HTTP range requests (it almost certainly does, every modern web server and CDN handles them). The reader sends Range: bytes=... headers asking for specific byte segments; the server returns those segments with status 206 Partial Content.
The combination, linearized file plus range-aware server, is what makes pages appear before the rest of the file arrives. If your server returns the whole file regardless of range headers (unusual but possible with some legacy setups), even a linearized PDF acts non-linearized in practice.
Trade-offs to know about
Re-linearization after every save. If you edit a linearized PDF and save it back, most editors write an "incremental update", appending changes at the end, which destroys the linearization. The file is still valid, just no longer streamable. If you want it linearized again, re-run the linearization step.
Encrypted PDFs. Linearization is compatible with encryption, but with some catches. The first page's stream still has to be readable by an authorized client, and the linearization dictionary itself has to be unencrypted in places. Most production tools handle this correctly; less-mature ones produce broken files.
Signed PDFs. A digital signature locks the bytes of the file. Re-linearizing afterwards changes the bytes and invalidates the signature. Linearize before you sign.
Truly huge PDFs. Linearization helps the first page render instantly. For a 5,000-page document, scrolling to page 4,999 still requires fetching the bytes for that page, which takes time. There is no magic, just a much better experience for the common case (read from the front).
For more on optimizing PDFs for delivery, also see reduce PDF file size and open-source PDF tools.
When linearization matters most
- Public-facing documents, datasheets, manuals, government forms, reports. The first visitor impression depends on time-to-first-render.
- Mobile, slow networks magnify every saved second.
- CDN-hosted assets, linearized files cache well and serve fast.
- Inline previews, viewer widgets embedded on web pages often rely on range requests to show only the first few pages.
For private files, internal archives, or single-user downloads, linearization is a nice-to-have rather than a requirement.
Quick checklist for a web-ready PDF
Before you upload a public PDF:
- Compress images and downsample where appropriate
- Embed only the font subsets you need
- Linearize the file (
qpdf --linearize, or your tool of choice) - Confirm the server returns
Accept-Ranges: bytesfor the file - Validate with Acrobat's "Fast Web View: Yes" check
A 60-second pipeline that yields dramatically better user experience for everyone who opens the file over the internet.
Takeaway
Linearization is invisible until you do not have it, at which point every page load drags. A linearized PDF puts the first page near the front of the byte stream and tells the reader where everything else lives, so a browser can start rendering after a small download. Combine linearization with sensible compression and a CDN that supports range requests, and a 50-megabyte PDF feels as snappy as a web page. For documents you edit and re-host with Docento.app, remember to re-linearize after editing, saving usually breaks the layout, and one extra step puts it back.