Your PDF Docs Look Beautiful To Your Boss, But Your AI Thinks They're A Crime 🫆 Scene

Discover why allowing PDF tech docs to power AI-answer engines might not be your best idea — 💡

May 06, 2026

∙ Paid

There’s a quaint kind of optimism that appears when some of us start talking about AI and tech docs. It usually starts like this: “We already have thousands of pages of documentation in PDF, so the model can just read those and answer questions from them. Voila!”

I get the impulse. Really, I do. I ❤️ me a good PDF; but only when I need one.

PDFs are familiar. They look finished, feel official, and are what many tech docs teams ship, archive, email, and point to when someone asks where the user assistance lives.

To a human reader, a well-made PDF can seem perfectly clear.

The heading introduces the procedure
Steps appear in order
Warning box is seriously hard to miss
The diagram sits exactly where it belongs, quietly earning its keep

But when a large language model encounters that same file, it often isn’t getting the polished experience you imagine. It’s getting a reconstruction of the content through whatever extraction method sits between the PDF and the model, and that reconstruction can be a hot 🔥 mess.

Continue reading this post for free, courtesy of Scott Abel.

Or purchase a paid subscription.

The Content Wrangler

Your PDF Docs Look Beautiful To Your Boss, But Your AI Thinks They're A Crime 🫆 Scene

Discover why allowing PDF tech docs to power AI-answer engines might not be your best idea — 💡

Continue reading this post for free, courtesy of Scott Abel.