SEO For PDFs

January 10, 2013

As my partner in crime Travis recently pointed out, misconceptions abound in the SEO industry. Here’s another misconception: “PDF pages are so SEO-unfriendly that you can’t rank for any halfway competitive keywords with them“.

Some SEOs are still so set against the Portable Document Format pages that they don’t feel they should even be landing pages. Some such SEOs recommend replacing all PDFs with HTML pages or building additional HTML landing pages targeting the same keywords as the PDFs.

The truth is: the biggest reason PDF pages often rank so horribly is that they are rarely properly optimized.

Don’t get me wrong. In an overall SEO showdown, I’d still pick HTML over PDFs any day of the week, and you’re not likely to catch me creating brand new web content for my clients in Adobe Acrobat. The real reason HTML is SEO-superior in 2013 is the user-experience. Most people are more comfortable with HTML and experience less freezing and slow loading with HTML. It’s easier to incorporate interactivity and social functionality into HTML pages. People also link to HTML pages and share them more frequently than PDFs(this is big).

Why Use PDFs then?

Don’t get me wrong twice — there’s still reasons to keep PDFs as SEO landing pages. Below are a few common use cases:

  • When you already have many PDF pages on your site that people consider valuable.  Before replacing PDFs, be sure to check to see if your PDFs have backlinks decent engagement metrics, and good traffic.
  • When you have really sexy PDF’s that would be difficult to turn into an equivalently sexy and user-friendly HTML.
  • When you have content that is meant to be printed or downloaded, like spec sheets, MSDSs, product manuals, brochures, forms meant to be printed and filled by hand, etc…
  • When the cost-benefit ratio just isn’t in favor of replacing PDFs. This might be the case if you only have a few PDFs and you don’t want to spend the upfront time or money converting the pages into HTML  and redirecting the URLs. (That said, a good PDF-to-HTML converter may be worth the investment if you’ve got a lot of un-uploaded PDFs laying around.)

The Best Practices in SEO for PDF Files

The big myth that search engines can’t digest PDF content used to be the case years ago, but the search engines have come a long way, baby. So if you have reason to stick with your PDFs, just follow the simple tips below. I’ve listed the important stuff first.

Always use text-based PDFs

Search engines understand text waaay better than images (though the engines do have rudimentary optical character recognition capabilities), so make sure the words in your PDF are basic copy-and-paste-able text, not pictures of words. Most of the big PDF creators, like those in Adobe Creative Suite, have your back here. If you happen to have a scanned document you want to turn into a solid SEO landing page, you’ll need to use a little OCR yourself and convert the document into text.

Set your title in the document properties

This is such a common and easy-to-fix error that it drives me wild. It’s common knowledge that the title tag is a huge ranking factor. To do this to a PDF, one must set the title in the document properties. Almost all PDF creators support this functionality including Adobe applications such as InDesign. Per usual, you want to smartly utilize keywords and optimize your title tags.

seo-pdfs-1

 Set an SEO-friendly URL/filename

Typically, the PDF filename will become part of the URL, so give your document a good key-word rich filename. Often, search engines use the filename/URL snippet for the title tag when the title is not set. Also, some document creators will default the title as the filename. So please set a descriptive title and filename. I’m sick of search results that look like this:

Do good SEO

What do keyword-rich title tags and descriptive URLs have in common, besides being PDF SEO best practices? They also follow standard SEO best practices. Follow your other usual basic SEO best practices to optimize your PDFs as well. This includes:

  • internal linking to the PDF page to give it some link juice and authority (I see high-potential PDFs unnecessarily buried too deep in many websites). Speaking of internal linking and common pitfalls, please link from your PDF page to your other pages when relevant. It helps your SEO efforts and the user experience, and it isn’t done enough (seriously, I cringe when I have to copy and paste a URL from a PDF into the browser).
  • good keyword selection
  • keywords in body copy
  • image optimization (note: you can set alt text in many PDF tools)
  • human optimizing (Good content is good SEO, friend)

Keep the file size light

Huge sized files will load slower, affecting user experience and the search engines’ crawl. Adobe has the “PDF Optimizer” function which will allow you to reduce file size, and you’ll want to use it for heavy PDFs. Learn the nitty-gritty on reducing PDF file size here.

Avoid duplicate content

Having both HTML and PDF versions of the same content can sometimes be a wise choice, but only if you take measures to prevent the duplicate content issue. Also, if you tweak a PDF and re-upload it, don’t create a duplicate by accidentally changing the filename and change the URL.

Set the other document properties too

Hey, while you’re in there(setting the title)… you might as well complete the other properties such as  Author, Subject, and Keywords. I couldn’t honestly tell you I know how much impact this will have, but I keep reading on the Internets that it’s worth it. So fill out all the properties you can — I just wouldn’t spend all day on it. Some sources say the Subject will become the Meta Description (but I have yet to verify this with much validity.)

Touchup the Reading Order

“Touchup” the  Reading Order and set alternate text as well as headings. The headings are said to be handled by the search engines similarly to how header tags are handling in plain HTML.

Don’t save as the latest Acrobat version

Many readers might not have the latest Reader version (and no one wants to upload it just for your stupid page). Search engines sometimes fall behind the times too, so save your PDF in an older version.

Write-protect your document

If you don’t write-protect your document, then someone can upload the whole file to their site and change it however they want (including editing out your links.)

—-TL;DR—-

Ok, ok. Look, PDF SEO ain’t too hard. Just follow this checklist:

  • Always use text-based PDFs
  • Set your title in the document properties
  • Set an SEO-friendly URL/filename
  • Do good SEO
  • Keep the file size light
  • Avoid duplicate content
  • Set the other document properties too
  • Touchup the Reading Order
  • Don’t save as the latest Acrobat version
  • Write-protect your document

Let me know in the comments if I missed in any PDF SEO FAQs.

–TTFN.