ENGINEERING HUB

PDF Metadata Architecture

Deep dive into the Info Dictionary structure and XMP streams.
Learn why "Microsoft Word - Document1" is hurting your SEO.

⚠️ The "Untitled" Problem: When you save a PDF from Word or Canva, it often keeps the original filename as the internal Title. This looks unprofessional in browser tabs.

🔧 How PDF Stores Data: The "Info Dictionary"

A PDF file is not just flat text. It contains a hidden data structure called the Trailer, which points to the Info Dictionary. This key-value pair storage holds the document's DNA.

Title (Not Filename)

This is what Google displays in search results and what browsers show in the tab bar. It is separate from the file name (e.g., report.pdf).

Author & Creator

Stores who created the file and which software was used (e.g., "Microsoft Word"). Changing this is crucial for white-labeling documents.

Why Metadata Matters for SEO?

Search engines like Google index PDF files just like web pages. Without proper metadata, your PDF is essentially "invisible."

  • Keywords: Injected into the metadata stream, these help search algorithms understand the topic without OCR.
  • Subject: Acts as a meta-description for the document.
  • Creation Date: Establishes the timeline of the document's validity.

Technical Field Structure

Property Key Data Type Function
/Title String Browser Tab Label & Google Link Text
/Author String Citation & Ownership
/Keywords String Search Indexing (Comma Separated)
/ModDate Date (ASN.1) Last Modified Timestamp

Clean Your Document Properties

View and Edit hidden metadata instantly in your browser.

🛠️ Launch Metadata Editor
📊 Engineering Stats
Property Access Speed
< 50ms

Quick Help

Most questions regarding file security, limits, and student access are answered in our FAQ.

Browse FAQ Database