When the file lies about itself: AI-generated metadata hallucination

Most coverage of AI “hallucination” concerns the visible content of a document: a fabricated citation, an imaginary statute, a precedent that never existed. A quieter failure mode has drawn far less attention. Generative systems can also fabricate the metadata — the embedded, largely invisible layer that records who authored a file, what organization produced it, and when it was created or last modified. When that layer is wrong, the consequences are different in kind, because metadata is precisely the data we reach for when we want to know whether the visible content can be trusted.

Metadata is evidence, not trivia

Metadata is often described as “data about data”: the hidden information embedded in every digital file, organized into fields such as Author, Company, Created, and Last Modified. For the file types most people handle daily — Office documents, PDFs, images — metadata can reveal the author and organization, the creation and modification dates, version history and tracked changes, and embedded comments. Critically, these fields have historically been written automatically by the software and operating system as a user creates, saves, and shares a file — not typed in by hand. That provenance is exactly why metadata has been treated as a quiet, trustworthy audit trail.

In litigation that trust is formalized. Courts and counsel rely on metadata to authenticate documents, establish a chain of custody, and prove when and by whom a file was created. The Federal Rules of Evidence allow electronic records to be self-authenticated: Rule 902(13) covers a record “generated by an electronic process or system that produces an accurate result,” and Rule 902(14) covers data identified “by a process of digital identification” — in practice, a matching hash value.^[1] The Sedona Conference, the leading vendor-neutral authority on electronically stored information, treats metadata as a central pillar of ESI admissibility.^[2] The whole edifice assumes the fields mean what they say.

Why a model would invent a provenance

Metadata hallucination follows from how large language models work. They do not “know” facts; they predict the most probable next token from patterns in training data. Recent OpenAI research argues that hallucination is not a mysterious glitch but a predictable product of training and evaluation regimes that “reward guessing over acknowledging uncertainty” — models are optimized to be confident test-takers, and a confident guess scores better than an admission of ignorance.^[3]

That pressure does not stop at the visible text. When a system assembles a document, a “complete” file of that type is expected to carry author, date, and history fields. Faced with no real value to place there, a model does what it was trained to do: it produces something plausible. The output looks like a normal author name or timestamp — and is entirely invented. The danger is structural: these guesses live in the hidden layer, where almost no one looks.

Two places it can mislead you

The risk operates at two depths. At the surface, fabricated values surface in a file's ordinary Properties pane — the Author or Company a lawyer or reviewer reads with a couple of clicks and reasonably presumes to be authoritative. Deeper down, forensic and e-discovery tools report whatever is embedded in the file as ground truth. Those tools are built to surface what the file contains, not to adjudicate whether the contained values are true. So a hallucinated timestamp or author can pass through forensic review wearing the authority of forensic review.

The figure below states the core problem: the claimed provenance a file asserts about itself is not the same thing as a cryptographic fact about that file. The first can be guessed, copied, or invented. The second can only be verified.

Fig. 1 — Claimed metadata vs. cryptographic truth: a three-step verification flow.

How to verify provenance instead of trusting it

The remedy is to stop treating self-asserted metadata as proof and start treating it as a claim to be tested. Three layers, in order, move a file from “claimed” to “verified.”

Hash the bytes. A cryptographic hash (SHA-256) is computed from the file's actual content, so any change — including a doctored field — produces a different value. NIST's guidance on integrating forensic techniques into incident response builds its collection and integrity model on exactly this property: hashing and documented handling preserve evidence so it stays admissible.^[4] A matching hash is the digital identification that Rule 902(14) contemplates.^[1]
Check for a signed Content Credential. The C2PA standard — the open specification behind Content Credentials — binds provenance to an asset in a tamper-evident, cryptographically signed manifest. If the content or its credential is altered after signing, verification fails and says so.^[5] Unlike an ordinary Author field, a signed credential cannot be silently guessed into existence by a model.
Reconcile to a chain of custody. Cross-check the file's asserted dates and authorship against independent records — collection logs, system timestamps, custodian interviews. The Sedona Conference frames metadata authenticity as a question answered by corroboration, not by reading a field at face value.^[2]

None of these steps is exotic, and that is the point. The arrival of generative tools does not break provenance; it removes the luxury of assuming provenance. A field that a model can invent is not evidence. A value that can be recomputed from the bytes, or verified against a signature, is. The discipline that distinguishes the two has existed for years — see /provenance for the full verification toolkit.

When the file lies about itself.

Metadata is evidence, not trivia

Why a model would invent a provenance

Two places it can mislead you

How to verify provenance instead of trusting it

Sources

From claim to proof.