April 29th, 2011

A few days ago, the White House released a copy of President Obama’s long-form birth certificate (something that normally isn’t done, but they made an exception for the President). Immediately people started analyzing it and trying to determine its validity, and one of the biggest concerns is that “layers” were found in the document that split the text into various groups (as shown here).

Now, I’ve been working with the Adobe Creative Suite (including Photoshop and Illustrator) for over a decade now, and some of the claims by other “experts” I am quite offended they choose to label themselves as “experts”. I’ll leave the other points of interest (smudges in various boxes, “African” vs. “Negro” as race, etc.) for others, but I can speak to the graphic qualities. Let’s break it down:

There are layers in the PDF! It must be altered, since if it were just scanned, it would be a single, flat image!

First of all, there is a terminology/vocabulary issue here; there are no layers in the PDF; to prove this, open the PDF in either Acrobat Reader (free; you can get it yourself here), and take a look at the “Layers” toolbar:

There's no true "layers" in the PDF

There's no true "layers" in the PDF

The Layers palette is empty! The PDF has no true layers in it. Period.

Okay, so there’s no layers, but what about the separate pieces in Illustrator?

Okay, yes, so it’s semantics, but it’s been bugging me that people have been calling them “layers” when they’re not. What they are are separate “pieces” in the document that Acrobat Reader doesn’t view as Layers, but Illustrator can be made to do so. So what happens when you open the PDF in Illustrator; you get an object that is bound by a Clipping Path. The clipping path is used by the PDF format to define the printable area of the document. In Illustrator we can safely remove it since Illustrator has an “artboard” that defines what gets printed (if we were to print it from Illustrator). So, Illustrator can remove the clipping path (“Releasing” the contents), which shows the internal pieces of the document. So what created those pieces?

Let’s assume for a moment that the document is genuine and not manipulated in Photoshop/Illustrator to add/remove anything. What created the separate pieces?

Here’s my solution: Thinking through how this document was probably made, it probably went something like this: A staffer at the Hawaii hospital was told to go make a copy of the bound book page with President Obama’s birth certificate on it. They laid that bound book on a photocopier and printed out a copy of the page. For security reasons, this paper has a watermark on it (the green hashing in the background of the final image). Because the original certificate probably doesn’t have the security hashing, the photocopier printed just the black image it perceived onto the already-patterned paper, which is why the hash pattern doesn’t curve at the left edge of the certificate like the lines on the certificate do. The certificate is also a lot smaller than a letter sized piece of paper, so there’s a decent border of green hash pattern all the way around the certificate.

The staffer takes it to their superior, who verifies that the copy looks like the original (Alvin T. Onaka, Ph.D stamped and signed the document on April 25th, in the unused bottom margin of the document). So now the staffer has a printed document that was pre-printed with a green security pattern, then was printed with the certificate from the bound book, and has a real signature from Dr. Onaka on it.

Now to deliver this document to President Obama: President Obama sent one a personal lawyer to Hawaii to pick up the certificate, so it’s likely that this printed document that contains the real, original signature of the State Registrar, Dr. Onaka was handed over and flown back to Washington D.C. But how to disperse this document to the country?

So, President Obama probably handed off the original document to a White House staffer and told them to scan the document and put it online. On hand for the staffer to scan the document is probably a modern office copy machine that also does scanning to PDF. And here’s where the kicker comes in… Modern copier/scanners are getting smarter by the day, and often try to “help” you make the best scan possible. One such feature is the option to “enhance text characters”.

Like this Epson GT-20000: “The Epson Scan software included with the GT-20000 offers new text enhancement features that sharpen text characters, and enhance recognition when scanning documents.”

Or this Pixma wireless printer/scanner: “Remove backgrounds and sharpen text before sending scanned documents with Auto Document Fix.”

Or this Kodak i1840: “Easy brightness, contrast and color controls to adjust tones, sharpen text and optimize photos”

In all these cases, note that the scanner is sharpening text, not just sharpening the overall image. So, the scanner is attempting to determine which parts of the document are text and which are not (since if you are scanning an old yearbook photo of your great-aunt Midred, you want the text of her name under the photo to be sharpened, but not the photo itself, since that would ruin the appearance of the scanned photograph).

I propose that the hypothetical White House staffer left some sort of “auto enhance text” feature enabled when they scanned the document in, and the scanner attempted to determine what was text and what was not, and if it was text, it pulled it to a separate layer, grouped close-by text bits together, and sharpened the edges. The PDF has metadata inside it that claims it was created on April 27th, two days after the April 25th stamp on the document itself. Having the document created in Washington, two days after the document was signed makes sense in this scenario.

But then why were some characters missed or grouped with different letters? Let’s take a look at a few examples:

Enlargement: KenyaHere’s an enlargement of the “Birthplace of Father” box. The “K” in “Kenya” was not sharpened and separated from the background, while the rest of the word was. In this case, the “K” is overlapping the vertical line at the edge of the box, which likely fooled the image processor into thinking it was not a letter, and so treated it like an image.

Enlargement: NoneHere’s the word “None” from the “Type of occupation outside home during pregnancy” box. In Illustrator, the “Non” at the beginning of the word is separate text grouping from the “e”; why would that be when they’re so close together? The answer is in the color; due to the green hash pattern in the background, the “Non” overlaps one of the horizontal green bars, while the “e” is in the gap between it and the next hash pattern. When scanned, the image processor averaged the color, and as a result the “Non” is a dark green rather than true black, and so it’s assumed to be a different grouping (written in a green pen rather than a black?).

Enlargement: Blob?This enlargement may not look like much, but to me it’s pretty convincing proof that an automated process made these separations of text from the background, and not an artist trying to cover something up. There’s two pieces that Illustrator identifies when the PDF is broken apart that don’t appear to be text at all. This is one of the two, near the top of the page, and there’s one more between the date stamp and the certificate. Some people have conjectured that these two pieces were “Sample selections” used to recreate the security pattern where text was removed from the image. This seems to imply they think that the blue bounding box is selecting part of the green background pattern. It is not. There is image data inside the bounding box, but when first broken apart, this “text” grouping is colored white (so it looks like there’s nothing there on first glance). I’ve changed that to black so it’s easier to see. It is a sharpened bit of color that looks very similar to the sharp, jagged edges of the other pieces of text in this document. So everything that’s black in the enlargement above, some process thought was white text and pulled it off of the background. Clearly, this is not text, and I propose that this is an artifact of the text-enhancement process making a mistake and proving it’s an automated process and not being guided intentionally by a human. If a human designer was trying to sharpen all text on the document, there’s no chance they’d flag this as “text” and enhance it. If this piece was placed here by a conspiracy supporter, trying to hide evidence by drawing white over the top of something incriminating, what incriminating details would be so small and scattered in just two areas that would merit covering up?

Metadata Details:

There’s a few more bits of evidence (or lack thereof) to indicate that this document was not edited in Photoshop or Illustrator prior to being released. In Acrobat Reader, you can find some metadata about the document, incuding when it was created (April 27th), and what program created it (Preview, running on a Mac running OS 10.6.7). “Preview” is a simple image viewing application on the Mac platform, which (incidentally) has the means to capture images from a scanner, and the ability to save PDFs.MetadataThis metadata cannot be edited once it’s part of the document, even with the full version of Acrobat (which is what I have it open in in the screenshot above). If this image were doctored in Photoshop and then exported as a PDF (“forgetting” to flatten the “layers”), the “PDF Producer” field would clearly list Photoshop, and it doesn’t.

Conclusion:

So, applying Occam’s Razor, what is more likely: that a White House staffer hit an “Easy Scan” button that included some text enhancements that got foiled by the security pattern and fact that letters ran into line elements? Or that the two-day gap between stamping and creation of the PDF was spent by a graphic artist painstakingly separating some of the text from the background so text could be changed? I’m going with the former.