Microsoft Word documents are notorious for containing private information in file headers which people would sometimes rather not share. The British government of Tony Blair just learned this lesson the hard way.
A few weeks ago, Richard Smith – whom many of you will recall as a tireless defender of personal privacy – sent me an interesting bit of mail:
“Microsoft Word documents are notorious for containing private information in file headers which people would sometimes rather not share. The British government of Tony Blair just learned this lesson the hard way.
Last week, Alastair Campbell, Blair’s Director of Communications and Strategy, was in the hot seat in British Parliament hearings explaining what roles four of his employees played in the creation of a plagiarized dossier on Iraq which the UK government published in February 2003. The names of these four employees were found hidden inside of a Microsoft Word file of the Iraq dossier which was posted on the 10 Downing Street Web site for use by the press. The “dodgy dossier” as it became known in the British press raised serious questions about the quality of British intelligence before the second Iraq war.
I wrote an article for my Web site about how a bit computer forensics analysis played a role in this controversy.”
I strongly recommend that you take a look at Richard’s article. You’ll discover that Word saves the names of the last ten people who edited a document – as well as the file name and full path of the document at the time it was edited. You’ll also discover that this information can be a bit, uh, more revealing than you might imagine.
Ends up there’s nothing you can do about it. I repeat. No matter what you do, every Word document contains the names of the last ten people who edited the document, and the path and file name of the document when it was edited.
Those ten names appear whether Revision Tracking is on or off. They appear even if you’ve told Word 2002 to “Remove personal information from this file on save” (Tools | Options | Security – the option doesn’t exist in Word 2000 and earlier).
Microsoft’s response? Tough luck. That’s the way Word was designed. Knowledge Base article 290945 says: “Word stores the names of the last 10 people who worked on a document in the document. This is an automatic feature that you cannot turn off. However, you can remove the names of the last 10 authors from a document by saving the document in a format that does not retain such information. For example, if you save the document in either RTF (Rich Text Format) or HTML format, the authoring information is lost. You can then close and reopen the RTF or HTML document, and then save it in Word format.”
It’s remarkably easy to get at that information. Richard sent me a program several months ago that examines any Word document and, in most cases, coughs up the entire revision history in a matter of seconds.
Makes me wonder about Word 2003. Is the entire revision history stored away in plain text when you save in XML format?
Also, don’t forget: if you tell Word 2002 to “Remove personal information from this file on save”, and then you send the document to someone else using Outlook 2002, Outlook will brand the file with personally identifiable information (details here).
Between the Word “feature” that saves details about the last ten edits inside a document and the Outlook “feature” that stamps Word documents with unique numbers that identify the originating PC, I’d say that the “Remove personal information from this file on save” terminology is a bit, uh, disingenuous. A more cynical soul might even judge the phrase highly deceptive.
A Word to the wise. If you give a Word document to someone else, or post it on the Web, don’t be too surprised if it contains all sorts of embarrassing details. To be safe, use PDF format.
- Another victim of Word ‘metadata’
- Outlook 2002 Privacy Busting “Feature”
- The Case for PDF
- Hidden information in Microsoft documents