Skip to content

How Excel’s hidden CalcChain can catch data cheats

There’s a hidden part of Excel which can be used to catch ‘data cheats’ or people who tweak workbooks to get the results they want.

There’s a news story about research into honesty, the academic is being accused, ironically, of dishonesty in their findings. The Professor is suspected of tampering with source data to change the conclusion.

What caught our attention was one of the ways academics and others are detected using a little known and hidden part of Excel – calcChain.xml.  CalcChain is one of the XML files that makes up an Excel xlsx/xlsm file.

Sometimes people ‘tweak’ their data lists to change the result.  Unhelpful data might be removed or moved.  The Excel workbook won’t show those alternations directly but calcChain.xml will.

About CalcChain

CalcChain.xml is there to help Excel and us humans don’t need to worry about it.

The “Calculation Chain” file tells Excel the order to calculate cells. A side-effect of that is calcchain.xml will show if a cell has been moved or changed from the original order.

Source: https://datacolada.org/109

DataColada.com has an explanation of calcchain.xml and how it helps detect data cheating. (gotta love domain name).

CalcChain does NOT need Track Changes to be turned on.

If you want to see CalcChain.xml in an Excel file see How to look inside an Office document

CalcChain.xml is in the /xl folder along with other Excel components like styles.xml and workbook.xml. Use any text editor (Notepad, Notepad++ etc) to view the XML contents.

More Info

Data Falsificada (Part 1): “Clusterfake” is a detailed explanation of how the evidence of fraud was found.

The New York Times – The Harvard Professor and the Bloggers

The New Yorker They Studied Dishonesty. Was Their Work a Lie?

The Atlantic An Unsettling Hint at How Much Fraud Could Exist in Science

The Harvard Crimson details Prof. Gino’s rebuttal

Not the first time ….

Office Watch has covered academics problems with Excel before:

A decade ago there was the “Excel Depression” when a faulty worksheet led to some bad policies.

Geneticists have been using Excel badly leading to gene codes being converted into dates. Gene names have been changed to avoid the problem and Microsoft improved text importing.

About this author

Office-Watch.com

Office Watch is the independent source of Microsoft Office news, tips and help since 1996. Don't miss our famous free newsletter.

Office 2024 - all you need to know. Facts & prices for the new Microsoft Office. Do you need it?

Microsoft Office upcoming support end date checklist.