Which version of alphabetical order does Microsoft Word use? I thought that alphabetical order was, well, alphabetical order. That’s what I was taught at school and apparently I was taught wrong. There are at least three ways, two used by dictionaries, another by Microsoft and that’s for English alone!
Reading Sue Butler’s delightful book ‘The Aitch Factor’ I found that there’s more than one alphabetical order! Naturally, that got me wondering which alphabetical order is used by Microsoft Word and Office when sorting.
Look at these two word lists, both have the same words and both are in alphabetical order.
The left-hand list is in the more common order. Space and hyphens are ignored then the sort order is applied by each letter.
The right-hand list is another alphabetical order, used with some dictionaries. The key or head word (‘bush’) is followed by the key word compounds (‘bush wire’) .
The left-hand list is more common these days. Ms Butler suggest that because spacing and hyphenation can vary too much (we’ve seen ‘bushbaby’ ‘bush-baby’ and ‘bush baby’). But we think it’s also because it’s easier and faster to program on a computer. To replicate the other method takes some work by splitting the entries up, sorting then putting the entries back together.
Sorting in Word
Now we turn to Microsoft Word. You might expect the same words to be sorted using the left-hand or common method. But no, there’s another ‘alphabetical order’ from Word.
- bush breakfast
- bush wire
Why is the Word ‘alphabetical order’ different?
The usual answer
Word, and most computer programs, don’t really sort alphabetically, they sort numerically using the ASCII/Unicode value for each character from left to right. A lower character value comes before higher ones.
‘b’ has the value 98 so it comes before words starting with ‘c’ which is 99. Spaces have an ASCII code 32 so they sort ‘above’ any letters so that explains why ‘bush wire’ with a space in the fifth position is above ‘bushbaby’ where ‘b’ (98) is the fifth letter. For alphabetical sorting upper and lower case letters are treated the same, despite having different ASCII values.
The usual answer is wrong.
Modern software and a multi-lingual world means that things are a lot more complicated. The ASCII or Unicode character value does NOT necessarily determine the ordering of a list. You can try this in Excel by comparing a sorted list of characters with the CODE() or UNICODE() values for the sorted cells; the two lists won’t always be in the same order.
The above example is sorted ‘A to Z’ by the first column but the code values are somewhat mixed up.
Another example, what about ‘bush-bash’ in the Word sorting example above? A hyphen is ASCII 45 so that should sort below spaces but above any letters – but it doesn’t. In a test of single characters (same settings as above), the sort order is space, apostrophe, hyphen, many other characters, digits and finally letters.
What’s going on?
Sorting isn’t done by ASCII/Unicode values anymore. That numerical order doesn’t always match the accepted sorting order for a language. For example the accented characters like Á Å Æ Ç É Ó Ö have high ASCII values but are normally sorted with their base letter (AEIO or U).
Each language has a collation order — that defines how to sort text in that language. It allows sorting in any order, regardless of the letter/symbol ASCII/Unicode numeric value.
Collation orders are part of the language setting, they can’t be manually set.
The Regional Settings affect the collation order (aka sort). The usual advice is to change the Regional Settings in Windows to change the sort order but that’s not necessary in modern Word.
Go to Sort | Options to choose the Sorting language.
Word will automatically select the sorting language based on the language of the selected paragraph. You could change the sorting language to something different from the proofing language for the text.
Curiously, Excel doesn’t have an equivalent setting and you have to rely on the Windows Regional setting.
- Two ways for sorting by Number
- Sort by hidden column in Word
- How to hide a column in Word
- Sorting in Word
- Saving Sort Criteria in Word
- Sorting Reports by Date, Part 2
- Sorting Reports by Date, Part 1
- How to avoid trillion dollar mistakes in Word
- Sorting paragraphs using Word – Part 2
- Sorting paragraphs using Word – Part 1
- Table tricks in Word 2003