🔗 “Bush hid the facts” bug
Bush hid the facts is a common name for a bug present in some versions of Microsoft Windows, which causes text encoded in ASCII to be interpreted as if it were UTF-16LE, resulting in garbled text. When the string "Bush hid the facts", without newline or quotes, was put in a new Notepad document and saved, closed, and reopened, the nonsensical sequence of Chinese characters "畂桳栠摩琠敨映捡獴" would appear instead.
While "Bush hid the facts" is the sentence most commonly presented on the Internet to induce the error, the bug can be triggered by many strings with letters and spaces in the same positions, for example "hhhh hhh hhh hhhhh". Other sequences trigger the bug as well, including even the text "a ".
The bug occurs when the string is passed to the Win32 charset detection function IsTextUnicode
. IsTextUnicode
sees that the bytes match the UTF-16LE encoding of valid (if nonsensical) Chinese Unicode characters, concludes that the text is valid UTF-16LE Chinese and returns true
, and the application then incorrectly interprets the text as UTF-16LE.
The bug had existed since IsTextUnicode
was introduced with Windows NT 3.5 in 1994, but was not discovered until early 2004. Many text editors and tools exhibit this behavior on Windows because they use IsTextUnicode
to determine the encoding of text files. As of Windows Vista, Notepad has been modified to use a different detection algorithm that does not exhibit the bug, but IsTextUnicode
remains unchanged in the operating system, so any other tools that use the function are still affected.
Discussed on
- "Bush Hid the Facts" | 2023-07-22 | 12 Upvotes 1 Comments
- "“Bush Hid the Facts”" | 2021-05-15 | 206 Upvotes 33 Comments
- "The "Bush hid the facts" bug" | 2015-12-08 | 139 Upvotes 28 Comments