laitimes

Microsoft also planted, "millennium worm" when is a head

Author | Chu Xingjuan, Nuclear Coke

The problem of integer overflow, why is it still committed now?

At the beginning of the new year, a date check error occurred in Microsoft Exchange Server 2016 and 2019, causing the server to incorrectly recognize the 2022 time stamp. Therefore, some people also call it the Y2K22 bug, that is, the Millennium Bug 2022 version.

It is reported that Microsoft's mail program stores dates and times as signed integers (signed integers), with a maximum value of 2147483647, that is, 2^31 - 1. Microsoft uses the first two digits of the newer version to indicate its release year, so as long as the time is 2021 or earlier, everything is OK. However, just as Microsoft released version 2201010001 on New Year's Eve, the local server crashed due to the inability to parse the date correctly, causing the delivery message to freeze in the transmission queue.

Administrators around the world are frantically troubleshooting and missing out on the precious time to welcome the New Year with friends and family. "What the hell is Microsoft doing?" It's almost the New Year, and if it weren't for the fact that the forum said that everyone was generally having problems, we would have to run back to work. An administrator wrote in the Reddit thread.

Microsoft released a fix the next day: automated PowerShell scripts and manual solutions that are not available when scripts are also not run. In any case, administrators will need to perform remediation actions separately on each affected local Exchange 2016 and 2019 server. Fortunately, automation scripts can be run in parallel on multiple servers. Microsoft stressed that automated scripts "may take a while to run" and called on administrators to be patient.

Date and time checking is performed during Exchange's review of the fip-FS version, a scanning engine that is part of Exchange's antimalware protection mechanism. Once the version of FIP-FS starts with the number 22, the check will not complete and messages in transit will be abruptly halted. Microsoft-released hotfixes stop Microsoft Filter management with the Microsoft Exchange Transport Service, remove existing anti-virus engine files, and install and start the new anti-virus engine that has been fixed.

At present, most of the affected organizations have returned to normal, but it is unclear how long the bug has been around, but judging by the affected version, it is likely that it originated from the development phase of Exchange Server 2016.

It's been repeating the same mistakes

Fundamentally, the Millennium Worm is a bug in the processing date, which is not a serious technical problem, but it is a mistake that businesses have been making.

In November 2019, some HP SSDs automatically stopped working after 32,768 hours of operation, and all on-disk storage disappeared and could not be recovered. All drives in a particular system may have the same batch of firmware pre-installed, have the same bug hazards, and if they fail at the same time, even the RAID system cannot withstand this "collective strike" extreme situation.

HP did not make a specific explanation, but directly released the firmware repair upgrade. But from the phenomenon, the problem should be related to the 16-bit value in the code. This means that the maximum negative integer that this system can load is 32768 and the maximum positive integer is 32767.

The digital overflow problem is one of the most common programming errors, and any code can have problems once the value reaches the limit condition and is not corrected by overflow or underflow checking. Therefore, many developers like to calibrate with super-large integers; as long as the numbers are large enough, they are not afraid of accidental overflows.

However, this trick is not a tried-and-true trick. Only 8-bit or 16-bit integers can be used in microcontrollers. Given that these values are often associated with peripheral controllers, it is important to set appropriate range limits to ensure that developers and code reviewers have an accurate grasp of these important values.

On the other hand, such overrun conditions often lead to hard-to-find bugs. In HP SSD events, the drive takes years to run to reach its limit, so this error, which is triggered under rare conditions, is really not easy to detect. If the SSD happens to serve a self-driving car, the moment it stops working, the vehicle is likely to cause a serious traffic accident.

In addition to the HP SSD incident, the reason for the failure of the first test launch of the Ariane-5 launch vehicle is also such a "small" mistake. On June 4, 1996, the Ariane-5 carrier rocket was first test launched, and the rocket was forced to detonate itself 37 seconds after launch and disintegrate 40 seconds later. The $500 million delivery system was wiped out in an instant.

Ariane 5 A section of the control program directly reused the code of the Ariane 4 rocket, one of the variables that needed to receive 64 bits of data used 16 bits to save storage space, which made the faster Ariane 5 generate integer overflow during the control process, causing the navigation system to fail to control the rocket, and the program entered the anomaly handling module and detonated self-destruct. The failure became one of the most notorious and expensive software bugs in history.

It has terrified the whole world

The root cause of the Millennium Worm problem began in the 1960s. At that time, the cost of computer memory was very high, and if the year was represented by four digits, it would occupy more memory space, which increased the cost. Therefore, in order to save storage space, the programmer of the computer system uses two digits to represent the year.

Although it improves the operational efficiency of the computer, it also brings new hidden dangers. For example, when the date rolls from 1999 to 2000, what are the consequences of the change from 99 to 00? Some worry that computers will not know how to understand such a null value of time, resulting in invalid dates and causing global computing facilities to fail.

In order to make "December 31, 1999" safe to "January 1, 2000", the data shows that at that time, the world invested about $300 billion to 600 billion to solve the millennium worm problem. Although the effect is good, there are still some problems, even jokes.

The attitude and actions of the United States on the millennium worm issue have been particularly positive. At the time, about $9 billion of at least $100 billion spent by the entire U.S. nation was spent on the federal government. The Pentagon's intelligence and defense systems became a major destination for funding (totaling about $3.5 billion). But despite months of costly computer repair and hardware update efforts, the government suffered severe spy satellite failures in the first three days of 2000. It wasn't until it was restarted and run again that the satellite was finally able to send back the identifiable information content normally.

Three days doesn't sound like a long time, but a Pentagon official still classified the incident as "significant." Ironically, it wasn't the Millennium Bug that caused the crash, but the software patch that addressed the bug.

In addition, the U.S. Naval Observatory has also temporarily lost control due to the influence of the Millennium Worm. The U.S. Naval Observatory has only one job: calibration time. Founded in 1830, the agency is primarily responsible for all kinds of nautical instruments in the United States, and gradually became the official timekeeping agency of the United States in subsequent developments. It was this importance that made it particularly embarrassing for the Naval Observatory to announce the date "January 1, 19100" on the first day of the millennium, although the problem was resolved less than an hour after it was reported.

Microsoft also planted, "millennium worm" when is a head

Inside the U.S. Naval Observatory in Washington, D.C., December 29, 1999

In addition to the United States, Japan's nuclear power plants have also been affected by the Millennium Worm. Two minutes after the New Year's bell rang, an alarm suddenly sounded at Japan's Onagawa Nuclear Power Plant when computers noticed a problem with a device measuring the temperature of the surrounding seawater. Fortunately, the failure lasted only about 10 minutes, after which everything returned to calm again and no serious conditions were found.

A similar incident occurred at the Shiga Nuclear Power Plant in Japan, where a Millennium Bug malfunction led to the downing of part of the station's alarm system. To make matters worse, a power station monitoring computer in a government office went down along with an accompanying alarm system. In short, similar minor problems occurred throughout Japan that day, but they were quickly controlled and corrected. Japanese officials did not say whether the incidents were related to the Millennium Bug.

Due to Y2K errors, the hong Kong Futures Exchange's computer system malfunctioned, the computer system that controlled the pricing of Hang Seng Index option contracts miscalculated the number of days between trading days and maturity dates for many options trades; the Federal Reserve Bank of Chicago was unable to complete the $700,000 tax transfer; a bank in Chicago interrupted the electronic medical insurance payment function for some hospitals, and the insurance company that processed and paid the medical insurance claim had to go through FedEx, Send a floppy disk containing processed claim information to the bank to guarantee on-time payment.

In addition, there are some things that make people cry and laugh:

Millennium worms cause newborns to be registered as centenarians. Denmark's first "millennial baby" was just born and was registered as a centenarian on the hospital computer. Deutsche Oper's computer system jumped the date back to 1900 on January 1, 2000, resulting in a dramatic change in the age of all employees and their children. Children born in 1990 instantly ushered in the 90-year-old age, and many employees could not normally collect child care subsidies directly from the government in their salaries.

The "Millennium Bug Survival Pack" that is completely useless and cannot be returned. With the fear of the Millennium Worm catastrophe around the world, many companies have launched a series of "Millennium Worm Survival Packs" several months in advance. The business quickly spawned a multimillion-dollar market, with one company called Preparedness Resources earning $16 million by selling survival tool kits that included dehydrated food, water purifiers, battery-less flashlights, blankets and waterproof matches. Scott Sperry, the sober-minded president, also set a tough policy of "no selling" early on.

A surprise experience of "getting rich overnight". The Millennium Worm made a man in Germany suddenly experience the feeling of being a rich man on the first day of the new century. On that day, he randomly deposited about $6 million into his bank account on The date of the transaction was December 30, 1899. Officials at the time weren't sure if the unusual transfer had anything to do with the Millennium Worm, and the only thing that was certain was that the man wouldn't actually get rich overnight.

While the world's fears about the Millennium Worm problem didn't seem necessary from the current point of view, it was largely due to the fact that countries had invested hundreds of billions of dollars in bug fixes years in advance.

Bill Gates stressed in an interview that the millennium worm "did not make any waves in the end because all parties really spared no effort to repair it." Without such efforts, the world would have been greatly affected. ”

Can the Millennium Worm dodge it?

Until 1999, governments and businesses around the world had been looking for a fix for Y2K. However, the millennium worm problem has not been effectively avoided so far, and the millennium worm may reappear.

Similar to the Millennium Bug issue, the 32-bit Unix and Linux operating system time overflow issues are also known as the "2038 issues," and all programs that use POSIX time to represent time are affected. This problem is caused by the C language used to write Unix/Linux.

The C language uses time_t to represent time and date, is used to record the number of seconds elapsed from January 1, 1970 to 2000, and stores them in 32 bits. The first bit is the sign bit, and the remaining 31 bits are used to store the number, and the maximum number that can be stored in these 31 bits is 2147483647, which can be used up to 03:14:07 on January 19, 2038.

At this point, the number does not automatically increase, but becomes -2147483648, which is 20:45:52 on December 13, 1901. This can cause a lot of programs to have problems and even crash.

The 2038 problem is not only more hidden than the Millennium Bug, but also more destructive than the previous Millennium Bug problem. The Millennium Bug problem only causes problems with application-level programs, such as credit card payment systems or management systems. The "2038 Issues" bug will affect the system's lowest-level time control function.

Linux kernel 5.6, released in February 2020, claims to solve this problem, so 32-bit systems can also run until after 2038. Linux developer Arnd Bergmann says user time is built on 64-bit time_t using GNU C Library 2.32 and Musl libc 1.2.

While systemic issues like the "2038 problem" may take a long time to explore and solve, a millennium bug like Microsoft's is completely avoidable.

Read on