laitimes

Beating the world's No.1 system and covering 80+ countries, Google's flood prediction model is back in Nature

author:HyperAI

It is recorded in the "Shangshu Yaodian": "The soup soup flood is cut, the Huaishan Xiangling is swinging, the vast and monstrous, and the people are consulted. Yao Shun era, the flood made the people miserable, Yao Shun decided to find someone to control the flood, Kun was initially ordered but unsuccessful, after Dayu inherited his father's business to control the water again, so there is a legend of "Yu Yu ruled the water for thirteen years, three through the door and did not enter".

In July 2023, a rare torrential downpour triggered by Typhoon Doksuri hit Beijing, causing record-breaking peak flood flows in the Daqing River Basin. According to People's Daily Online, more than 1.29 million people in Beijing were affected by the floods, with more than 59,000 houses collapsed, more than 147,000 severely damaged, and more than 225,000 acres of crops affected.

Beating the world's No.1 system and covering 80+ countries, Google's flood prediction model is back in Nature

Source: China News Service

From ancient times to the present, human beings have often been in a vulnerable position when faced with natural disasters such as floods. Google research scientist Grey Nearing has shown in his paper that an effective flood forecasting system can reduce the number of associated deaths by 43% and economic losses by 35%-50%. It can be seen that the establishment of a flood forecasting system is an important means for human beings to deal with flood disasters.

Current global flood forecasting systems mostly rely on riverine observatories, and due to deployment costs, low- and middle-income countries tend to have low installations of flow meters, making it difficult for these countries to prepare for floods in advance. The World Bank estimates that upgrading flood forecasting systems in developing countries to the level of developed countries could save about 23,000 lives each year. There is an urgent need to establish a flood forecasting system for watersheds without stations.

Fortunately, with the development of technology, the application of artificial intelligence (AI) in the flood field has brought hope for flood defense in stationless watersheds. Grey Nearing and his team from Google Research have developed a machine learning-based river forecast model that enables reliable predictions of floods five days in advance, outperforming or equivalent to the current one-in-year flood events for more than 80 countries.

Research Highlights:

* The prediction power of the river forecasting model is better than that of GloFAS, the world's most advanced flood forecasting system

* Better support for flood warnings in unmeasured watersheds

Beating the world's No.1 system and covering 80+ countries, Google's flood prediction model is back in Nature

Address:

https://www.nature.com/articles/s41586-024-07145-1

Dataset download address:

Hatpas://hyper.i/dataset/30647

Follow the official account and reply to "Flood Forecasting System" in the background to get the full PDF

Dataset: From 5,680 watersheds

The study's full dataset includes model inputs and (runoff) target values from 5,680 watersheds, based on which the researchers trained and tested the model.

Beating the world's No.1 system and covering 80+ countries, Google's flood prediction model is back in Nature

Locations of 5,680 runoff monitoring stations used to train the model

This study uses 3 types of publicly available data as inputs, mainly from governments:

* Static watershed data representing geographic and geophysical variables: from the HydroATLAS project, including long-term climate indicators (precipitation, temperature, snow cover), land cover, anthropogenic attributes, etc.

* 历史气象时间序列数据:来自 NASA IMERG, NOAA CPC Global Unified Gauge-Based Analysis of Daily Precipitation 和 ECMWF ERA5-land reanalysis。 变量包括每日总降水量、气温、热辐射、降雪量和地表压力等。

* Time series data for forecasted weather within the seven-day forecast range: These data are from the ECMWF HRES atmospheric model and the meteorological variables are the same as above.

Model architecture: Construct a river forecasting model based on LSTM

Beating the world's No.1 system and covering 80+ countries, Google's flood prediction model is back in Nature

LSTM-based river forecasting model architecture

In this study, two applied long short-term memory networks (LSTMs) were used sequentially to construct river forecasting models, the core of which is the encoder–decoder model. where the Hindcast LSTM receives historical weather data, the Forecast LSTM receives forecast weather data, and the output of the model is the probability distribution parameter for each forecast time step, which represents the probability prediction of the volumetric flow of a particular river at a specific time.

In addition, the researchers trained the model on 50,000 minibatches, with all input data pre-normalized. To enhance the learning ability of the model, the researchers set the number of hidden layer nodes with a cell state of 256 for the encoder and decoder LSTMs, as well as a linear-cell-state transfer network and a nonlinear hidden-state transfer network.

Model optimization: Cross-validation reduces prediction error

The researchers used cross-validation to train on 5,680 flow meters and test the river forecasting model outside the sample to ensure that the model's generalization ability was effectively evaluated and the prediction reliability was improved.

First, in the time dimension, cross-validation folds are designed so that the test data of any monitoring station within one year must not overlap with the training data it uses. In the spatial dimension, k-fold cross-validation (k = 10) is used to evenly segment the data in the spatial dimension. Repeat both cross-validation processes to avoid data leakage between training and testing.

Second, to further investigate the performance of the model under different geographical regions and environmental conditions, the researchers also conducted more types of cross-validation experiments, including but not limited to: non-random spatial segmentation by continent (k = 6), different climatic zones (k = 13), hydrologically separated watershed groups (k = 8), etc.

* K-fold cross-validation: Divide the dataset into k subsets, where 1 subset is used for validation and the remaining k-1 subset is used for training. Repeat k cross-validations, 1 validation per subset, and the results of an average of k times get the final evaluation of the model.

Experimental conclusion: The performance is better than the most advanced flood forecasting system in the world

To evaluate the reliability of flood event predictions, the researchers compared river forecasting models with the Global Flood Awareness System (GloFAS), the world's most advanced flood forecasting system.

Beating the world's No.1 system and covering 80+ countries, Google's flood prediction model is back in Nature

River forecasting model and GloFAS model in real-time prediction

Differences in F1 scores for predicting 2-year return-to-return period events

* Red indicates a difference value between -0.2-0

* Green indicates a difference between 0-0.2

First, the researchers analyzed the distribution of F1 scores between the river forecasting model and the GloFAS model in predicting the 2-year return period events under real-time prediction from 1984 to 2021.

The results showed that the river forecasting model outperformed the GloFAS model on 70% of the stations (3,673 in total).

Beating the world's No.1 system and covering 80+ countries, Google's flood prediction model is back in Nature

Instant predictions

Distribution of accuracy and recall for different return period events

* The blue dotted line is the reference baseline

* N is the number of monitoring stations

Second, the researchers analyzed the distribution of accuracy and recall of events with different return periods under real-time predictions.

The results show that the river forecasting model shows higher reliability in predicting all return-period events. There was no significant difference between the river forecasting model and GloFAS in the 1-year return period for predicting extreme events, while the recall rate was higher than that of GloFAS. This suggests that the accuracy of the river forecasting model in predicting 5-year return-period events is better than or equivalent to that of GloFAS in predicting 1-year return-period events, that is, its reliability in predicting longer-return-period flood events is better than that of the current state-of-the-art model in predicting 1-year return-period flood events.

* Return period: How many years is the return period of a certain peak flood flow. The longer the return period, the greater the magnitude of the flood, and the shorter the return period, the smaller the flood.

Beating the world's No.1 system and covering 80+ countries, Google's flood prediction model is back in Nature

When predicting 0-7 days in advance, the distribution of F1 scores for events in different return periods The blue dotted line is the reference baseline

Third, the researchers analyzed the distribution of F1 scores for different return period events when predicted 0-7 days in advance.

The results showed that for predicting 1-year (a), 2-year (b), 5-year (c), and 10-year (d) return-period events, the F1 scores of the river forecasting model were either higher than the immediate prediction of GloFAS or had no significant difference at most 5 days in advance. This shows that the river forecasting model has better or better flood forecasting capabilities than or comparable to GloFAS over a period of 5 days in advance.

Beating the world's No.1 system and covering 80+ countries, Google's flood prediction model is back in Nature

F1 scores by geographic location and return period

Fourth, the researchers analyzed the distribution of F1 scores when predicting events across geographic locations and return periods.

The results show that there are significant differences in the reliability of the two models in different geographical locations. In addition, the F1 scores of river forecasting models were higher or no significantly different than GloFAS in the prediction of 1-year (a), 2-year (b), 5-year (c), and 10-year (d) return period events.

From Europe's EFAS to China's Xin'anjiang model, AI has become an intelligent line of defense

In fact, as early as 2021, when Google presented the results of its AI technology research at the "Inventors@Google" event, it already mentioned Google Flood Hub, a machine learning-based flood forecasting system, which was mainly used in India to let local people understand the flood situation in a visual way. After three years of development, Google's latest flood forecasting system can now be extended to other stationless watershed areas, covering more than 80 countries.

Similarly, the European Flood Perception System (EFAS), which uses advanced weather forecasting and hydrological models, combined with machine learning algorithms, to reliably forecast floods across Europe at least ten days in advance and send correct early warnings to national and local flood centres in member states.

In addition, as one of the flood-prone countries, about two-thirds of the continent has varying degrees of flood risk. According to statistics, between 1991 and 2020, the average number of deaths or missing persons caused by floods in mainland China exceeded 2,000 per year, with a cumulative death toll of more than 60,000 and an average annual direct economic loss of about 160.4 billion yuan.

Beating the world's No.1 system and covering 80+ countries, Google's flood prediction model is back in Nature

Source: Map of China

In the face of flood hazards, the Xin'anjiang model independently developed by the mainland, based on long-term practice accumulation and in-depth study of hydrological laws, divides the whole basin into multiple unit sub-basins, and considers the influence of topography, soil, vegetation and other factors on hydrological processes to provide accurate hydrological prediction results, which is widely used in flood prevention and disaster reduction.

In fact, human beings have never stopped exploring more effective flood prevention measures, although it is impossible to eliminate floods fundamentally, but through advanced flood forecasting systems, predicting disasters in advance and taking measures, the negative impact of floods on human society can be greatly reduced. Today, AI-based flood forecasting systems are no longer limited to a specific area, and may also cover the whole world in the future, protecting more citizens from flooding.

Resources:

1. http://bj.people.com.cn/no2/2023/0809/c14540-40525241.html

2.https://www.sohu.com/a/766008856_473283

3. https://www.sohu.com/a/745381603_121687414

4.https://european-flood.emergency.copernicus.eu/en/european-flood-awareness-system-efas

5.https://developer.baidu.com/article/details/3096974

6.https://blog.research.google/2024/03/using-i-to-expand-global-access-to.html

7. Hattpus://m.gmi.com/article/6809946.html

Read on