laitimes

The FVCOM model using the OpenACC framework achieves a 100-fold computational speedup

author:NVIDIA China

As a core member of the development team of the international advanced marine numerical model FVCOM, the team of Professor Ge Jianzhong of the State Key Laboratory of Estuarine and Coastal Studies of East China Normal University, with the development and application of FVCOM becoming more and more extensive, as well as the industry's increasing requirements for forecast accuracy and timeliness, the demand for computing power has increased dramatically. It also provides key basic data generation tools and methods for the transformation of ocean model systems to artificial intelligence models and the development of artificial intelligence oceanography, which is an important cornerstone for the further application of artificial intelligence technology in the marine field.

Numerical models for ocean forecasting calculate a sharp increase in load

With the increasing frequency of natural disasters, the numerical prediction system that provides technical support for disaster process defense has an increasing demand for "accuracy, timeliness, efficiency and stability", especially with the development and application of ensemble forecasting models, which has brought about a sharp increase in the computational volume of numerical models (for example, in ensemble forecasting, the computational amount is proportional to the number of ensemble samples, which is dozens of times that of a single model), and the large computational load has brought great pressure to the forecasting business unit and the supercomputing center, and the forecasting system has "high timeliness" It is necessary to control the amount of computation as much as possible, so as to improve the timeliness of forecasting. At the same time, estuarine ecological and biogeochemical process models have the characteristics of many variables and complex processes, and their computational cost is generally more than 10 times that of dynamic models. Models of tidal flat wetland vegetation patches and tidal trench systems, offshore projects, and offshore wind farms generally require a spatial resolution of less than 5 meters, which also leads to a significant increase in the computational cost of the model.

In the face of the challenge of the rapid increase in the amount of computing, the current computing architecture of the laboratory mainly adopts the CPU-based multi-core computing node expansion scheme to increase the number of cores and nodes, which puts forward higher requirements for the construction and operation and maintenance of high-performance clusters, and further raises the threshold for the application and expansion of numerical models.

采用 OpenACC 框架加速 FVCOM 模型

In order to solve the difficult problem of the rapid increase in the computing load of numerical models, the team of Professor Ge Jianzhong of the State Key Laboratory of Estuarine and Coastal Studies of East China Normal University investigated and analyzed the current major GPU-accelerated computing technologies, including CUDA, OpenACC, stdpar, Kokkos, OpenCL, etc., and discussed and analyzed in detail with the NVIDIA technical team At the beginning of the year, we started the relevant code migration work, and participated in the GPU Hackthon event held by NVIDIA in August 2023, received professional technical support, solved a number of key technical difficulties, and completed the migration, testing, and verification of the main code by the end of 2023.

In order to lower the threshold for the use of large-scale numerical models, the migration and testing of the model code were completed on a desktop computer equipped with NVIDIA GeForce RTX 40 series GPUs, and the CPU was compared on the supercomputing center computing node that will be deployed in early 2023. Efficient transmission of offline flow field and nested files, and free switching of single-precision and double-precision calculations can also be carried out. After migration, there are no changes to the inputs, outputs, and control files associated with the model, and they can be applied to the original FVCOM.

The accelerated comparison test selects models such as 100,000, 350,000, 1,000,000, 1,500,000, and 2,000,000 horizontal meshes, all of which are computed in single-precision mode on RTX GPUs, and the same model is run on a single thread using compute nodes. The FVCOM model with OpenACC technology achieves speedup ratios of 88, 181, 194, 195, and 198x, respectively relative to the single-threaded computing speed of the CPU (Figure 1). On this basis, the compiler control option can be used to flexibly switch between CPU and GPU modes on the same set of code, and the CPU and GPU acceleration models have been verified to obtain consistent simulation results. Under the premise of single-precision FVCOM, the computing power of an RTX GPU is equivalent to 3.5 64-core compute nodes in a supercomputing cluster without considering network switching, and 5 nodes when considering network switching latency between nodes.

The FVCOM model using the OpenACC framework achieves a 100-fold computational speedup

Figure 1: Results of a single-precision GPU-FVCOM acceleration experiment

The model can be efficiently scaled within the NVIDIA accelerated computing framework, adjusting the 100,000, 350,000, 1 million, and 1.5 million mesh models to double-precision mode, and using a single NVIDIA Ampere Tensor Core GPU for accelerated computing, achieving speedup ratios of 48, 77, 139, and 135, respectively, showing that it also has a good acceleration effect on double-precision mode. In the case of multiple GPU computing nodes, MPI+OpenACC can also be used to support multi-GPU parallel computing.

The FVCOM model using the OpenACC framework achieves a 100-fold computational speedup

Figure 2: Results of the double-precision GPU-FVCOM-accelerated experiment

More than 100 times faster computing for the benefit of ocean forecasting

At present, the FVCOM model is widely used in the fields of ocean forecasting, ocean engineering and operations. Taking offshore marine forecasting services at home and abroad as an example, FVCOM has become the main model of choice for marine early warning and forecasting departments at all levels in coastal provinces, municipalities and districts in mainland China to carry out operational forecasting work. The development trend of marine forecasting business is to continuously improve the requirements for forecast accuracy and timeliness, both of which mean huge computing power requirements, and the implementation of GPU acceleration of FVCOM models is an effective way to solve the rapidly increasing computing power requirements in practical applications.

GPU-accelerated forecasting models can reduce forecast timeliness from hours to minutes or even seconds. The significant efficiency gains also unlock the potential for models to further adopt higher mesh resolutions and thus improve simulation accuracy.

On the other hand, business departments are paying more and more attention to the ensemble forecasting of events such as typhoon storm surge. Ensemble forecasting refers to the calculation of multiple possible future scenarios for perturbations under different initial conditions or drivers (e.g., typhoon evolution processes) to account for uncertainties in the forecast. This poses a greater challenge to the computational speed of the model, which GPU acceleration copes with well.

In the field of hydraulic engineering, the FVCOM model has also been widely used in engineering feasibility analysis and evaluation. Especially in the pre-engineering research stage, it is necessary to simulate and evaluate the effects of various construction schemes with the help of numerical models, and multi-condition calculations also pose great challenges to traditional models. GPU acceleration to deliver argument results faster can make your project more efficient and save time.

In addition, the cases realized in this project have high enlightenment significance and promotion value, for example, the OpenACC technical solution can also be applied to other offshore and marine numerical model systems. In models with structured meshes (e.g., ROMS, ECOM, POM, etc.), this scheme may even achieve better speedup. This practice also proves that GPU acceleration can greatly reduce the hardware threshold required for numerical simulation in estuary, coastal, marine research and engineering applications, which provides great help for discipline development and business applications.

At present, ocean numerical models are undergoing the biggest transformation in their development history, that is, from traditional ocean numerical models based on dynamic mechanisms and equations to artificial intelligence models based on machine learning (deep learning) and other methods. However, AI models have a huge demand for and dependence on data, and their training is usually inseparable from massive and reliable data. However, measured data in ocean systems are always scarce compared to the enormous spatial scale of the ocean and the specific time horizon of the issue of concern. Numerical models can provide a large amount of basic training data for AI models, and are also one of the most effective ways to ensure the scope and quality of data. For example, Professor Ge Jianzhong's team has used the GPU-accelerated FVCOM model system to calculate the ocean flow field and ecodynamic processes in China's coastal waters from 1960 to 2023, and used the 3D high-resolution model to generate an assimilated data product with a capacity of more than 20TB. Then, by training the dataset on the AFNO-based FourCastNet model developed by NVIDIA, they were able to quickly derive and analyze estuarine and offshore dynamics. In addition, they used a GPU-accelerated FVCOM model to efficiently and quickly calculate more than 1,000 typhoon storm surge process samples to train a storm surge forecasting model based on deep learning methods. The time cost of constructing these two datasets would be more than 100 times higher if the traditional and unaccelerated numerical model was used.

In summary, FVCOM using the OpenACC framework provides more than 100 times faster computational acceleration for traditional dynamical numerical models. This efficiency improvement not only directly benefits specific application fields such as ocean forecasting and hydraulic engineering, but also provides key basic data generation tools and methods for the transformation of ocean model systems to artificial intelligence models and the development of artificial intelligence oceanography, which is an important cornerstone for the further application of artificial intelligence technology in the marine field.

Meet the team

The team of Professor Ge Jianzhong of the State Key Laboratory of Estuarine and Coastal Studies of East China Normal University has long been committed to the development and application of marine numerical models, and is a core member of the development team of the international advanced marine numerical model FVCOM. In addition, the team has also established a multi-spatial scale physical-biogeochemical coupling numerical simulation system for the China Sea-Yangtze River Estuary.

Based on the FVCOM framework, Professor Ge Jianzhong's team mainly focuses on the research of high sediment concentration, physico-biogeochemical coupling process, typhoon storm surge, etc., and conducts applied research on typical estuarine coastal areas in China, such as the Yangtze River estuary, the Yellow Sea, the coastal areas of Zhejiang and Fujian, the Pearl River estuary, and the Beibu Gulf. In Germany's Elbe Estuary, the Port of Hamburg, Vietnam's Da Nang and other regions, the team has also carried out relevant cooperation and applied research, and its relevant results have also provided a number of technical support for the prevention and control of Yellow Sea moss, storm surge forecasting, and salty tide invasion prevention of the national ocean and water conservancy departments.

*Note: The picture in this article is from the team of Professor Ge Jianzhong, State Key Laboratory of Estuarine and Coastal Sciences, East China Normal University

Read on