laitimes

OceanBase 14 years: Difficult start, self-development, and integrated thinking

author:Tech Walker
Author|Jin Wang

2024 is a particularly critical year for OceanBase.

On the one hand, this year, Ant Group officially announced that OceanBase was officially operated independently.

More importantly, many leading enterprises have officially shifted their core business to OceanBase distributed databases, and some companies have even explicitly proposed to "All in OceanBase".

Yang Bing, CEO of OceanBase, said, "Distributed databases have begun to become the standard architecture of modern databases, and at the same time, integrated databases are gradually maturing and moving towards commercial use. ”

In 2024, driven by these two mainstream trends, OceanBase will have a larger operational space.

Of course, they also have new ideas and goals.

A tough decade for databases

In November 2014, at the AWS re:Invent conference, Amazon officially announced the release of Amazon Aurora, which opened the curtain on self-developed databases.

At that time, China's database commercial market was still in its infancy, although Oracle had entered the Chinese market in 1989 and led to the informatization upgrade of large state-owned enterprises such as railways, finance, and operators, due to the high selling price and operation and maintenance costs, these enterprises needed huge expenses every year, and self-developed databases were about to emerge in China at this time.

In 2010, 44-year-old Yang Zhenkun joined Alibaba and led the team to start the road of Alibaba's database self-development.

OceanBase 14 years: Difficult start, self-development, and integrated thinking

This year, relational databases are still the mainstream, but there is already a lot of talk in the industry about the possibility of NoSQL replacing relational databases, and distributed databases are still a niche route that is not optimistic.

Yang Bing recalled, "More than ten years ago, because the technology of distributed databases was still very immature, even how to use middleware to do database and table sharding was still a very complicated thing, so it was a very niche technical route. ”

However, this was not the biggest problem they encountered at the beginning of Alibaba's self-developed database, the biggest problem they encountered at that time was actually a shortage of talents.

At that time, although some IT technicians had already started to use databases, due to the high complexity of this technology, it was difficult to recruit excellent database talents due to the high complexity of this technology and the late start in China.

This is one of the reasons why OceanBase has held an annual developer conference, continuously invested in the developer community, and even directly cooperated with universities to cultivate talents.

OceanBase 14 years: Difficult start, self-development, and integrated thinking

Of course, these are all later words, when Ali decided to develop their own database, the first question in front of them was - choose the open source route, or the pure self-developed route?

If you choose the open source route, it is equivalent to standing on the shoulders of giants, without having to go through the cold bench and painful retreat from 0 to 1, but the problem is that many problems encountered by domestic enterprises in the actual application scenarios at that time can no longer be fundamentally solved through open source databases.

For example, as enterprises have an increasing demand for massive data and high-speed writes, the LSM-Tree data structure is just right for such requirements.

However, in the traditional database architecture, the most basic requirements such as building indexes and querying data based on LSM-Tree are not friendly.

OceanBase 14 years: Difficult start, self-development, and integrated thinking

Choosing the pure self-research route can break the traditional rules from the underlying architecture and design a new architecture that is more suitable for application needs, which will naturally make it easier to solve the seemingly conflicting business needs of the database industry at that time.

But such a breakthrough in technology from scratch is a painful and painful process.

At that time, the OceanBase team finally chose the "point of no return" of pure self-development.

OceanBase has adhered to this purely self-developed technical route that seemed to be very niche at the time, and OceanBase has adhered to it for 10 years.

Yang Bing said, "At present, OceanBase has achieved 100% self-development. ”

OceanBase 14 years: Difficult start, self-development, and integrated thinking

"From the first line of code, we know how each line of code is implemented and how the NAS CPU is scheduled, so we have made a lot of technical attempts and self-developed innovations in combination with various proprietary cloud and public cloud scenarios. ”

In 2014, OceanBase, a native distributed database, replaced the original centralized database, supported Alipay's core transaction system, and began to undertake 10% of the transaction traffic of "Double 11".

In 2016, OceanBase 1.0 was officially released, and all Alipay's payment data chains and transaction data chains were run on OceanBase during the "Double 11" period, taking the lead in replacing core business within Alibaba.

In 2021, OceanBase released the HTAP hybrid engine and officially open-sourced it to the public, with more than 400 customers, and began to truly become a general-purpose enterprise-level distributed database.

“All in OceanBase”

When Yang Bing stands on the podium of OceanBase Database City Tour in 2024, distributed database is no longer the niche technology route of ten years ago, but has truly become a mainstream technology route in the database field.

Statistics from IDC show that by 2022, the proportion of distributed transactional databases in China has increased to 16.2% of relational databases.

In addition, IDC predicts that by 2027, the compound annual growth rate of China's overall distributed transaction database market will reach 28.5%, of which the growth rate on the public cloud will reach 32.8%.

This growth rate has exceeded the growth rate of the public cloud itself.

OceanBase 14 years: Difficult start, self-development, and integrated thinking

Yang Bing also learned from the exchange with some institutional analysts that with the current growth rate, by 2025, the proportion of domestic enterprises and scenarios using distributed databases is expected to exceed 50%.

As the most representative distributed database in China, OceanBase now has more than 1,000 customers, such as China Mobile, Bank of Communications, Li Auto, and other leading enterprises that have tested and prepared, and their core business is turning to OceanBase distributed database.

The distributed transformation of Bank of Communications' credit card system is a key step in its transformation to a fully distributed system, which includes the use of OceanBase at the bottom layer.

By using OceanBase, the data processing efficiency and system availability of Bank of Communications have been greatly improved, and the financial TPS (number of transactions processed per second) has been increased by 6 times, and the batch running efficiency has been increased by more than 7 times.

According to the data released by Yang Bing at the conference, OceanBase has served 70% of China's top banks, 75% of top securities, 45% of top funds, 20% of provincial mobile operators and 25% of provincial human resources and social security departments.

In this process, distributed databases have gradually become the standard configuration of modern databases, and more and more enterprises have begun to choose distributed databases and begin to "All in OceanBase".

The new trend of "integration".

On November 16, 2023, OceanBase 4.2.1 LTS was officially released at the OceanBase 2023 annual press conference.

OceanBase 4.2.1 LTS is unique in that it is the first long-term all-in-one database version supported by OceanBase.

OceanBase 14 years: Difficult start, self-development, and integrated thinking

What is a distributed all-in-one database?

In the traditional concept, the distributed architecture corresponds to the centralized architecture, however, the enterprise tends to be a linear development route, and there may be different requirements for the two types of databases at different stages, which means that it is difficult for database vendors to completely separate the two types of products from engineering and product design.

"Distributed and centralized are not opposites in themselves," this is the personal experience of Yang Bing and the OceanBase team in database product development and engineering practice in the past few years.

This is the source of the design idea of OceanBase that integrates distributed and stand-alone databases.

Yan Nan, Director of the DBA Group of the IT Department of the Vivo System and Process Department, pointed out, "vivo's internal business system has grown from a dozen database instances to thousands, including both business systems that use commercial databases and business systems that use open source databases. ”

OceanBase 14 years: Difficult start, self-development, and integrated thinking

In this process, vivo uses OceanBase 4.2.1, which is a single-node distributed integrated product of OceanBase.

Based on this version, OceanBase has been rolled out in 15 production business systems within vivo in half a year, and after replacing vivo's original MySQL database and table sharding architecture with OceanBase, the total resource occupation has been reduced by 80%, greatly reducing the operation and maintenance costs of the vivo team.

Integration is a database technology trend that Yang Bing firmly believes, and it is also a consistent direction throughout OceanBase's R&D history over the past 14 years.

In the past 14 years, the OceanBase team has achieved engineering integration, TP/AP integration, cloud and off-cloud integration, and single-node distributed integration based on its own database.

Not long ago, Ant Group officially announced that its subsidiaries Ant International, OceanBase and Ant Digital have established a board of directors and began to face the market independently.

At the same time, at the conference on March 20, OceanBase officially announced the upgrade of the coral plan released in 2022, increasing the proportion of proprietary cloud partner contracts to 70% and the proportion of independent partner delivery to 30%.

As a result, the pace of large-scale commercial implementation of OceanBase distributed databases has been accelerated.

As a technical team that has been independently commercialized since 2020, a new cycle in the database industry will quietly start in 2024 when OceanBase officially operates independently and accelerates its market entry.