laitimes

Huang Dongxu: Reflections on the value of basic software products

In recent years, there have been many Local Projects in China in the field of basic software, how should we understand the product value of basic software? In this article, PingCAP founder and CTO Huang Dongxu will talk about his understanding from the perspective of a TiDB designer.

In the past few years, I have occasionally written some articles about the interpretation of TiDB product functions, tiDB 5.0 has been published for so long, and it should be written. I have actually expressed my emphasis on 5.0 on many occasions, this version may be MySQL 5.x for TiDB, friends who are familiar with the MySQL ecosystem must know what I am talking about, MySQL 5.x, especially 5.5 ~ 5.7 These important versions basically laid a solid foundation for the rapid expansion of MySQL, and also cultivated a large number of MySQL users and talents.

For me, TiDB is an excellent sample, before that, There are few open source basic software products made from zero to one in China, most engineers and product managers are the "users" of these software, more are building business systems, and TiDB allows us to participate in it for the first time from the perspective of "designers": the thinking behind the setting of each functional feature, the value of the basic software product presentation, the experience is still very different, write some feelings through this article, In addition, this article is compiled from my sharing of Presales and PM training within PingCAP before the Spring Festival, which is not necessarily correct and is for reference only.

1

What do we do mean to users?

To talk about the product value of basic software, the first hurdle to overcome is to learn to empathize. In fact, each version of TiDB comes with dozens of features and fixes, but most of the time our Release note just faithfully reflects "what we did":

Huang Dongxu: Reflections on the value of basic software products

Screenshot of release Note for TiDB 4.0 GA

Please don't misunderstand what I mean here, this type of record is very necessary, but this alone is far from enough. For example, in the TiDB version 5.0 ~5.5, we introduced n-many new features: cluster indexes, asynchronous commit transaction model, optimized SQL optimizer, support for CTE, introduced lock views and continuous performance diagnostic tools, improved hotspot scheduler, reduced latency for acquiring TSOs, introduced Placement Rules SQL... These names seem fine to TiDB's developers, but note that the more important question is: What do these mean for users (customers)?

There are two ways to answer this question, and I will talk about them separately:

Show value through a hypothetical target scenario, and then satisfy this scenario through the product.

Solve the most annoying problems of existing solutions (including their older versions) to show value.

For the first line of thinking, it usually applies to relatively new features, especially something new that has never been done before. To use a more understandable example: if you invent a car when everyone is driving a carriage, then if you use the car to solve the problem of horses eating grass as the value point, it is obviously more absurd, and it is more reasonable to depict the convenience of the high-speed commuting scene as the selling point. There are two key points to this line of thinking:

First of all, this scenario was originally imagined by the product manager (of course, there will certainly be a lot of interviews and fieldwork), so how to ensure that this scene is "high value" and "universal"? This is especially important for a successful basic software, usually in the early stages of the project to catch such a point, is equivalent to half of the success, of course, this is very high requirements for the product manager, usually need to have a strong vision and driving force, which is why many product companies CEOs are early large product managers, because the early CEO of the project needs to have both. Of course, it is stronger like Jobs's reality distortion field, and it is so bold and visionary to create an iPod/iPhone that has changed the whole world out of nothing (I believe Jobs should have been able to imagine today's world when he conceived the iPhone). There is no way to do this, basically rely on people.

Whether the value of your product is most directly reflected in this scene. The best directness is usually directed to the human heart, and it is the "feeling" that people can directly experience. For developer products, the anchor point I usually choose is "user experience", because a good experience is self-evident, and the car and carriage comparison is perfect in terms of commuting comfort and efficiency; just like TiDB and MySQL database sharding schemes are the same as elastic scalability, the experience is also complete. There are many ways to refer to this, and those who are interested can refer to my article on user experience.

The first line of thought is essentially storytelling, and the benefits of this approach are:

Very good verification, when you think the story clearly, then the natural typical user journey comes out, then you put yourself as a hypothetical user complete experience is verification, which is also the way I usually use to test our own product manager work.

Users are receptive, and the reason is simple: everyone likes to listen to stories, everyone likes to watch Demos, and everyone likes to copy homework.

For the second idea, it usually applies to some improved features, the key point of which is: How painful is the problem to be solved? There is no perfect software, there will be a variety of problems on the side of heavy users, and this kind of problem is usually difficult for developers of this function to appreciate, and what to do at this time is very simple, that is, to bend down to understand, to use, to feel. I often go to chat with the frontline students of our customer delivery team, and before I do this sharing, the general conversation is as follows:

Me: Regarding our SQL optimizer, what do you think is the biggest headache for you in your daily work?

Ta: Execution plan mutation.

Me: By the way, isn't that hint enough? And 3.0 introduced SQL Binding? Did that help?

Ta: For some difficult diseases, it is difficult for you to specify a specific execution plan through hint (and then attach an example from a real business scenario, a hundred lines of SQL, it is really impossible to start), and the problem with SQL Binding is, after I bind the SQL execution plan, if there is a better plan after that, do I need to start again?

Me: Why didn't we introduce SQL Plan Management in 4.0? Isn't the auto-evolution function in there exactly to solve this problem?

Ta: That's true, but we don't dare to drive in production environments, and for extremely important OLTP scenarios, we can't tolerate the risk of jitter caused by automatic changes to the execution plan.

Me: What do our products do to make you feel like your life is a little better?

Ta:1. For complex SQL to be able to choose the target execution plan, let me choose binding instead of constructing it through Hint; SPM finds a better execution plan, just need to notify me in time and I will make the change, rather than automatically decide on the change.

The two feedbacks from the last sentence above, I felt very enlightened after hearing it, in fact, these two requirements are very pertinent, and the cost of development is not large, but it does save a lot of time and the mental burden of the DBA.

There are many similar examples, but the point is: find the heavy user of the product, dig deep into his biggest headaches, and sometimes have unexpected gains (such as going to OnCall's site to observe everyone's operation). And the solution of this kind of problem is usually accompanied by a good sense of body, and some of the improvements in observability in recent versions of TiDB are basically through similar observations.

However, the value of the second way of thinking must be found to find the right audience, for example: usually the problems we solve for application developers (database users) are not the same as database operators (DBAs). In the face of the wrong object, the result may be chicken and duck.

2

When the user is saying, "I want this," what is he actually saying?

Product managers and solution engineers of basic software in China are difficult to find, I think there is a historical reason, like I mentioned above, in the past a long time, we usually look at software from the perspective of a "user", which means that from the problem to the solution is usually obvious, for example, suppose I need to do a high-performance, support sub-millisecond low latency read and write User Profile system, the amount of data is not large, can tolerate data loss, then I will use Redis well! But as a product manager at Redis, it's hard to design Redis' features for the purpose of a very specific user profile.

Huang Dongxu: Reflections on the value of basic software products

Good foundation software product managers often choose common skill points and use the smallest possible set of functions to include greater possibilities (such flexibility is encouraged, e.g., UNIX), so this places higher demands on pre-sales and solution engineers at foundation software vendors: many of the "features" needed by the business need to be combined with multiple "technical points", or by directing to the right problem to provide better solutions. I'll illustrate this point with a few examples:

As a first example, we are often asked by users: Does TiDB have multi-tenancy capabilities? My response to this question is not simply "yes" or "no", but will dig out what the user really wants to solve? What is the subtext? In the case of multi-tenancy, the following situations may not be escaped:

Subtext 1: "Deploy a set of TiDB for every business, it's too expensive", value point: cost savings

Subtext 2: "I do have a lot of business to use TiDB, for me the machine cost is not a problem, but the configuration management is too troublesome, but also have to upgrade one by one, monitoring what can not be reused", the value point: reduce the complexity of operation and maintenance

Subtext 3: "Some of my scenes are particularly important, some scenes are not so important, need to be treated differently, for the unimportant I want to share, but for the important to be able to isolate", value point: resource isolation + unified control

Subtext 4: "I have regulatory requirements, such as encryption and auditing for different tenants", value point: compliance

After figuring out the situation, I will take one of these different situations as an example: cost savings and talking about it. The next step is to think about what dishes we have on our hands.

For TiDB 5.x, there are roughly the following technical points related to the above features:

Placement Rule in SQL

TiDB Operator on K8s

XX (a new product of PingCAP, not yet released, please expect, roughly a multi-cluster visual control platform)

TiDB Managed Could Service

For the cost-saving demand, the usual reason is that the proportion of hot and cold data is relatively large, we observed that most large clusters are in line with the 2/8 principle, that is, 20% of the data carries 80% of the traffic, and especially for financial types of business, many times the data can never be deleted, which means that users also need to pay for the storage cost of cold data, this situation is deployed according to unified hardware standards, which is actually not cost-effective, so from the user's point of view, it is a very reasonable appeal.

The next step to think about is: there is nothing new in the world, what is the way for users to solve such problems now?

In scenarios like hot and cold separation, I've seen more scenarios where the cold data is used by HBase or other low-cost database solutions (such as MySQL sharding tables running on mechanical disks), and the hot data is still placed in the OLTP database, and then manually imported into the cold data cluster by time index (or partition) on a regular basis. In this way, for the application layer, it is necessary to know which data to go to query, which is equivalent to the need to dock two data sources, and such an architecture is usually difficult to deal with sudden cold data read and write hotspots (especially for ToC-side services, there will occasionally be some "digging" bursts).

Then the next question is: What difference does our product make to the user by solving this problem? If you still need users to manually do data relocation, or build two TiDB clusters with different configurations, it is actually not a big difference, in this scenario, if TiDB can support heterogeneous clusters, and automatically solidify the hot and cold data on a specific configuration of the machine, while supporting the automatic exchange of cold data to hot data, the experience is the best for users: a DB means that the business change and maintenance costs are the lowest. In TiDB 5.4, a new feature called Placement Rules in SQL was released, which allows users to use SQL declarative data distribution strategies, and naturally specify the distribution strategies for hot and cold data. Further, for multi-tenancy requirements of more complex data distribution, such as different tenants of data placed on different physical machines, but can be unified through a TiDB cluster, through the Placement Rules in SQL function can also be achieved.

3

Meta Feature: A treasure trove for solution architects

Speaking of which, I'd like to expand on the concept that there are some features that are different from others, and that can be used as a basis for building other functions and combining new features. This kind of function I call: Meta Feature, the Placement Rule mentioned above is a very typical Meta Feature, for example: Placment Rule + Follower Read can be combined into a database in the traditional sense of one write and read more (but more flexible, more fine-grained, especially suitable for temporary fishing or temporary query, to ensure that the data is fresh, does not affect the online business), Placement Rule + user-defined permission system = support for physical isolation of multi-tenancy; Placement Rule + Local Transaction + cross-center ownership = offsite multi-activity (WIP) ;P lacement Rule can also place elaborate facility data placement strategies, allowing TiDB to avoid distributed transactions (simulating sharding tables) and improving OLTP performance.

Huang Dongxu: Reflections on the value of basic software products

Meta Features are usually less likely to be directly exposed to the end user because they are too flexible and have a certain learning cost and barrier to entry (unless carefully designed with a UX), but this type of capability is especially important for architects /solution providers/ecosystem partners, because the more Meta Features, the more "playable" a system is, and the more differentiated solutions are created. But often we make a mistake: Does flexibility equal product value? I don't think so, and while engineers (especially Geek) have a natural affinity for this kind of openness, I'm skeptical about whether end users can tell such a story well, just look at the market share of Windows and UNIX end users. I recently heard a great example of this example, and I would like to share with you: you can't tell an American enthusiast that a latte is better, because you can flexibly control the milk content, and the milk content is reduced to 0 to include the American.

Let's look at another scenario, about batch processing. Friends familiar with the history of TiDB must know that the original intention of our earliest project was actually from the replacement of MySQL Sharding, and then slowly many users found out: Anyway, my data is already in TiDB, why not do the calculation directly on it? Or some of the original complex data transformation work using SQL has encountered a bottleneck in stand-alone computing power, and because of some business requirements, these calculations also need to maintain strong consistency and even ACID transaction support, a typical scenario is the bank's clearing and settlement business.

Originally young I did not quite understand, this kind of batch business directLy Hadoop ran well, and then after understanding the situation clearly, I found that I was still young, for the bank, many traditional settlement business is directly running on the core database, and the business is not simple, a Job on the hundreds of lines of SQL is commonplace, it is likely that the developer of this Job has disappeared, who dare not easily rewrite into MR Job, in addition, for the batch results, may also have to be backfilled into the database, And the whole process needs to be completed in just a few hours, and the completion is a production accident. Originally, if the amount of data is not so large, running on Oracle, DB2 minicomputer is no problem, but in recent years, with the rise of mobile payment and e-commerce, the amount of data is getting larger and faster, and scale-Up must become a bottleneck sooner or later. TiDB hits exactly two high value points in it:

SQL-compatible capabilities (especially after CTE support in 5.0 and the temporary table feature introduced in 5.3, the compatibility and performance of complex SQL have been greatly improved), and the ability to consistent transactions at the financial level is also supported.

Scale-out's horizontal computing scalability (especially after 5.0 supports TiFlash MPP mode, unlocking the ability to perform distributed computing on columnar storage) theoretically can be scaled up as long as there are enough machines.

For the bulk business of the bank, the headache of structural transformation has become a simple problem of buying a machine. But in TiDB's early design solutions, there were several pain points:

Bulk data import

distributed computing

For the first question, a typical TiDB bulk task is typically done by going down file (daily transactions are published as a file) - > writing these records to TiDB in batches - > calculations (via SQL) - > calculations backfilled into TiDB's tables. Archive records may be a lot of text files (such as CSV) format, the simplest way to write is definitely to write directly to a record with SQL Insert, this way to deal with the point of small data volume is not a big problem, but the amount of data is large, in fact, it is not cost-effective, after all, most of the import is offline import, although TiDB provides large transactions (a single transaction maximum 10G), but from the user's point of view there are several problems:

Batch writes are usually offline, and the core appeal of users in this scenario is: fast! In this scenario, a complete process of distributed transactions is not necessary.

Although there is a 10G boundary, it is also difficult for users to cut accurately.

The write process of a large transaction means that a larger memory cache is required, which is often overlooked.

A better way is to support physical import, directly distributed data files from the underlying storage engine, distribute them to storage nodes, and insert physical files directly, which is what TiDB's Lightning does. In a recent real-life user scenario, it was observed that Lightning used 3 machines to transcode and import ~30T of raw data in about 72 hours, and the import throughput was about 380GB/h. Therefore, in batch scenarios, if you can use the Lightning physical import mode, it is usually a faster and more stable solution.

Another pain point, the computing bottleneck (sounds quite unreasonable, hahaha), in the early TiDB did not support MPP era, TiDB only supports the 1-layer operator pushdown, that is, the results of the Coprocessor distribution in TiKV can only be summarized on a TiDB compute node for aggregation, if the intermediate result is too large, more than the memory of this TiDB node, it will OOM, which is why TiDB needs to introduce Spark in the past The reason for more complex distributed computing (especially the join of large tables and large tables), so in the past for complex batch businesses, it was still necessary to introduce a batch of Spark machines to supplement the computing power of TiDB through TiSpark. However, after TiDB 5.0, TiFlash's MPP mode was introduced, which can aggregate computing results through multiple TiDB compute nodes, so computing power is no longer a bottleneck, which means that it is likely that in some TiDB batch computing scenarios, 5.0 can save a batch of Spark machines, which means a simpler technology stack and higher performance.

Further, another reason for the introduction of Spark is that in the stage of calculation result backfill, due to TiDB's transaction size limitations and the efficiency of improved concurrent writes, we will use Spark to perform distributed data insertion on TiDB. This process can theoretically also be improved by Lightning, TiFlash MPP can output the resulting data into CSV, and Lightning supports data import in CSV format.

So the original link theoretically becomes: Lightning imports data to TiDB -> TiDB uses TiFlash MPP to output the calculation results into CSV -> and lightning again writes the CSV results to TiDB; it is possible to use TiSpark + large transaction schemes faster, more resource-efficient and more stable.

In this scheme, we can extend a little more carefully to think about it, the above scheme optimization is actually the use of Lightning's large-volume data writing capabilities, theoretically there is a "large write pressure" data import scenario, can be improved through this idea. I share a TiDB user real feedback here: after the customer's business system to TiDB, there will be a regular large table import scenario, they hope to first designate the large empty table through the Placement Rule to a specific idle host, and then quickly import data through Lightning, without considering measures such as throttling can also reduce the impact on the overall cluster and achieve rapid import; on the contrary, if TiDB does not have this scheduling capability, Customers can only keep the cluster stable by throttling, but the import speed will be slow. This example is the implementation of online batch write through Placement Rule + Lightning, which is also a good echo of the previous description of Meta Feature.

Originally, there was also an example of "sub-database table" vs TiDB in the offline sharing, because the length relationship is not expanded, and interested can follow the above ideas to think.

4

More implicit, but larger, longer-term value: observability and troubleshooting capabilities

In the last part, you can also see that recently I have been trying to convey this message, for a basic software product, an important long-term competitiveness and product value comes from observability and Troubleshooting capabilities. There is no such thing as perfect software, and the ability to quickly identify and locate problems is a must for experienced developers, and service support efficiency and Self-serving are also fundamental to scaling for the commercialization of underlying software, which is equally important in a cloud environment. I'm here to say some of the new things we've been doing lately, and the challenges ahead.

TiDB Clinic (tiup diag)

Why do you want to do this? In the past, when we were doing fault diagnosis, it was a painful process, in addition to the experience of the old driver mentioned in my previous article on observability, which was only in the mind of the old driver, I observed that in fact, most of the time consumed came from collecting information, especially deployed in the user's own environment, the user is not familiar with the system diagnosis, when asking for help from our service support, the frequent dialogue is:

Service support: Please run this command xxx and then tell me the result

Customer: (Results were given after 2 hours)

Service support: Sorry, please look at the chart of a certain indicator on your monitoring interface

Client: Screenshots are for you

Service support: Sorry, the time period is wrong... Then adjust the rules of grafana and do it again

client:! @##¥#¥%

Service support (changed to personal duty after a few days): Please run this command xxx and then tell me the result

Customer: Didn't you give it before?

This asynchronous and inefficient problem diagnosis is a source of great pain, and one of the core reasons why oncall can't scale. The user's pain points are:

Can't you just get all the information at once? I don't know what to give you."

"The information is too big and too complicated, how can I give it to you?"

"My dashboard is in the intranet, you can't see, I can only take screenshots"

"I can't expose business information, but I can submit diagnostic information"

But conversely, the pain points for TiDB's service support staff are:

"The original guess was not quite right and needed some other metric to verify"

"Unable to fully reproduce the metrics and system state at the fault site, I want to operate freely Grafana"

"Context sharing between different service support staff for the same user"

So there is the Clinic product, with the user's consent:

Automatically collect various metrics related to system diagnostics with tiup one click

Automated diagnosis of some common errors through a learning rules engine

Diagnostic information storage and playback platform for different tenants (SaaS-like)

If you are familiar with AskTUG (TiDB User Forum), you may see a link like this: https://clinic.pingcap.net/xxx (e.g. this case:https://asktug.com/t/topic/573261/13)

For users, only a simple command needs to be executed in the cluster, and the above link will be generated to share important diagnostic information with PingCAP's professional services support personnel, and we can see in the background:

Huang Dongxu: Reflections on the value of basic software products
Huang Dongxu: Reflections on the value of basic software products

In fact, TiDB Clinic is also a new attempt at the maintenance of basic software: the SaaS of diagnostic capabilities, through a rules engine that is constantly strengthened in the cloud, couples fault diagnosis and repair suggestions with local O&M deployment. Such a capability will become a new value point for users to choose TiDB, and it is also a strong ecological moat for TiDB.

TiDB Dashboard 中 Profiling

In my mind, for whether a basic software product is good, I have a special standard: the ones that come with Profiler are basically conscience products, and those who can optimize the Profile experience for UX are the consciences in the conscience. For example, Golang's pprof has been used to say incense. In fact, this point is not difficult to do, but the key moment can save lives, and usually when the accident can not be Profile, this time if the system tells you that you saved a profile record at the time of the failure, this kind of snow charcoal like help experience is very good.

Huang Dongxu: Reflections on the value of basic software products
Huang Dongxu: Reflections on the value of basic software products

In fact, this function comes from a few of the oncall cases that we have actually handled, are some problems that cannot be covered by metrics, there is a large class of failures, it is a hardware bottleneck, probably can not escape the CPU and disk, the disk bottleneck is relatively easy to check, roughly look at whether there is a large IO (Update / Delete / Insert) or RocksDB itself Compaction is good, but the CPU bottleneck to find the way is blurred a lot, Profiler is almost the only way:

What the Call Stack on the critical path of the CPU looks like

What do these function calls on the critical path imply? The second problem is usually the key to troubleshooting the problem and will give an optimization direction, for example, we found that the CPU consumption of SQL Parse/ optimization is particularly large, which implies that a mechanism such as Plan Cache should be used to improve CPU utilization.

At present, TiDB provides two profile methods in 5.x: manual profile and automatic persistent profile, the two application scenarios are different, manual is usually used for targeted performance optimization, and automatic continuous profile is usually used for backtracking after system problems.

5

The challenge

It's almost over, say something challenging. PingCAP was founded in 2015 and is now about to turn 7 years old, and in the past seven years, it has undergone some important industry changes:

Database technology has transitioned from distributed systems to cloud-native; although many people may think that these two words are not a concept at the same level, because cloud-native is also implemented by distributed systems. But I think cloud native is a fundamental change in the way design systems think, which I've mentioned in many of my other articles and won't dwell on it.

Open source database software companies have found a model for commercialization at scale: Managed Service on the cloud.

The global field of basic software is going through a phase from "usable" to "easy to use"

These points represent cognitive changes in two directions:

Technically, the technical challenge of going from relying on computer-dependent operating systems and hardware to relying on cloud services is enormous, e.g. is Data Replication still a must with EBS? With Serverless, can you break the limitations of limited computing resources? If this problem is superimposed on the existing system may have a large number of existing users will become more complex, of course, cloud native technology is not the same as public cloud Only, but how to design a path, slowly transition to a new architecture based on cloud native technology? This can be a huge challenge for R&D and product teams.

The second change will be a bigger challenge, because the business model is changing, in the traditional open source database companies, the mainstream business model is a service support-based human business, and the advanced one is an insurance business like Oracle, but these business models can't answer two questions well:

a. Differences in value between commercial and open source editions

b. How to scale, it is impossible to scale by manpower

The SaaS model can answer these two questions very well, and the integration of infrastructure software and SaaS models will have a greater amplification effect, which I mentioned in the article "The Cathedral Will Eventually Fall, but the Bazaar Will Live Forever", but the real challenge is: How does an organization of a software company oriented to traditional software sales + service support be adjusted to an operation-oriented online service company? From the perspective of R&D system, here are a few small examples: 1. Version release, for traditional software companies, releasing 1 to 2 versions a year is good, but online services may be upgraded once a week, do not underestimate the difference in the rhythm of this release, this difference determines the difference in the entire research and development and quality assurance system model. 2. If the service is provided on the cloud, then it requires a supporting operation support system (billing, auditing, troubleshooting, etc.) and the corresponding SRE team, which may not be in the traditional software development system, and the focus on user experience and developer experience becomes particularly important.

Of course, the challenges are more than these, and there are no standard answers, but I am still full of confidence in the future, after all, these trends are essentially accelerating the transformation of technology to social value and commercial value and lowering the threshold, which is a good and pragmatic change, which is of course good for a company like PingCAP, the front is the sea of stars, everything is man-made, promising. I originally wanted to write a small article, but I didn't expect to write it a little long, so let's go here.

Read on