Introduction: Performance improvement starts with small habits, which is the right posture for code management and Commit!

This is the right posture for code management and Commit! | R&D efficiency increased by 36

Columnist | Yachun

Volunteer Editor | Zhang Sheng

Software delivery is a code-centric delivery process in which code has several roles: first, what the final artifact is delivered needs to be clearly described through code; second, code defines how the system and software work; third, code defines how the system operates. All of this revolves around code.

So what should we do with code management and software configuration management?

Let's start with an example. The following diagram is the code organization structure of a team, what is the problem with such a code organization structure?

Issue 1: Code groups are named in a confusing way

We found that in the topmost directory called risk-managenment, this is a system, and this system is risk management. But the subdirectory is called "qinglong", and whether the "qinglong" is an application or a team, I don't know. Then there is a Xuanwu below, and there is also a ATeam below, mixed in Chinese and English, and this naming method is very confusing.

Problem 2: Storing external binaries in code blocks

In android-sdks will store a lot of SDK files, these files are very large, this code base stores a lot of external binaries, we know that in the code base directly stored such large files, the entire code base resource consumption is very large.

Issue 3: Codes that belong to the same are saved in different code groups

There is a data-model in the aTeam directory, but other related files are under The Basalt, that is, data-console, data-task, data-ui, we don't know what it is, but we know that these are the same application or the same product, so it is unreasonable at two different levels.

Issue 4: Public libraries are saved in subcode groups

The next one is common-lib, which is understood by name as a public library, but this public feeling is only used by the subcode group Xuanwu.

Issue 5: The documentation (or tests) of the application is stored separately from the application

Finally, there is a docs directory with risk-docs and data-docs, one for risk control systems and one for data-ground systems. Then the documentation in this is also a code base, a document code base and a test code base, which is separate from the application, which is also unreasonable.

What does a good codebase organize?

Question: Assuming that all code is kept in one code base and accessible to everyone, how should the code base be organized?

We think that code bases can be grouped, code groups (+ subcode groups) + code bases = large libraries.

Based on this logic, let's look at what the structure of the reasonable code group in the example should look like.

As shown in the image above, the entire code base is a system with two applications, one is risk and the other is data. Below each app is a lot of services and documentation. They have a public model called common-lib, which is relied upon by all applications. So we put together Git repositories that belong to the same application and let commons go where they should be. Not by team, but by application group, so the structure is clearer. Here we have summarized some practical suggestions.

Contents of the code base:

- The source code of the software (ProductionCode);

- Place the documentation (and test) git library under its relevant application group;

- Do not save artifacts (such as system binary packages) in the code base, and if you really need them, store them in LFS or similar;

(Editor's recommendation: Cloud code management Codeup provides free unlimited LFS storage for enterprises)

Organizational structure of the code base:

- Organize the code base according to the hierarchy of systems, applications and modules;

- Everything at the same system/application level is under the same code group;

Visibility of the code base:

- The generic code base is placed in a location where its common level can be accessed;

- Except for a few code bases such as core algorithms, it is recommended that access to the code base be made public to all relevant personnel under the same system/application;

Once the code is organized, developers can collaborate around the code base. The whole codebase works together: everything is Committed. Whether it's rebase or merge, it's Commit.

So what do we have to pay attention to for Commit?

What is good commit

We have summarized 3 suggestions for you:

1.Samll

Git libraries should be as small as possible. Especially in the current infrastructure situation, although you can put multiple applications in a warehouse, the cost of maintenance will be very large. There's also the management side, don't store build artifacts and other binaries on Git. Putting the build product on the build repository, although it is convenient for others, it is difficult to know whether the build product was produced by the current code or before, which is difficult to trace. For binary files, if it is necessary (such as game footage), it is recommended to use LFS to save.

2.Linear

Avoid meaningless merge and try to operate with rebase. The second is to avoid invalid commit, there are many code base commit records are very long, but 80% of them are invalid, such as fix1, fix2 such as the commit, but do not know what it does specifically, this is obviously unreasonable, for this lengthy commit list, sometimes you can swipe when the merge.

3.Atomic

Atomicity refers to the atomization of operations. What are the benefits of atomicity? A commit solves a specific problem, for example, I am fixing a UTcase, or adding a UT or adding a function, or adding an API, these explicit problems correspond to a commit, which is easy to trace. Solving the problem can't be very large, can't write 2000 lines of code to solve a characteristic, commit together, which is very dangerous. As a developer, a good job should be to have rapid and phased results, and continue to have feedback, and continue to be close to the goal. On the contrary, the experience of the developer is not good, and the experience of the relevant collaborators is not good, because others do not know how much you have done, and it is likely to have mergeconflicit with you.

Here are some of The Commit's anti-patterns:

1. Invalid commit

For example, Mergebranch 'develop' of https://codeup.aliyun.com/abc/xyzintodevelop the first problem, in almost all companies are casually pulled a code, both local and remote have this situation, originally a rebase to do things, this will lead to a lot of invalid commit, and even have a great impact on the commit traceability ability.

2. Giantcommit

A commit contains a lot of code variations, and belongs to multiple implementation purposes, like codereview, some people mention mergequest, all of a sudden more than 3000 lines of code, as a reviewer, you have no idea what he did, which is very dangerous.

3. Semi-finished products

Such as a commit that contains code with basic syntax problems or implementation errors. For example, when it's time for dinner, don't care, submit a handful first. Such code can't even be compiled, which is obviously bad and doesn't make any sense.

4. Mutual merge between branches

The last one is mutual merge between branches. From the levelop to the master, and from the master to the levelop, to each other, once this merger is more, the commit will be difficult to trace, because the source is not known. We suggest that the code base should have a unique backbone, single-aspire to the trunk merge, and try to avoid the situation of reverse merge.

(Recommended by Xiaobian: The backbone development model of Cloud Code Management Codeup advocates lightweight review and backbone research and development to help enterprises avoid complex mergers between branches~)

Software configuration management

Question: The software configuration is often modified and released, is it code?

Software configuration is actually a different form of code. It's possible that the configuration doesn't exist in a Git repository in practice, it could be in a configuration center or some other similar system, but wherever it is, essentially, we can equate the configuration with some type of code.

The following figure is the common static configuration and dynamic configuration, or startup-related configuration and run-related configuration.

Start the relevant configuration

Boot-related configurations are built into the image or passed in as boot parameters.
After startup, it is no longer modified, and there is no need to dynamically listen for its changes.
Modifications to this type of configuration generally require the container to be re-created or restarted.

And so on, which configurations are boot-related? For example, DB connection string, container CPU specifications, startup mode, etc. (for example, some stress test applications distinguish between master mode and worker mode when starting). Other configurations like DNS service addresses, etc., which we consider to be startup-related.

Run the relevant configuration

Typically obtained and updated by listening for a service or file. Let's say I want to see what my whitelist is, and I'm going to read the whitelist.
Configured updates are not required to modify containers and pods.
Running containers need to continuously listen for configuration changes, and when there are changes, they take effect automatically without restarting.

Let's take a scenario example to illustrate:

Adjust the log level during the promotion period and only record ERROR-level logs.
A blacklist and whitelist of services that blacklist certain IP addresses in order to restrict access to them.
Feature switch, which turns a feature on or off.
Monitor the sampling frequency from sampling once per minute to sampling every 5 minutes.

These configurations do not need and should not be redeployed every time they are modified, they are all run-related configurations.

Let's look at a demo example of which are startup-related and which are run-related. Let's take a look:

This is a parameter that will be required at startup.

We inject the secret file into the Deeployment, and the app automatically senses the value of the secret from the file without rebooting, so it's the configuration at runtime. The more the configuration of the inner layer, the higher the modification cost.

Looking at the configuration from another perspective, it has different layers, code, images, pods, and systems. The configuration in the code is in the innermost layer, and the modification cost is the highest. Therefore, if it is a code-level modification, it will have to go through all the stages before it can go live. If the phase is running, I don't need to move the previous part.

Finally, leave a question for everyone: which kind of configuration does the runtime environment belong to? Welcome to leave a message in the comment area.

The final state of software delivery is to provide a stable and predictable system, and to do this, it is necessary to ensure: 1. the consistency of the operating environment; 2. the consistency of the software artifacts. So in the next article, we will begin to share how to ensure the consistency of the operating environment, as well as common pain points and countermeasures in the environment. Stay tuned!

301 Moved Permanently

This article is the original content of Alibaba Cloud and may not be reproduced without permission.

This is the right posture for code management and Commit! | R&D efficiency increased by 36

So what should we do with code management and software configuration management?

What does a good codebase organize?

What is good commit

Software configuration management