introduction

After joining JD.com in 22 years, I have been engaged in test development in the data middle office testing department. After graduation, the most written documents are test plans and test reports, and there are few opportunities to review and summarize my own growth codewords. With the help of the "UP Technician" column, I finally looked back after work and made a small summary of my past two years.

This article is a big data test of Xiaobai's growth summary after entering the workplace, with the confusion of newcomers joining the company, as well as the experience accumulated bit by bit. I hope this article can be helpful to newcomers who are confused and students who are interested in big data testing.

1. Beginning in the workplace: stand at the crossroads of confusion

I majored in computer science and technology at my undergraduate and master's levels, and my research direction in my graduate school was network embedding. The leader only said to me: "It won't be normal at first, there is a high probability that you won't be exposed to these in school, you will get started little by little, it's okay, take your time." ”

When I first came into contact with work, I could only say that I couldn't understand it.

There are a lot of proper nouns and English abbreviations in the field of big data, such as "cluster", "queue", "RSS", "NN", "DN", "NS", etc., in the face of these unfamiliar concepts, I was really a little panicked, so I chose the most "student" method - reading books.

It is undeniable that reading books is useful, but the efficiency is too low, and in actual work, many of the transformations of big data storage and computing engines are self-developed and have a certain practical background, and relying on professional books cannot really help us carry out testing work.

Rather than professional books, team documentation and bold questioning are the way out for us to get started in the workplace.

The team document not only records the historical requirements testing records to help us understand the product background, but also through the good use of the search function, we can quickly understand unfamiliar terms and concepts, and improve communication efficiency. Among them, the production research and testing team has some self-named microservices and widgets, if it is not through the team search, or directly ask, we may need to spend a lot of unnecessary effort to understand. Therefore, we must be proactive, ask more questions and communicate more, so as to integrate into the group work faster.

Looking back now, I still miss the "rookie period". Every day, if you don't look for a mentor, the mentor will ask you "Do you have any questions today?", and even simple questions will be patiently answered for you. In addition, the monthly report of the newcomer and the 1v1 communication of the department are all good opportunities, and with the identity of "newborn calf", you can communicate directly with the leader if you have any doubts and suggestions. Although there were difficulties and challenges during that time, it was precisely because of the accumulation of bits and pieces during that time that I gradually embarked on a career path in the field of big data testing.

2. The road to advancement: Fighting monsters and upgrading should be done step by step

2.1 Step 1: Submit a big data computing task

Compared with traditional software testing, the core of big data testing is to verify the accuracy and reliability of data analysis and processing, and ensure that the big data system can process massive data efficiently and stably. There is a certain threshold for big data testing, which requires us not only to have basic software testing skills, but also to be familiar with the use of big data platforms. So, the first step might be to submit a computing task on a big data platform.

It's easy to say, but there's a lot to prepare:

1. Pass the big data platform exam: The exam provides certain training courses that can be used as a guide to the platform, which can help us have an overall preliminary understanding of the big data platform

2. Application permissions: including data permissions and account permissions to operate on data

3. Create a new big data task: After the permission application is approved, you can use your account to read data, which is a simple big data task

At this point, you've successfully submitted a task from a user perspective, and there's still a lot of work to be done for the big data platform and for you as a tester.

2.2 Step 2: Light up the big data product map

From the process of submitting a big data task, it is not difficult to see that the big data platform provides a variety of services, including not only user-facing data permission management, account management, process center, etc., but also the computing engine, scheduling engine, and storage related to task calculation after the task is submitted.

When following up on the demand in the early stage, there will always be endless problems.

"Can't find the computing environment for the task?"

"Why don't you have permission to read the meter?"

"Where can I see the metadata for the table?"

......

There is only one main service for testing, but there are many related services involved, and the background knowledge that needs to be understood may not know where to look or how to check it at first. Although the transformation is only a certain link in the long link of data processing, the combing of test scenarios is inseparable from the familiarity with the whole link. If the basic services and functional features of the big data platform are not clear, the quality assurance work cannot be completed.

In addition to the accumulation of daily needs, we also need to take the initiative to explore big data platforms. As a big data test and development engineer, exploring and lighting up our own big data product map is a must for us. The products of the big data platform are inseparable from the data and the task of data processing, so it is better to think about this problem from these two points.

The test growth path of big data novices

Familiarity with the services of the big data platform is a basic requirement for big data testing, so that we can better help the production and research team to conduct risk assessment. In addition, the big data platform itself provides users with a large set of data management tools, which can also help us in our work. For example, metadata query, compared with writing a script to view the relevant information of the table, you can easily query the structure, access, storage and other details of the table directly on the platform.

2.3 Step 3: Walk into the preparation for the big promotion

Big sales often lead to a significant increase in traffic and a surge in data processing needs. In order to ensure the stable operation of the service during the promotion event, the big data platform will have some key preparation measures, such as stress testing, emergency drills, emergency plans, etc.

When I first experienced the preparation for Double 11, I was in charge of a new big data service, so many of its big promotion plans were challenges from scratch. Due to my lack of experience, I also experienced my first overnight overtime at JD.com.

• Core hours can not be pressed

Although the existing stress test tools can support stress testing at the interface level, there are still problems such as how to arrange the stress test time, determine the stress test duration and traffic size, and the source of the stress test data. Due to the lack of experience, the preparation time was long, and the actual operation did not start until the day of the sealing. And when I was about to start the operation, I didn't know the problem of the core period, or I stopped the action in time after being reminded by the R&D classmates that there was a risk in the current period. Stress testing environments are often not completely independent of the online environment, and since we are running a new service at a time, we must avoid stress testing during core hours.

• Read interfaces can also generate dirty data

When sorting out the stress testing interfaces, we distinguish between read interfaces and write interfaces to better understand and control the data consistency problems that may occur during the stress testing process. But this is also misleading: since the stress test interface is identified as a read interface, and the stress test data is constructed independently, we do not consider that the interface may contain audit-related writes. It wasn't until near the end of the stress test that we received an alert call from a downstream service informing us that the stress test was affecting their service, and we realized that the read interface was also generating dirty data.

• The emergency plan cannot be a plan without action

An emergency plan is an emergency response to online problems, and its operation is risky. I remember that during the plan review phase, we originally planned to apply for resources to rehearse in the pre-release environment due to the high-risk operations involved. However, LDR raises a key question: what if there is no actual operation during the preparation period, what if there is a problem during the big promotion?

So far, I have participated in the preparation of three major promotions, and I obviously feel the increasing maturity of the preparation plan and implementation process of the major promotion. Even in this context, we still need to strictly follow the preparation plan, ensure that the key operation steps are coordinated with the product development team, and announce relevant information in advance to ensure that upstream and downstream services and platform users can predict risks.

In addition, based on the company's existing platform, the preparation for the big promotion is gradually transforming into a normalized work, and the preparation tasks have been gradually institutionalized and automated, forming a reliable solution. This series of online service assurance measures can not only provide solid support for the big promotion, but also conduct a risk assessment for each service launch to ensure that problems can be detected in time and solved earlier.

3. Competency review: A few tips for newbies

For students who are interested in big data testing, the following four points are worth paying attention to:

1. Master the basics of big data: be familiar with the core concepts of big data processing frameworks such as Hadoop and Spark and their applications in practical scenarios

2. Programming and scripting skills: proficient in at least one programming language (such as Java or Python), and proficient in using basic shell commands

3. Testing professional ability: have a solid basic knowledge of software testing and understand the basic quality assurance methods

4. Ability to learn and solve problems: have the ability to learn new technologies quickly, and be problem-solving-oriented, analyze and simplify complex problems efficiently

ps. These points are very similar to recruitment requirements, so you can also pay more attention to the recruitment information of the positions you are interested in, and cultivate your professional ability from the job requirements~

4. The future is in hand: the surging tide of technology

In a variety of emerging applications, testing tools and quality assurance methods at the application layer are undergoing a process of maturity and advancement. With the proven nature of many applications, the benchmarking, go-live, monitoring, and emergency self-healing required for a new app and web application to market have become increasingly standardized and systematic. However, compared with the testing of the application layer, the testing of big data-related products is more dependent on the professional ability of individuals, and usually requires a higher professional threshold. As a result, the coverage of big data tests tends to be lower than that of applications layer. This opens up many potential opportunities for us to explore.

1. Functional testing -> Quality assurance: Testing has gradually changed from functional testing to quality assurance. This requires testing efforts to focus not only on the product itself, but also on the quality and stability of the entire platform. Before we came into contact with this job, we may use "dots and dots" to describe the content of testing, but in the context of quality assurance, testing work also includes multiple dimensions such as the construction of efficiency tools, safety compliance assurance, and process specification formulation.

2. Technical Effectiveness: One of the main responsibilities of a test development engineer is to maintain and improve performance tools. For big data platforms, although automated testing tools are crucial, data generation and whole-process monitoring are still mainly manual operations. How to automate these processes is a challenge for us.

Every year, JD star like me joins JD.com and joins the data middle platform, maybe you can have more big data-related knowledge than me, maybe you are also starting from scratch like me, but I believe that you will not be disappointed here. Whether you encounter difficulties or want to show your strengths, there are teams standing by your side to help you, and seniors are far-sighted to guide you. We are waiting for you on JD.com.

The test growth path of big data novices