"Trusted AI" battle: the "life and death" moment of the Internet giant

In recent years, "trusted AI" has gradually become one of the hot topics in the field of artificial intelligence, and the Internet technology manufacturers that support their business based on artificial intelligence have taken the lead.

In June this year, Ant Group announced the "Trusted AI" technical architecture system for the first time at the Global Artificial Intelligence Conference. In July this year, JD.com Discovery Research Institute released China's first Trusted Artificial Intelligence White Paper at the World Artificial Intelligence Conference. Both companies regard privacy protection, robustness/stability, explainability, and fairness as the four basic principles of "trusted AI".

If you look closely, the call for privacy protection and fairness predates the emergence of the term "trusted AI". For example, financial technology companies such as WeBank and Tongdun Technology have long begun to lay out data privacy, applying technologies such as federated learning and differential privacy to protect data to drive the research and development of data dependence models.

What is "Trusted AI"? Why do Internet giants frequently go down, so that the digital jianghu will set off such a vigorous research boom?

More importantly, in addition to the business community, academia has also invested in it. For example, in October this year, The Director of the Data Science Research Institute at Columbia University, ACM and IEEE, Published an article entitled "Trusted Artificial Intelligence" in the authoritative journal ACM Newsletter, detailing the past and present life, basic core and research significance of "Trusted AI".

Different from the "AI ethics", "trusted AI" not only calls for the development of technology to be people-oriented, but also starts from the artificial intelligence technology itself, emphasizing the robustness and interpretability of artificial intelligence algorithms and models. In other words, if "AI ethics" is the moral code of artificial intelligence society, then "trusted AI" is equivalent to the legal means of the era of artificial intelligence, and there will be opportunities to restrict the drawbacks of artificial intelligence technology from the root.

But why is the research on "trusted AI" first initiated in industry and then spread to academia? At present, why is the research subject of "trusted AI" an Internet technology giant?

The reason is also very simple: artificial intelligence has many "trust crisis" problems in large-scale landing applications, and both ordinary users and authoritative scholars have raised concerns about artificial intelligence algorithms. As the main force in the application of artificial intelligence technology, if the Internet manufacturers do not actively solve the trust problem of artificial intelligence, they are likely to face the fate of being eliminated.

01. A technological revolution that has been "forced" by users

The big players have to start looking at a problem: the public's trust in artificial intelligence is decreasing.

As we all know, the current NEURAL NETWORK-based AI technology has common diseases such as unexplainable, poor robustness, and excessive dependence on data, which contain "bestiality", which also contains many potential hazards when bringing unprecedented convenience to human society.

Such examples are not uncommon on the Internet.

According to the British "Daily Mail", in February 2015, the united Kingdom underwent the first robotic heart valve repair surgery. The operation used the Da Vinci surgical robot, known as the "Boston Dynamics of surgical robotics". However, this demonstration of what was originally a cutting-edge medical AI technology ended in a tragic failure. During the operation, blood splashed onto the camera and the machine was "blinded", causing the robot to "misplace the patient's heart" and puncture the aorta. Eventually, the patient who underwent surgery died a week after the procedure.

In September 2016, CCTV's Rule of Law Online column reported a serious accident that occurred on the Handan section of the highway in Hebei Province. A young man was driving a Tesla sedan when he failed to avoid the rear-end of the road sweeper in front of him in time, and the accident led to the death of the young man. According to the video analysis in the post-accident dashcam, at the time of the accident, Tesla was in a "fixed speed" state and crashed into the front car due to failure to recognize the dodge. This is believed to be the first case of car accident death in the use of Tesla's Autopilot function disclosed in China.

As a result of these real tragedies, the public's trust in AI has been greatly reduced. So even though countless studies have shown that the accident rate of vehicles using autonomous driving systems is much lower than the accident rate of existing vehicle driving methods, there are still voices of doubt:

"In autonomous driving, it is not the 99% accuracy that determines the success or failure of the traffic innovation, but the 1% error rate."

The current development of artificial intelligence is dominated by deep learning, and the most commonly advertised deep learning algorithm is "the accuracy rate is as high as 99.99%". Due to the "black box" characteristics of deep learning models, even the "Big Three of Deep Learning", which won the Turing Award (nobel prize in the field of computing) in 2018 for its achievements in deep learning and neural networks, cannot confidently say that an algorithm can achieve 100% accuracy.

The big three of deep learning, from left to right, are Yann LeCun, Geoffrey Hinton and Yoshua Bengio

From this point of view, when the accuracy of the deep learning algorithm can only reach 99% in real life, it will bring many problems that should not be underestimated. For example, if there are 100,000 self-driving cars in a city in the future, then according to the probability of the highest accuracy rate of 99%, there are still 1,000 hidden vehicles that may pose a threat to human travel safety.

In addition to "unexplainability", artificial intelligence systems have also presented many "user trust crises" caused by unfair design, unstable model conclusions and privacy violations in real life. The companies and users involved are not only the autonomous driving industry, but also more of a "national crisis".

For example, AI technology has become one of the indispensable driving forces for the development of the Internet industry. However, in the process of AI empowering the digital economy, the drawbacks of AI algorithms also appear frequently, making the AI products launched by enterprises cause some users to worry, and the voices of doubt are endless.

For example, some e-commerce platforms have the phenomenon of big data killing; content platforms have the phenomenon of homogenization of information content received by users under recommendation algorithms; last year, People reported that the takeaway system uses algorithms to trap riders in the system; social platforms also have the problem of exposing personal privacy data due to improper supervision; and in financial scenarios, there are also fairness problems caused by AI algorithm ratings behind businesses such as loan insurance.

A more direct example is still to cite transportation. Although not everyone will choose to buy self-driving cars with an accuracy rate of only 99%, the daily travel of contemporary residents is almost inseparable from the ride platform. The platform uses artificial intelligence systems to match drivers and passengers, route planning and automatic dispatch, which bring great convenience to people's travel, but also many problems caused by imperfect artificial intelligence technology, such as unreasonable route planning leads to overcharging, display pick-up and drop-off arrival time is seriously inconsistent with the actual and so on.

Behind the fairness of algorithms are two basic factors, human and non-artificial. If it is said that the use of artificial intelligence by enterprises to carry out "big data killing" is a controllable corporate moral issue (enterprises try to make profits), then similar to the normal dispatch of drivers with "many complaints" to users, resulting in threats to passengers' personal safety, it is likely to originate from the "uncontrollable" technical defects of the artificial intelligence system itself. The so-called "uncontrollable" means that the traditional artificial intelligence model has a "black box" problem in the decision-making process, that is, the reasoning process is not transparent.

The contradiction between the convenience brought by AI technology and the "untrustworthiness" of AI technology is gradually becoming the core problem of large-scale application of AI in real life.

For example, in a medical scenario, if a patient cannot trust AI, he will not listen to the diagnosis and medical advice given by the AI system, even if these diagnoses are correct and beneficial to the patient. Similarly, no matter how great the technology of self-driving is bragged to companies and users, if there is no universal guarantee, we dare not give "driving" to AI; even if online payment platforms such as Alipay are convenient, if the artificial intelligence algorithms used will cause users to lose money, we will not use them again.

Therefore, increasing the public's trust in AI has become crucial.

02. Why do companies enter the "Trusted AI"?

In view of the negative impact of AI in the landing, in addition to the nearly 150 AI governance principles and consensuses currently issued by government organizations around the world, such as the "G20 Artificial Intelligence Principles" in 2019 emphasizing "promoting the innovation of trusted AI", the European Union issued the "Code of Ethics for Trusted AI" in 2019, and industry and academia have also awakened and taken the initiative to advocate the study of "trusted AI".

So, why are JD.com, Ant, Tencent and other companies entering the "trusted AI"? Even Google has set up an "AI ethics team"?

One immediate reason is that trust is the cornerstone of business. In January 2002, Bill Gates, then at the helm of Microsoft, proposed in a Trusted Computing Memorandum to Employees and Shareholders that the four factors that constitute "trustworthiness" are security, privacy, reliability and business integrity.

With the fermentation of the ai trust crisis, ordinary users from the perspective of consumers, the attitude towards AI products is becoming more and more cautious; scholars from the perspective of technical research, concerned about the actual application consequences that may be caused by the defects of the AI model itself; and enterprises from the perspective of operation, have to face the user trust, technical hidden dangers and peer competition that need to be solved when applying artificial intelligence to empower the digital economy.

In recent years, national policies have also placed great emphasis on "people-oriented" in the landing of artificial intelligence. In other words, users are the core protectors of policy. For example, in May 2018, the European Union's General Data Protection Regulation (GDPR), known as the "most stringent privacy and data protection law in history", came into effect, while the French Data Protection Authority (CNIL) imposed a record fine of 50 million euros on Google for violating the GDPR, firing a warning gun at global companies that use technologies such as artificial intelligence to empower the economy.

In addition to the trust problem of users, there are two reasons why enterprises want to enter "trusted AI":

First, enterprises are also facing their own risk control problems. Unlike the stagnation of robots in the game, the loopholes and mistakes of artificial intelligence systems in medical, financial, travel and other scenarios may cause irreparable losses of money and security.

Internet payment platforms such as Alipay are subject to hundreds of millions of black attacks every day. Every day, they are facing a situation of "if you don't run, the black industry will run ahead of you", if the speed is slower than the black industry, the safety of thousands of Alipay users' funds will be threatened. At this time, the robustness of the risk control model and algorithm applied by Alipay becomes crucial.

Second, the current impact of AI on human society is gradually deepening, becoming a substitute for labor in more and more scenarios, if the company does not make defense and preparation in advance, it may be eliminated in a new round of market competition.

For example, the same is the online ride-hailing platform, any company can take the lead in developing a more powerful and stable artificial intelligence dispatch system, reduce the inferiority rate of driver-passenger matching, improve the matching rate of passengers and the nearest driver to reduce passenger waiting time, and automatically provide the best and fair travel price, then this company will be able to minimize cost reduction and efficiency and gain market competitive advantage.

For example, in the online payment platform, if Alipay does not actively improve the robustness and interpretability of artificial intelligence algorithms, but relies on manual methods to screen and identify fraudulent calls, labor costs will increase significantly; at the same time, if the original robust model is applied for screening identification and defense, it cannot outperform the black industry, then the losses suffered are immeasurable. At this time, if competitors take the lead in developing more robust systems and algorithms on "trusted AI", Alipay will lose its original city or face the fate of being eliminated. This may also be the core motivation for Ant Group to start trusted AI-related research as early as 2015.

In September 2018, the McKinsey Global Institute released a 60-page report analyzing the impact of AI on the global economy, making it clear that ai will generate $13 trillion in economic benefits worldwide by 2030, increasing global GDP by about 1.2% per year; in addition, the application of AI may widen the gap between businesses, and AI leaders are expected to double their returns by 2030. Companies that delay the use of AI technology will be far behind.

03. How do companies respond?

Forced by increasingly tightening policy pressure, user trust, and black industry pursuit, domestic and foreign manufacturers have to actively or passively devote themselves to the research of "trusted AI" and control the negative impact that artificial intelligence technology may have on human society through practical actions.

For example, after the GDPR penalty incident, Google launched a "forgetting algorithm" in 2019, allowing users to delete their personal privacy data on Google pages or Youtube, and promised to automatically delete users' location information, browsing history, etc. within a specific time (18 months).

In the Internet era, in addition to integrity, people also have to consider the problem of data security, and data leakage is the main source of trust crisis; and in the AI era, in addition to more severe data security issues, the uncontrollability brought about by the statistical nature of algorithms, the self-learning of AI systems and the unexplainability of deep learning "black box" models have also become new elements leading to user trust crises.

At this time, the trust problem is no longer entirely dependent on the company's own wishes, but also depends on the company's understanding and control of AI technology (data, algorithms, etc.). Therefore, from the perspective of enterprises, the willingness to promote "trusted AI" is only the first step to solve the problem of user trust. The key to the problem is whether the company can achieve the credibility of AI from the underlying technology.

A common prejudice is that the awareness of "trusted AI" by domestic manufacturers is far behind that of European and American countries. However, the fact is that as early as February 2015, Ant Group has launched a mobile phone loss risk research project based on "end characteristics", taking the first step in the research of end-cloud collaborative risk control, aiming to protect the privacy and security of users. In June 2017, Ant released AlphaRisk, the first generation of intelligent risk control engine with intelligent attack and defense capabilities, to carry out risk prevention and control on the Alipay side where users are concentrated. By 2021, the technical framework will be disclosed for the first time, and Ant Group has completed the 6-year-long road of "trusted AI" technology accumulation. According to the "Report on Key Technologies for Artificial Intelligence Security and Credibility" released by the authoritative patent agency IPR Daily in June 2021, The number of patent applications and authorizations of Alipay under Ant Group in this field ranks first in the world.

In general, the work of enterprises on "trusted AI" is mainly divided into three parts: instrument appeal, business management and technology research.

In terms of documentation, the most famous is the "Trusted Artificial Intelligence White Paper" released by the JD Exploration Research Institute this year, and the "Federal Learning White Paper" led by WeBank. Of course, trusted AI can't just stay on principles and consensus, but also needs to be implemented in technology implementation and organizational culture.

In terms of enterprise management, SenseTime established the Artificial Intelligence Ethics Governance Committee in January last year, and launched an ethical review system in the first half of this year, established a risk management system that runs through the life cycle of the artificial intelligence system, and traced and reviewed the whole process of the artificial intelligence system to be landed, setting an example for domestic science and technology enterprises.

At the level of technology research and development, the way to achieve "trusted AI" is mainly in two aspects: data and algorithms. Data issues focus on privacy and security, data bias, and the resulting unfairness, while algorithmic issues lie in explainability and robustness (also known as "robustness").

Data, algorithms and computing power are cited as the "troika" of artificial intelligence research, with the enhancement of users' awareness of the protection of private data and the increase in the risks caused by data breaches, how to seek a two-pronged approach in "data protection" and "data-driven AI research" will become one of the popular research directions of "trusted AI".

In addition to data privacy protection, in order to achieve "trusted AI", researchers in industry and academia are currently facing problems and the direction of centralized solutions in the future there are the following dimensions:

1) Fairness of data

The training results of the AI are directly dependent on the input data. However, due to the limitations of data collection conditions, the proportion of different groups in the data is uneven, for example, the current NLP training corpus is mostly English and Chinese, and more than 8,000 other minority languages are difficult to integrate into the AI world; and due to the problem of learning corpus, AI resume screening often automatically filters out job candidates with specific keywords, making them "AI transparent people".

2) Stability of the algorithm

There are multiple attack methods for data and systems that target AI models, such as poison attacks, confrontation attacks, backdoor attacks, and so on. For example, by feeding malicious comments into the model, the accuracy of the recommendation system can be affected, and specially designed patterns on traffic signs can mislead the automatic driving system into misidentification.

On the other hand, the form of interference is gradually spreading from the digital world to the physical world, such as directly causing physical interference to automatic driving and face recognition systems by printing countermeasure samples.

3) Interpretability of algorithms

Machine learning algorithms represented by deep learning are essentially an end-to-end black box. On the one hand, it is not clear why the trained AI model can have extremely high performance; on the other hand, it is not clear what factors the AI system relies on when making decisions. For example, an experimenter once asked GPT-3 (natural language processing model) "When will the COVID-19 epidemic end", and its answer was "December 31, 2023", but what is the basis for the answer? Researchers have no way of explaining, and it is naturally difficult to guarantee its accuracy.

Stills from Person of Interest

In view of the above problems, major domestic manufacturers have begun to study the layout and seek feasible technical means to solve the "roadblocks" on the road to "trusted AI".

Take data privacy protection. Emerging technologies such as "distributed computing" and "federated learning" that support "data availability is invisible" are very popular in the industrial community, especially favored by ant groups, WeBank, Tongdun Technology and other enterprises. In November 2017, Ant and the University of Berkeley launched the artificial intelligence open source project Ray, providing developers with computing resources and task scheduling support through a distributed platform, and continuing to provide open source support for developers in their communities around application scenarios such as privacy protection, intelligent risk control, and intelligent search; in 2019, WeBank open sourced the first industrial-grade federated learning framework FATE.

In terms of model robustness, domestic manufacturers are actively investing in the research of confrontation learning. In September 2017, Ant submitted the first patent related to text confrontation, "A Method for Identifying Risk Variants of Text Content Based on Pinyin Extended Features", and in the following three years, it continued to explore intelligent countermeasure technology solutions for content security scenarios, and applied for a total of 31 patents. Baidu proposed and opened the adversarial sample toolbox "Advbox", which uses advanced generation methods to construct adversarial sample datasets to conduct feature statistics of adversarial samples, attack new AI applications, and strengthen the business AI model through adversarial attacks to improve the security of the model.

In the applied research of interpretability, the performance of domestic large factories is also particularly prominent. In September 2018, Ant launched an anti-money laundering intelligent messaging system that automatically outputs message content containing risk control reasons and processing schemes for anti-money laundering regulatory compliance requirements; in 2020, Ant developed an interpretable graph algorithm, Suchk-alike, which can actively try scenarios in fraud cases.

Looking at the governance of AI technology by major domestic manufacturers, it is not difficult to find that the research and development of perception intelligence technology represented by SenseTime focuses more on the application control of "AI for good". Compared with perceptual intelligence, "trusted AI" has more distinct stakes in risk-sensitive scenarios represented by finance, and the research is more in-depth and thorough.

In the foreseeable future, enterprises such as finance, medical care, and travel that are closely related to the reality of human society may become the main force of "trusted AI" research, and a large number of key technological achievements that can be expected will emerge to guide artificial intelligence from the technical side to benefit society.

04. Corporate responsibility

A company can make breakthrough AI technologies, and it will go very fast; but it can only go further if its AI technology becomes credible. "AI breakthrough" and "AI credibility" are just like fighting the country and guarding the country, the former will make people more ambitious, but the latter is the guarantee of a stable life.

In the process of creating "trusted AI", enterprises are a force that should not be underestimated. On the one hand, enterprises are the main force of technology research; on the other hand, enterprises are the discoverers of AI landing problems. In the process of thinking about how to promote the commercialization of AI, the problems found by enterprises are fed back to the academic community, which can accelerate the solution of various problems in the landing of AI.

In addition, enterprises are pioneers in promoting the value of AI technology in human society. Ultimately, AI innovations achieved in the lab, whether positive or negative, must be productized by the enterprise, provided to users, and affected individuals.

In this process, protecting users is also protecting the enterprise itself. Although it is a technological revolution forced by black industry and users, the crisis of "trusted AI" does not stem from the public's ignorance of technology and the stubbornness of their own ideas, but the technology of artificial intelligence itself has not yet been developed and perfected. As mentioned above, there are still many technical problems that need to be solved in the current research of "trusted AI".

Enterprises charge, academic circles cover, and all forces unite. Only when more and more researchers are involved will AI achieve "trustworthiness" in the near future.

Reference Links:

1. Bughin, J., Seong, J., Manyika, J., Chui, M., & Joshi, R. (2018). Notes from the AI frontier: Modeling the impact of AI on the world economy. McKinsey Global Institute, Brussels, San Francisco, Shanghai, Stockholm.

2. Trustworthy AI

https://cacm.acm.org/magazines/2021/10/255716-trustworthy-ai/fulltext

3. Bill Gates: Trustworthy Computing

https://www.wired.com/2002/01/bill-gates-trustworthy-computing/

4. Google Will Delete Your Data by Default—in 18 Months

https://www.wired.com/story/google-auto-delete-data/

5. ETHICS GUIDELINES FOR TRUSTWORTHY AI

https://ec.europa.eu/newsroom/dae/document.cfm?doc_id=60419

"Trusted AI" battle: the "life and death" moment of the Internet giant

Read on