The music business is increasingly attracting the attention of giants. In the past, comparative music platforms relied on the number of tracks and the number of artists stationed, and later paid more attention to the exclusivity of copyright. Now, the AI technology represented by deep neural networks is gradually approaching the landing, and looking at the world, the AI war on music platforms is about to erupt.
In 2016, Google Brain launched the Magenta project, which is more academically oriented, from the early NSynth neural network audio synthesis algorithm to today's Coconet machine learning model that restores Bach's music from fragments.
Sony, one of the world's three major music copyright owners, has a great advantage in the music content itself. In 2016, Sony Computer Science Experiments launched a large database of songs and styles, Flow Machines, which created a "Beatles" style melody.
Both Google and Sony own corresponding streaming products, such as the YouTube music service and Sony's Hi-Res. Similarly, streaming media may subvert the status of traditional record labels and songwriters, and they are bound to seize a new industrial collaboration model "bellwether". The difference is that Sony needs to look for the next growth point after the slowdown in the revenue of the music copyright business.
Under the popularity of smart speakers, Google can also take its own smart speaker Home as the core, thereby pulling the smart home ecology, but relying on low-cost subsidies to attract consumers' smart speaker trends are not ideal now, and the future will eventually rely on intelligent interaction and the coverage of the entire scene.
Microsoft's "Xiaoice" for the Chinese market has now grown to the seventh generation, based on the Avatar Framework artificial intelligence framework, in addition to intelligent dialogue, voice interaction, but also mainly simulates human real voice, writing lyrics and composing. In 2018, Microsoft Xiaoice also proposed the strategy of Dual AI semi-open ecosystem, and reached a platform strategy with a number of domestic companies, but still did not outline a clear business logic.
ByteDance, which is obsessed with overseas markets, after completing the acquisition of startup Jukedeck, acquired the music rights of T-Series and Times Music, two major record companies in India, in an attempt to synthesize music using neural networks in its TikTok short video products. Short videos carry an effective way for AI composers to achieve large-scale music, and may alleviate their pressure on music copyright.
At present, Google, Sony, Microsoft Xiaoice, and ByteDance are fully exerting their efforts on AI, but when it comes to the level of AI music, it is uneven. Early researchers were more likely to have computers mimic existing music fragments and make music melodies by analyzing the laws in them; the difference between AI creating music is that it allows computers to actually "automatically" create relatively complex and story-like music by learning a large number of music fragments. In this direction, Google and Sony began to explore ai music creation early; in contrast, Jukedeck, which was acquired by ByteDance, stayed more in the imitation stage and could only be used as a mass production tool on the music assembly line.
In a certain sense, technological progress has driven every progress in the music industry, from the earliest CD records to today's AI music, the production, distribution and consumption of music has shown obvious iterative upgrades. According to the International Federation of the Phonographic Industry (IFPI), global music market revenue increased by 9.7% year-on-year to $19.1 billion in 2018. For tech giants seeking deep business evolution, whether they can ultimately win will depend on the grasp of the opportunity. More importantly, the AI-driven music market revolution will set off a new round of competition with the frequent actions of giants.
AI is impacting the competitive landscape of global companies, and the music industry has entered a critical period of diversified value activated by AI.
Today, ping an of China's strength in AI music has far exceeded people's imagination.
On October 11, in honor of the 70th anniversary of the founding of the People's Republic of China, the world's first AI symphony variation "My Motherland and Me" created by Ping An Artificial Intelligence Research Institute was performed by the Shenzhen Symphony Orchestra for the first time in the world.

The AI symphonic variation "My Motherland and Me" is based on the modern and contemporary history of China, and contains five major movements, including the Opium War, the founding of New China, the tortuous development of the Republic, reform and opening up, and national rejuvenation, showing a sequence of historical changes. Accompanied by audio performances, a series of historical stories are presented to express deep feelings for the motherland.
Arguably, this is the first of its kind in the musical history of a symphonic variation. At the musical level, it transcends the previous single-dimensional, short-form, and entertaining scope, breaking through multi-dimensional, long-form, classical symphonies; more importantly, in the integration of AI technology, Ping An Technology's self-developed AVM automatic variation model training system, and then uses deep learning to achieve feature learning and extraction of music, and combines reinforcement learning technology to let the machine learn the variation technique.
Broadly speaking, AI composition is not a new term. From the earliest use of stochastic statistical models to the application of deep neural networks today, the use of AI to achieve intelligent creation has become a widely explored thing for scientists around the world. Still, at the research methods level, we see many recurring questions: how can data-driven algorithms avoid homogeneous musical styles? How can AI better "understand" music?
With this question in mind, Lei Feng Network (public number: Lei Feng Network) conducted an exclusive interview with the technical leader of the AI Symphony Variations "My Motherland and Me" repertoire team.
"In addition to some technical means in algorithms and data labeling, we are also considering directly analyzing the audio of music, and the purpose is to enable AI to understand and recognize music itself more and more deeply." The technical leader said.
Strong technology paving the way, bravely climb the AI music no man's land
In fact, as early as a year ago, Ping An AI composition won the first place in the World AI Composition International Competition organized by the Swiss Federal Institute of Technology (EPFL). In February this year, Ping An Technology once again won the first place in the Global AI Art Competition (GAAC) jointly organized by the Center for Art and Scientific Research of Tsinghua University with the ai-created pop song "Youth Memory".
Behind the high frequency of achievement breakthroughs, it is inseparable from the exploration and accumulation of Ping An AI team in the field of intelligent creation in the past two years or so. As early as 2017, Ping An Technology launched the three major music development directions of music portrait mask, music popular prediction and artificial intelligence composition, trying to integrate AI into the music field. At present, the team has accumulated a large number of annotation analysis data, independently developed a generative model that can complete specific tasks, and built an evaluation system in line with music theory.
The creation was prepared for nearly two months, and the most core part of the model training took nearly a month and a half.
The technical leader of the AI Symphony Variations "My Motherland and Me" repertoire team explains, "Usually, the entire creation cycle of a symphony is as long as one year, and this creation actually takes only one and a half months, but behind it is two years of technical reserves, model learning and data accumulation. From the technical point of view of symphonic variation creation, AI composition still has great challenges, especially creating a satisfactory work for human conductors to recognize and play. "You know, the symphonic variation is different from the general music generation process, it has a story line, and it has a strong emotional expression appeal. To this end, the project team has carried out technological evolution from the following three levels:
Self-developed AVM auto variation model
Based on rhythm, harmony, texture, orchestration and other aspects, an expert variation rule library is established for basic model training, and then deep learning and reinforcement learning technology is used to do multi-dimensional feature learning and extraction of musical works, and an AVM automatic variation model with style fusion ability is trained.
Train a dataset of more than 700,000 songs and create a massive dimensional music label system
In order to learn machines and understand the important features of music, the team training data used more than 700,000 songs, including classical music works, red songs, folk songs and so on. Moreover, the labeling of music labels follows the knowledge of music theory, in addition to the labels of mood and style, it also includes various musical elements such as themes, development techniques, harmonies, melodies, counterpoints, instrumentation, tonality, mode, and time signature.
Flexible use of music evaluation models and expert rules
Also training machines for deep learning and reinforcement learning is the music evaluation model, that is, the evaluation network built based on the learning of the works of a large number of composers. The principle is to follow the mainstream aesthetic while taking into account the evaluation criteria of composer experts. At the same time, in order to prevent AI composition from being too free, Ping An has incorporated expert rules including harmony constraints, counterpoint constraints, and song structure constraints into the process of artificial intelligence song creation.
In general, in the adaptation of "My Motherland and Me", in addition to the original melody used at the beginning and end, the creation of AI variations was integrated in the middle. In the process of AI application, the team comprehensively used the joint scheme of deep learning, reinforcement learning and transfer learning to build an automatic variation model, a music evaluation model, and an expert rule system, based on the database of massive historical music works and a systematic music label project, disassembled the music note combination space, and selected the best music fragments to complete this creation.
The unique dna of Ping An AI+
So why is it that an integrated financial services group engaged in finance, medical care and smart cities in the eyes of outsiders has also ventured into the seemingly unrelated art field of music?
Looking at the leapfrog development of Ping An Group in the past 30 years, it is not difficult to find the driving force hidden behind. At present, Ping An uses technology to empower finance, mainly platform construction, and has built five major ecosystems such as finance, medical care, automobiles, real estate, and smart cities, and the overall business layout has emerged, and Ping An Artificial Intelligence Research Institute is an important part of the group's underlying technology reserves and applications.
The technical leader of the AI Symphony Variation "My Motherland and Me" repertoire team said: "The establishment of the Ping An Artificial Intelligence Research Institute mainly has two things: one is to go deep into the research and ability reserves of the underlying technology; the other is to combine it with the current enterprise application scenarios. In his view, AI intelligent creation is one of the important components of the research institute's project section. Although the landing is not yet clear, it is still a relatively early stage, trying to explore and demonstrate and verify, but the support of the underlying technology is universal.
Previously, in combination with financial, medical, health and other businesses, Ping An launched smart flash compensation, Ping An voiceprint, Ping An bill OCR recognition, Ping An speech recognition, Ping An speech synthesis, Ping An medical imaging and other products.
In Lei Feng's view, the success of Ping An's exploration of the "AI + music" field will be mainly attributed to three factors:
The first is not only the reserve of deep learning technology, but more importantly, the team's deep understanding of music.
The AI composition project team of Ping An Artificial Intelligence Research Institute has a large number of compound talents who understand both music theory and computer algorithms, and can integrate cutting-edge AI technology with flexible musical emotions, continuously break through the boundaries of artificial intelligence technology, tap the potential of AI technology in the music field, and achieve the optimal development of AI composition.
The second is to have a relatively landable scene presentation, and know how to dig, such as music therapy, intelligent composition.
From a formal point of view, after the AI variations, Ping An Technology will also make more attempts and breakthroughs in classical music, pop music, lyrics, composition and singing. The integration of AI into artistic creation has greatly reduced the creative threshold of the general public, allowing more people to join in music creation, explore more forms of music, and greatly enrich people's lives.
Third, the continuous accumulation of data and scenarios will feed back the technical depth of the group in other industrial chains, in a sense, this will be a dimensionality reduction breakthrough.
In fact, Ping An Technology is already trying to dig up some interesting scenes and release more, richer and more personalized works of art through the form of AI. At present, from the perspective of the whole industry, there have been many demands for AI composition in scenes such as short video soundtracks, game soundtracks, and film and television soundtracks. In the future, the use of AI technology to create many application products, to achieve product output and business output, through the construction of multi-angle integration solutions to help the main business and ecosystem layer to develop in a more diversified and profound direction.
Perhaps, under the thinking of Ping An Technology to build a differentiated advantage, AI music creation is only a small step attempt, but this does not affect its combination of its own scientific and technological accumulation, main business and the advantages of its industry. In addition to the factors at the commercial level, enterprises will also define them from different perspectives because of the different levels of social value they undertake, which will have different degrees of impact on the development of AI art.
Carelessly, AI gave us a new understanding of ourselves
In the future, Ping An will further expand the scenes and fields of AI music applications, such as music appreciation, music education, music therapy, etc., in addition, artificial intelligence technology will further penetrate into the multidimensional field of human ideology, such as painting and poetry.
It's not hard to imagine that AI has not only changed our ability to create, but also raised the question of key technological breakthroughs. In the future, how can AI expand human creativity? How to use technology to expand the boundaries of art and enrich the diversity of art? AI can draw and arrange music, but can it be as moving as a human creation?
When talking about the transformation of AI to the art industry, the project technology leader believes that the use of AI to achieve intelligent creation can actually help composers and artists create more efficiently and explore their otherwise impossible works and style attempts. But among them, the human factor is still the most core and important link in artistic creation.
This answer undoubtedly shows the biggest space for the development of artificial intelligence in the future, and the more things AI achieves, the higher the standard of human creativity. How to get closest to human ideology and achieve a rich thinking and a breakthrough in imagination is the biggest difficulty facing AI technology. In more art fields, the intervention of AI technology has greatly reduced the threshold of art access, and also allowed the art field to penetrate into life and industry in more diverse forms, and even improve the development process of the human spiritual world to some extent.
In fact, the case of AI carrying history has been continuing: the Palace Museum has become an Internet celebrity because of AI technology, "Qingming On the River Map" because of AI, 3D, VR technology to make history "really" flow in front of the eyes, and Notre Dame Cathedral, which was hit by the fire, will also find another new "self" in AI technology.
The AI Symphony Variations of "My Motherland and Me" are also the perfect fusion of this romantic art and rigorous science.
We also see that on the long scroll of the 70th anniversary of the founding of New China, many technologies in our country have brought revolutionary breakthroughs. For example, early nuclear technology based on cybernetics, to supercomputer technology that broke the blockade, to manned space satellite technology, all made a significant leap in national power.
After 70 years of industrialization and informatization, we have entered a new era of intelligence, and ping an's "My Motherland and Me" AI symphony variation has overcome technical barriers, and to some extent, it has also brought artificial intelligence technology into a new territory, which is bound to leave its strong mark on the great historical node.
Source | Lei Feng network, reprint please indicate the source
Note: This article is the independent opinion of the author and does not represent the position of Soo56