Shanghai, July 7, 2023 – SenseTime, a strategic partner of the 2023 World Artificial Intelligence Conference (WAIC), held the "Boundless Love, Daily Innovation" AI Forum, and launched the multi-faceted comprehensive upgrade of the "SenseTime SenseNova" large model system, as well as a series of large model product updates and landing results under this system. In addition, SenseTime also highlighted and demonstrated the application practices of its large-model technology with all parties in the industry since its official release, including SenseTime's newly created intelligent cockpit products and vehicle-road-cloud collaborative transportation system, as well as the practical application in the production practices of finance, medical care, e-commerce, mobile terminals, industrial parks and other industries.
Xu Li, Chairman and CEO of SenseTime, said in the product launch session: "The breakthrough of large models has set off a new round of technological revolution in artificial intelligence, followed by explosive growth in industrial demand, and new application scenarios and application models are rapidly emerging. SenseTime hopes to continue to promote the leap forward and improve AI infrastructure capabilities through 'big model + big device', not only to create a basic model with more powerful general capabilities, but also to further efficiently integrate the expertise of different vertical fields, build a professional large model with better understanding of the industry and more expertise, fundamentally reduce the downstream application cost and threshold of the large model, and let the industrial value of the large model bloom in thousands of industries." ”
Meaning that "the speed of model iteration and the ability to deal with problems can be updated day by day", SenseTime's model system is undergoing high-speed iteration under its AGI strategic layout of "big model + big device". As a natural language processing model with hundreds of billions of parameters, SenseChat 2.0 breaks through the limitation of input length of large language models and launches model versions with different parameter magnitudes, which can perfectly adapt to the application requirements of different terminals and scenarios such as mobile and cloud, and reduce deployment costs. The model parameters of SenseMirage 3.0, SenseTime's self-developed large-scale model, SenseMirage 3.0, have increased from 1 billion since its first release in April this year to 7 billion, enabling professional photography-level image detailing.
In addition, SenseAvatar 2.0 digital human generation platform improves voice and lip shape fluency by more than 30% compared with version 1.0, achieving 4K HD video effects, and bringing AIGC image generation and digital human singing functions. In addition, SenseTime Qiongyu SenseSpace 2.0 improves the space reconstruction efficiency by 20%, the rendering performance by 50%, and the mapping time per 100 square kilometers of scene can be completed in only 38 hours (1200 TFLOPS/sec computing power support). SenseThings 2.0 achieves millimeter-level fineness in texture and material restoration of small objects, and breaks through the problem of collecting highly reflective and specular objects.
Every day is new, and multi-modal empowers industrial upgrading
Relying on the rapid iteration of the "SenseNova SenseNova" large model system in the underlying technology field, SenseTime is actively empowering industrial upgrading through the multi-modal capability combination of the large model, and bringing many new breakthroughs leading the industry.
In the financial field, SenseTime cooperates with banks, insurance, securities firms and other customers, uses digital humans for intelligent customer service, smart marketing, etc., and provides new functions such as investment research analysis and research report writing by accessing the ability of large language models to achieve cost reduction and efficiency increase. In addition, after mounting the financial knowledge base, it can also output content Q&A based on the customer's product description 100%, and realize timely update of information.
In the medical scenario, SenseTime has built a Chinese medical language model "Big Doctor" based on massive medical knowledge and clinical data, providing multi-scenario and multi-round conversation capabilities such as guidance, consultation, health consultation, and decision-making assistance, and will soon support multi-modal comprehensive analysis of medical images, text, structured data, etc., and continuously improve medical language understanding and reasoning capabilities, and continue to empower the rate of hospital diagnosis and treatment and patient service improvement.
Combining the comprehensive capabilities of Discussion 2.0 and Miaohua 3.0, SenseTime also brings a variety of intelligent interaction solutions to mobile terminal customers, including Q&A interaction for information acquisition, knowledge interaction for life scenarios, and content interaction for language and image generation, etc., relying on the lightweight version of SenseTime's model, which can be easily deployed and operated on mobile terminals. In addition, in the immersive sci-fi experience space "Three-Body Beyond Gravity", created by SenseTime based on Liu Cixin's award-winning novel "The Three-Body Problem", SenseTime uses the ability of large models to break through the boundaries of imagination and create and present a futuristic sci-fi voyage.
For offline scenarios, SenseTime uses large model capabilities to bring intelligent solutions such as long-tail fault identification and complex defect judgment to power grid inspection. Based on the spatial reconstruction of Qiongyu 2.0, SenseTime created a digital twin of the real space for the regional development of Mashan Town in Jinan, China Vision Park in Hefei, and Shanghai Ruijin Hospital to improve the efficiency of operation and management. In the jewelry industry, SenseTime relies on Gewu 2.0 to reproduce jewelry for jewelry brands, meticulously display the characteristics of product craftsmanship, and enhance customer shopping experience.
SenseTime has also reached channel strategic cooperation with a number of leading enterprises to build a "cloud + AIGC+ short video live broadcast" ecology, bringing more efficient, low-cost, convenient and easy-to-use AI video and marketing tools to the industry.
In the field of intelligent vehicles, SenseTime's intelligent cockpit, intelligent driving, vehicle-road coordination and other industry applications have also broken through the boundaries of innovation with the blessing of large models. In the intelligent cockpit, SenseTime perceives user needs in an all-round way through multi-modal integration such as vision and hearing, records user habits and preferences through tagged data, and provides exclusive personalized services. At the same time, SenseTime also uses the powerful environmental understanding, logical thinking and content generation capabilities of the big model to bring a more user-aware "cabin brain", as well as a digital human that can support rapid customization of image and voice for anthropomorphic interaction, bringing an intelligent cockpit experience that integrates safety, entertainment, education and efficiency.
Outside the cabin, relying on the powerful capabilities of "big model + big device", SenseTime deploys device-cloud collaboration, unifies traffic entrance, and supports private deployment and tens of millions of application requirements. In the recent CVPR 2023, SenseTime and the joint laboratory also proposed UniAD, a general large model for autonomous driving that integrates perception and decision-making, created an autonomous driving large model architecture with global tasks as the goal, and won the best paper award, proposing a new direction for the development of autonomous driving technology and industry. Based on this, SenseTime builds a vehicle-road-cloud collaborative transportation system, develops a roadside visual perception large model with a multi-modal and multi-task general large model, combines Qiongyu 2.0 and Lether 2.0 to build intelligent traffic twins and simulations, and uses the perceptual reasoning and human-computer interaction capabilities of Discussion 2.0 to promote the evolution of vehicle-road cloud to large-model conversational interaction.
Under the new wave of intelligent emergence, SenseTime has built a long-term competitiveness and innovation cornerstone in the AGI era with large computing power and large models, not only launching multi-task general large models for different fields, but also laying a new path for long-term development of innovation in basic scientific research and large-scale application of generative AI. Facing the future, the fundamental value of the big model is to reconstruct the productivity model and bring paradigm innovation to the industrial landing of artificial intelligence, SenseTime is committed to constantly jumping out of the limitations of cognition, embracing change, active innovation, and outsmart the future in the AGI era through daily efficient technology research and development and scenario empowerment.