ChatGPT ignites the vector database track, just now, Zilliz Cloud cloud service is released!

Since OpenAI released ChatGPT in November last year, the AI market has been completely detonated, and domestic and foreign technology companies have released their own AI large models. The development of domestic large models has also ushered in unprecedented opportunities, and the fierce battle of "hundred models" is in full swing. At the just-concluded World Artificial Intelligence Conference, domestic and foreign technology companies joined in the board, more than 30 large models were unveiled in a concentrated manner, the dust of the "national team" was settled, and the formulation of national standards for large model testing was officially launched, setting off a new round of large model heat wave. As a "large model memory" and an important part of the new paradigm of AIGC application development, the evolution of vector databases has gradually reached an unprecedented new height.

Since its official open source in 2019, Milvus has grown into the world's largest and most active vector database open source project and developer community. As the developer and operator behind Milvus, Zilliz has always been at the forefront of vector databases, always adhering to the concept of providing developers with easy-to-use and cost-effective vector database services. After five years of continuous polishing, Zilliz Cloud, a fully managed vector database cloud service product based on Milvus, was finally launched in China.

ChatGPT ignites the vector database track, just now, Zilliz Cloud cloud service is released!

After continuous development and upgrading, Zilliz Cloud has become a leader in the vector database track. With Zilliz Cloud fully opening the vector database cloud service in China, it has also opened a new era for the rapid development of vector database. For the implementation of this service in China, Zilliz's mission and goals are particularly clear and clear:

Provide the world's most professional fully managed vector database cloud service.
Break the embarrassing situation that vector database services are concentrated in North America and there are no vector database services available in China.
Meet the multi-cloud requirements of vector database services and avoid business limitations by a single cloud environment.
Provide feasibility for unified vector database services and architectures required in cross-border business.
Milvus open source solutions, SaaS, PaaS unified interface standards, seamless offline/cloud migration, and greatly reduce the overall cost of hybrid deployment.
Provide products and solutions with more cost-effective and stable service support than open source Milvus.

Mature and stable, the world's first to support billion-level vector scale services

Since open source, Milvus has always been the first choice for enterprise users to build their own vector data platform, and a full set of technical solutions has been adopted by tens of thousands of enterprises, among which Baidu, Sina, Li Auto, Huatai Securities, Walmart, LINE, BIGO and other leading enterprises have been repeatedly verified in practice and have been successfully put into production.

The vector database is an important complement to the AIGC large model and a key carrier for providing accurate, reliable, and highly scalable long-term and short-term "memory". In the past year, vector database projects have sprung up. However, the scale of vector data supported by most vector databases is only in the order of tens of millions, and it does not have the ability to support the production environment.

In contrast, Milvus' customer application scenarios in the past five years have covered all walks of life, and as early as 2021, it has achieved stable support for billion-scale online services. Today, Zilliz Cloud's vector database service can easily support more than one billion scale vector data with up to 99.9% availability.

In addition, behind the products and technologies, Zilliz also has the world's most senior vector database expert team, which can equip each enterprise user with 4 technical support, "no one knows vector databases better than us" is the team's commitment to the open source community and commercial users.

High performance + high cost performance, excellent performance far beyond similar products

At present, the mainstream vector data indexing algorithms are memory algorithms or memory/SSD hybrids, the algorithm core is mainly matrix computing (similar to HPC), and large-scale vector retrieval and analysis is a computation/memory-intensive task. This means that vector databases, as infrastructure, are more sensitive to performance and cost.

In terms of performance, Zilliz Cloud far outperforms other similar products in terms of QPS and reduced query latency. We compare the four common vector databases of Zilliz Cloud, Milvus, Pinecone, and ElasticCloud (ElasticCloud is not strictly a vector database, but it has vector capabilities, and has the widest audience in the field of traditional text retrieval, which can be regarded as the representative of the current traditional database support vector search) under the same conditions of the same resources and 6 sets of vector query tasks (the test framework has been open source, see for details). VectorDBBench，Leaderboard）。

The results of the comparison are as follows:

In terms of query throughput, Zilliz Cloud comprehensively pressured the vector database Pinecone in all 6 sets of query tasks, and the overall performance exceeded 2 times on average. At the same time, Zilliz Cloud has nearly doubled compared to Milvus, and its performance is eye-catching. As a representative of traditional text retrieval services, ElasticCloud's vector query capabilities are mainly supplementary capabilities, and the QPS of these six query tasks is below 50.

In terms of query latency, Zilliz Cloud as a whole is below 10 ms, Milvus is below 20 ms, Pinecone is between 20-40 ms, and the gap between ElasticCloud is more obvious.

In terms of cost performance, it mainly examines Queries per dollar (the number of query requests that can be supported by unit cost under high concurrency). Compared with Pinecone and Elastic, Zilliz Cloud's advantages are obvious. The indicator can be up to 1 order of magnitude higher than the second-place Pinecone (Q1, Q2), and generally about 3 times higher in the remaining four sets of tasks. (Because Milvus is an open source solution and difficult to compare with commercial services under the same standards, we removed it from this set of tests.) ）

With the blessing of black technology, the performance of software and hardware soared, and the new kernel was fully firepowered

Zilliz Cloud uses a commercial engine with more than 1x the combined performance of Milvus' open source engine. The engine is deeply optimized for typical scenarios and can improve performance by 3-5 times.

At the hardware level, Zilliz has long-term and stable cooperation with first-line hardware manufacturers such as NVIDIA and Intel, and the vector algorithm core has been customized and optimized for X86, ARM, and GPU.

At the software level, Zilliz Cloud launched Autoindex. Intelligent indexes are continuously tuned automatically based on the user's vector dimension, data scale, data distribution, and query characteristics, eliminating the pain of user index type selection and parameter tuning. According to Zilliz's internal tests, autoindex intelligent indexing has reached 84% of the manual tuning effect of vector database experts, which greatly exceeds the user average. In the next stage, the functionality of AutoIndex Smart Index will be greatly enhanced, allowing users to specify recall for optimization and ensure that the index runs at the best advantage of the specified query accuracy.

Of course, for the recent fire of AIGC applications, Zilliz Cloud has also launched special feature support:

Dynamic schema can flexibly extend vector features or label fields according to AIGC iteration needs.
Partition Key, a powerful tool that supports AIGC application multi-user knowledge base, can reduce the comprehensive cost by 2-3 orders of magnitude compared with the individual table building solution.
Support JSON type, you can combine JSON and embedding two super capabilities, to achieve mixed data representation based on JSON and embedding vector, as well as complex business logic.

Break the "CAP" impossible triangle and give users flexible choices

Vector database technology is not perfect now, usually, the business needs to make a trade-off between cost, query effect and accuracy, and query performance, that is, the CAP problem of vector database. At present, CAP is an impossible triangle, and Zilliz's solution is to give a local optimal solution at a typical location and give the user a flexible choice.

In fact, the common scenarios of users can be summarized as performance-demanding, capacity-demanding, and cost-sensitive. To this end, Zilliz Cloud also provides three types of support in vector database instances: performance, capacity, and economy. Different instance types are composed of different algorithms and hardware resources, which are suitable for different business scenarios.

Performance instances are suitable for vector similarity retrieval scenarios that require low latency and high throughput, and can guarantee millisecond-level responses.

Applicable scenarios for performance-oriented instances include, but are not limited to, generative AI, recommendation systems, search engines, chatbots, content moderation, LLM-enhanced knowledge base, and financial risk control.

Capacity instances can support five times the amount of data than performance instances, but the query latency increases slightly, making them suitable for scenarios that require a large amount of storage space, especially those that need to process more than 10 million vector data.

Capacity-based instances are not limited to large-scale unstructured data (such as text, images, audio, video, medicinal chemical structures, etc.), infringement detection, and biometric authentication.

Economy instances can support the same data scale as capacity, but the price is about 7% off, and the performance is slightly reduced, which is suitable for scenarios that pursue high cost performance or are sensitive to budget.

Applicable scenarios for economical instances include but are not limited to: data labeling or data clustering, data deduplication, data anomaly detection, and balanced training set type distribution.

Supports full ecological coverage of large models and unstructured data processing

No system can meet all the business needs of users, and the same is true for vector databases. In services supported by vector databases, it is often necessary to handle multiple processes, including:

Semantic structuring of business data, such as combing headings embedding, embedding of content paragraphs, primary and secondary topics, reading time, from text data;
Model selection for end-to-end results, such as finding the embedding model selection that brings the best results;
The integration of the model with the vector database, such as the original data recall driven by the vector database query and the subsequent LLM summary or reconstruction of the recalled content.

In order to further reduce the cost of building applications and provide standardized components, Zilliz Cloud provides dual support for developers:

Large model ecological docking. In March 2023, Zilliz, as the first vector database partner of OpenAI, completed the plug-in integration of Milvus and Zilliz Cloud, and was included in the list of officially recommended vector database plugins. Not only that, Zilliz also has deep integration with popular projects such as LangChain, Cohere, LlamaIndex, Auto-GPT, BabyAGRI, and more. In addition, the docking work with domestic large models such as Wen Xin Yiyan, Tongyi Qianwen, Zhipu AI, MiniMax, 360 Wisdom Brain and so on is underway, and more results will be released in the near future.
For unstructured data processing pipelines. Zilliz Cloud provides an open-source framework for Towhee tools. Developers can write their own pipelines in the familiar Python environment with Spark-like operator syntax, and easily handle ETL processes for unstructured data such as text, images, audio, video, and compound structures. Towhee also provides automated orchestration tools, which are organized into service images based on Triton, TensorRT, ONNX, and a series of hardware-accelerated algorithms with one click, and are organized into service images based on Triton, TensorRT, ONNX, and a series of hardware-accelerated algorithms, targeting typical scenarios such as text approximation search, intelligent question answering, and knowledge base. Of course, Towhee also offers deeply optimized standard pipelines.

At present, Zilliz Cloud provides SaaS and PaaS services, of which SaaS has covered AWS, GCP, Alibaba Cloud, and PaaS covers AWS, GCP, Azure, Alibaba Cloud, Baidu Intelligent Cloud, Tencent Cloud and Kingsoft Cloud. The domestic official website has been launched simultaneously, and more details and cases can be visited at Https://zilliz.com.cn (overseas official website and cloud service portal: Https://zilliz.com).

In order to accelerate the polishing of industry best practices, we will soon launch the "Looking for CVP Practice Stars in the AIGC Era" special activity, Zilliz will select application scenarios together with the leading large model manufacturers in the United Nations, and the two sides will provide vector databases and top technical experts of large models to empower users, polish applications together, improve landing effects, and empower the business itself. If your application is also suitable for the CVP framework and you are worried about the implementation and actual effect of the application, you can directly apply to participate in the event to get the most professional help and guidance. (Note: CVP stands for LLMs represented by ChatGPT, V stands for Vector DB, P stands for Prompt Engineering, see [email protected] for contact information)

In 2023, with the outbreak of AGI and LLMs has passed the halfway, it is urgent to accelerate the exploration of the road to the landing of large models. The high consensus of the industry promotes the advent of the AI singularity, and the large model will reconstruct enterprise-level applications and reshape the development direction of the artificial intelligence industry. Zilliz said that in the future, it will continue to focus on the forefront of the development of the vector database industry, aiming at the intelligent evolution of all walks of life, and provide the most competitive "big model memory" for enterprises and developers in the era of large models.

ChatGPT ignites the vector database track, just now, Zilliz Cloud cloud service is released!

Read on

The AI search that ChatGPT did not do is not the next battleground

最强OpenAI发布新ChatGPT-4o,AI领域的突破情感识别+视觉理解

OpenAI overturned the voice assistant overnight! ChatGPT learns to look at screens, and the real-life version of Her is here

Sudden Kill! The Chinese version of Ali ChatGPT is here! I couldn't resist signing up for the experience

Hu Xijin is going to lose his job? Netizens used ChatGPT to imitate "Hu Biao" writing, laughing crazy

Let's talk about ChatGPT-4o from the perspective of human-computer interaction

The iOS version of ChatGPT updates support the app's preferred language setting Chinese

How to make ChatGPT "understand you" better

Risk and Governance of Generative AI – The Case of ChatGPT

This is the biggest update for ChatGPT4o! The press conference didn't mention a word! GPT-4o's image recognition ability is so strong! Even the portrait photo can tell who I am 👍 here

ChatGPT's new feature is online: when chatting, you can directly select network disk files such as OneDrive

ChatGPT is able to help doctors accurately analyze clinical studies and medical records

ChatGPT consumes more than 500,000 kWh of electricity per day, and it is energy that is stuck in the development of AI?

Terror! Imploring a Stanford professor to help it "break from prison"? ChatGPT-4 has emerged since

and ChatGPT engage in yellow young people

Former OpenAI director reveals the inside story of Ultraman's recall: The board of directors knew that ChatGPT had been released from X