laitimes

Retrieval Augmentation Technology of Knowledge Graph | May 7 TF131 registration

author:CCFvoice

In this issue, we have invited the research leaders of retrieval augmentation technology from leading enterprises such as 360 Artificial Intelligence Research Institute, Tencent AI Lab, Alibaba Tongyi Lab, NetEase Youdao QAnything, etc., to discuss the cutting-edge development of retrieval augmentation technology and the opportunities and challenges of the integration of knowledge graph and retrieval augmentation technology, and further understand the technological breakthroughs and corresponding solutions and cases. Join the discussion.

Provide a top-level communication platform for engineers

CCF TF No. 131

Time: May 7, 2024, 19:00-21:30

(Online Meeting)

Topic: Retrieval Enhancement Technology of Knowledge Graph

Please scan the QR code for more information and register for the online conference

Registration link: https://ccf.org.cn/TF131

Retrieval-Augmented Generation (RAG) significantly improves the performance of large language models on knowledge-intensive tasks by introducing document content related to questioning through text similarity calculation. However, the naïve RAG approach still faces many limitations. For example: 1) Complex reasoning is weak, and it is difficult to cope with tasks that require multi-hop and complex reasoning by relying on similarity retrieval alone. 2) Isolated textual representations, where similarity only determines relevance and cannot capture the specific relationships between texts and why they are related. 3) It is susceptible to noise, and retrieving incorrect or noisy information may result in incorrect answers. 4) Lack of multimodal data fusion, not making full use of images, tables and other data in documents.

As a structured knowledge base that has been deeply processed and verified, Knowledge Graph (KG) can provide timely and reliable information and clear logical reasoning paths. The combination of KG and RAG has great potential to minimize the limitations of existing RAGs and provide more accurate, context-aware, and nuanced responses.

In this context, this conference is fortunate to invite a number of RAG technical leaders from well-known enterprises to share RAG's cutting-edge progress and enterprise implementation practices around the current RAG technology development trend, RAG research paradigm, and optimization of RAG's key technologies. The purpose of this event is to build a high-level diversified communication platform to provide reference for RAG's research and application developers from different levels.

2. Arrangement of the meeting

TF131: Retrieval Enhancement Technology for Knowledge Graph

Moderator: Haofen Wang, Chairman of CCF TF Knowledge Graph SIG, Distinguished Researcher of the Hundred Talents Program of Tongji University

Time topic Speakers
19:00-19:05 Description of the event
19:05-19:35 Document comprehension and knowledge base construction practice in the implementation of RAG

Liu Huanyong

Senior algorithm expert of 360 Artificial Intelligence Research Institute

19:35-20:05 Retrieval Enhancement Generation? Retrieval is generated!

Chaden

Senior Researcher at Tencent AI Lab

20:05-20:35 GTE-Embedding/Ranking:统一文本表示与排序模型

Zhang Yanzhao

Algorithm Engineer of Alibaba Tongyi Laboratory

20:35-21:05 Youdao QAnything's landing experience sharing

Lin Hui

Technical Director of NetEase Youdao

21:05-21:20 Participants ask questions and interact with each other
21:20-21:30 Summary of the event

3. SIG

CCF TF 知识图谱SIG

4. Invited speakers

Retrieval Augmentation Technology of Knowledge Graph | May 7 TF131 registration

Liu Huanyong

Senior algorithm expert of 360 Artificial Intelligence Research Institute

Topic: Document comprehension and knowledge base construction practice in RAG implementation

Topic Introduction:

RAG retrieval enhances large model Q&A, which has become an important paradigm for large model implementation, and it is widely used but also faces many challenges. This report will focus on some exploration practices of our team in document understanding and knowledge base, including end-to-end OCR-Free scheme and pipeline-based integration scheme, and introduce some experience of knowledge graph structure for document organization.

Biography:

Liu Huanyong, the person in charge of the knowledge graph and document cross-modal direction algorithm of the 360 Institute of Artificial Intelligence, used to work at the Institute of Software of the Chinese Academy of Sciences. His main research interests are document comprehension and knowledge enhancement. In recent years, he has participated in the research and development of landing projects such as 360 document model, 360 intelligent brain model, 360 encyclopedia map, commercial advertising map, and right recommendation, and has applied for more than 10 invention patents, several papers, and more than 70 open source projects.

Retrieval Augmentation Technology of Knowledge Graph | May 7 TF131 registration

Chaden

Senior Researcher at Tencent AI Lab

Topic: Retrieval Enhancement Generation?

Topic Introduction:

The combination of Enhanced Retrieval Generation (RAG) and Language Models (LLMs) continues to attract the attention of academia and industry. This report will introduce a language model architecture (CoG) that directly replaces generation with retrieval. Like traditional language models, CoG generates text in a left-to-right autoregressive manner. The difference is that traditional language models repeatedly predict that the next word will be picked from the word list (next-token prediction). CoG retrieves the next phrase at a time from an explicit external memory (next phrase retrieval). Our analysis shows that compared with traditional language models, CoG has advantages in terms of accuracy, interpretability, scalability, and efficiency, and our prototype experiments also verify the effectiveness of CoG. At the same time, CoG, as a new language model architecture, is worthy of further exploration.

Biography:

Cai Deng is a senior researcher at Tencent AI Lab. Selected into the "Young Talent Lifting Project" of China Association for Science and Technology. He received his Ph.D. from the University of Chinese of Hong Kong in 2022. His research interests include natural language processing and machine learning, especially the fusion of deep learning models and external explicit memory, and semantic symbolic representation and inference. He has published more than 30 papers in top international conferences or journals such as ACL, EMNLP, NAACL, NeurIPS, ICLR, AAAI, etc. Google Scholar has been cited more than 2,000 times. He was awarded the ACL Outstanding Paper Award (first author). He has held cutting-edge workshops (tutorials) at top international conferences such as IJCAI and SIGIR.

Retrieval Augmentation Technology of Knowledge Graph | May 7 TF131 registration

Zhang Yanzhao

Algorithm Engineer of Alibaba Tongyi Laboratory

Topic: GTE-Embedding/Ranking: Unified Text Representation and Ranking Model

Topic Introduction:

Knowledge Retrieval Augmentation Technology (RAG) is an effective means to solve problems such as large model illusion and real-time performance. Among them, the text representation model and the depth ranking model are important modules to improve the accuracy and generalization of the retrieval system. This report will focus on sharing the exploration, ideas and experience of Alibaba's open-source generic text representation vector and ranking GTE series models. This paper discusses the influence of large models on text representation and sorting models, and discusses the existing problems and future development directions of representation and sorting models.

Biography:

Yanzhao Zhang, algorithm engineer of the Machine Intelligence Laboratory of Ali Libaba Tongyi Laboratory, joined Alibaba after graduating from Beihang University with a master's degree in 2022, and has been engaged in natural language processing-related research and industrial implementation. It has won the first place in MSMarco, TREC, MTEB and other lists for many times.

Retrieval Augmentation Technology of Knowledge Graph | May 7 TF131 registration

Lin Hui

Technical Director of NetEase Youdao

Theme: Youdao QAnything's landing experience sharing

Topic Introduction:

At the beginning of 2024, Youdao open-sourced the self-developed RAG engine QAnything, which has attracted more attention, and has nearly 10,000 stars so far, among which the embedding/rerank model performs the industry sota in a cross-language environment, and has accumulated millions of downloads on huggingface. Before being open sourced, QAnything has been applied in a series of projects, including Youdao Translation's document Q&A, Youdao speed reading, internal sales assistance, college counseling, and teacher P. After the open source, we did some more 2B projects. In this event, I will introduce some of the implementation of QAnything and share some of our experiences on what we know about RAG and how to improve accuracy.

Biography:

After graduating from the Institute of Computing Science of the Chinese Academy of Sciences in 2011, he joined NetEase Youdao and is one of the founding members of Youdao AI Lab and Youdao Zhiyun Department. Responsible for the research and development of Youdao's computational advertising recommendation algorithm, Youdao image recognition/speech recognition/speech synthesis, image translation, document analysis and document translation, Youdao digital human, QAnything and other projects. At present, he is in charge of Youdao Zhiyun Department, doing the 2B business of Youdao AI solutions, as well as the productization and commercialization exploration of LLMs. In 2023, he initiated the research and development of Youdao QAnything RAG engine and led a series of landing work.

5. Chair of the SIG

Retrieval Augmentation Technology of Knowledge Graph | May 7 TF131 registration

Wang Haofen

He is the chairman of the CCF TF Knowledge Graph SIG and a distinguished researcher at Tongji University

Personal profile: Distinguished researcher and doctoral supervisor of Tongji University. He is one of the initiators of OpenKG, the world's largest Chinese open knowledge graph alliance. He is responsible for participating in a number of national AI-related projects, publishing more than 100 high-level papers in the field of AI, which have been cited more than 3,100 times and have an H-index of 28. It has built the world's first interactive virtual idol - "Amber Void Face", and the intelligent customer service robot has served more than 1 billion users. At present, he serves as the deputy director of the Terminology Working Committee of the China Computer Federation, the secretary-general of the Natural Language Processing Committee, the chairman of TF SIG KG, the secretary-general of the Shanghai branch, the director of the China Chinese Information Society, the deputy secretary-general of the Language and Knowledge Computing Committee, the deputy director of the Natural Language Processing Committee of the Shanghai Computer Federation, and the secretary-general of the AI Alumni Association of Shanghai Jiao Tong University.

6. Upcoming Events

Instalments date SIG topic format
TF132 May 16th Architecture Cloud-native architecture in the AI era online
TF133 May 23rd Intelligent front-end The front end of the intelligent era: new productivity and new experience online
TF134 May 30th Smart manufacturing Discussion on the application scenarios of large models in industrial intelligence online

7. Description of participation

1. If you are unable to participate after registration, please send an email to apply for cancellation before the start of the event (contact email: [email protected]), as unexcused absence will affect your participation in the next event.

2. The event will be conducted online in Tencent Meeting, and will also be broadcast live on the CCF video account "China Computer Federation". (Note: Tencent online meeting is limited to 100 participants, if you are unable to enter Tencent Meeting, you can participate through live broadcast)

3. The meeting link and password will be notified by email and SMS on the day of the event. You can click on the Tencent Meeting link and enter your password to participate.

4. Please complete the registration before 15:00 the day before the event and get the conference link in time.

5. CCF members can participate for free, non-members can participate in online activities for free, and members can participate in online activities throughout the year for free.

8. Membership Benefits

Members can participate in 20 online activities of CCF TF for free throughout the year, and participate in 14 offline activities at a discounted price, making a good investment in their own technological growth and an excellent way to obtain professional knowledge with high cost performance!

  • Professional Member/Senior Member/Distinguished Member/Fellow: 360 RMB/year
  • Student membership: 50 yuan/year
  • For specific benefits, please click to view: CCF Individual Membership Benefits
  • Apply for corporate membership to enjoy more free places, brand promotion and other benefits, please click to view: CCF corporate membership rights or consultation telephone 0512-65900856 ext. 27

Press and hold to identify or scan the QR code to join

9. How to register

May 7, 2024 (Tuesday) 19:00-21:30

Press and hold to identify or scan the QR code to register

Registration link: https://ccf.org.cn/TF131

Contact

E-mail: [email protected]

Tel: 0512-6590 0856 ext. 27