Skip to content

Commit 8c80cdc

Browse files
Readme optimize (#65)
* update readme for cn,en * update readme for cn,en * update readme for cn,en * minor format tweak --------- Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
1 parent 421090e commit 8c80cdc

3 files changed

Lines changed: 129 additions & 85 deletions

File tree

README.md

Lines changed: 69 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,60 +1,85 @@
11
# KAG: Knowledge Augmented Generation
22

3-
[中文版文档](./README_cn.md)
4-
[日本語版ドキュメント](./README_ja.md)
5-
6-
## 1. What is KAG
7-
8-
Retrieval Augmentation Generation (RAG) technology promotes the integration of domain applications with large language models. However, RAG has problems such as a large gap between vector similarity and knowledge reasoning correlation, and insensitivity to knowledge logic (such as numerical values, time relationships, expert rules, etc.), which hinder the implementation of professional knowledge services.
9-
10-
On October 24, 2024, OpenSPG released v0.5, officially releasing the professional domain knowledge service framework of knowledge augmented generation (KAG). The goal of KAG is to build a knowledge-enhanced LLM service framework in professional domains, supporting logical reasoning, factual Q&A, etc. KAG fully integrates the logical and factual characteristics of the KGs. Meanwhile, it uses OpenIE to lower the threshold for knowledgeization of domain documents and alleviates the sparsity problem of the KG through hybrid reasoning. As far as we know, KAG is the only RAG framework that supports logical reasoning and multi-hop factual Q&A. Its core features include:
11-
12-
* Knowledge and Chunk Mutual Indexing structure to integrate more complete contextual text information
13-
* Knowledge alignment using conceptual semantic reasoning to alleviate the noise problem caused by OpenIE
14-
* Schema-constrained knowledge construction to support the representation and construction of domain expert knowledge
15-
* Logical form-guided hybrid reasoning and retrieval to support logical reasoning and multi-hop reasoning Q&A
16-
17-
KAG is significantly better than NaiveRAG, HippoRAG and other methods in multi-hop Q&A tasks. The F1 score on hotpotQA is relatively increased by 19.6%, and the F1 score on 2wiki is relatively increased by 33.5%. We have successfully applied KAG to Ant Group's professional knowledge Q&A tasks, such as e-government Q&A and e-health Q&A, and the professionalism has been significantly improved compared to the traditional RAG method.
3+
<div align="center">
4+
<a href="https://spg.openkg.cn/en-US">
5+
<img src="./_static/images/OpenSPG-1.png" width="520" alt="openspg logo">
6+
</a>
7+
</div>
8+
9+
<p align="center">
10+
<a href="./README.md">English</a> |
11+
<a href="./README_cn.md">简体中文</a> |
12+
<a href="./README_ja.md">日本語版ドキュメント</a>
13+
</p>
14+
15+
<p align="center">
16+
<a href='https://arxiv.org/pdf/2409.13731'><img src='https://img.shields.io/badge/arXiv-2409.13731-b31b1b'></a>
17+
<a href="https://github.com/OpenSPG/KAG/releases/latest">
18+
<img src="https://img.shields.io/github/v/release/OpenSPG/KAG?color=blue&label=Latest%20Release" alt="Latest Release">
19+
</a>
20+
<a href="https://github.com/OpenSPG/KAG/blob/main/LICENSE">
21+
<img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?labelColor=d4eaf7&color=2e6cc4" alt="license">
22+
</a>
23+
</p>
24+
25+
# 1. What is KAG?
26+
27+
KAG is a logical reasoning and Q&A framework based on the [OpenSPG](https://github.com/OpenSPG/openspg) engine and large language models, which is used to build logical reasoning and Q&A solutions for vertical domain knowledge bases. KAG can effectively overcome the ambiguity of traditional RAG vector similarity calculation and the noise problem of GraphRAG introduced by OpenIE. KAG supports logical reasoning and multi-hop fact Q&A, etc., and is significantly better than the current SOTA method.
28+
29+
The goal of KAG is to build a knowledge-enhanced LLM service framework in professional domains, supporting logical reasoning, factual Q&A, etc. KAG fully integrates the logical and factual characteristics of the KGs. Its core features include:
30+
31+
- Knowledge and Chunk Mutual Indexing structure to integrate more complete contextual text information
32+
- Knowledge alignment using conceptual semantic reasoning to alleviate the noise problem caused by OpenIE
33+
- Schema-constrained knowledge construction to support the representation and construction of domain expert knowledge
34+
- Logical form-guided hybrid reasoning and retrieval to support logical reasoning and multi-hop reasoning Q&A
1835

1936
⭐️ Star our repository to stay up-to-date with exciting new features and improvements! Get instant notifications for new releases! 🌟
2037

2138
![Star KAG](./_static/images/star-kag.gif)
2239

23-
## 2 Core Features
40+
# 2. Core Features
2441

25-
### 2.1 Knowledge Representation
42+
## 2.1 Knowledge Representation
2643

27-
In the context of private knowledge bases, unstructured data, structured information, and business expert experience often coexist. KAG references the DIKW hierarchy to upgrade SPG to a version that is friendly to LLMs. For unstructured data such as news, events, logs, and books, as well as structured data like transactions, statistics, and approvals, along with business experience and domain knowledge rules, KAG employs techniques such as layout analysis, knowledge extraction, property normalization, and semantic alignment to integrate raw business data and expert rules into a unified business knowledge graph.
44+
In the context of private knowledge bases, unstructured data, structured information, and business expert experience often coexist. KAG references the DIKW hierarchy to upgrade SPG to a version that is friendly to LLMs.
45+
46+
For unstructured data such as news, events, logs, and books, as well as structured data like transactions, statistics, and approvals, along with business experience and domain knowledge rules, KAG employs techniques such as layout analysis, knowledge extraction, property normalization, and semantic alignment to integrate raw business data and expert rules into a unified business knowledge graph.
2847

2948
![KAG Diagram](./_static/images/kag-diag.jpg)
3049

31-
This makes it compatible with schema-free information extraction and schema-constrained expertise construction on the same knowledge type (e. G., entity type, event type), and supports the cross-index representation between the graph structure and the original text block. This mutual index representation is helpful to the construction of inverted index based on graph structure, and promotes the unified representation and reasoning of logical forms.
50+
This makes it compatible with schema-free information extraction and schema-constrained expertise construction on the same knowledge type (e. G., entity type, event type), and supports the cross-index representation between the graph structure and the original text block.
51+
52+
This mutual index representation is helpful to the construction of inverted index based on graph structure, and promotes the unified representation and reasoning of logical forms.
3253

33-
### 2.2 Mixed Reasoning Guided by Logic Forms
54+
## 2.2 Mixed Reasoning Guided by Logic Forms
3455

3556
![Logical Form Solver](./_static/images/kag-lf-solver.png)
3657

37-
KAG proposes a logically formal guided hybrid solution and inference engine. The engine includes three types of operators: planning, reasoning, and retrieval, which transform natural language problems into problem solving processes that combine language and notation. In this process, each step can use different operators, such as exact match retrieval, text retrieval, numerical calculation or semantic reasoning, so as to realize the integration of four different problem solving processes: Retrieval, Knowledge Graph reasoning, language reasoning and numerical calculation.
58+
KAG proposes a logically formal guided hybrid solution and inference engine.
59+
60+
The engine includes three types of operators: planning, reasoning, and retrieval, which transform natural language problems into problem solving processes that combine language and notation.
3861

62+
In this process, each step can use different operators, such as exact match retrieval, text retrieval, numerical calculation or semantic reasoning, so as to realize the integration of four different problem solving processes: Retrieval, Knowledge Graph reasoning, language reasoning and numerical calculation.
3963

40-
## 3. Release Notes
64+
# 3. Release Notes
4165

42-
### 3.1 Released Versions
43-
* 2024.11.21 : Support docs upload, model invoke concurrency setting, User experience optimization, etc.
44-
* 2024.10.25 : KAG release
66+
## 3.1 Latest Updates
4567

46-
### 3.2 Future Plan
47-
* 2024.12 : domain knowledge injection, domain schema customization, QFS tasks support, Visual query analysis, etc.
48-
* 2025.01 : Logical reasoning optimization, conversational tasks support
49-
* 2025.02 : kag-model release, kag solution for event reasoning knowledge graph and medical knowledge graph
50-
* 2025.03 : Kag front-end open source, distributed build support, mathematical reasoning optimization
68+
* 2024.11.21 : Support Word docs upload, model invoke concurrency setting, User experience optimization, etc.
69+
* 2024.10.25 : KAG initial release
5170

71+
## 3.2 Future Plans
5272

53-
## 4. How to use it
73+
* domain knowledge injection, domain schema customization, QFS tasks support, Visual query analysis, etc.
74+
* Logical reasoning optimization, conversational tasks support
75+
* kag-model release, kag solution for event reasoning knowledge graph and medical knowledge graph
76+
* kag front-end open source, distributed build support, mathematical reasoning optimization
5477

55-
### 4.1 product-based (for ordinary users)
78+
# 4. Quick Start
5679

57-
#### 4.1.1 Engine & Dependent Image Installation
80+
## 4.1 product-based (for ordinary users)
81+
82+
### 4.1.1 Engine & Dependent Image Installation
5883

5984
* **Recommend System Version:**
6085

@@ -81,19 +106,20 @@ curl -sSL https://raw.githubusercontent.com/OpenSPG/openspg/refs/heads/master/de
81106
docker compose -f docker-compose.yml up -d
82107
```
83108

84-
#### 4.1.2 Use the product
109+
### 4.1.2 Use the product
85110

86111
Navigate to the default url of the KAG product with your browser: <http://127.0.0.1:8887>
87112

88113
See the [Product](https://openspg.yuque.com/ndx6g9/wc9oyq/rgd8ecefccwd1ga5) guide for detailed introduction.
89114

90-
### 4.2 toolkit-based (for developers)
115+
## 4.2 toolkit-based (for developers)
91116

92-
#### 4.2.1 Engine & Dependent Image Installation
117+
### 4.2.1 Engine & Dependent Image Installation
93118

94119
Refer to the 3.1 section to complete the installation of the engine & dependent image.
95120

96-
#### 4.2.2 Installation of KAG
121+
### 4.2.2 Installation of KAG
122+
97123

98124
**macOS / Linux developers**
99125

@@ -117,33 +143,33 @@ Refer to the 3.1 section to complete the installation of the engine & dependent
117143
# Install KAG: cd KAG && pip install -e .
118144
```
119145

120-
#### 4.2.3 Use the toolkit
146+
### 4.2.3 Use the toolkit
121147

122148
Please refer to the [Quick Start](https://openspg.yuque.com/ndx6g9/wc9oyq/owp4sxbdip2u7uvv) guide for detailed introduction of the toolkit. Then you can use the built-in components to reproduce the performance results of the built-in datasets, and apply those components to new busineness scenarios.
123149

124-
## 5. Technical Architecture
150+
# 5. Technical Architecture
125151

126-
![Figure 1 KAG technical architecture](./_static/images/kag-arch.png)
152+
![KAG technical architecture](./_static/images/kag-arch.png)
127153

128154
The KAG framework includes three parts: kg-builder, kg-solver, and kag-model. This release only involves the first two parts, kag-model will be gradually open source release in the future.
129155

130156
kg-builder implements a knowledge representation that is friendly to large-scale language models (LLM). Based on the hierarchical structure of DIKW (data, information, knowledge and wisdom), IT upgrades SPG knowledge representation ability, and is compatible with information extraction without schema constraints and professional knowledge construction with schema constraints on the same knowledge type (such as entity type and event type), it also supports the mutual index representation between the graph structure and the original text block, which supports the efficient retrieval of the reasoning question and answer stage.
131157

132158
kg-solver uses a logical symbol-guided hybrid solving and reasoning engine that includes three types of operators: planning, reasoning, and retrieval, to transform natural language problems into a problem-solving process that combines language and symbols. In this process, each step can use different operators, such as exact match retrieval, text retrieval, numerical calculation or semantic reasoning, so as to realize the integration of four different problem solving processes: Retrieval, Knowledge Graph reasoning, language reasoning and numerical calculation.
133159

134-
## 6. Contact us
160+
# 6. Contact us
135161

136162
**GitHub**: <https://github.com/OpenSPG/KAG>
137163

138164
**OpenSPG**: <https://spg.openkg.cn/>
139165

140166
<img src="./_static/images/openspg-qr.png" alt="Contact Us: OpenSPG QR-code" width="200">
141167

142-
# Differences between KAG, RAG, and GraphRAG
168+
# 7. Differences between KAG, RAG, and GraphRAG
143169

144170
**KAG introduction and applications**: <https://github.com/orgs/OpenSPG/discussions/52>
145171

146-
# Cite
172+
# 8. Citation
147173

148174
If you use this software, please cite it as below:
149175

0 commit comments

Comments
 (0)