Skip to content

KongLongGeFDU/TransferTOD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities

Paper EMNLP 2024 Dataset Model

Note: For the Chinese version of this README, please refer to README_zh.md.

🔔 News

  • 🏆 [2024-09] Our paper has been accepted at EMNLP 2024 (Main).
  • 🤖 [2024-08] TransferTOD-7B model released on ModelScope.
  • 🎉 [2024-07] Our paper is released on arXiv: arXiv:2407.21693.

📚 Overview

TransferTOD is a generalizable Chinese multi-domain Task-Oriented Dialogue (TOD) system with strong transfer capabilities to unseen domains. The dataset and the released TransferTOD-7B model are designed to handle real-world TOD use cases — slot filling, intent reasoning, and graceful out-of-domain generalization — within a unified framework.

The dataset spans 30 domains (27 in-domain + 3 held-out OOD: Water Delivery, Sanitation, Courier), and is paired with a two-stage fine-tuning recipe that first injects general TOD ability and then sharpens transfer to specific deployments.

✨ Highlights

  • 🌐 30 domains with 188 slot types in total — one of the largest publicly available Chinese multi-domain TOD datasets
  • 💬 35,965 turns across 5,460 dialogues, with separate In-Domain and Out-of-Domain test splits
  • 🤖 TransferTOD-7B model open-sourced on ModelScope, ready for downstream deployment
  • 🔁 Two-stage fine-tuning recipe balancing general dialogue ability with task-specific transfer
  • 🔬 Strong generalization to unseen domains (OOD test on Water Delivery, Sanitation, Courier)

📊 Dataset Statistics

📌 Statistics Train ID Test OOD Test
🌐 # Domains 27 27 3
🎯 # Slots 188 188 27
💬 # Dialogues 4,320 540 600
🔁 # Turns 28,680 3,585 3,700
📦 # Slots / Dialogue 10.3 10.3 9.7
📏 # Tokens / Turn 66.4 66.4 76.8

Table: Overall statistics of the TransferTOD dataset.

ID Test = In-Domain test set. OOD Test = Out-of-Domain test set, covering three held-out domains: Water Delivery, Sanitation, and Courier.

📂 Project Structure

TransferTOD/
├── data/                                   # 📦 All TOD data
│   ├── raw_data/                           # Raw collected data (incl. BELLE 950k)
│   ├── fine_tune_1/                        # Stage-1 fine-tuning data
│   ├── fine_tune_2/                        # Stage-2 fine-tuning data
│   ├── data_generate_template.ipynb        # Data generation template
│   ├── gpt_generate.ipynb                  # GPT-based data generation
│   └── data_process.py                     # Data processing utilities
├── fine_tune/                              # 🛠️ Training scripts
│   ├── fine-tune.py                        # Main training entry
│   ├── ds_config.json                      # DeepSpeed config
│   └── scripts/                            # Full / LoRA fine-tuning launchers
└── inference/                              # 🚀 Inference & evaluation
    ├── inference.py                        # Run inference on the test sets
    ├── eval.py                             # Compute evaluation metrics
    ├── examples.json                       # Example prompts
    └── inference_and_eval.sh               # End-to-end pipeline

🛠️ Usage Guide

1. Prepare Data

All data used in two-stage fine-tuning, along with the raw TransferTOD data, is provided under data/. For each stage, train.json is a mixture of:

  • train_slot.json — TOD-specific data, and
  • An equivalent amount of data/raw_data/belle_data/belle_filtered_950k_train.jsonl — general instruction data.

This balanced mixture preserves general instruction-following ability while injecting strong TOD competence.

You can also download the released data from Hugging Face:

from datasets import load_dataset

dataset = load_dataset("konglongge/TransferTOD")

2. Two-Stage Fine-tuning

Full fine-tuning:

bash fine_tune/scripts/finetune_full.sh

LoRA fine-tuning:

bash fine_tune/scripts/finetune_lora.sh

Adjust model_name_or_path, data_path, and DeepSpeed settings in ds_config.json before launching.

3. Inference & Evaluation

End-to-end inference + evaluation on the TransferTOD test set:

bash inference/inference_and_eval.sh

This will:

  1. Run inference.py on the ID and OOD test sets.
  2. Run eval.py to compute slot-level and dialogue-level metrics.

4. Use the Released Model

The fine-tuned TransferTOD-7B is available on ModelScope:

🤖 Mee1ong/TransferTOD-7B

from modelscope import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("Mee1ong/TransferTOD-7B", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Mee1ong/TransferTOD-7B", trust_remote_code=True)

📝 Citation

If you find this project useful in your research, please cite us:

@inproceedings{zhang-etal-2024-transfertod,
    title     = "{T}ransfer{TOD}: A Generalizable {C}hinese Multi-Domain Task-Oriented
                 Dialogue System with Transfer Capabilities",
    author    = "Zhang, Ming and Huang, Caishuang and Wu, Yilong and Liu, Shichun and
                 Zheng, Huiyuan and Dong, Yurui and Shen, Yujiong and Dou, Shihan and
                 Zhao, Jun and Ye, Junjie and Zhang, Qi and Gui, Tao and Huang, Xuanjing",
    editor    = "Al-Onaizan, Yaser and Bansal, Mohit and Chen, Yun-Nung",
    booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural
                 Language Processing",
    month     = nov,
    year      = "2024",
    address   = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url       = "https://aclanthology.org/2024.emnlp-main.710/",
    doi       = "10.18653/v1/2024.emnlp-main.710",
    pages     = "12750--12771"
}

🔗 Related Projects

Project Description Link
PFDial (ACL 2025) Structured dialogue instruction tuning based on UML flowcharts GitHub
LLMEval-Med (EMNLP 2025) Real-world clinical benchmark for medical LLMs GitHub
LLMEval-Fair (ACL 2026) Robust & fair evaluation, 200K+ questions GitHub

📞 Contact Us

For questions or collaboration, please:


TransferTOD | Fudan NLP Lab

About

The code repository of paper "TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors