1. Clone the repository
git clone https://github.com/pnborchert/FLAREEnsure you have the required dependencies installed. Check the requirements.txt file for details.
2. Task Fine-tuning on English Data
Fine-tune a pretrained language model (e.g., Gemma 2) on English task data such as XNLI:
bash train_task_ft.sh3. Machine Translation
Translate XNLI training and evaluation data from English to the target langauge (e.g., Spanish) and translate the test data from Spanish to English.
bash run_translate.sh4. Cross-lingual Transfer with FLARE
Adapt the fine-tuned Gemma 2 model from English XNLI to Spanish:
bash train_flare.sh📚 Supported Datasets
| Datasets | Links |
|---|---|
| XNLI | Paper, Data |
| NusaX | Paper, Data |
| TyDiQA | Paper, Data |
🤖 Supported Models
| Model | Size | Links |
|---|---|---|
| XLMR Large | 0.6 B | Paper, Model Card |
| Llama 3.1 | 8 B | Paper, Model Card |
| Gemma 2 | 9 B | Paper, Model Card |
By default Llama 3.1 and Gemma 2 are loaded with 4 bit quantization and trained with LoRAs injected in all linear layers Dettmers et al., 2023.
Note
🧩 Want to use FLARE with a different model? Follow the step-by-step guide to add new models.
We provide an overview of the supported input parameters in run_task_ft.py for initial task adaptation to English task data.
task: Specifies the task/dataset, e.g., "xnli", "nusax", or "tydiqa".lang: Sets the language (default: "en").output_dir: Directory to save checkpoints (default: "checkpoints").path_data: Data Directory.plm: Identifies the pretrained language model to use, options include "gemma2-9b" and "llama3.1-8b".adapter: Adapter type , e.g., "lora".lora_randlora_alpha: LoRA configuration parameters (r=8, alpha=16 by default).
We provide an overview of the supported input parameters in run.py for cross-lingual transfer of the task fine-tuned model with FLARE (also supports FLARE FuseMT, regular LoRA, and input-level fusion):
task: Specifies the task/dataset, e.g., "xnli", "nusax", or "tydiqa".source_langandtarget_lang: Sets the source and target language, e.g., "en" and "es".output_dir: Directory to save checkpoints (default: "checkpoints").path_data: Data Directory, default "translations".path_mt: Provides the path to the machine translated data, default "translations".load_ckpt: Path to checkpoint directory, e.g., "{output_dir}/{task}-{plm}-{source_lang}-{seed}"plm: Identifies the pretrained language model to use, options include "gemma2-9b" and "llama3.1-8b".adapter: Adapter type , e.g., "lora".lora_randlora_alpha: LoRA configuration parameters (r=8, alpha=16 by default).translate-test: Sets the evaluation mode "translate-test".translate-train: Sets the evaluation mode "translate-train" if specified.eval_zs: Sets the evaluation mode "zero-shot" if specified.
FLARE parameters:
fusion_fn: Select a supported fusion fucntion, including "add", "add_relu", "mul" or "cross-attention".fuse_mt: Sets the fusion mode to "FLARE MT" with MT encoder representations used as source language inputs.mt_model: Select a supported MT model including "nllb-600m", "nllb-3.3b".
Input-level Fusion:
input_fusion: Sets the fusion mode to "input-level fusion" with source and target language inputs concatenated. Note: ensuremax_lengthis adjusted accordingly.
@inproceedings{borchert-etal-2025-language,
title = "Language Fusion for Parameter-Efficient Cross-lingual Transfer",
author = "Borchert, Philipp and
Vuli{\'c}, Ivan and
Moens, Marie-Francine and
De Weerdt, Jochen",
editor = "Che, Wanxiang and
Nabende, Joyce and
Shutova, Ekaterina and
Pilehvar, Mohammad Taher",
booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = jul,
year = "2025",
address = "Vienna, Austria",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.acl-long.1255/",
doi = "10.18653/v1/2025.acl-long.1255",
pages = "25848--25868",
ISBN = "979-8-89176-251-0",
}