First Workshop on NLP and LLMs for the Iranian Language Family Co-located with EACL 2026 · Rabat, Morocco · March 2026
This repository hosts the accepted papers of the SilkRoadNLP 2026 workshop, dedicated to advancing NLP research across the Iranian language family — including Persian, Kurdish, Pashto, Balochi, Luri, Ossetian, Tajik, Shughni, and related languages.
For more information, visit silkroadnlp.org
| # | Paper | Resources | Poster | |
|---|---|---|---|---|
| 1 | 📄 | Unmasking the Factual-Conceptual Gap in Persian Language Models Alireza Sakhaeirad, Ali Ma'manpoosh, Arshia Hemmat |
— | |
| 2 | 📄 | Benchmarking Offensive Language Detection in Persian and Pashto Zahra Bokaei, Bonnie Webber, Walid Magdy |
— | — |
| 3 | 📄 | Do Large Language Models Understand Double Mismatches? Evidence from Farsi Maryam Mohammadi |
— | — |
| 4 | 📄 | TajPersLexon: A Tajik–Persian Lexical Resource and Hybrid Model for Cross-Script Low-Resource NLP Mullosharaf Kurbonovich Arabov |
— | — |
| 5 | 📄 | A Computational Approach to Language Contact — A Case Study of Persian Ali Basirat, Danial Namazifard, Navid Baradaran Hemmati |
— | — |
| 6 | 📄 | Online Polarization Detection in Persian (Farsi) Social Media Saeedeh Davoudi, Nazli Goharian |
— | |
| 7 | 📄 | ParsCORE: The Persian Corpus of Online Registers Alireza Razzaghi, Erik Henriksson, Veronika Laippala |
Dataset & Code on GitHub (link TBD) |
— |
| 8 | 📄 | PMWP: A Benchmark for Math Word Problem Solving in Persian Marzieh Abdolmaleki, Mehrnoush Shamsfard, Veronique Hoste, Els Lefever |
— | |
| 9 | 📄 | APARSIN: A Multi-Variety Sentiment and Translation Benchmark for Iranic Languages Sadegh Jafari, Tara Azin, Farhad Roodi, Zahra Dehghani Tafti, Mehrdad Ghadrdan, Elham Vatankhahan Esfahani, Aylin Naebzadeh, Mohammadhadi Shahhosseini, Ghafoor Khan, Kazem Forghani, Danial Namazi, Seyed Mohammad Hossein Hashemi, Farhan Farsi, Mohammad Osoolian, Maede Mohammadi, Mohammad Erfan Zare, Muhammad Hasnain Khan, Muhammad Hussain, Nooreen Zaki, Joma Mohammadi, Shayan Bali, Mohammad Javad Ranjbar, Els Lefever, Veronique Hoste |
— | |
| 10 | 📄 | One Language, Three of Its Voices: Evaluating Multilingual LLMs Across Persian, Dari, and Tajiki on Translation and Understanding Tasks Noor Mairukh Khan Arnob, Abu Bakar Siddique Mahi |
— | — |
| 11 | 📄 | PersianPunc: A Large-Scale Dataset and BERT-Based Approach for Persian Punctuation Restoration Mohammad Javad Ranjbar Kalahroodi, Heshaam Faili, Azadeh Shakery |
Dataset & Model publicly available (link TBD) |
— |
| 12 | 📄 | Shughni Machine Translation Enhanced by Donor Languages Dmitry Novokshanov, Innokentiy S. Humonen, Ilya Makarov |
— | |
| 13 | 📄 | Segmentation Strategy Matters: Benchmarking Whisper on Persian YouTube Content Reihaneh Iranmanesh, Rojin Ziaei, Joe Garman |
— | |
| 14 | 📄 | Multi-modal Neural Machine Translation for Low-Resource Classical Persian Poetry: A Culture-Aware Evaluation Soheila Ansari, Mounir Boukadoum, Fatiha Sadat |
— |
Paper 2 — Unmasking the Factual-Conceptual Gap in Persian Language Models
We introduce DivanBench, a manually curated benchmark of 315 questions across three task types — factual retrieval, paired scenario verification, and situational reasoning — designed to probe cultural and conceptual knowledge in Persian LLMs. Our evaluation reveals a consistent "acquiescence trap" where models default to agreement, and highlights gaps between factual recall and deeper conceptual reasoning.
Paper 3 — Benchmarking Offensive Language Detection in Persian and Pashto
This paper provides a systematic benchmark of offensive language detection across Persian and Pashto, evaluating a range of transformer-based models including multilingual and language-specific variants. It examines cross-lingual transfer, the impact of script similarity, and limitations of existing datasets.
Paper 4 — Do Large Language Models Understand Double Mismatches? Evidence from Farsi
This paper investigates whether LLMs can handle double mismatch constructions in Farsi — syntactic configurations where two grammatical features simultaneously deviate from their canonical agreement patterns. Results reveal systematic failures in current LLMs, pointing to gaps in morphosyntactic reasoning for morphologically rich languages.
Paper 5 — TajPersLexon: A Tajik–Persian Lexical Resource and Hybrid Model for Cross-Script Low-Resource NLP
We present TajPersLexon, a bilingual lexical resource of 40,112 Tajik–Persian word/phrase pairs bridging Cyrillic and Perso-Arabic scripts. A hybrid transliteration and alignment model is introduced to support cross-script NLP tasks for these closely related but orthographically distant language varieties.
Paper 6 — A Computational Approach to Language Contact — A Case Study of Persian
This paper proposes a computational framework for detecting and quantifying language contact effects in Persian, modeling lexical borrowing and phonological adaptation through diachronic corpus analysis. The methodology is validated against historical linguistic accounts of Arabic, French, and English influence on Persian.
Paper 7 — Online Polarization Detection in Persian (Farsi) Social Media
We investigate political polarization on Persian-language social media using NLP techniques. Fine-tuned transformer models are evaluated on the POLAR dataset, and the paper analyzes the impact of pre-training language specificity on polarization classification performance.
Paper 8 — ParsCORE: The Persian Corpus of Online Registers
ParsCORE v0.1 is a corpus of 2,000 human-annotated Persian web documents spanning diverse online registers, developed within the Universal Register framework. Initial experiments on automatic register identification show performance comparable to high-resource languages, establishing a foundation for Persian web-language research.
💻 Dataset & Code: GitHub
Paper 10 — PMWP: A Benchmark for Math Word Problem Solving in Persian
We introduce PMWP, the first dataset of 15,000 elementary-level Persian math word problems for training and evaluating mathematical reasoning in LLMs. Systematic evaluation shows Gemini-2.5-Flash achieves the highest accuracy (72.02%), while LoRA fine-tuning of open-weight models (LLaMA-3-8B, Qwen-2.5-7B) reaches over 91% exact equation match.
💻 Dataset: github.com/marzieh-abdolmaleki/PMWP
Paper 11 — APARSIN: A Multi-Variety Sentiment and Translation Benchmark for Iranic Languages
APARSIN is a large-scale benchmark covering 14 Iranic languages and dialects for sentiment analysis and machine translation. It provides standardized evaluation protocols and baselines using state-of-the-art LLMs, addressing the critical lack of multi-variety resources for the Iranian language family.
💻 Benchmark: github.com/SilkRoadAparsin
Paper 12 — One Language, Three of Its Voices: Evaluating Multilingual LLMs Across Persian, Dari, and Tajiki
This paper evaluates multilingual LLMs on translation and understanding tasks across the three major varieties of Persian (Persian/Farsi, Dari, Tajiki), using a dataset of over 240,000 processed samples. Results highlight variety-specific performance gaps and the challenges of treating these as a single language.
Paper 13 — PersianPunc: A Large-Scale Dataset and BERT-Based Approach for Persian Punctuation Restoration
PersianPunc is a dataset of 17 million samples for Persian punctuation restoration, constructed by aggregating and filtering diverse textual resources. A fine-tuned ParsBERT model achieves 91.33% macro-F1, outperforming large generative models in efficiency and accuracy. The full dataset and fine-tuned model are publicly released.
Paper 15 — Shughni Machine Translation Enhanced by Donor Languages
This paper presents a machine translation system for Shughni, an endangered Iranian language of the Pamirs with fewer than 100,000 speakers and limited digital resources. By leveraging Russian and English as pivot/donor languages within an NLLB-200 framework, the system achieves significant improvements over baseline MT for this extremely low-resource language.
🤗 Demo: huggingface.co/spaces/Novokshanov/Shughni-Translator
Paper 16 — Segmentation Strategy Matters: Benchmarking Whisper on Persian YouTube Content
We benchmark OpenAI's Whisper on 10 hours of Persian YouTube audio with gold-standard transcripts, systematically evaluating the impact of audio segmentation strategies on ASR performance. Results show that segmentation choices have a substantial effect on WER, with practical implications for Persian speech processing pipelines.
💻 Dataset & Code: github.com/ri164-bolleit/persian-youtube-whisper-benchmark
Paper 18 — Multi-modal Neural Machine Translation for Low-Resource Classical Persian Poetry
We introduce the first multi-modal NMT system for translating classical Persian poetry (Masnavi-ye-Ma'navi), combining text with audio recitations. A new parallel Persian–English corpus of 26,571 aligned verse pairs with recitations is released, alongside a culture-specific evaluation framework for idiomatic and poetic translation quality.
💻 Corpus: github.com/amnghd/Persian_poems_corpus
| Full Name | First Workshop on NLP and LLMs for the Iranian Language Family (SilkRoadNLP) |
| Venue | Co-located with EACL 2026, Rabat, Morocco |
| Workshop Date | March 28–29, 2026 |
| Website | silkroadnlp.org |
| Languages Covered | Persian · Dari · Tajiki · Kurdish · Pashto · Balochi · Luri · Ossetian · Shughni · and more |
| Milestone | Date |
|---|---|
| Call for Papers | October 20, 2025 |
| Direct Submission Deadline | January 8, 2026 |
| Notification of Acceptance | January 26, 2026 |
| Camera-ready Papers Due | February 3, 2026 |
| Workshop Date | March 28–29, 2026 |
If you use these proceedings in your research, please cite the workshop:
@proceedings{silkroadnlp2026,
title = {Proceedings of the First Workshop on NLP and LLMs for the Iranian Language Family (SilkRoadNLP)},
year = {2026},
address = {Rabat, Morocco},
publisher = {Association for Computational Linguistics},
url = {https://silkroadnlp.org}
}SilkRoadNLP 2026 — Bridging languages along the Silk Road