MolecularIQ

Characterizing Chemical Reasoning Capabilities Through Symbolic Verification on Molecular Graphs

🎯 Overview

Large Language Models are increasingly applied to chemistry, tackling tasks such as molecular name conversion, captioning, text-guided generation, and property or reaction prediction. A molecule's properties are fundamentally determined by its composition and structure, encoded in its molecular graph; thus, reasoning about molecular properties requires understanding and reasoning over the molecular structure.

Yet, most existing benchmarks emphasize general chemical knowledge, rely on literature or surrogate labels that risk leakage or bias, or reduce evaluation to multiple-choice questions.

MolecularIQ is a molecular structure reasoning benchmark focused exclusively on symbolically verifiable tasks. It enables fine-grained evaluation of reasoning over molecular graphs and produces capability fingerprints that localize model failures to specific tasks and molecular regimes.

🗝️ Key features

Reasoning tasks. MolecularIQ covers three categories of reasoning:

Counting: Feature and substructure counting on molecular graphs
Indexing: Index-based attribution of atoms, bonds, or substructures
Constrained generation: Generation of valid molecules under structural constraints

Three types of complexity axes. MolecularIQ spans three orthogonal axes:

Molecular complexity: Tests models across molecules of varying structural complexity.
Multitask load: Evaluates performance across different amounts of reasoning requirements.
SMILES representation: Tests robustness across different SMILES representations.

👨‍👧‍👦 MolecularIQ repository family

This repository serves as an entry point to the MolecularIQ ecosystem, covering the benchmark dataset creation, the leaderboard, the evaluation procedure with lm-eval-harness. The dynamic version MolecularIQD is part of the core package.

Repository	Purpose
📍 moleculariq	Current repo, overview over different MolecularIQ code bases
moleculariq-leaderboard	Leaderboard: HuggingFace space, displays results, handles submissions
moleculariq-core	MoelcuarIQD and shared library providing core functionality, e.g. symbolic verifiers and question formatting
moleculariq-benchmark	Dataset creation: task definitions, symbolic verifiers implementations, question generator
moleculariq-eval	Evaluation code: integration with lm-eval-harness, model configs, reward functions, extraction functions, and system prompts

Citation

If you use MolecularIQ in your research, please cite:

@inproceedings{
bartmann2026moleculariq,
title={Molecular{IQ}: Characterizing Chemical Reasoning Capabilities Through Symbolic Verification on Molecular Graphs},
author={Christoph Bartmann and Johannes Schimunek and Mykyta Ielanskyi and Philipp Seidl and G{\"u}nter Klambauer and Sohvi Luukkonen},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=RqwEzZqMFv}
}

⬆ Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MolecularIQ

🎯 Overview

🗝️ Key features

👨‍👧‍👦 MolecularIQ repository family

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

MolecularIQ

🎯 Overview

🗝️ Key features

👨‍👧‍👦 MolecularIQ repository family

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages