Skip to content

ml-jku/moleculariq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MolecularIQ

Leaderboard Paper Benchmark dataset creation Evaluation Code MolecularIQD

Characterizing Chemical Reasoning Capabilities Through Symbolic Verification on Molecular Graphs

OverviewRepositoriesCitation

🎯 Overview

Large Language Models are increasingly applied to chemistry, tackling tasks such as molecular name conversion, captioning, text-guided generation, and property or reaction prediction. A molecule's properties are fundamentally determined by its composition and structure, encoded in its molecular graph; thus, reasoning about molecular properties requires understanding and reasoning over the molecular structure.

Yet, most existing benchmarks emphasize general chemical knowledge, rely on literature or surrogate labels that risk leakage or bias, or reduce evaluation to multiple-choice questions.

MolecularIQ is a molecular structure reasoning benchmark focused exclusively on symbolically verifiable tasks. It enables fine-grained evaluation of reasoning over molecular graphs and produces capability fingerprints that localize model failures to specific tasks and molecular regimes.

MolecularIQ overview

🗝️ Key features

Reasoning tasks. MolecularIQ covers three categories of reasoning:

  • Counting: Feature and substructure counting on molecular graphs
  • Indexing: Index-based attribution of atoms, bonds, or substructures
  • Constrained generation: Generation of valid molecules under structural constraints

Three types of complexity axes. MolecularIQ spans three orthogonal axes:

  • Molecular complexity: Tests models across molecules of varying structural complexity.
  • Multitask load: Evaluates performance across different amounts of reasoning requirements.
  • SMILES representation: Tests robustness across different SMILES representations.

👨‍👧‍👦 MolecularIQ repository family

This repository serves as an entry point to the MolecularIQ ecosystem, covering the benchmark dataset creation, the leaderboard, the evaluation procedure with lm-eval-harness. The dynamic version MolecularIQD is part of the core package.

Repository Purpose
📍 moleculariq Current repo, overview over different MolecularIQ code bases
moleculariq-leaderboard Leaderboard: HuggingFace space, displays results, handles submissions
moleculariq-core MoelcuarIQD and shared library providing core functionality, e.g. symbolic verifiers and question formatting
moleculariq-benchmark Dataset creation: task definitions, symbolic verifiers implementations, question generator
moleculariq-eval Evaluation code: integration with lm-eval-harness, model configs, reward functions, extraction functions, and system prompts

Citation

If you use MolecularIQ in your research, please cite:

@inproceedings{
bartmann2026moleculariq,
title={Molecular{IQ}: Characterizing Chemical Reasoning Capabilities Through Symbolic Verification on Molecular Graphs},
author={Christoph Bartmann and Johannes Schimunek and Mykyta Ielanskyi and Philipp Seidl and G{\"u}nter Klambauer and Sohvi Luukkonen},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=RqwEzZqMFv}
}

About

Characterizing chemical reasoning capabilities through symbolic verification on molecular graphs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors