dataset_config_dart

Dart library for resolving eval dataset YAML into EvalSet JSON for the Python runner. Also contains the shared data models (e.g., EvalSet, Task, Sample, Variant, Job) used across the eval pipeline. Python equivalents of these models live in dash_evals_config. Located in packages/dataset_config_dart/.

Architecture

The package follows a layered pipeline design:

YAML / JSON files
    │
    ▼
┌──────────┐
│  Parser  │  YamlParser · JsonParser
└────┬─────┘
     │  => List<ParsedTask>, Job
     ▼
┌──────────┐
│ Resolver │  EvalSetResolver
└────┬─────┘
     │  => List<EvalSet>
     ▼
┌──────────┐
│  Writer  │  EvalSetWriter
└────┬─────┘
     │  => JSON file(s) on disk
     ▼
  Python dash_evals

The JSON files written to disk conform to the InspectAI API for eval_set, which is an entry point from which to start running evals.

Layer	Class	Responsibility
Parsers	`YamlParser`, `JsonParser`	Read task YAML and job files into `ParsedTask` and `Job` objects
Resolvers	`EvalSetResolver`	Combine parsed tasks with a job to produce fully resolved `EvalSet` objects (expanding models, variants, sandbox config, etc.)
Writers	`EvalSetWriter`	Serialize `EvalSet` objects to JSON files that the Python runner can consume
Facade	`ConfigResolver`	Single-call convenience that composes Parser → Resolver

Quick Start

import 'package:dataset_config_dart/dataset_config_dart.dart';

// Single-call convenience
final resolver = ConfigResolver();
final configs = resolver.resolve(datasetPath, ['my_job']);

// Or use the layers individually
final parser = YamlParser();
final tasks = parser.parseTasks(datasetPath);
final job = parser.parseJob(jobPath, datasetPath);

final evalSetResolver = EvalSetResolver();
final evalSets = evalSetResolver.resolve(tasks, job, datasetPath);

final writer = EvalSetWriter();
writer.write(evalSets, outputDir);

Data Models

This package also contains the shared Dart data models used across the eval pipeline. All models are built with Freezed for immutability, pattern matching, and JSON serialization via json_serializable.

Note

Python equivalents of these models live in the dash_evals_config package.

Config Models

Model	Description
`Job`	A job configuration — runtime settings, model/variant/task selection, and `eval_set()` overrides
`JobTask`	Per-task overrides within a job (sample filtering, custom system messages)
`Variant`	A named configuration variant (e.g. `baseline`, `with_docs`) applied to task runs
`ContextFile`	A file to inject into the sandbox as additional context for the model

Inspect AI Models

Mirror the Python Inspect AI types so that Dart can produce JSON the Python runner understands directly.

Model	Description
`EvalSet`	Maps to `inspect_ai.eval_set()` parameters — the top-level run definition
`Task`	A single evaluation task with its solver, scorer, dataset, and sandbox config
`TaskInfo`	Lightweight task metadata (name and function reference)
`Sample`	An individual evaluation sample (input, target, metadata)
`Dataset`	A dataset definition (samples file path and field mappings)
`FieldSpec`	Maps dataset columns to sample fields
`EvalLog`	Comprehensive log structure for evaluation results

Source Layout

lib/
├── dataset_config_dart.dart         # Library barrel file
└── src/
    ├── config_resolver.dart # Convenience facade
    ├── parsed_task.dart     # Intermediate parsed-task model
    ├── parsers/
    │   ├── parser.dart      # Abstract parser interface
    │   ├── yaml_parser.dart # YAML file parser
    │   └── json_parser.dart # JSON map parser
    ├── resolvers/
    │   └── eval_set_resolver.dart
    ├── writers/
    │   └── eval_set_writer.dart
    ├── runner_config_exception.dart
    └── utils/
        └── yaml_utils.dart

Testing

cd packages/dataset_config_dart
dart test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dataset_config_dart

Architecture

Quick Start

Data Models

Config Models

Inspect AI Models

Source Layout

Testing

FilesExpand file tree

dataset_config_dart.md

Latest commit

History

dataset_config_dart.md

File metadata and controls

dataset_config_dart

Architecture

Quick Start

Data Models

Config Models

Inspect AI Models

Source Layout

Testing