Problem
extropy validate currently only handles two file types:
- Population specs (
.yaml) → _validate_population_spec()
- Scenario specs (detected by filename containing "scenario") →
_validate_scenario_spec()
Two gaps:
1. No persona config validation
There is no validation path for persona.v*.yaml files. After extropy persona generates a config, there's no way to validate it short of manually reading the YAML.
Persona validation should check:
- All attributes from the population spec + scenario extended attributes have phrasing entries
- Boolean phrasings have both
true_phrase and false_phrase
- Categorical phrasings cover all options defined in the population spec
- Relative attributes have all 5 z-score labels (
much_below, below, average, above, much_above)
- Concrete attributes have valid
format_spec and template with {value} placeholder
- Group assignments are valid (every attribute in a group, no orphans)
- Intro template references only attributes that exist
2. Scenario file routing is fragile
The current routing (_is_scenario_file()) detects scenario files by filename pattern. This should be documented or made more robust. The validator also doesn't clearly distinguish between the new study-folder flow (base_population reference) and the legacy flow (population_spec + study_db paths).
The scenario validator itself is thorough (event, exposure, spread, outcomes, timeline, simulation config, file refs) but the CLI routing to get there could be clearer.
Proposed Solution
- Add
_validate_persona_config() with the checks listed above
- Route
persona.v*.yaml files to the new validator
- Consider a
--type flag to explicitly specify file type instead of relying on filename detection
Problem
extropy validatecurrently only handles two file types:.yaml) →_validate_population_spec()_validate_scenario_spec()Two gaps:
1. No persona config validation
There is no validation path for
persona.v*.yamlfiles. Afterextropy personagenerates a config, there's no way to validate it short of manually reading the YAML.Persona validation should check:
true_phraseandfalse_phrasemuch_below,below,average,above,much_above)format_specandtemplatewith{value}placeholder2. Scenario file routing is fragile
The current routing (
_is_scenario_file()) detects scenario files by filename pattern. This should be documented or made more robust. The validator also doesn't clearly distinguish between the new study-folder flow (base_populationreference) and the legacy flow (population_spec+study_dbpaths).The scenario validator itself is thorough (event, exposure, spread, outcomes, timeline, simulation config, file refs) but the CLI routing to get there could be clearer.
Proposed Solution
_validate_persona_config()with the checks listed abovepersona.v*.yamlfiles to the new validator--typeflag to explicitly specify file type instead of relying on filename detection