This guide shows how to enrich media recipes with the new PFAS-based schema extensions.
Use the import scripts to automatically add roles to ingredients:
# Preview what would change (dry-run)
just import-pfas-roles --dry-run
# Apply the changes
just import-pfas-roles
# Import cofactor reference data
just import-pfas-cofactors
# Or do both at once
just import-pfas-allThis automatically adds role fields to ingredients with matching CHEBI IDs.
Edit YAML files directly to add new fields. See examples below.
Add functional role annotations to ingredients:
ingredients:
- preferred_term: Glucose
concentration:
value: '10.0'
unit: G_PER_L
term:
id: CHEBI:17234
label: glucose
# NEW: Functional roles (multivalued)
role:
- CARBON_SOURCE
- ENERGY_SOURCEAvailable roles:
CARBON_SOURCE- Primary carbon sourceNITROGEN_SOURCE- Nitrogen sourceMINERAL- Major mineral nutrientTRACE_ELEMENT- MicronutrientBUFFER- pH buffering agentVITAMIN_SOURCE- Provides vitaminsSALT- Osmotic balance/ionic strengthPROTEIN_SOURCE- Complex protein sourceAMINO_ACID_SOURCE- Amino acidsSOLIDIFYING_AGENT- Gelling agent (agar)ENERGY_SOURCE- Energy sourceELECTRON_ACCEPTOR- Terminal electron acceptorELECTRON_DONOR- Electron donorCOFACTOR_PROVIDER- Supplies cofactors
Annotate which cofactors an ingredient provides:
ingredients:
- preferred_term: MgSO4·7H2O
concentration:
value: '1.0'
unit: G_PER_L
term:
id: CHEBI:31795
label: MgSO4·7H2O
role:
- MINERAL
- COFACTOR_PROVIDER
# NEW: Cofactors provided
cofactors_provided:
- preferred_term: Magnesium ion
term:
id: CHEBI:18420
label: magnesium(2+)
category: METALS
bioavailability: Readily available as Mg2+ ion
notes: Essential cofactor for hundreds of enzymesCofactor categories:
VITAMINS- Vitamins and vitamin-derived cofactorsMETALS- Metal ions (Fe, Mg, Ca, Zn, etc.)NUCLEOTIDES- NAD, NADH, NADP, ATP, etc.ENERGY_TRANSFER- CoA, SAM, acetyl-CoA, etc.OTHER_SPECIALIZED- PQQ, F420, methanofuran, etc.
Cofactor attributes:
preferred_term(required) - Human-readable nameterm- CHEBI ontology termcategory- One of the categories aboveprecursor- Precursor molecule nameprecursor_term- CHEBI term for precursorec_associations- List of EC numberskegg_pathways- List of KEGG pathway IDsenzyme_examples- Example enzymes using this cofactorbiosynthesis_genes- Genes for biosynthesisbioavailability- Uptake characteristicsnotes- Additional information
Annotate organisms with their functional role in microbial communities:
target_organisms:
- preferred_term: Methylobacterium extorquens
term:
id: NCBITaxon:408
label: Methylobacterium extorquens
strain: AM1
# NEW: Community role annotation
community_role:
- PRIMARY_DEGRADER
# NEW: Target abundance in community
target_abundance: 0.6 # 60% of community
# NEW: Functional contributions
community_function:
- C1 metabolism
- Methanol oxidationAvailable community roles:
PRIMARY_DEGRADER- Direct substrate degradation (40-60% abundance)REDUCTIVE_DEGRADER- Reductive degradation pathwaysOXIDATIVE_DEGRADER- Oxidative degradation pathwaysBIOTRANSFORMER- Converts without complete degradationSYNERGIST- Complementary functions (15-30% abundance)BRIDGE_ORGANISM- Provides essential cofactors to communityELECTRON_SHUTTLE- Facilitates electron transferDETOXIFIER- Handles toxic intermediatesCOMMENSAL- General commensalCOMPETITOR- Competitive organism
Specify which cofactors an organism requires:
target_organisms:
- preferred_term: Methylobacterium extorquens
term:
id: NCBITaxon:408
label: Methylobacterium extorquens
# NEW: Cofactor requirements
cofactor_requirements:
# Example 1: Auxotroph (cannot synthesize)
- cofactor:
preferred_term: Cobalamin (Vitamin B12)
term:
id: CHEBI:16335
label: cobalamin
category: VITAMINS
notes: Required for C1 metabolism
can_biosynthesize: false
confidence: 0.95
# Example 2: Prototroph (can synthesize)
- cofactor:
preferred_term: Tetrahydrofolate
term:
id: CHEBI:20506
label: tetrahydrofolate
category: VITAMINS
can_biosynthesize: true
confidence: 0.90
genes:
- folA
- folB
- folCCofactorRequirement attributes:
cofactor(required) - CofactorDescriptor (see above)can_biosynthesize(required) - true/falseconfidence- Confidence score (0.0-1.0)evidence- List of EvidenceItemsgenes- Related gene names
Annotate organism transport systems:
target_organisms:
- preferred_term: Methylobacterium extorquens
term:
id: NCBITaxon:408
label: Methylobacterium extorquens
# NEW: Transporter systems
transporters:
- name: Methanol dehydrogenase (MDH)
transporter_type: DEHALOGENASE
substrates:
- methanol
substrate_terms:
- id: CHEBI:17790
label: methanol
direction: import
genes:
- mxaF
- mxaI
ec_number: 1.1.2.7
notes: PQQ-dependent methanol dehydrogenase
- name: Nitrate transporter (NarK)
transporter_type: MFS
substrates:
- nitrate
- nitrite
substrate_terms:
- id: CHEBI:17632
label: nitrate
direction: import
genes:
- narKAvailable transporter types:
ABC- ATP-binding cassetteMFS- Major facilitator superfamilyPTS- Phosphotransferase systemTONB- TonB-dependent receptorSYMPORTER- Co-transporterANTIPORTER- ExchangerUNIPORTER- ChannelPORIN- Outer membrane proteinSIDEROPHORE_RECEPTOR- Iron uptake receptorDEHALOGENASE- Dehalogenase enzymeFLUORIDE_EXPORTER- Fluoride exporter
See the fully enriched example at:
normalized_yaml/bacterial/Nitrate_Mineral_Salts_Medium_(NMS)_ENRICHED_EXAMPLE.yaml
This shows all new fields in action.
Always validate after enriching:
# Validate a single recipe
just validate normalized_yaml/bacterial/MyRecipe.yaml
# Validate all recipes
just validate-allFor batch enrichment, you can create Python scripts:
#!/usr/bin/env python
"""
Custom enrichment script example
"""
import yaml
from pathlib import Path
def add_roles_to_recipe(recipe_path: Path):
"""Add ingredient roles to a recipe."""
with open(recipe_path) as f:
recipe = yaml.safe_load(f)
# Example: Add roles based on ingredient names
for ingredient in recipe.get('ingredients', []):
name = ingredient['preferred_term'].lower()
if 'glucose' in name:
ingredient['role'] = ['CARBON_SOURCE', 'ENERGY_SOURCE']
elif 'kno3' in name or 'nitrate' in name:
ingredient['role'] = ['NITROGEN_SOURCE', 'ELECTRON_ACCEPTOR']
elif 'buffer' in name or 'phosphate' in name:
ingredient['role'] = ['BUFFER', 'MINERAL']
# Add more mappings...
# Write back
with open(recipe_path, 'w') as f:
yaml.dump(recipe, f, default_flow_style=False,
allow_unicode=True, sort_keys=False)
# Use it
recipe_dir = Path("normalized_yaml/bacterial")
for recipe_file in recipe_dir.glob("*.yaml"):
add_roles_to_recipe(recipe_file)- Start with automatic enrichment: Run
just import-pfas-allfirst - Use CHEBI terms: Always include ontology terms for validation
- Be specific with roles: Use multiple roles when appropriate
- Document changes: Add curation history entries
- Validate frequently: Run validation after each change
- Review imported data: The PFAS import is semi-automatic - review results
- preferred_term: Glucose
role: [CARBON_SOURCE, ENERGY_SOURCE]
- preferred_term: NH4Cl
role: [NITROGEN_SOURCE]- preferred_term: KH2PO4
role: [BUFFER, MINERAL]
- preferred_term: Na2HPO4
role: [BUFFER, MINERAL]- preferred_term: FeSO4
role: [TRACE_ELEMENT, COFACTOR_PROVIDER]
cofactors_provided:
- preferred_term: Iron(II) ion
term:
id: CHEBI:29033
label: iron(2+)
category: METALStarget_organisms:
- preferred_term: Primary degrader
community_role: [PRIMARY_DEGRADER]
target_abundance: 0.5
- preferred_term: Cofactor provider
community_role: [BRIDGE_ORGANISM, SYNERGIST]
target_abundance: 0.3
cofactor_requirements:
- cofactor:
preferred_term: Vitamin B12
category: VITAMINS
can_biosynthesize: trueThe enrichment data comes from:
- PFAS Repository:
/Users/marcin/Documents/VIMSS/ontology/PFAS/PFASCommunityAgents- Ingredient roles:
data/sheets_pfas/PFAS_Data_for_AI_media_ingredients_extended.tsv - Cofactor hierarchy:
data/reference/cofactor_hierarchy.yaml - Cofactor mappings:
data/reference/ingredient_cofactor_mapping.csv
- Ingredient roles:
Generated reference data:
- Cofactors:
data/reference/cofactors.yaml(generated byjust import-pfas-cofactors)
- Check the schema:
src/culturemech/schema/culturemech.yaml - View the enriched example:
normalized_yaml/bacterial/Nitrate_Mineral_Salts_Medium_(NMS)_ENRICHED_EXAMPLE.yaml - Run import scripts with
--helpfor options