# Install dependencies
just install
# Optional: Install Koza for KG export
just install-kozaCultureMech uses a 3-layer data architecture:
📁 raw/ ← Layer 1: Original source data (immutable)
└─ mediadive/
├─ mediadive_media.json (3,327 recipes)
├─ mediadive_ingredients.json (1,234 mappings)
└─ README.md (provenance)
📁 data/processed/ ← Layer 2: Intermediate transformations (regenerable)
└─ (future enhancements)
📁 normalized_yaml/ ← Layer 3: Curated LinkML YAML (version controlled)
├─ bacterial/ (2,877 recipes)
├─ fungal/ (114 recipes)
├─ specialized/ (96 recipes)
├─ archaea/ (59 recipes)
└─ algae/ (0 recipes, ready for future)
See DATA_LAYERS.md for complete architecture documentation.
# Fetch all available raw data sources
just fetch-raw-data
# Or fetch specific sources
just fetch-mediadive-raw
just fetch-microbe-media-param-raw
# Check what you have
just show-raw-data-statsResult:
MediaDive:
📁 mediadive_media.json: 3,327 records
📦 Size: 1.1M
# Import all MediaDive recipes
just import-mediadive
# Or test with limited number
just import-mediadive 10Result: Creates 3,146 validated YAML files in normalized_yaml/
# Validate a single recipe
just validate normalized_yaml/bacterial/LB_Broth.yaml
# Validate all recipes
just validate-all
# Full QC pipeline
just qc# Faceted browser
just gen-browser-data
just serve-browser
# Open http://localhost:8000/app/
# HTML pages
just gen-pages
# Pages generated in pages/
# Knowledge graph export
just kgx-export
# KGX edges in output/kgx/just serve-browser
# Navigate to http://localhost:8000/app/
# Search for "LB Broth" or filter by organismjust gen-page normalized_yaml/bacterial/LB_Broth.yaml
open pages/LB_Broth.html# Using grep
grep -r "Escherichia coli" normalized_yaml/
# Using the browser (best experience)
just serve-browser# Create new file (follow template)
cp normalized_yaml/bacterial/LB_Broth.yaml normalized_yaml/bacterial/My_New_Medium.yaml
# Edit the file
# (Use CONTRIBUTING.md for guidelines)
# Validate
just validate normalized_yaml/bacterial/My_New_Medium.yaml
# If valid, commit
git add normalized_yaml/bacterial/My_New_Medium.yaml
git commit -m "Add My New Medium recipe"# Backup current data
cp -r normalized_yaml normalized_yaml.backup.$(date +%Y%m%d)
# Fetch latest raw data
just fetch-raw-data
# Re-import
just import-mediadive
# Validate
just validate-all
# Review changes
git diff normalized_yaml/
# Commit if satisfied
git add normalized_yaml/
git commit -m "Update MediaDive recipes"Test ingredient name → CHEBI ID mappings:
# Test a specific ingredient
just test-chemical-mappings glucose
# Show coverage statistics
just chemical-mapping-statsOutput:
Lookup: glucose
✓ Found in MediaDive
CHEBI ID: CHEBI:17234
Label: glucose
Formula: C6H12O6
Statistics:
Total ingredients: 1,234
With CHEBI IDs: 686 (56%)
# Fetch the data
just fetch-mediadive-raw
# Check if path is correct
cat raw/mediadive/README.md
# Update path in project.justfile if needed
# Look for: cmm_automation_dir := "..."# Schema validation only (fastest)
just validate-schema normalized_yaml/bacterial/LB_Broth.yaml
# Term validation (requires ontologies)
just validate-terms normalized_yaml/bacterial/LB_Broth.yaml
# Check if ontologies are downloaded
ls ~/.data/oaklib/# Regenerate browser data
just gen-browser-data
# Check if data.js exists
ls -lh app/data.js
# Serve and check console
just serve-browser
# Open browser console (F12) for errors# Check import statistics
just import-mediadive-stats
# Test with small batch
just import-mediadive 5
# Check logs for specific errors
# Errors are printed during importCultureMech/
├─ data/
│ ├─ raw/ # Original source data
│ │ ├─ mediadive/
│ │ └─ microbe-media-param/
│ └─ processed/ # Intermediate (future)
│
├─ kb/
│ └─ media/ # Curated YAML recipes ★
│ ├─ bacterial/
│ ├─ fungal/
│ ├─ archaea/
│ ├─ specialized/
│ └─ algae/
│
├─ src/
│ └─ culturemech/
│ ├─ schema/ # LinkML schema
│ ├─ import/ # Importers
│ ├─ export/ # KGX, browser exports
│ └─ templates/ # Jinja2 HTML templates
│
├─ app/ # Faceted browser UI
│ ├─ index.html
│ ├─ schema.js
│ └─ data.js # Generated
│
├─ pages/ # Generated HTML pages
├─ output/ # Generated KG exports
└─ tests/ # Test suite
Run just --list to see all commands, organized by group:
just --list
Available recipes:
[Data] fetch-raw-data # Fetch all raw data
[Data] fetch-mediadive-raw # Fetch MediaDive
[Data] show-raw-data-stats # Show statistics
[Import] import-mediadive [limit] # Import from MediaDive
[Import] chemical-mapping-stats # Chemical coverage
[QC] validate file # 3-layer validation
[QC] validate-all # Validate all recipes
[QC] qc # Full QC pipeline
[Export] kgx-export # Export to KG format
[Browser] gen-browser-data # Generate browser data
[Browser] serve-browser # Serve browser locally
[Pages] gen-pages # Generate HTML pages
[Pages] gen-page file # Generate single page
[Docs] serve-docs # Serve documentation
[Utils] clean # Clean generated files
[Utils] count-recipes # Count by category- Explore the browser:
just serve-browser - Read a recipe:
cat normalized_yaml/bacterial/LB_Broth.yaml - Understand the schema:
cat src/culturemech/schema/culturemech.yaml - Add a recipe: Follow CONTRIBUTING.md
- Export to KG:
just kgx-export
- DATA_LAYERS.md: Complete data architecture
- CONTRIBUTING.md: Curation guidelines
- README.md: Full project documentation
- IMPORT_COMPLETE.md: Import statistics and future roadmap
- IMPLEMENTATION_SUMMARY.md: Technical details
- Issues: File on GitHub (add repo link when available)
- Questions: Check documentation first
- Contributing: See CONTRIBUTING.md
Status: Production ready with 3,146 validated recipes!
Last Updated: 2026-01-21