This document defines the standardized methodology for publishing research papers, creating complete dossiers, and managing the GitHub repository for the HAWRA project.
- Source Management: All papers must be authored in LaTeX using the
templates/paper_templatestructure. - Asset Linking: Figures must be sourced from
06_presentationorresultsand linked relative to the project root or copied via build script. - Compilation:
- Primary:
pdflatexorlualatexvia CI/CD pipeline. - Fallback: HTML generation with "Print-to-PDF" CSS for environments lacking TeX.
- Primary:
- Output: PDF (PDF/A compliant) and HTML (Web-ready).
- Language: Scientific English (US), checked via
textblobor similar linter. - References: No missing citations (
[?]or??). All bibtex entries must be valid. - Figures: Minimum 300 DPI for raster images. Vector graphics (SVG/PDF) preferred for plots.
- Reproducibility: Every figure must link to the specific
simulation_run_idand code version that generated it.
- Drafts:
v0.x-alpha(internal),v0.x-beta(collaborator review). - Submission:
v1.0-submission(immutable tag). - Revisions:
v1.x-revision(post-review). - Final:
v1.0-published(DOI linked).
A "Complete Dossier" must follow this exact hierarchy:
HAWRA_Dossier_YYYY_MM_DD/
├── 00_Manifesto/ # Executive Summary & Vision
├── 01_Manuscript/ # The primary scientific paper (PDF + Source)
├── 02_Code/ # Snapshot of relevant source code (Arbol, BioOS)
├── 03_Data/ # Raw simulation data & GenBank files
├── 04_Multimedia/ # High-res figures & Videos/GIFs
└── 05_Supplementary/ # SOPs, Lab Protocols, Notebooks
README.md: Explaining the dossier contents.manifest.json: List of all files with MD5 checksums.LICENSE: Usage rights.requirements.txtorenvironment.yml: For code reproducibility.
- Automated script
scripts/verify_dossier.pymust be run before distribution. - Checks: File existence, Checksum validity, PDF readability, Code syntax (linting).
Refactored structure for clarity and industry standards:
/
├── .github/ # CI/CD Workflows
├── src/ # Source Code
│ ├── arbol/
│ ├── bioos/
│ └── simulator/
├── docs/ # Documentation (Sphinx/MkDocs)
├── data/ # Static Data (GenBank, Configs)
├── experiments/ # Jupyter Notebooks & One-off scripts
├── publications/ # LaTeX sources for papers
├── results/ # Simulation outputs (GitIgnored mostly)
├── scripts/ # Build & Maintenance utilities
└── tests/ # Unit & Integration tests
- Naming:
snake_casefor files/folders.PascalCasefor Classes. - Commits: Conventional Commits (e.g.,
feat: add new quantum gate,fix: correct plasmid sequence). - Branching:
main(stable),develop(integration),feature/*(dev).
- Test Suite: Runs
pyteston every push todevelopandmain. - Linting:
flake8orblackfor code quality. - Doc Build: Generates static site from
docs/to GitHub Pages. - Paper Compile: Compiles LaTeX in
publications/to PDF artifacts on release tags.
- Templates: Create reusable templates for LaTeX papers and Dossier structures.
- Scripts: Develop
verify_publication.pyandbuild_dossier.py. - Migration: Move existing files to the new
src/,experiments/,publications/structure (Iterative process).