This repository contains course materials, hands-on labs, and assessment frameworks for a graduate-level course in applied machine learning for bioinformatics. It demonstrates full-cycle curriculum design, instructional delivery, and learner evaluation using real-world datasets and reproducible workflows.
Instructor: Sarangan "Ravi" Ravichandran
The course is primarily based on:
Joel Grus, Data Science from Scratch, 2nd Edition, O’Reilly Media, Inc.
The original concepts, structure, and examples from the book have been expanded and adapted for instructional use. In particular, this repository includes:
- Detailed explanations and annotations to support learning
- Additional examples beyond those in the textbook
- Code adaptations designed for clarity and pedagogical use
- Conversion of Python examples into ready-to-run Google Colab notebooks, requiring no local setup
These materials are intended to complement the textbook, not replace it. Students are expected to read the corresponding book chapters alongside the notebooks in this repository.
All materials in this repository are provided for educational use only.
Any modifications, extensions, or instructional adaptations beyond the original textbook and code examples are the responsibility of the course instructor.
The notebooks in this repository are adapted from the examples presented in Data Science from Scratch (2nd Edition) and have been implemented in Google Colab for ease of use. In addition to making the code readily executable, the notebooks include:
- Detailed explanations of key concepts
- Step-by-step walkthroughs of how the code works
- Additional comments and annotations to clarify implementation details
- Minor adaptations to improve readability and instructional flow
Students are encouraged to read the corresponding textbook chapters first, and then use these notebooks to reinforce understanding through hands-on practice and experimentation.
The Notebooks/ directory contains hands-on instructional labs used during the course. Each notebook is designed to support a specific learning objective and includes:
- Concept explanations in Markdown
- Step-by-step code walkthroughs
- Embedded discussion prompts
- Examples intended for exploration and modification by learners
Notebooks are structured to be run top-to-bottom in Google Colab and are used in conjunction with the course project milestones defined in /course-docs/.
This course is designed using a structured instructional design approach aligned with ADDIE principles (Analysis, Design, Development, Implementation, Evaluation).
- Progressive, project-based learning from exploratory data analysis to modeling, evaluation, and interpretation
- Emphasis on reproducibility, methodological rigor, and responsible use of machine learning
- Hands-on instruction using Python and Jupyter/Colab notebooks with real-world datasets
- Designed for learners with mixed technical backgrounds
Learner progress is evaluated through clearly defined milestones and performance-based assessments, including:
- Multi-stage project deliverables (proposal, EDA, modeling, final demo)
- Live notebook demonstrations and repository walkthroughs
- Evaluation criteria emphasizing:
- Data quality and preprocessing
- Appropriate method selection and validation
- Interpretation, limitations, and responsible claims
- Reproducibility and code hygiene
Detailed project requirements and grading rubrics are provided in the /course-docs/ directory.
bifx-546/
├── README.md
├── Notebooks/
│ ├── README.md
│ ├── *.ipynb
│
├── course-docs/
│ ├── README.md
│ ├── BIFX_546_ML_For_Bioinformatics.pdf
│ ├── BIFX_546_Proj_Deliverables_and_Grading_Req.pdf
│ └── BIFX_546_Project_Rubric.pdf