kangwonlee
diff --git a/‎.github/workflows/build.yml‎
Lines changed: 3 additions & 2 deletions b/‎.github/workflows/build.yml‎
Lines changed: 3 additions & 2 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 43 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 43 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 18 additions & 13 deletions b/‎README.md‎
Lines changed: 18 additions & 13 deletions
@@ -100,10 +100,10 @@ jobs:
     strategy:
       matrix:
         model: [
-          gemini-2.5-flash,
-          grok-code-fast,
           claude-sonnet-4-20250514,
+          gemini-2.5-flash,
           google/gemma-2-9b-it,
+          grok-code-fast,
           sonar
         ]
       fail-fast: false
@@ -137,6 +137,7 @@ jobs:
       run: |
         . .venv/bin/activate
         uv pip list
+        sleep 1
         python3 entrypoint.py
       timeout-minutes: 5
     - name: Output and Verify
 
@@ -16,6 +16,49 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Fixed
 
+## [0.3.13] - 2025-09-07
+
+### Added
+- **Security Enhancements**:
+  - Implemented `sanitize_input` function in `prompt.py` to prevent prompt injection attacks by removing or escaping common injection patterns and sensitive keywords (e.g., "ignore previous instructions", "grading logic").
+  - Added input sanitization for student code, README content, and JSON reports to ensure safe inclusion in LLM prompts.
+  - Introduced random delimiters to wrap content and prevent malicious prompt manipulation.
+  - Added a `sleep 1` command in `build.yml` to stabilize pipeline execution by introducing a brief delay before running `entrypoint.py`.
+  - Added guardrail instructions in `prompt.py` to restrict LLM behavior to providing feedback based solely on provided test results, student code, and assignment instructions.
+
+### Changed
+- **Build Workflow** (`build.yml`):
+  - Reordered the model matrix to prioritize `claude-sonnet-4-20250514` and `google/gemma-2-9b-it`, ensuring consistency in model testing order.
+  - Updated `actions/checkout` from `v4` to `v5` for improved performance and compatibility.
+  - Changed default model from `gemini` to `gemini-2.5-flash` for improved performance and accuracy.
+- **README** (`README.md`):
+  - Updated project description to highlight enhanced security against prompt injection attacks.
+  - Added details about new security features (input sanitization and random delimiters) in the "Key Features" and "Notes" sections.
+  - Changed input parameters (`report-files`, `student-files`, `readme-path`) to have no default values, requiring explicit configuration for clarity.
+  - Updated default model in input table and example workflow to `gemini-2.5-flash`.
+  - Added note about prompt injection mitigation, emphasizing use in controlled environments.
+  - Updated future enhancements to include advanced parsing for stronger prompt injection defenses.
+  - Added troubleshooting guidance for prompt injection anomalies.
+  - Updated acknowledgments to reflect contributions from `Gemini 2.5 Flash` instead of `Gemini 2.0 Flash`.
+- **Prompt Logic** (`prompt.py`):
+  - Applied input sanitization to `longrepr`, `stderr`, student code, README content, and locale files to prevent prompt injection.
+  - Added guardrail instruction in `get_initial_instruction` to enforce strict adherence to feedback tasks.
+  - Improved type hints and function signatures for better code clarity and maintainability.
+  - Optimized string handling by replacing newlines with spaces and limiting input length to 10,000 characters to prevent prompt structure disruption.
+- **Tests** (`test_prompt.py`):
+  - Improved test assertions with descriptive error messages for better debugging.
+  - Simplified test code by removing redundant tuple conversions and map operations.
+  - Updated test fixtures and assertions to align with sanitized input handling.
+  - Renamed test functions to follow a consistent naming convention (e.g., `test__exclude_common_contents__single` to `test_exclude_common_contents__single`).
+  - Updated `expected_default_gemini_model` fixture to reflect the new default model `gemini-2.5-flash`.
+  - Enhanced `test_collect_longrepr__compare_contents` to track found markers and report missing ones explicitly.
+  - Corrected minor typos and formatting in test comments and strings (e.g., standardized language terms like "Foutmelding" for Dutch).
+
+### Fixed
+- Ensured consistent handling of README content by sanitizing inputs in `assignment_instruction` and `exclude_common_contents` to prevent malicious patterns from affecting prompt generation.
+- Fixed potential issues in test suite by adding explicit type checking and non-empty string validation in `test_collect_longrepr__has_list_items_len`.
+- Addressed missing marker detection in `test_collect_longrepr__compare_contents` by tracking found markers instead of modifying the marker list.
+
 ## [v0.3.12] - 2025-09-07
 
 ### Added
 
@@ -3,9 +3,9 @@
 
 # AI Code Tutor
 
-This GitHub Action uses AI to provide personalized feedback for student assignments in C/C++ and Python. It analyzes test results and code, identifying errors, suggesting optimizations, and explaining concepts clearly. Ideal for GitHub Classroom, it saves instructors time and ensures consistent, on-demand feedback.
+This GitHub Action uses AI to provide personalized feedback for student assignments in C/C++ and Python. It analyzes test results and code, identifying errors, suggesting optimizations, and explaining concepts clearly. Ideal for GitHub Classroom, it saves instructors time and ensures consistent, on-demand feedback with enhanced security against prompt injection attacks.
 
-The AI tutor processes JSON test reports from `pytest-json-report`, generated by `pytest` tests wrapping C/C++ or Python code. It detects logic errors, recommends efficient algorithms, and links to relevant documentation.
+The AI tutor processes JSON test reports from `pytest-json-report`, generated by `pytest` tests wrapping C/C++ or Python code. It detects logic errors, recommends efficient algorithms, and links to relevant documentation. New security features sanitize inputs and use random delimiters to prevent malicious prompt manipulation.
 
 ## Key Features
 - AI-powered feedback for C/C++ and Python assignments.
@@ -14,6 +14,7 @@ The AI tutor processes JSON test reports from `pytest-json-report`, generated by
 - Flexible LLM selection (Claude, Gemini, Grok, Nvidia NIM, Perplexity) with Gemini fallback.
 - Customizable feedback language (e.g., English, Korean).
 - Excludes common README content to optimize API usage.
+- **Security Enhancements**: Sanitizes student code and READMEs to remove malicious patterns and wraps content with random delimiters to prevent prompt injection attacks.
 
 ## Prerequisites
 - **Python Dependencies**:
@@ -47,7 +48,7 @@ jobs:
       WORKSPACE_OUTPUT: ${{ runner.temp }}/output
       CONTAINER_OUTPUT: /output
     steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v5
       - name: Set up environment
         run: pip install pytest==8.3.5 pytest-json-report==1.5.0 pytest-xdist==3.6.1 requests==2.32.4
       - name: Create output folder
@@ -61,14 +62,14 @@ jobs:
             ghcr.io/kangwonlee/edu-base-cpp:4e0d6d8 \
             /bin/sh -c "cmake . -DCMAKE_BUILD_TYPE=Debug -DSTUDENT_DIR=${{ env.CONTAINER_WORKSPACE }} && make && python3 -m pytest --json-report --json-report-indent=4 --json-report-file=${{ env.CONTAINER_OUTPUT }}/report.json test_dynamic.py"
       - name: AI Code Tutor
-        uses: kangwonlee/gemini-python-tutor@v0.3.7
+        uses: kangwonlee/gemini-python-tutor@v0.3.12
         if: always()
         with:
           report-files: ${{ env.WORKSPACE_OUTPUT }}/report.json
           student-files: ${{ env.CONTAINER_SRC }}/${{ env.C_FILENAME }}
           readme-path: ${{ env.CONTAINER_WORKSPACE }}/README.md
           explanation-in: English
-          model: gemini
+          model: gemini-2.5-flash
           INPUT_CLAUDE_API_KEY: ${{ secrets.INPUT_CLAUDE_API_KEY }}
           INPUT_GOOGLE_API_KEY: ${{ secrets.INPUT_GOOGLE_API_KEY }}
           INPUT_GROK_API_KEY: ${{ secrets.INPUT_GROK_API_KEY }}
@@ -79,12 +80,13 @@ jobs:
 
 ### Notes
 - **C/C++ Testing**: Tests can run in a Docker container with `pytest` wrapping C/C++ code (e.g., via `ctypes` for shared libraries, as in `test_dynamic.py`). Ensure JSON reports are generated.
-- **Model Selection**: Set `model` to prefer an LLM (e.g., `gemini`). If its key is unavailable, the action falls back to Gemini if `INPUT_GOOGLE_API_KEY` is set, or uses any one of available key.
+- **Model Selection**: Set `model` to prefer an LLM (e.g., `gemini-2.5-flash`). If its key is unavailable, the action falls back to Gemini if `INPUT_GOOGLE_API_KEY` is set, or uses any available key.
 - **Secrets**: Store API keys as repository secrets with `INPUT_` prefix (e.g., `INPUT_GOOGLE_API_KEY`) in Settings > Secrets and variables > Actions.
 - **README Optimization**: Exclude common README content with:
   - Start: ``From here is common to all assignments.``
   - End: ``Until here is common to all assignments.``
   - Use double backticks (``).
+- **Security**: Student code and READMEs are sanitized to remove malicious patterns (e.g., "ignore previous instructions") and wrapped with random delimiters to prevent prompt injection.
 
 ### Optimizing pytest for AI Feedback
 - Use descriptive test names (e.g., `test_sum_range_for__valid_input`).
@@ -94,11 +96,11 @@ jobs:
 ## Inputs
 | Input                   | Description                                      | Required | Default         |
 |-------------------------|--------------------------------------------------|----------|-----------------|
-| `report-files`          | Comma-separated JSON report files                | Yes      | `report.json`   |
-| `student-files`         | Comma-separated student code files (`.c`, `.cpp`, `.py`) | Yes | `exercise.py`   |
-| `readme-path`           | Path to assignment instructions (README.md)      | No       | `README.md`     |
+| `report-files`          | Comma-separated JSON report files                | Yes      | None            |
+| `student-files`         | Comma-separated student code files (`.c`, `.cpp`, `.py`) | Yes | None            |
+| `readme-path`           | Path to assignment instructions (README.md)      | Yes      | None            |
 | `explanation-in`        | Feedback language (e.g., English, Korean)        | No       | `English`       |
-| `model`                 | Preferred LLM (e.g., `gemini`, `claude`)        | No       | None            |
+| `model`                 | Preferred LLM (e.g., `gemini-2.5-flash`, `claude-sonnet-4-20250514`) | No | `gemini-2.5-flash` |
 | `INPUT_CLAUDE_API_KEY`  | Claude API key                                  | No*      | None            |
 | `INPUT_GOOGLE_API_KEY`  | Google Gemini API key                           | No*      | None            |
 | `INPUT_GROK_API_KEY`    | Grok API key                                    | No*      | None            |
@@ -114,7 +116,7 @@ with:
   student-files: 'src/main.c,src/utils.c'
   readme-path: README.md
   explanation-in: English
-  model: gemini
+  model: gemini-2.5-flash
   INPUT_GOOGLE_API_KEY: ${{ secrets.INPUT_GOOGLE_API_KEY }}
   INPUT_CLAUDE_API_KEY: ${{ secrets.INPUT_CLAUDE_API_KEY }}
 ```
@@ -126,11 +128,13 @@ with:
 - Primarily supports C/C++ and Python assignments via `pytest-json-report`.
 - Requires at least one valid API key.
 - C/C++ feedback relies on `pytest` tests wrapping compiled code.
+- Prompt injection mitigated but not eliminated; use in controlled environments.
 
 ## Future Enhancements
 - Auto-detect feedback language.
 - Support additional programming languages.
 - Add verbose mode for detailed feedback.
+- Enhance prompt injection defenses with advanced parsing.
 
 ## Troubleshooting
 Check GitHub Actions logs for details.
@@ -139,6 +143,7 @@ Check GitHub Actions logs for details.
 - **API Key Issues**: "No API keys provided" – Ensure at least one API key is set in secrets.
 - **Report File Issues**: "Report file not found" – Verify JSON report exists.
 - **Student File Issues**: "Student file not found" – Check file paths and extensions.
+- **Prompt Injection**: Malicious inputs are sanitized and wrapped with random delimiters, but monitor outputs for anomalies.
 
 ### Debugging Tips
 - View logs in the "AI Code Tutor" job.
@@ -149,10 +154,10 @@ Check GitHub Actions logs for details.
 Questions? Contact [https://github.com/kangwonlee](https://github.com/kangwonlee).
 
 ## License
-BSD 3-Clause License + Do Not Harm.  
+BSD 3-Clause License + Do Not Harm.
 Copyright (c) 2024 Kangwon Lee
 
 ## Acknowledgements
 - Built using [python-github-action-template](https://github.com/cicirello/python-github-action-template) by Vincent A. Cicirello (MIT License).
-- Gemini 2.0 Flash and Grok 3 assisted with code and documentation.
+- Gemini 2.5 Flash and Grok assisted with code and documentation.
 - Registered as #C-2024-034203, #C-2024-035473, #C-2025-016393, and #C-2025-027967 with the Korea Copyright Commission.