Skip to content

Commit 9cfb48f

Browse files
committed
Merge branch 'feature/input-sanitization'
2 parents 8bb5cf7 + dbf5812 commit 9cfb48f

5 files changed

Lines changed: 333 additions & 329 deletions

File tree

.github/workflows/build.yml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,10 +100,10 @@ jobs:
100100
strategy:
101101
matrix:
102102
model: [
103-
gemini-2.5-flash,
104-
grok-code-fast,
105103
claude-sonnet-4-20250514,
104+
gemini-2.5-flash,
106105
google/gemma-2-9b-it,
106+
grok-code-fast,
107107
sonar
108108
]
109109
fail-fast: false
@@ -137,6 +137,7 @@ jobs:
137137
run: |
138138
. .venv/bin/activate
139139
uv pip list
140+
sleep 1
140141
python3 entrypoint.py
141142
timeout-minutes: 5
142143
- name: Output and Verify

CHANGELOG.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,49 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1616

1717
### Fixed
1818

19+
## [0.3.13] - 2025-09-07
20+
21+
### Added
22+
- **Security Enhancements**:
23+
- Implemented `sanitize_input` function in `prompt.py` to prevent prompt injection attacks by removing or escaping common injection patterns and sensitive keywords (e.g., "ignore previous instructions", "grading logic").
24+
- Added input sanitization for student code, README content, and JSON reports to ensure safe inclusion in LLM prompts.
25+
- Introduced random delimiters to wrap content and prevent malicious prompt manipulation.
26+
- Added a `sleep 1` command in `build.yml` to stabilize pipeline execution by introducing a brief delay before running `entrypoint.py`.
27+
- Added guardrail instructions in `prompt.py` to restrict LLM behavior to providing feedback based solely on provided test results, student code, and assignment instructions.
28+
29+
### Changed
30+
- **Build Workflow** (`build.yml`):
31+
- Reordered the model matrix to prioritize `claude-sonnet-4-20250514` and `google/gemma-2-9b-it`, ensuring consistency in model testing order.
32+
- Updated `actions/checkout` from `v4` to `v5` for improved performance and compatibility.
33+
- Changed default model from `gemini` to `gemini-2.5-flash` for improved performance and accuracy.
34+
- **README** (`README.md`):
35+
- Updated project description to highlight enhanced security against prompt injection attacks.
36+
- Added details about new security features (input sanitization and random delimiters) in the "Key Features" and "Notes" sections.
37+
- Changed input parameters (`report-files`, `student-files`, `readme-path`) to have no default values, requiring explicit configuration for clarity.
38+
- Updated default model in input table and example workflow to `gemini-2.5-flash`.
39+
- Added note about prompt injection mitigation, emphasizing use in controlled environments.
40+
- Updated future enhancements to include advanced parsing for stronger prompt injection defenses.
41+
- Added troubleshooting guidance for prompt injection anomalies.
42+
- Updated acknowledgments to reflect contributions from `Gemini 2.5 Flash` instead of `Gemini 2.0 Flash`.
43+
- **Prompt Logic** (`prompt.py`):
44+
- Applied input sanitization to `longrepr`, `stderr`, student code, README content, and locale files to prevent prompt injection.
45+
- Added guardrail instruction in `get_initial_instruction` to enforce strict adherence to feedback tasks.
46+
- Improved type hints and function signatures for better code clarity and maintainability.
47+
- Optimized string handling by replacing newlines with spaces and limiting input length to 10,000 characters to prevent prompt structure disruption.
48+
- **Tests** (`test_prompt.py`):
49+
- Improved test assertions with descriptive error messages for better debugging.
50+
- Simplified test code by removing redundant tuple conversions and map operations.
51+
- Updated test fixtures and assertions to align with sanitized input handling.
52+
- Renamed test functions to follow a consistent naming convention (e.g., `test__exclude_common_contents__single` to `test_exclude_common_contents__single`).
53+
- Updated `expected_default_gemini_model` fixture to reflect the new default model `gemini-2.5-flash`.
54+
- Enhanced `test_collect_longrepr__compare_contents` to track found markers and report missing ones explicitly.
55+
- Corrected minor typos and formatting in test comments and strings (e.g., standardized language terms like "Foutmelding" for Dutch).
56+
57+
### Fixed
58+
- Ensured consistent handling of README content by sanitizing inputs in `assignment_instruction` and `exclude_common_contents` to prevent malicious patterns from affecting prompt generation.
59+
- Fixed potential issues in test suite by adding explicit type checking and non-empty string validation in `test_collect_longrepr__has_list_items_len`.
60+
- Addressed missing marker detection in `test_collect_longrepr__compare_contents` by tracking found markers instead of modifying the marker list.
61+
1962
## [v0.3.12] - 2025-09-07
2063

2164
### Added

README.md

Lines changed: 18 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33

44
# AI Code Tutor
55

6-
This GitHub Action uses AI to provide personalized feedback for student assignments in C/C++ and Python. It analyzes test results and code, identifying errors, suggesting optimizations, and explaining concepts clearly. Ideal for GitHub Classroom, it saves instructors time and ensures consistent, on-demand feedback.
6+
This GitHub Action uses AI to provide personalized feedback for student assignments in C/C++ and Python. It analyzes test results and code, identifying errors, suggesting optimizations, and explaining concepts clearly. Ideal for GitHub Classroom, it saves instructors time and ensures consistent, on-demand feedback with enhanced security against prompt injection attacks.
77

8-
The AI tutor processes JSON test reports from `pytest-json-report`, generated by `pytest` tests wrapping C/C++ or Python code. It detects logic errors, recommends efficient algorithms, and links to relevant documentation.
8+
The AI tutor processes JSON test reports from `pytest-json-report`, generated by `pytest` tests wrapping C/C++ or Python code. It detects logic errors, recommends efficient algorithms, and links to relevant documentation. New security features sanitize inputs and use random delimiters to prevent malicious prompt manipulation.
99

1010
## Key Features
1111
- AI-powered feedback for C/C++ and Python assignments.
@@ -14,6 +14,7 @@ The AI tutor processes JSON test reports from `pytest-json-report`, generated by
1414
- Flexible LLM selection (Claude, Gemini, Grok, Nvidia NIM, Perplexity) with Gemini fallback.
1515
- Customizable feedback language (e.g., English, Korean).
1616
- Excludes common README content to optimize API usage.
17+
- **Security Enhancements**: Sanitizes student code and READMEs to remove malicious patterns and wraps content with random delimiters to prevent prompt injection attacks.
1718

1819
## Prerequisites
1920
- **Python Dependencies**:
@@ -47,7 +48,7 @@ jobs:
4748
WORKSPACE_OUTPUT: ${{ runner.temp }}/output
4849
CONTAINER_OUTPUT: /output
4950
steps:
50-
- uses: actions/checkout@v4
51+
- uses: actions/checkout@v5
5152
- name: Set up environment
5253
run: pip install pytest==8.3.5 pytest-json-report==1.5.0 pytest-xdist==3.6.1 requests==2.32.4
5354
- name: Create output folder
@@ -61,14 +62,14 @@ jobs:
6162
ghcr.io/kangwonlee/edu-base-cpp:4e0d6d8 \
6263
/bin/sh -c "cmake . -DCMAKE_BUILD_TYPE=Debug -DSTUDENT_DIR=${{ env.CONTAINER_WORKSPACE }} && make && python3 -m pytest --json-report --json-report-indent=4 --json-report-file=${{ env.CONTAINER_OUTPUT }}/report.json test_dynamic.py"
6364
- name: AI Code Tutor
64-
uses: kangwonlee/gemini-python-tutor@v0.3.7
65+
uses: kangwonlee/gemini-python-tutor@v0.3.12
6566
if: always()
6667
with:
6768
report-files: ${{ env.WORKSPACE_OUTPUT }}/report.json
6869
student-files: ${{ env.CONTAINER_SRC }}/${{ env.C_FILENAME }}
6970
readme-path: ${{ env.CONTAINER_WORKSPACE }}/README.md
7071
explanation-in: English
71-
model: gemini
72+
model: gemini-2.5-flash
7273
INPUT_CLAUDE_API_KEY: ${{ secrets.INPUT_CLAUDE_API_KEY }}
7374
INPUT_GOOGLE_API_KEY: ${{ secrets.INPUT_GOOGLE_API_KEY }}
7475
INPUT_GROK_API_KEY: ${{ secrets.INPUT_GROK_API_KEY }}
@@ -79,12 +80,13 @@ jobs:
7980

8081
### Notes
8182
- **C/C++ Testing**: Tests can run in a Docker container with `pytest` wrapping C/C++ code (e.g., via `ctypes` for shared libraries, as in `test_dynamic.py`). Ensure JSON reports are generated.
82-
- **Model Selection**: Set `model` to prefer an LLM (e.g., `gemini`). If its key is unavailable, the action falls back to Gemini if `INPUT_GOOGLE_API_KEY` is set, or uses any one of available key.
83+
- **Model Selection**: Set `model` to prefer an LLM (e.g., `gemini-2.5-flash`). If its key is unavailable, the action falls back to Gemini if `INPUT_GOOGLE_API_KEY` is set, or uses any available key.
8384
- **Secrets**: Store API keys as repository secrets with `INPUT_` prefix (e.g., `INPUT_GOOGLE_API_KEY`) in Settings > Secrets and variables > Actions.
8485
- **README Optimization**: Exclude common README content with:
8586
- Start: ``From here is common to all assignments.``
8687
- End: ``Until here is common to all assignments.``
8788
- Use double backticks (``).
89+
- **Security**: Student code and READMEs are sanitized to remove malicious patterns (e.g., "ignore previous instructions") and wrapped with random delimiters to prevent prompt injection.
8890

8991
### Optimizing pytest for AI Feedback
9092
- Use descriptive test names (e.g., `test_sum_range_for__valid_input`).
@@ -94,11 +96,11 @@ jobs:
9496
## Inputs
9597
| Input | Description | Required | Default |
9698
|-------------------------|--------------------------------------------------|----------|-----------------|
97-
| `report-files` | Comma-separated JSON report files | Yes | `report.json` |
98-
| `student-files` | Comma-separated student code files (`.c`, `.cpp`, `.py`) | Yes | `exercise.py` |
99-
| `readme-path` | Path to assignment instructions (README.md) | No | `README.md` |
99+
| `report-files` | Comma-separated JSON report files | Yes | None |
100+
| `student-files` | Comma-separated student code files (`.c`, `.cpp`, `.py`) | Yes | None |
101+
| `readme-path` | Path to assignment instructions (README.md) | Yes | None |
100102
| `explanation-in` | Feedback language (e.g., English, Korean) | No | `English` |
101-
| `model` | Preferred LLM (e.g., `gemini`, `claude`) | No | None |
103+
| `model` | Preferred LLM (e.g., `gemini-2.5-flash`, `claude-sonnet-4-20250514`) | No | `gemini-2.5-flash` |
102104
| `INPUT_CLAUDE_API_KEY` | Claude API key | No* | None |
103105
| `INPUT_GOOGLE_API_KEY` | Google Gemini API key | No* | None |
104106
| `INPUT_GROK_API_KEY` | Grok API key | No* | None |
@@ -114,7 +116,7 @@ with:
114116
student-files: 'src/main.c,src/utils.c'
115117
readme-path: README.md
116118
explanation-in: English
117-
model: gemini
119+
model: gemini-2.5-flash
118120
INPUT_GOOGLE_API_KEY: ${{ secrets.INPUT_GOOGLE_API_KEY }}
119121
INPUT_CLAUDE_API_KEY: ${{ secrets.INPUT_CLAUDE_API_KEY }}
120122
```
@@ -126,11 +128,13 @@ with:
126128
- Primarily supports C/C++ and Python assignments via `pytest-json-report`.
127129
- Requires at least one valid API key.
128130
- C/C++ feedback relies on `pytest` tests wrapping compiled code.
131+
- Prompt injection mitigated but not eliminated; use in controlled environments.
129132

130133
## Future Enhancements
131134
- Auto-detect feedback language.
132135
- Support additional programming languages.
133136
- Add verbose mode for detailed feedback.
137+
- Enhance prompt injection defenses with advanced parsing.
134138

135139
## Troubleshooting
136140
Check GitHub Actions logs for details.
@@ -139,6 +143,7 @@ Check GitHub Actions logs for details.
139143
- **API Key Issues**: "No API keys provided" – Ensure at least one API key is set in secrets.
140144
- **Report File Issues**: "Report file not found" – Verify JSON report exists.
141145
- **Student File Issues**: "Student file not found" – Check file paths and extensions.
146+
- **Prompt Injection**: Malicious inputs are sanitized and wrapped with random delimiters, but monitor outputs for anomalies.
142147

143148
### Debugging Tips
144149
- View logs in the "AI Code Tutor" job.
@@ -149,10 +154,10 @@ Check GitHub Actions logs for details.
149154
Questions? Contact [https://github.com/kangwonlee](https://github.com/kangwonlee).
150155

151156
## License
152-
BSD 3-Clause License + Do Not Harm.
157+
BSD 3-Clause License + Do Not Harm.
153158
Copyright (c) 2024 Kangwon Lee
154159

155160
## Acknowledgements
156161
- Built using [python-github-action-template](https://github.com/cicirello/python-github-action-template) by Vincent A. Cicirello (MIT License).
157-
- Gemini 2.0 Flash and Grok 3 assisted with code and documentation.
162+
- Gemini 2.5 Flash and Grok assisted with code and documentation.
158163
- Registered as #C-2024-034203, #C-2024-035473, #C-2025-016393, and #C-2025-027967 with the Korea Copyright Commission.

0 commit comments

Comments
 (0)