You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+43Lines changed: 43 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,49 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
16
16
17
17
### Fixed
18
18
19
+
## [0.3.13] - 2025-09-07
20
+
21
+
### Added
22
+
-**Security Enhancements**:
23
+
- Implemented `sanitize_input` function in `prompt.py` to prevent prompt injection attacks by removing or escaping common injection patterns and sensitive keywords (e.g., "ignore previous instructions", "grading logic").
24
+
- Added input sanitization for student code, README content, and JSON reports to ensure safe inclusion in LLM prompts.
25
+
- Introduced random delimiters to wrap content and prevent malicious prompt manipulation.
26
+
- Added a `sleep 1` command in `build.yml` to stabilize pipeline execution by introducing a brief delay before running `entrypoint.py`.
27
+
- Added guardrail instructions in `prompt.py` to restrict LLM behavior to providing feedback based solely on provided test results, student code, and assignment instructions.
28
+
29
+
### Changed
30
+
-**Build Workflow** (`build.yml`):
31
+
- Reordered the model matrix to prioritize `claude-sonnet-4-20250514` and `google/gemma-2-9b-it`, ensuring consistency in model testing order.
32
+
- Updated `actions/checkout` from `v4` to `v5` for improved performance and compatibility.
33
+
- Changed default model from `gemini` to `gemini-2.5-flash` for improved performance and accuracy.
34
+
-**README** (`README.md`):
35
+
- Updated project description to highlight enhanced security against prompt injection attacks.
36
+
- Added details about new security features (input sanitization and random delimiters) in the "Key Features" and "Notes" sections.
37
+
- Changed input parameters (`report-files`, `student-files`, `readme-path`) to have no default values, requiring explicit configuration for clarity.
38
+
- Updated default model in input table and example workflow to `gemini-2.5-flash`.
39
+
- Added note about prompt injection mitigation, emphasizing use in controlled environments.
40
+
- Updated future enhancements to include advanced parsing for stronger prompt injection defenses.
41
+
- Added troubleshooting guidance for prompt injection anomalies.
42
+
- Updated acknowledgments to reflect contributions from `Gemini 2.5 Flash` instead of `Gemini 2.0 Flash`.
43
+
-**Prompt Logic** (`prompt.py`):
44
+
- Applied input sanitization to `longrepr`, `stderr`, student code, README content, and locale files to prevent prompt injection.
45
+
- Added guardrail instruction in `get_initial_instruction` to enforce strict adherence to feedback tasks.
46
+
- Improved type hints and function signatures for better code clarity and maintainability.
47
+
- Optimized string handling by replacing newlines with spaces and limiting input length to 10,000 characters to prevent prompt structure disruption.
48
+
-**Tests** (`test_prompt.py`):
49
+
- Improved test assertions with descriptive error messages for better debugging.
50
+
- Simplified test code by removing redundant tuple conversions and map operations.
51
+
- Updated test fixtures and assertions to align with sanitized input handling.
52
+
- Renamed test functions to follow a consistent naming convention (e.g., `test__exclude_common_contents__single` to `test_exclude_common_contents__single`).
53
+
- Updated `expected_default_gemini_model` fixture to reflect the new default model `gemini-2.5-flash`.
54
+
- Enhanced `test_collect_longrepr__compare_contents` to track found markers and report missing ones explicitly.
55
+
- Corrected minor typos and formatting in test comments and strings (e.g., standardized language terms like "Foutmelding" for Dutch).
56
+
57
+
### Fixed
58
+
- Ensured consistent handling of README content by sanitizing inputs in `assignment_instruction` and `exclude_common_contents` to prevent malicious patterns from affecting prompt generation.
59
+
- Fixed potential issues in test suite by adding explicit type checking and non-empty string validation in `test_collect_longrepr__has_list_items_len`.
60
+
- Addressed missing marker detection in `test_collect_longrepr__compare_contents` by tracking found markers instead of modifying the marker list.
Copy file name to clipboardExpand all lines: README.md
+18-13Lines changed: 18 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,9 +3,9 @@
3
3
4
4
# AI Code Tutor
5
5
6
-
This GitHub Action uses AI to provide personalized feedback for student assignments in C/C++ and Python. It analyzes test results and code, identifying errors, suggesting optimizations, and explaining concepts clearly. Ideal for GitHub Classroom, it saves instructors time and ensures consistent, on-demand feedback.
6
+
This GitHub Action uses AI to provide personalized feedback for student assignments in C/C++ and Python. It analyzes test results and code, identifying errors, suggesting optimizations, and explaining concepts clearly. Ideal for GitHub Classroom, it saves instructors time and ensures consistent, on-demand feedback with enhanced security against prompt injection attacks.
7
7
8
-
The AI tutor processes JSON test reports from `pytest-json-report`, generated by `pytest` tests wrapping C/C++ or Python code. It detects logic errors, recommends efficient algorithms, and links to relevant documentation.
8
+
The AI tutor processes JSON test reports from `pytest-json-report`, generated by `pytest` tests wrapping C/C++ or Python code. It detects logic errors, recommends efficient algorithms, and links to relevant documentation. New security features sanitize inputs and use random delimiters to prevent malicious prompt manipulation.
9
9
10
10
## Key Features
11
11
- AI-powered feedback for C/C++ and Python assignments.
@@ -14,6 +14,7 @@ The AI tutor processes JSON test reports from `pytest-json-report`, generated by
- Customizable feedback language (e.g., English, Korean).
16
16
- Excludes common README content to optimize API usage.
17
+
-**Security Enhancements**: Sanitizes student code and READMEs to remove malicious patterns and wraps content with random delimiters to prevent prompt injection attacks.
- **C/C++ Testing**: Tests can run in a Docker container with `pytest` wrapping C/C++ code (e.g., via `ctypes`forshared libraries, asin`test_dynamic.py`). Ensure JSON reports are generated.
82
-
- **Model Selection**: Set `model` to prefer an LLM (e.g., `gemini`). If its key is unavailable, the action falls back to Gemini if`INPUT_GOOGLE_API_KEY` is set, or uses any one of available key.
83
+
- **Model Selection**: Set `model` to prefer an LLM (e.g., `gemini-2.5-flash`). If its key is unavailable, the action falls back to Gemini if`INPUT_GOOGLE_API_KEY` is set, or uses any available key.
83
84
- **Secrets**: Store API keys as repository secrets with `INPUT_` prefix (e.g., `INPUT_GOOGLE_API_KEY`) in Settings > Secrets and variables > Actions.
84
85
- **README Optimization**: Exclude common README content with:
85
86
- Start: ``From here is common to all assignments.``
86
87
- End: ``Until here is common to all assignments.``
87
88
- Use double backticks (``).
89
+
- **Security**: Student code and READMEs are sanitized to remove malicious patterns (e.g., "ignore previous instructions") and wrapped with random delimiters to prevent prompt injection.
88
90
89
91
### Optimizing pytest for AI Feedback
90
92
- Use descriptive test names (e.g., `test_sum_range_for__valid_input`).
0 commit comments