Skip to content

Commit dbf5812

Browse files
committed
doc : add changelog for v0.3.13
1 parent 873088c commit dbf5812

1 file changed

Lines changed: 43 additions & 0 deletions

File tree

CHANGELOG.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,49 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1616

1717
### Fixed
1818

19+
## [0.3.13] - 2025-09-07
20+
21+
### Added
22+
- **Security Enhancements**:
23+
- Implemented `sanitize_input` function in `prompt.py` to prevent prompt injection attacks by removing or escaping common injection patterns and sensitive keywords (e.g., "ignore previous instructions", "grading logic").
24+
- Added input sanitization for student code, README content, and JSON reports to ensure safe inclusion in LLM prompts.
25+
- Introduced random delimiters to wrap content and prevent malicious prompt manipulation.
26+
- Added a `sleep 1` command in `build.yml` to stabilize pipeline execution by introducing a brief delay before running `entrypoint.py`.
27+
- Added guardrail instructions in `prompt.py` to restrict LLM behavior to providing feedback based solely on provided test results, student code, and assignment instructions.
28+
29+
### Changed
30+
- **Build Workflow** (`build.yml`):
31+
- Reordered the model matrix to prioritize `claude-sonnet-4-20250514` and `google/gemma-2-9b-it`, ensuring consistency in model testing order.
32+
- Updated `actions/checkout` from `v4` to `v5` for improved performance and compatibility.
33+
- Changed default model from `gemini` to `gemini-2.5-flash` for improved performance and accuracy.
34+
- **README** (`README.md`):
35+
- Updated project description to highlight enhanced security against prompt injection attacks.
36+
- Added details about new security features (input sanitization and random delimiters) in the "Key Features" and "Notes" sections.
37+
- Changed input parameters (`report-files`, `student-files`, `readme-path`) to have no default values, requiring explicit configuration for clarity.
38+
- Updated default model in input table and example workflow to `gemini-2.5-flash`.
39+
- Added note about prompt injection mitigation, emphasizing use in controlled environments.
40+
- Updated future enhancements to include advanced parsing for stronger prompt injection defenses.
41+
- Added troubleshooting guidance for prompt injection anomalies.
42+
- Updated acknowledgments to reflect contributions from `Gemini 2.5 Flash` instead of `Gemini 2.0 Flash`.
43+
- **Prompt Logic** (`prompt.py`):
44+
- Applied input sanitization to `longrepr`, `stderr`, student code, README content, and locale files to prevent prompt injection.
45+
- Added guardrail instruction in `get_initial_instruction` to enforce strict adherence to feedback tasks.
46+
- Improved type hints and function signatures for better code clarity and maintainability.
47+
- Optimized string handling by replacing newlines with spaces and limiting input length to 10,000 characters to prevent prompt structure disruption.
48+
- **Tests** (`test_prompt.py`):
49+
- Improved test assertions with descriptive error messages for better debugging.
50+
- Simplified test code by removing redundant tuple conversions and map operations.
51+
- Updated test fixtures and assertions to align with sanitized input handling.
52+
- Renamed test functions to follow a consistent naming convention (e.g., `test__exclude_common_contents__single` to `test_exclude_common_contents__single`).
53+
- Updated `expected_default_gemini_model` fixture to reflect the new default model `gemini-2.5-flash`.
54+
- Enhanced `test_collect_longrepr__compare_contents` to track found markers and report missing ones explicitly.
55+
- Corrected minor typos and formatting in test comments and strings (e.g., standardized language terms like "Foutmelding" for Dutch).
56+
57+
### Fixed
58+
- Ensured consistent handling of README content by sanitizing inputs in `assignment_instruction` and `exclude_common_contents` to prevent malicious patterns from affecting prompt generation.
59+
- Fixed potential issues in test suite by adding explicit type checking and non-empty string validation in `test_collect_longrepr__has_list_items_len`.
60+
- Addressed missing marker detection in `test_collect_longrepr__compare_contents` by tracking found markers instead of modifying the marker list.
61+
1962
## [v0.3.12] - 2025-09-07
2063

2164
### Added

0 commit comments

Comments
 (0)