You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Mar 27, 2026. It is now read-only.
Copy file name to clipboardExpand all lines: src/utils/diff_parser.rs
+72-67Lines changed: 72 additions & 67 deletions
Original file line number
Diff line number
Diff line change
@@ -128,6 +128,77 @@ impl DiffParser {
128
128
Ok(files)
129
129
}
130
130
131
+
/// Get the instructions for interpreting git diff output
132
+
fnget_diff_instructions() -> Vec<String>{
133
+
let instructions = r#"**Instructions for Interpreting Git Diff Output**
134
+
135
+
This document provides a guide to understanding the diff output generated by RepoDiff.
136
+
137
+
**Important Note:** The diff output in `repodiff_output.txt` has been sanitized to focus what's relevant for understanding the diffs.
138
+
Real-world Git diff output may contain more details.
139
+
140
+
**1. Basic Structure:**
141
+
142
+
A Git diff file describes the *differences* between two versions of a file. It's structured into *hunks*, which represent contiguous regions of change.
143
+
144
+
* `diff --git a/<path> b/<path>`: Indicates the file being compared. `a/` refers to the "old" version, and `b/` refers to the "new" version. (Note that paths always use forward slashes in Git diff output, even on Windows systems.)
145
+
* `--- a/<path>`: Marks the beginning of the original file content.
146
+
* `+++ b/<path>`: Marks the beginning of the modified file content.
147
+
* `@@ -<start_line_old>,<num_lines_old> +<start_line_new>,<num_lines_new> @@ <section_header>`: This is the *hunk header*. (Optional in simplified output, but common in real diffs).
148
+
* `-<start_line_old>,<num_lines_old>`: Indicates the starting line number and number of lines in the *old* version of the file that this hunk represents. If only one line is affected, `,<num_lines_old>` will be omitted.
149
+
* `+<start_line_new>,<num_lines_new>`: Indicates the starting line number and number of lines in the *new* version of the file that this hunk represents. If only one line is affected, `,<num_lines_new>` will be omitted.
150
+
* `<section_header>`: (Optional) This is often a function or method name, providing context for the change.
151
+
* Hunk Content: Lines within a hunk are marked with a prefix:
152
+
* ` ` (space): Unchanged line (context).
153
+
* `-`: Line removed from the old version.
154
+
* `+`: Line added to the new version.
155
+
156
+
**2. Simplified Example:**
157
+
158
+
```
159
+
diff --git a/MyFile.cs b/MyFile.cs
160
+
--- a/MyFile.cs
161
+
+++ b/MyFile.cs
162
+
// Some code
163
+
string oldValue = "old";
164
+
-// Removed line
165
+
+string newValue = "new";
166
+
// More code
167
+
```
168
+
169
+
**Explanation of the Example:**
170
+
171
+
* The file being changed is `MyFile.cs`.
172
+
* `" string oldValue = "old";"`: This line is present in both versions.
173
+
* `-// Removed line`: This line was removed from the old version.
174
+
* `+string newValue = "new";`: This line was added to the new version.
175
+
* `" // More code"`: This line is present in both versions.
176
+
177
+
**3. Key LLM Considerations:**
178
+
179
+
* **Focus on Content Lines:** The most important part for understanding changes is the content prefixed with ` `, `-`, or `+`.
180
+
* **Context is Crucial:** Use the surrounding unchanged lines to understand the *purpose* of the change.
181
+
* **File Paths:** Pay attention to the file paths (`a/<path>`, `b/<path>`) to understand which files are being modified.
182
+
183
+
**4. Application to your File:**
184
+
185
+
* **".cs" Files:** Changes to C# source code. Focus on the addition (`+`) and removal (`-`) of code lines to understand logic changes.
186
+
* **"Test*.cs" Files:** Changes to unit test files. These are often important for understanding how the functionality is being tested and whether the changes are robust.
187
+
* **".xml" Files:** Changes to configuration or data files. Look for added, removed, or modified XML elements and attributes. Focus is usually on changes to properties.
188
+
189
+
**5. Special Instructions for File Types based on the given filters:**
190
+
191
+
* `.cs` code is assumed to not contain test code
192
+
* `*Test*.cs` contain test code, which should be helpful for understanding functionality.
193
+
* `*.xml` contains configuration.
194
+
195
+
By focusing on these key elements, you can effectively extract meaningful information from Git diff output and summarize the changes made in a software project.
/// Reconstruct a unified diff from the processed patch dictionary
132
203
///
133
204
/// # Arguments
@@ -138,73 +209,7 @@ impl DiffParser {
138
209
139
210
// Only add instructions if the patch dictionary is not empty
140
211
if !patch_dict.is_empty(){
141
-
// Add instructions at the beginning of the output
142
-
output.push("**Instructions for Interpreting Git Diff Output**".to_string());
143
-
output.push("".to_string());
144
-
output.push("This document provides a guide to understanding the diff output generated by RepoDiff.".to_string());
145
-
output.push("".to_string());
146
-
output.push("**Important Note:** The diff output in `repodiff_output.txt` has been sanitized to focus what's relevant for understanding the diffs.".to_string());
147
-
output.push("Real-world Git diff output may contain more details.".to_string());
output.push("A Git diff file describes the *differences* between two versions of a file. It's structured into *hunks*, which represent contiguous regions of change.".to_string());
152
-
output.push("".to_string());
153
-
output.push("* `diff --git a/<path> b/<path>`: Indicates the file being compared. `a/` refers to the \"old\" version, and `b/` refers to the \"new\" version. (Note that paths always use forward slashes in Git diff output, even on Windows systems.)".to_string());
154
-
output.push("* `--- a/<path>`: Marks the beginning of the original file content.".to_string());
155
-
output.push("* `+++ b/<path>`: Marks the beginning of the modified file content.".to_string());
156
-
output.push("* `@@ -<start_line_old>,<num_lines_old> +<start_line_new>,<num_lines_new> @@ <section_header>`: This is the *hunk header*. (Optional in simplified output, but common in real diffs).".to_string());
157
-
output.push(" * `-<start_line_old>,<num_lines_old>`: Indicates the starting line number and number of lines in the *old* version of the file that this hunk represents. If only one line is affected, `,<num_lines_old>` will be omitted.".to_string());
158
-
output.push(" * `+<start_line_new>,<num_lines_new>`: Indicates the starting line number and number of lines in the *new* version of the file that this hunk represents. If only one line is affected, `,<num_lines_new>` will be omitted.".to_string());
159
-
output.push(" * `<section_header>`: (Optional) This is often a function or method name, providing context for the change.".to_string());
160
-
output.push("* Hunk Content: Lines within a hunk are marked with a prefix:".to_string());
161
-
output.push(" * ` ` (space): Unchanged line (context).".to_string());
162
-
output.push(" * `-`: Line removed from the old version.".to_string());
163
-
output.push(" * `+`: Line added to the new version.".to_string());
output.push("* **Focus on Content Lines:** The most important part for understanding changes is the content prefixed with ` `, `-`, or `+`.".to_string());
189
-
output.push("* **Context is Crucial:** Use the surrounding unchanged lines to understand the *purpose* of the change.".to_string());
190
-
output.push("* **File Paths:** Pay attention to the file paths (`a/<path>`, `b/<path>`) to understand which files are being modified.".to_string());
191
-
output.push("".to_string());
192
-
output.push("**4. Application to your File:**".to_string());
193
-
output.push("".to_string());
194
-
output.push("* **\".cs\" Files:** Changes to C# source code. Focus on the addition (`+`) and removal (`-`) of code lines to understand logic changes.".to_string());
195
-
output.push("* **\"Test*.cs\" Files:** Changes to unit test files. These are often important for understanding how the functionality is being tested and whether the changes are robust.".to_string());
196
-
output.push("* **\".xml\" Files:** Changes to configuration or data files. Look for added, removed, or modified XML elements and attributes. Focus is usually on changes to properties.".to_string());
197
-
output.push("".to_string());
198
-
output.push("**5. Special Instructions for File Types based on the given filters:**".to_string());
199
-
output.push("".to_string());
200
-
output.push("* `.cs` code is assumed to not contain test code".to_string());
201
-
output.push("* `*Test*.cs` contain test code, which should be helpful for understanding functionality.".to_string());
output.push("By focusing on these key elements, you can effectively extract meaningful information from Git diff output and summarize the changes made in a software project.".to_string());
0 commit comments