Skip to content

feat: support reading external files as Code Review rules via use_file_path#87

Open
zephyrq-z wants to merge 4 commits into
alibaba:mainfrom
zephyrq-z:feature/rule
Open

feat: support reading external files as Code Review rules via use_file_path#87
zephyrq-z wants to merge 4 commits into
alibaba:mainfrom
zephyrq-z:feature/rule

Conversation

@zephyrq-z

@zephyrq-z zephyrq-z commented Jun 9, 2026

Copy link
Copy Markdown

Description

This PR adds support for loading Code Review rules from external .md / .txt files via a top-level use_file_path toggle in rule.json.

In practice, inlining complex review rules directly into rule.json makes the configuration bloated and hard to maintain. This change allows users to extract rule content into separate documents and reference them by path, keeping rule.json concise and readable.

Core Changes

  • New top-level field use_file_path (boolean) on ProjectRule:
    • When true, all rule fields within the same rule.json are treated as relative paths to external .md / .txt files. The file content is read and overwrites the rule field at load time.
    • When false or omitted, rule fields continue to work as inline string rules — fully backward compatible.
  • New resolveRuleFiles() function implements the file loading logic, called from loadGlobalRule(), loadRuleFile(), and loadProjectRule() after JSON unmarshalling.

Security & Robustness

The resolveRuleFiles function enforces strict safety guards:

  1. Directory traversal protection — resolves symlinks on both base dir and file path, then performs an absolute-path prefix check to block ../ escapes.
  2. Extension whitelist — only .md and .txt files are accepted; other extensions are rejected with a warning.
  3. File size cap — individual rule files are limited to 100 KB to prevent LLM context-window bloat.
  4. Graceful degradation — missing files, permission errors, unsupported extensions, or empty paths all produce a [WARN] log and leave the original rule string intact (no panics).

Configuration Example

{
  "use_file_path": true,
  "rules": [
    {
      "path": "**/*mapper*.xml",
      "rule": "docs/sql-rules.md"
    },
    {
      "path": "web/**/*.ts",
      "rule": "docs/frontend-rules.md"
    }
  ]
}

Paths are relative to the directory containing the rule.json file.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Refactoring (no functional changes)
  • Documentation update
  • CI / Build / Tooling

How Has This Been Tested?

Unit tests added in internal/config/rules/system_rules_test.go:

  • TestResolveRuleFiles_Basic — verifies that a valid .md file's content correctly replaces the rule field.

  • TestResolveRuleFiles_Security — confirms ../outside.md path traversal is blocked and the original rule string is preserved.

  • TestResolveRuleFiles_UnsupportedExtension — confirms .json extension is rejected.

  • TestResolveRuleFiles_TooLarge — confirms files exceeding 100 KB are rejected.

  • TestResolveRuleFiles_MissingFile — confirms missing files produce a warning without panicking.

  • make test passes locally

  • Manual testing (describe below)

    Tested with use_file_path: true pointing to .md files in .opencodereview/docs/, confirmed rules are loaded correctly. Tested with ../ traversal paths and verified warnings are emitted and rules are not loaded.

Checklist

  • My code follows the project's coding style (go fmt, go vet)
  • I have performed a self-review of my code
  • I have added tests that prove my fix is effective or my feature works
  • New and existing unit tests pass locally with my changes
  • I have updated the documentation accordingly (if applicable)
  • I have signed the CLA

Related Issues

Closes #67

zephyrq-z added 2 commits June 8, 2026 15:45
- Added 'rule_file' field to ProjectRuleEntry in rule.json
- Implemented resolveRuleFiles to read, validate, and merge external rule files
- Security: prevents path traversal (../) outside the base directory
- Limits supported extensions to .md and .txt, max size 100KB
- Merges contents if both 'rule' and 'rule_file' are provided
- Added unit tests for security, missing files, large files, etc.
- Updated README.md and README.zh-CN.md

Resolves alibaba#67
@CLAassistant

CLAassistant commented Jun 9, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

Comment thread README.zh-CN.md Outdated
"path": "**/*mapper*.xml",
"rule": "检查 SQL 注入风险、参数错误和缺少闭合标签"
"rule": "docs/sql-rules.md",
"use_file_path": true

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉需要放在顶层,不然每个子项都需要配置一遍。

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 OpenCodeReview found 2 issue(s) in this PR.

  • ✅ 2 posted as inline comment(s)
  • 📝 0 posted as summary (missing line info)

Comment thread internal/config/rules/system_rules.go Outdated
Comment on lines +335 to +336
// Security check: prevent directory traversal
if !strings.HasPrefix(absFile+string(filepath.Separator), absBase) && absFile != strings.TrimSuffix(absBase, string(filepath.Separator)) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: Symlink-based directory traversal bypass. filepath.Abs only performs lexical path cleaning — it does NOT resolve symlinks. An attacker (or a malicious repo) could place a symlink inside .opencodereview/ that points to an arbitrary file outside the base directory. The prefix check on the string path would pass because the symlink itself lives within the base dir, but os.ReadFile follows the symlink and reads the target file.

To fix this, resolve symlinks before performing the prefix check:

resolvedFile, err := filepath.EvalSymlinks(absFile)
if err != nil {
    fmt.Fprintf(os.Stderr, "[WARN] Failed to resolve symlinks for rule file %s: %v\n", filePath, err)
    continue
}

Then use resolvedFile for both the security check and subsequent reads. You should also resolve absBase with EvalSymlinks to handle the case where the base directory itself contains symlinks.

Comment thread internal/config/rules/system_rules.go Outdated
Comment on lines +324 to +326
if !entry.UseFilePath || entry.Rule == "" {
continue
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent skip without warning when UseFilePath is true but Rule is empty. A user who sets "use_file_path": true but forgets to specify the "rule" field will get no feedback at all, making configuration debugging difficult. Consider adding a warning here similar to the other error paths.

Suggestion:

Suggested change
if !entry.UseFilePath || entry.Rule == "" {
continue
}
if !entry.UseFilePath {
continue
}
if entry.Rule == "" {
fmt.Fprintf(os.Stderr, "[WARN] Rule entry has use_file_path=true but empty rule path, skipping\n")
continue
}

…erability

- Move use_file_path from per-entry to top-level ProjectRule field
  (simplifies config: no need to repeat flag for every rule entry)
- Add filepath.EvalSymlinks to prevent symlink-based directory traversal
  (resolves symlinks on both file path and base directory for consistent
  prefix matching across platforms like macOS /var -> /private/var)
- Add warning when use_file_path=true but rule path is empty
- Update tests to use new top-level UseFilePath structure
- Update README.md and README.zh-CN.md with new config format
Comment thread README.zh-CN.md Outdated

```json
{
"use_file_path": true,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉这个说明会误导用户,同时存在 path 和字符串应该会报错吧,use_file_path开关应该是控制用户使用其中一种。

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

另外其他语言的 readme 也需要补充下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants