chore(skills): import DP-GEN simplify agent skill#1879
Conversation
📝 WalkthroughWalkthroughThis pull request introduces a new ChangesDP-GEN Simplify Skill Module
🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (4)
.github/skills/dpgen-simplify/assets/param.example.qm7.from-official-docs.json (1)
2-8: ⚡ Quick winConsider documenting or eliminating the
type_mapduplication.The
type_maparray appears at both the root level (lines 2-8) and withindefault_training_param.model.type_map(lines 26-31). While the values currently match, maintaining identical arrays in two locations creates a maintenance hazard—if one is updated without updating the other, silent inconsistencies could occur that violate the "keeptype_mapconsistent" guidance from the field reference documentation.Consider one of these approaches:
- Add a comment warning that both arrays must be kept synchronized
- Investigate whether dpgen can accept a reference to avoid duplication
- Add validation in the workflow to ensure both arrays match
Also applies to: 26-31
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/skills/dpgen-simplify/assets/param.example.qm7.from-official-docs.json around lines 2 - 8, The root-level "type_map" and "default_training_param.model.type_map" are duplicated; add a config validation step when loading/parsing this JSON that explicitly compares these two arrays (root "type_map" vs "default_training_param.model.type_map") and fails fast with a clear error/log message showing both values if they differ, or collapse the duplication by removing the root "type_map" and only keeping "default_training_param.model.type_map" (or vice versa) if dpgen supports a single location—implement the equality check in your config-loader/validation routine so mismatches cannot silently occur..github/skills/dpgen-simplify/SKILL.md (2)
341-341: ⚡ Quick winAdd a trailing newline at end of file.
The file should end with a newline character for POSIX compliance and to avoid warnings from some text processing tools.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/skills/dpgen-simplify/SKILL.md at line 341, Add a trailing newline to the end of SKILL.md so the file terminates with a newline character (POSIX-compliant); open the file (.github/skills/dpgen-simplify/SKILL.md), go to the end of the file (after the last line containing the simplify machine definitions link) and insert a single newline character so the file ends with a newline.
84-122: ⚡ Quick winClarify the section numbering hierarchy.
The current numbering structure is ambiguous:
- Section "4. Do not invent environment activation commands" (line 84) has standalone content
- Then subsections "4.1 Outer launcher policy" (line 100) and "4.2 Outer vs inner runtime boundaries" (line 110) appear
- It's unclear whether 4.1 and 4.2 are nested under section 4 or if they're related follow-ups
Consider either:
- Renumbering 4.1 and 4.2 as standalone sections (5 and 6), and shifting "Prefer reproducible output layout" to section 7
- Restructuring section 4 to clearly indicate it contains subsections, perhaps with a different heading like "### 4. Environment activation policies" that encompasses all three pieces
This will improve documentation clarity and help readers understand the logical grouping.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/skills/dpgen-simplify/SKILL.md around lines 84 - 122, The section numbering is inconsistent: "4. Do not invent environment activation commands" is presented as a main section while "4.1 Outer launcher policy" and "4.2 Outer vs inner runtime boundaries (critical)" appear to be ambiguous subsections; update the headings so the hierarchy is clear by either (A) renumbering "4.1" and "4.2" to standalone sections "5" and "6" and bumping "Prefer reproducible output layout" to "7", or (B) renaming the main heading to "4. Environment activation policies" (or similar) and ensure "4.1 Outer launcher policy" and "4.2 Outer vs inner runtime boundaries" are nested under it, keeping the exact text for the policies unchanged; adjust the heading markers for "4. Do not invent environment activation commands", "4.1 Outer launcher policy", "4.2 Outer vs inner runtime boundaries (critical)", and "Prefer reproducible output layout" accordingly so the document structure is unambiguous..github/skills/dpgen-simplify/references/machine-fields.md (1)
97-97: ⚡ Quick winAdd a trailing newline at end of file.
The file should end with a newline character for POSIX compliance and to avoid warnings from some text processing tools.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/skills/dpgen-simplify/references/machine-fields.md at line 97, Add a POSIX-compliant trailing newline to the end of the file by ensuring the last line ("if the user already has a working template, patch it instead of rewriting everything") is terminated with a newline character; update the file so its final byte is a newline to avoid tooling warnings.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In
@.github/skills/dpgen-simplify/assets/param.example.qm7.from-official-docs.json:
- Around line 2-8: The root-level "type_map" and
"default_training_param.model.type_map" are duplicated; add a config validation
step when loading/parsing this JSON that explicitly compares these two arrays
(root "type_map" vs "default_training_param.model.type_map") and fails fast with
a clear error/log message showing both values if they differ, or collapse the
duplication by removing the root "type_map" and only keeping
"default_training_param.model.type_map" (or vice versa) if dpgen supports a
single location—implement the equality check in your config-loader/validation
routine so mismatches cannot silently occur.
In @.github/skills/dpgen-simplify/references/machine-fields.md:
- Line 97: Add a POSIX-compliant trailing newline to the end of the file by
ensuring the last line ("if the user already has a working template, patch it
instead of rewriting everything") is terminated with a newline character; update
the file so its final byte is a newline to avoid tooling warnings.
In @.github/skills/dpgen-simplify/SKILL.md:
- Line 341: Add a trailing newline to the end of SKILL.md so the file terminates
with a newline character (POSIX-compliant); open the file
(.github/skills/dpgen-simplify/SKILL.md), go to the end of the file (after the
last line containing the simplify machine definitions link) and insert a single
newline character so the file ends with a newline.
- Around line 84-122: The section numbering is inconsistent: "4. Do not invent
environment activation commands" is presented as a main section while "4.1 Outer
launcher policy" and "4.2 Outer vs inner runtime boundaries (critical)" appear
to be ambiguous subsections; update the headings so the hierarchy is clear by
either (A) renumbering "4.1" and "4.2" to standalone sections "5" and "6" and
bumping "Prefer reproducible output layout" to "7", or (B) renaming the main
heading to "4. Environment activation policies" (or similar) and ensure "4.1
Outer launcher policy" and "4.2 Outer vs inner runtime boundaries" are nested
under it, keeping the exact text for the policies unchanged; adjust the heading
markers for "4. Do not invent environment activation commands", "4.1 Outer
launcher policy", "4.2 Outer vs inner runtime boundaries (critical)", and
"Prefer reproducible output layout" accordingly so the document structure is
unambiguous.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 867352cf-1a0b-4a82-ba4a-7dae01aa2874
📒 Files selected for processing (10)
.github/skills/dpgen-simplify/SKILL.md.github/skills/dpgen-simplify/assets/machine.template.json.github/skills/dpgen-simplify/assets/machine.template.local-shell.json.github/skills/dpgen-simplify/assets/machine.template.server-local-slurm.json.github/skills/dpgen-simplify/assets/machine.template.ssh-remote-slurm.json.github/skills/dpgen-simplify/assets/param.example.qm7.from-official-docs.json.github/skills/dpgen-simplify/assets/param.template.json.github/skills/dpgen-simplify/references/machine-fields.md.github/skills/dpgen-simplify/references/param-fields.md.github/skills/dpgen-simplify/references/workflow-notes.md
4630b3c to
18b6360
Compare
Imported from jinzhezenggroup/computational-chemistry-agent-skills. Upstream-Commit: jinzhezenggroup/computational-chemistry-agent-skills@885861f Upstream-Paths: - simplify/dpgen-simplify - machine-learning-potentials/dpgen-simplify
…ng-potentials (deepmodeling#65) Imported from jinzhezenggroup/computational-chemistry-agent-skills. Upstream-Commit: jinzhezenggroup/computational-chemistry-agent-skills@a019524 Upstream-Paths: - simplify/dpgen-simplify - machine-learning-potentials/dpgen-simplify
…eepmodeling#67) Imported from jinzhezenggroup/computational-chemistry-agent-skills. Upstream-Commit: jinzhezenggroup/computational-chemistry-agent-skills@0475acb Upstream-Paths: - simplify/dpgen-simplify - machine-learning-potentials/dpgen-simplify
Apply the same JSON indentation as pre-commit.ci after rebuilding the history-preserving migration branch. Authored by OpenClaw (model: gpt-5.5)
18b6360 to
a8d60ba
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@skills/dpgen-simplify/assets/machine.template.json`:
- Around line 12-17: The stage templates' "resources" objects (e.g., the block
showing "number_node", "cpu_per_node", "gpu_per_node", "group_size") are missing
the required "source_list" field; update every stage template resources object
in machine.template.json to include "source_list" (an array or null as used
convention in this file) so generated scheduler configs include inner-stage
environment activation per SKILL.md—ensure you add "source_list" alongside the
existing keys in every resources block referenced (also at the other occurrences
around lines 27-32 and 42-47).
In `@skills/dpgen-simplify/references/param-fields.md`:
- Line 31: The note for init_data_sys is a sentence fragment ("Can be empty when
starting fully from `pick_data`"); rewrite it as a complete sentence mentioning
the subject and context — e.g. "The init_data_sys field can be empty when
starting fully from `pick_data`." — by editing the line in
skills/dpgen-simplify/references/param-fields.md where `init_data_sys` is
described so the note reads as a full sentence and remains clear and concise.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f196057c-5ad8-4011-906b-59a1f7dd9bc7
📒 Files selected for processing (10)
skills/dpgen-simplify/SKILL.mdskills/dpgen-simplify/assets/machine.template.jsonskills/dpgen-simplify/assets/machine.template.local-shell.jsonskills/dpgen-simplify/assets/machine.template.server-local-slurm.jsonskills/dpgen-simplify/assets/machine.template.ssh-remote-slurm.jsonskills/dpgen-simplify/assets/param.example.qm7.from-official-docs.jsonskills/dpgen-simplify/assets/param.template.jsonskills/dpgen-simplify/references/machine-fields.mdskills/dpgen-simplify/references/param-fields.mdskills/dpgen-simplify/references/workflow-notes.md
✅ Files skipped from review due to trivial changes (7)
- skills/dpgen-simplify/assets/param.template.json
- skills/dpgen-simplify/assets/machine.template.local-shell.json
- skills/dpgen-simplify/assets/machine.template.server-local-slurm.json
- skills/dpgen-simplify/assets/machine.template.ssh-remote-slurm.json
- skills/dpgen-simplify/references/workflow-notes.md
- skills/dpgen-simplify/assets/param.example.qm7.from-official-docs.json
- skills/dpgen-simplify/references/machine-fields.md
| "resources": { | ||
| "number_node": null, | ||
| "cpu_per_node": null, | ||
| "gpu_per_node": null, | ||
| "group_size": null | ||
| } |
There was a problem hiding this comment.
Add resources.source_list to all stage templates for scheduler compatibility.
SKILL.md requires explicit inner-stage environment activation via resources.source_list, but this base template omits the field in every stage. That mismatch can lead to invalid scheduler configs being generated from this template.
💡 Suggested patch
"train": {
"command": "dp",
"machine": {
"batch_type": null,
"context_type": null,
"local_root": "./",
"remote_root": null
},
"resources": {
"number_node": null,
"cpu_per_node": null,
"gpu_per_node": null,
- "group_size": null
+ "group_size": null,
+ "source_list": null
}
},
@@
"resources": {
"number_node": null,
"cpu_per_node": null,
"gpu_per_node": null,
- "group_size": null
+ "group_size": null,
+ "source_list": null
}
},
@@
"resources": {
"number_node": null,
"cpu_per_node": null,
"gpu_per_node": null,
- "group_size": null
+ "group_size": null,
+ "source_list": null
}
}Also applies to: 27-32, 42-47
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@skills/dpgen-simplify/assets/machine.template.json` around lines 12 - 17, The
stage templates' "resources" objects (e.g., the block showing "number_node",
"cpu_per_node", "gpu_per_node", "group_size") are missing the required
"source_list" field; update every stage template resources object in
machine.template.json to include "source_list" (an array or null as used
convention in this file) so generated scheduler configs include inner-stage
environment activation per SKILL.md—ensure you add "source_list" alongside the
existing keys in every resources block referenced (also at the other occurrences
around lines 27-32 and 42-47).
|
|
||
| List of initial system indices for training. | ||
|
|
||
| Can be empty when starting fully from `pick_data`. |
There was a problem hiding this comment.
Fix sentence fragment for init_data_sys note.
This line is missing a subject and reads as a fragment. Make it a complete sentence for clarity.
✏️ Suggested patch
-Can be empty when starting fully from `pick_data`.
+It can be empty when starting fully from `pick_data`.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Can be empty when starting fully from `pick_data`. | |
| It can be empty when starting fully from `pick_data`. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@skills/dpgen-simplify/references/param-fields.md` at line 31, The note for
init_data_sys is a sentence fragment ("Can be empty when starting fully from
`pick_data`"); rewrite it as a complete sentence mentioning the subject and
context — e.g. "The init_data_sys field can be empty when starting fully from
`pick_data`." — by editing the line in
skills/dpgen-simplify/references/param-fields.md where `init_data_sys` is
described so the note reads as a full sentence and remains clear and concise.
Problem
Change
dpgen-simplifyskill intoskills/.simplify/dpgen-simplifypath or the renamedmachine-learning-potentials/dpgen-simplifypath as separate commits.Notes
simplify/dpgen-simplify,machine-learning-potentials/dpgen-simplify.skills/dpgen-simplify.git diff --check origin/master...HEAD.Authored by OpenClaw (model: gpt-5.5)
Summary by CodeRabbit
Documentation
New Features