Skip to content

Add validate command for custom training#7407

Open
saanikaguptamicrosoft wants to merge 11 commits intoAzure:foundry-training-devfrom
saanikaguptamicrosoft:saanika/validate
Open

Add validate command for custom training#7407
saanikaguptamicrosoft wants to merge 11 commits intoAzure:foundry-training-devfrom
saanikaguptamicrosoft:saanika/validate

Conversation

@saanikaguptamicrosoft
Copy link
Copy Markdown
Collaborator

@saanikaguptamicrosoft saanikaguptamicrosoft commented Mar 31, 2026

Notes

Added validation for-

  1. command field is required
  2. environment field is required
  3. compute field is required
  4. code if present, it should not start with git:// or git+
  5. The local path mentioned in code and inputs.<key>.path are existent
  6. Placeholder mapping is correct. Different cases:
    • The command field contains placeholder in the form of ${{inputs.xxx}}, the xxx should be present as key in inputs.
      • For inputs -> if key is present but its value is {} or None, throw error.
    • The command field contains placeholder in the form ${{outputs.xxx}}, the xxx may or may not be present as key in inputs. If it's not present, means user wants to go with default output values. If it's present means user wants to override output configs.
      • For outputs -> if key is present but its value is {} or None, show warning. Because user might have forgotten to override the value.
    • If ${{inputs.xxx}} is wrapped inside [ ], it's optional, so we don't validate its presence in inputs key.
    • Single bracket i.e. {inputs.xxx} or {outputs.x} should throw error

Added unit tests.

Testing

  • Build ~ Successful
azd x build
  • UTs ~ Successful
go test ./internal/utils/ -v -count=1

@saanikaguptamicrosoft saanikaguptamicrosoft changed the title Saanika/validate Add validate command for custom training Mar 31, 2026
Copy link
Copy Markdown
Member

@jongio jongio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review - #7407

Add validate command for custom training by @saanikaguptamicrosoft

Summary

What: Adds an offline job validate command that checks a YAML job definition for required fields, invalid git paths, local path existence, placeholder correctness, and empty input/output definitions - collecting all findings before reporting instead of failing on the first error.

Why: Lets users catch YAML mistakes locally before submitting to the backend, saving a round-trip.

Assessment: The validation logic is solid and well-structured, but the command can't actually run offline due to the parent command's PersistentPreRunE hook. This should be addressed before merge. A few edge cases in the validator produce confusing output (duplicate errors on git paths, silent pass on ${inputs.xxx} typos).

Findings

Category High Medium Low
Design 1 0 0
Logic 0 2 0
Error Handling 0 1 0
Consistency 0 1 1
Tests 0 0 1
Total 1 4 2

Key Findings

  1. [HIGH] Design: Parent job command's PersistentPreRunE requires Azure env setup, preventing offline use
  2. [MEDIUM] Logic: git:// code paths produce duplicate errors (git-not-supported + local-path-not-found)
  3. [MEDIUM] Logic: ${inputs.xxx} typos silently bypass single-brace detection
  4. [MEDIUM] Error Handling: os.Stat permission/IO errors silently ignored in path checks
  5. [MEDIUM] Consistency: Missing -f shorthand for --file flag (inconsistent with job submit)
  6. [LOW] Consistency: Missing Long description on cobra command (all sibling commands have one)
  7. [LOW] Tests: No test for the git:// double-error scenario (assert only 1 finding, not 2)

Test Coverage Estimate

  • Well covered: required fields, git path detection, local path checks, placeholder mapping, single-brace detection, empty definitions, multiline commands
  • Missing: command-layer test (cobra handler), ${inputs.xxx} edge case, git:// double-error assertion, os.Stat permission errors, empty/zero-byte YAML input

What's Done Well

  • Collects all findings instead of failing on the first error - much better UX than the existing ValidateJobDefinition
  • Good separation between the command handler and validation logic
  • Optional [...] block handling for input placeholders is a nice touch
  • Test file has clear YAML comments showing what each test case maps to

5 inline comments below.

cmd.AddCommand(newJobShowCommand())
cmd.AddCommand(newJobDeleteCommand())
cmd.AddCommand(newJobCancelCommand())
cmd.AddCommand(newJobValidateCommand())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Design: validate isn't actually offline - the parent job command has PersistentPreRunE calling validateOrInitEnvironment, which requires an azd environment or --subscription/--project-endpoint flags. Running azd ai training job validate --file job.yaml without Azure setup will error before reaching validation logic.

Fix: in job_validate.go, add PersistentPreRunE: func(_ *cobra.Command, _ []string) error { return nil } to the cobra.Command definition. This overrides the parent's pre-run for the validate subcommand only.

}

// 5. Local path existence checks
validateLocalPath(result, "code", job.Code, yamlDir)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Logic: git:// code paths produce duplicate errors. Step 4 (line 102) correctly flags the git path, but step 5 here still calls validateLocalPath. Since IsRemoteURI doesn't recognize git:// or git+ schemes, os.Stat("git://...") fails and adds a second, misleading "local path does not exist" error.

Fix: either extend IsRemoteURI in yaml_parser.go to cover git:// and git+ prefixes, or guard this call to skip when a git-path error was already recorded.

start := match[0]
// Skip if this is already part of a ${{...}} (preceded by "${")
if start >= 2 && command[start-2:start] == "${" {
continue
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Logic: The bare-$ guard here creates a false negative. For ${inputs.model} (wrong syntax - should be ${{inputs.model}}), the regex matches {inputs.model} at position N, then command[N-1:N] == "$" causes the skip. But ${inputs.model} isn't valid either.

The first guard (line 221, checking for "${" two chars back) already handles valid ${{...}} patterns correctly. This second single-char $ guard is redundant and causes ${inputs.xxx} typos to go undetected.

Fix: remove lines 224-226 entirely.

resolved = filepath.Join(yamlDir, path)
}

if _, err := os.Stat(resolved); os.IsNotExist(err) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Error Handling: Only os.IsNotExist is checked. If os.Stat fails with a permission error or other I/O error, the function silently returns success - the user won't know the path couldn't be verified.

Fix: add an else if err != nil branch to report a warning like "could not verify path exists: ".


cmd.Flags().StringVar(&filePath, "file", "", "Path to YAML job definition file (required)")

return cmd
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Consistency: The sibling job submit command uses StringVarP(&filePath, "file", "f", ...) with a -f shorthand. This command uses StringVar without it. Users who learn -f from submit will expect it here too.

Fix: change to cmd.Flags().StringVarP(&filePath, "file", "f", "", "Path to YAML job definition file (required)").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants