Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions cmd/ai/cmd.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
package ai

import (
sreagent "github.com/openshift/osdctl/cmd/ai/sre_agent"
"github.com/spf13/cobra"
)

// NewCmdAI implements the base AI command
func NewCmdAI() *cobra.Command {
aiCmd := &cobra.Command{
Use: "ai",
Short: "AI-powered tools for SRE automation",
Args: cobra.NoArgs,
}

aiCmd.AddCommand(sreagent.NewCmdSreAgent())

return aiCmd
}
33 changes: 33 additions & 0 deletions cmd/ai/sre_agent/helper.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
package sreagent

import (
"bufio"
"fmt"
"os"
"os/exec"
"strings"
)

// copyRepository copies a directory recursively
func copyRepository(sourcePath, destPath string) error {
fmt.Fprintf(os.Stderr, "Copying repository to %s...\n", destPath)
cmd := exec.Command("cp", "-r", sourcePath, destPath)
cmd.Stdout = os.Stderr
cmd.Stderr = os.Stderr

if err := cmd.Run(); err != nil {
return err
}

return nil
}
Comment thread
TheUndeadKing marked this conversation as resolved.

// promptUserInput reads a line of user input from stdin
func promptUserInput() (string, error) {
reader := bufio.NewReader(os.Stdin)
input, err := reader.ReadString('\n')
if err != nil {
return "", fmt.Errorf("failed to read input: %w", err)
}
return strings.ToLower(strings.TrimSpace(input)), nil
}
121 changes: 121 additions & 0 deletions cmd/ai/sre_agent/sre_agent.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
package sreagent

import (
"fmt"
"os"
"os/exec"
"path/filepath"

"github.com/adrg/xdg"
"github.com/spf13/cobra"
cmdutil "k8s.io/kubectl/pkg/cmd/util"
)

var (
pdURL string
autoExecute bool
outputDir string
)

const (
sreAgentDescription = `
SRE Agent is an AI-powered tool that helps SREs triage alerts and diagnose issues.
It automatically fetches incident details from PagerDuty, finds relevant SOPs,
and executes diagnostic commands on clusters.
`

sreAgentExample = `
# Interactive mode (asks for confirmation at each step)
osdctl ai sre-agent --pd-url "${PD_URL}"

# Fully automated mode (no confirmations)
osdctl ai sre-agent --pd-url "${PD_URL}" --auto-execute

# Specify output directory for sre-agent files
osdctl ai sre-agent --pd-url "${PD_URL}" --output /tmp/sre-agent-output
`
)

func NewCmdSreAgent() *cobra.Command {
sreAgentCmd := &cobra.Command{
Use: "sre-agent",
Short: "Run SRE Agent for automated incident investigation",
Long: sreAgentDescription,
Example: sreAgentExample,
Args: cobra.ArbitraryArgs,
SilenceUsage: true,
SilenceErrors: true,
Run: func(cmd *cobra.Command, args []string) {
// Step 1: Validate sre-agent installation
if !validateSreAgent() {
return
}

// Step 2: Check/Setup config (includes ops-sop setup)
if !checkSreAgentConfig() {
return
}
Comment on lines +48 to +57
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

cat -n cmd/ai/sre_agent/sre_agent.go | head -100

Repository: openshift/osdctl

Length of output: 3409


🏁 Script executed:

grep -n "func validateSreAgent\|func checkSreAgentConfig" cmd/ai/sre_agent/sre_agent.go

Repository: openshift/osdctl

Length of output: 42


🏁 Script executed:

find cmd/ai/sre_agent -type f -name "*.go" | head -20

Repository: openshift/osdctl

Length of output: 202


🏁 Script executed:

grep -n "func validateSreAgent\|func checkSreAgentConfig" cmd/ai/sre_agent/*.go

Repository: openshift/osdctl

Length of output: 216


🏁 Script executed:

cat -n cmd/ai/sre_agent/validate_sre_agent.go

Repository: openshift/osdctl

Length of output: 1896


🏁 Script executed:

cat -n cmd/ai/sre_agent/validate_sre_agent_config.go

Repository: openshift/osdctl

Length of output: 3601


Use RunE to propagate validation failures as command errors.

When validateSreAgent() or checkSreAgentConfig() return false, the Run function returns early without signaling an error. This causes the command to exit with success (code 0) despite validation failures, breaking automation and scripting workflows. Migrate to RunE and return explicit errors.

Proposed fix
-		Run: func(cmd *cobra.Command, args []string) {
+		RunE: func(cmd *cobra.Command, args []string) error {
 			// Step 1: Validate sre-agent installation
 			if !validateSreAgent() {
-				return
+				return fmt.Errorf("sre-agent installation validation failed")
 			}
 
 			// Step 2: Check/Setup config (includes ops-sop setup)
 			if !checkSreAgentConfig() {
-				return
+				return fmt.Errorf("sre-agent configuration validation failed")
 			}
 
 			// Step 3: Execute sre-agent
 			sreAgentPath := filepath.Join(xdg.DataHome, "sre-agent/venv/bin/sre-agent")
 			sreAgentArgs := buildSreAgentArgs(args)
 
-			err := executeSreAgent(sreAgentPath, sreAgentArgs, outputDir)
-			if err != nil {
-				cmdutil.CheckErr(err)
-			}
+			return executeSreAgent(sreAgentPath, sreAgentArgs, outputDir)
 		},
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/sre_agent.go` around lines 48 - 57, Replace the Cobra
command's Run handler with RunE so validation failures become command errors:
change the anonymous func assigned to Run into RunE(func(cmd *cobra.Command,
args []string) error { ... }), call validateSreAgent() and
checkSreAgentConfig(), and if either returns false return a descriptive error
(e.g., errors.New("sre-agent validation failed") or include context like
"sre-agent config check failed") instead of returning early; ensure the function
ultimately returns nil on success so Cobra propagates failures correctly.


// Step 3: Execute sre-agent
sreAgentPath := filepath.Join(xdg.DataHome, "sre-agent/venv/bin/sre-agent")
sreAgentArgs := buildSreAgentArgs(args)

err := executeSreAgent(sreAgentPath, sreAgentArgs, outputDir)
if err != nil {
cmdutil.CheckErr(err)
}
},
}

sreAgentCmd.Flags().StringVar(&pdURL, "pd-url", "", "PagerDuty incident URL (required)")
sreAgentCmd.Flags().BoolVar(&autoExecute, "auto-execute", false, "Fully automated mode without confirmations")
sreAgentCmd.Flags().StringVar(&outputDir, "output", "", "Output directory for sre-agent files (default: current directory)")

// Mark pd-url as required
if err := sreAgentCmd.MarkFlagRequired("pd-url"); err != nil {
fmt.Fprintf(os.Stderr, "Failed to mark pd-url as required: %v\n", err)
}

return sreAgentCmd
}

// buildSreAgentArgs constructs the argument list for sre-agent command
func buildSreAgentArgs(additionalArgs []string) []string {
args := []string{}

if pdURL != "" {
args = append(args, "--pd-url", pdURL)
}

if autoExecute {
args = append(args, "--auto-execute")
}

// Add any additional arguments passed
args = append(args, additionalArgs...)

return args
}

// executeSreAgent runs the sre-agent command with provided arguments
func executeSreAgent(sreAgentPath string, args []string, outputDir string) error {
cmd := exec.Command(sreAgentPath, args...)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
cmd.Stdin = os.Stdin

// Set working directory if output directory is specified
if outputDir != "" {
// Create directory if it doesn't exist
if err := os.MkdirAll(outputDir, 0755); err != nil {
return fmt.Errorf("failed to create output directory: %w", err)
}
cmd.Dir = outputDir
}

if err := cmd.Run(); err != nil {
return fmt.Errorf("sre-agent execution failed: %w", err)
}

return nil
}
54 changes: 54 additions & 0 deletions cmd/ai/sre_agent/validate_sre_agent.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
package sreagent

import (
"fmt"
"os"
"path/filepath"

"github.com/adrg/xdg"
"github.com/openshift/osdctl/internal/utils"
)

// validateSreAgent checks if sre-agent is installed
func validateSreAgent() bool {
baseDir := filepath.Join(xdg.DataHome, "sre-agent")
venvBinary := filepath.Join(baseDir, "venv/bin/sre-agent")

// Check if sre-agent binary exists
if utils.FileExists(venvBinary) {
return true // Already installed
}

fmt.Fprintf(os.Stderr, "sre-agent is not found in %s\n\n", venvBinary)

// Ask for path to sre-agent venv
fmt.Fprint(os.Stderr, "Enter the absolute path to sre-agent venv directory: ")
userVenvPath, err := promptUserInput()
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to read input: %v\n", err)
return false
}
Comment thread
TheUndeadKing marked this conversation as resolved.

// Validate venv binary exists in provided path
userVenvBinary := filepath.Join(userVenvPath, "bin/sre-agent")
if !utils.FileExists(userVenvBinary) {
fmt.Fprintln(os.Stderr, "\nsre-agent isn't installed")
return false
}

// Create base directory
if err := os.MkdirAll(baseDir, 0755); err != nil {
fmt.Fprintf(os.Stderr, "Failed to create base directory: %v\n", err)
return false
}
Comment thread
coderabbitai[bot] marked this conversation as resolved.

// Copy venv to XDG data directory
venvPath := filepath.Join(baseDir, "venv")
if err := copyRepository(userVenvPath, venvPath); err != nil {
fmt.Fprintf(os.Stderr, "\nCopy failed: %v\n", err)
return false
}

fmt.Fprintln(os.Stderr, "\n✓ sre-agent venv copied successfully")
return true
}
103 changes: 103 additions & 0 deletions cmd/ai/sre_agent/validate_sre_agent_config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
package sreagent

import (
"fmt"
"os"
"path/filepath"

"github.com/adrg/xdg"
"github.com/openshift/osdctl/internal/utils"
"gopkg.in/yaml.v3"
)

// checkSreAgentConfig validates config.yaml and updates ops-sop path if needed
func checkSreAgentConfig() bool {
baseDir := filepath.Join(xdg.DataHome, "sre-agent")
configPath := filepath.Join(xdg.ConfigHome, "sre-agent/config.yaml")

// Check if config exists
if !utils.FileExists(configPath) {
fmt.Fprintln(os.Stderr, "\nsre-agent not configured")
fmt.Fprintln(os.Stderr, "Config file not found at:", configPath)
return false
}

// Read existing config
data, err := os.ReadFile(configPath)
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to read config: %v\n", err)
return false
}

// Parse YAML
var config map[string]interface{}
if err := yaml.Unmarshal(data, &config); err != nil {
fmt.Fprintf(os.Stderr, "Failed to parse config: %v\n", err)
return false
}

// Get current sop directory from config
sop, ok := config["sop"].(map[string]interface{})
if !ok {
fmt.Fprintln(os.Stderr, "Invalid config: sop section not found")
return false
}

currentSopDir, ok := sop["directory"].(string)
if !ok {
fmt.Fprintln(os.Stderr, "Invalid config: sop directory is not a string")
return false
}

// Ask user for ops-sop repository path
fmt.Fprintln(os.Stderr, "\nChecking ops-sop repository...")
fmt.Fprint(os.Stderr, "Enter the absolute path to ops-sop repository: ")
userOpsSopPath, err := promptUserInput()
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to read input: %v\n", err)
return false
}

// Validate path exists
if !utils.FolderExists(userOpsSopPath) {
fmt.Fprintln(os.Stderr, "\nThe provided ops-sop path does not exist.")
return false
}

opsSopPath := filepath.Join(baseDir, "ops-sop")

// Copy ops-sop if not present
if !utils.FolderExists(opsSopPath) {
if err := copyRepository(userOpsSopPath, opsSopPath); err != nil {
fmt.Fprintf(os.Stderr, "\nCopy failed: %v\n", err)
return false
}
fmt.Fprintln(os.Stderr, "✓ ops-sop copied successfully")
} else {
fmt.Fprintln(os.Stderr, "✓ ops-sop repository found")
}
Comment on lines +52 to +78
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid unconditional prompting; only ask for source path when copy is actually needed.

The command currently prompts every run, even when ops-sop is already present under XDG data. That makes routine execution unnecessarily interactive.

💡 Proposed fix
-	// Ask user for ops-sop repository path
-	fmt.Fprintln(os.Stderr, "\nChecking ops-sop repository...")
-	fmt.Fprint(os.Stderr, "Enter the absolute path to ops-sop repository: ")
-	userOpsSopPath, err := promptUserInput()
-	if err != nil {
-		fmt.Fprintf(os.Stderr, "Failed to read input: %v\n", err)
-		return false
-	}
-
-	// Validate path exists
-	if !utils.FolderExists(userOpsSopPath) {
-		fmt.Fprintln(os.Stderr, "\nThe provided ops-sop path does not exist.")
-		return false
-	}
-
 	opsSopPath := filepath.Join(baseDir, "ops-sop")
 
 	// Copy ops-sop if not present
 	if !utils.FolderExists(opsSopPath) {
+		fmt.Fprintln(os.Stderr, "\nChecking ops-sop repository...")
+		fmt.Fprint(os.Stderr, "Enter the absolute path to ops-sop repository: ")
+		userOpsSopPath, err := promptUserInput()
+		if err != nil {
+			fmt.Fprintf(os.Stderr, "Failed to read input: %v\n", err)
+			return false
+		}
+		if !utils.FolderExists(userOpsSopPath) {
+			fmt.Fprintln(os.Stderr, "\nThe provided ops-sop path does not exist.")
+			return false
+		}
 		if err := copyRepository(userOpsSopPath, opsSopPath); err != nil {
 			fmt.Fprintf(os.Stderr, "\nCopy failed: %v\n", err)
 			return false
 		}
 		fmt.Fprintln(os.Stderr, "✓ ops-sop copied successfully")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Ask user for ops-sop repository path
fmt.Fprintln(os.Stderr, "\nChecking ops-sop repository...")
fmt.Fprint(os.Stderr, "Enter the absolute path to ops-sop repository: ")
userOpsSopPath, err := promptUserInput()
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to read input: %v\n", err)
return false
}
// Validate path exists
if !utils.FolderExists(userOpsSopPath) {
fmt.Fprintln(os.Stderr, "\nThe provided ops-sop path does not exist.")
return false
}
opsSopPath := filepath.Join(baseDir, "ops-sop")
// Copy ops-sop if not present
if !utils.FolderExists(opsSopPath) {
if err := copyRepository(userOpsSopPath, opsSopPath); err != nil {
fmt.Fprintf(os.Stderr, "\nCopy failed: %v\n", err)
return false
}
fmt.Fprintln(os.Stderr, "✓ ops-sop copied successfully")
} else {
fmt.Fprintln(os.Stderr, "✓ ops-sop repository found")
}
opsSopPath := filepath.Join(baseDir, "ops-sop")
// Copy ops-sop if not present
if !utils.FolderExists(opsSopPath) {
fmt.Fprintln(os.Stderr, "\nChecking ops-sop repository...")
fmt.Fprint(os.Stderr, "Enter the absolute path to ops-sop repository: ")
userOpsSopPath, err := promptUserInput()
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to read input: %v\n", err)
return false
}
if !utils.FolderExists(userOpsSopPath) {
fmt.Fprintln(os.Stderr, "\nThe provided ops-sop path does not exist.")
return false
}
if err := copyRepository(userOpsSopPath, opsSopPath); err != nil {
fmt.Fprintf(os.Stderr, "\nCopy failed: %v\n", err)
return false
}
fmt.Fprintln(os.Stderr, "✓ ops-sop copied successfully")
} else {
fmt.Fprintln(os.Stderr, "✓ ops-sop repository found")
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ai/sre_agent/validate_sre_agent_config.go` around lines 52 - 78,
Currently the code always calls promptUserInput() to ask for userOpsSopPath
before checking if ops-sop exists; move the prompt and subsequent validation
(promptUserInput(), utils.FolderExists(userOpsSopPath) checks and the
copyRepository call) inside the branch that runs only when opsSopPath :=
filepath.Join(baseDir, "ops-sop") does not exist (i.e., inside the if
!utils.FolderExists(opsSopPath) block). Keep the existing error handling
(fmt.Fprintf on read/copy failures and the success/found messages) but remove
the unconditional prompt so the code only asks for the source path when a copy
is actually needed; reference promptUserInput, utils.FolderExists and
copyRepository to locate the changes.


// Check if sop directory in config is different from expected
if currentSopDir != opsSopPath {
// Update config with new path
sop["directory"] = opsSopPath

// Write updated config
updatedData, err := yaml.Marshal(config)
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to marshal config: %v\n", err)
return false
}

if err := os.WriteFile(configPath, updatedData, 0600); err != nil {
fmt.Fprintf(os.Stderr, "Failed to write config: %v\n", err)
return false
}

fmt.Fprintf(os.Stderr, "✓ ops-sop path updated in config: %s\n\n", opsSopPath)
} else {
fmt.Fprintf(os.Stderr, "✓ ops-sop path is correct: %s\n\n", opsSopPath)
}

return true
}
2 changes: 2 additions & 0 deletions cmd/cmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import (

"github.com/openshift/osdctl/cmd/aao"
"github.com/openshift/osdctl/cmd/account"
"github.com/openshift/osdctl/cmd/ai"
"github.com/openshift/osdctl/cmd/alerts"
"github.com/openshift/osdctl/cmd/cloudtrail"
"github.com/openshift/osdctl/cmd/cluster"
Expand Down Expand Up @@ -99,6 +100,7 @@ func NewCmdRoot(streams genericclioptions.IOStreams) *cobra.Command {
// add sub commands
addToRootCmdWithOtherGlobalOpts(aao.NewCmdAao(kubeClient))
addToRootCmdWithOtherGlobalOpts(account.NewCmdAccount(streams, kubeClient, globalOpts))
rootCmd.AddCommand(ai.NewCmdAI())
addToRootCmdWithOtherGlobalOpts(alerts.NewCmdAlerts())
addToRootCmdWithOtherGlobalOpts(cloudtrail.NewCloudtrailCmd())
addToRootCmdWithOtherGlobalOpts(cluster.NewCmdCluster(streams, kubeClient, globalOpts))
Expand Down
Loading