Skip to content

feat: generate RAG service YAML config resource#313

Open
tsivaprasad wants to merge 2 commits intoPLAT-490-rag-service-api-key-file-managementfrom
PLAT-491-rag-service-yaml-config-generation-swarm-config
Open

feat: generate RAG service YAML config resource#313
tsivaprasad wants to merge 2 commits intoPLAT-490-rag-service-api-key-file-managementfrom
PLAT-491-rag-service-yaml-config-generation-swarm-config

Conversation

@tsivaprasad
Copy link
Copy Markdown
Contributor

@tsivaprasad tsivaprasad commented Mar 24, 2026

Summary

This PR adds RAGConfigResource, which generates pgedge-rag-server.yaml from pipeline configuration and writes it to the host data directory, completing the file-based config layer for the RAG service.

Changes

  • rag_config.go — YAML struct definitions mirroring the RAG server's
    Config struct and GenerateRAGConfig() generator; api_keys paths
    reference bind-mounted key files at /app/keys/{pipeline}_{embedding|rag}.key
  • rag_config_resource.goRAGConfigResource lifecycle (Create/Update/
    Refresh/Delete);
  • orchestrator.go — adds DirResource (host-side data directory) and
    RAGConfigResource to generateRAGInstanceResources; resource chain is now:
    DirResource → RAGServiceUserRole + RAGServiceKeysResource → RAGConfigResource
  • resources.go — registers ResourceTypeRAGConfig

Testing

Verification:

  1. Created a cluster

  2. Configured a database with RAG service (single-host and multi-host)
    rag_create_db.json
    rag_create_multi_host_db.json

  3. Confirmed successful database creation and generation of pgedge-rag-server.yaml with the expected data
    pgedge-rag-server.yaml

Checklist

  • Tests added

Notes for Reviewers

Container deployment is not part of this. The config file is written correctly but no container starts yet.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a06604c4-4d6f-4cdc-afbd-83859f94c8da

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This change introduces a new RAGConfigResource that manages YAML configuration generation for RAG service instances. It creates a host-side data directory, generates pgedge-rag-server.yaml from RAG pipeline definitions with database connection details, and integrates the new resource into the orchestrator's resource chain alongside existing user-role and keys resources.

Changes

Cohort / File(s) Summary
RAG Config YAML Generation
rag_config.go
New module implementing GenerateRAGConfig function that converts RAG pipeline definitions into YAML configuration, mapping database connection parameters, LLM providers/models, API key paths, and optional pipeline settings.
RAG Config Resource Implementation
rag_config_resource.go
New resource type RAGConfigResource managing the host-side pgedge-rag-server.yaml file within the service data directory, with dependencies on directory, user-role, and keys resources; includes credential population and file ownership management.
RAG Config Testing
rag_config_test.go, rag_config_resource_test.go
Comprehensive test suites validating YAML generation (database mapping, API key paths, LLM configuration, optional fields) and resource behavior (versioning, identifier generation, dependencies, diff ignore paths).
Orchestrator & Registry Integration
orchestrator.go, service_spec.go, resources.go, rag_service_user_role_test.go
Updated orchestrator to include RAGConfigResource in resource chain; added ragContainerUID constant; registered new resource type in registry; updated existing tests to reflect four-resource output sequence.

Poem

🐰 New config resources hop into view,
YAML pipelines crafted just right and true,
RAG service data directories bloom,
Keys and credentials fill every room,
The orchestrator's burrow grows anew!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.64% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: introducing a RAGConfigResource that generates the YAML config file for the RAG service.
Description check ✅ Passed The PR description covers all required sections: summary, changes, testing, and checklist. However, the changelog entry requirement is not addressed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch PLAT-491-rag-service-yaml-config-generation-swarm-config

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tsivaprasad
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

@tsivaprasad: I'll review the changes in this PR right away!

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/internal/orchestrator/swarm/orchestrator.go (1)

404-409: ⚠️ Potential issue | 🟠 Major

rag instances are now provisionable but never deployed.

GenerateServiceInstanceResources routes "rag" into generateRAGInstanceResources, but that path only returns filesystem/user/key/config resources. There is no ServiceInstanceSpecResource or ServiceInstanceResource, so nothing ever creates a Swarm service for the persisted ServiceInstance. In practice this leaves RAG instances stuck in creating with no container that can ever become ready. Either keep the "rag" path disabled until deployment lands, or add the deployment resources in this PR.

Also applies to: 531-594

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/orchestrator.go` around lines 404 - 409,
The "rag" branch in GenerateServiceInstanceResources currently calls
generateRAGInstanceResources which only returns filesystem/user/key/config
resources and omits ServiceInstanceSpecResource and ServiceInstanceResource, so
RAG instances are never deployed; either remove/guard the "rag" case until
deployment support lands, or modify generateRAGInstanceResources to create and
return the same deployment resources as other service types (add
ServiceInstanceSpecResource and ServiceInstanceResource entries, including
container/service spec needed for Swarm creation) so the persisted
ServiceInstance can be turned into a running Swarm service; update the switch in
GenerateServiceInstanceResources and the body of generateRAGInstanceResources
(and mirrored logic around lines 531–594) accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/internal/orchestrator/swarm/rag_config.go`:
- Around line 94-98: GenerateRAGConfig currently builds pipelines by calling
buildRAGPipelineYAML without validating that per-stage credentials match; this
lets a pipeline with EmbeddingLLM.APIKey and RAGLLM.APIKey set to different
values silently prefer one key and misauthenticate. Add a guard where pipelines
are iterated (in GenerateRAGConfig, and similarly in ParseRAGServiceConfig if
present) to detect when both EmbeddingLLM.APIKey and RAGLLM.APIKey are non-empty
for the same provider (e.g., "openai", "anthropic") and differ, then return an
error (fail fast) describing the mismatched keys and include the pipeline
identifier; update buildRAGPipelineYAML callers to rely on this validation so no
silent overwrite occurs.
- Around line 24-35: The YAML model is collapsing explicit zeros because
TokenBudget and TopN are plain ints with `omitempty`; update ragPipelineYAML to
use `*int` for TokenBudget and TopN (and change the corresponding numeric fields
in any related YAML structs such as ragDatabaseYAML and the RAGDefaults YAML
struct to `*int` as well), and preserve/propagate the pointers when mapping
between internal structs and these YAML structs so nil vs. explicit 0 is
preserved (look for all mappings that build/return ragPipelineYAML,
ragDatabaseYAML, and the RAGDefaults YAML structs and assign pointer values
instead of converting to plain ints).

---

Outside diff comments:
In `@server/internal/orchestrator/swarm/orchestrator.go`:
- Around line 404-409: The "rag" branch in GenerateServiceInstanceResources
currently calls generateRAGInstanceResources which only returns
filesystem/user/key/config resources and omits ServiceInstanceSpecResource and
ServiceInstanceResource, so RAG instances are never deployed; either
remove/guard the "rag" case until deployment support lands, or modify
generateRAGInstanceResources to create and return the same deployment resources
as other service types (add ServiceInstanceSpecResource and
ServiceInstanceResource entries, including container/service spec needed for
Swarm creation) so the persisted ServiceInstance can be turned into a running
Swarm service; update the switch in GenerateServiceInstanceResources and the
body of generateRAGInstanceResources (and mirrored logic around lines 531–594)
accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2c6dcd4a-44fe-4d12-9fe1-c4b54b1b6cc8

📥 Commits

Reviewing files that changed from the base of the PR and between c384b7d and f67ddc3.

📒 Files selected for processing (8)
  • server/internal/orchestrator/swarm/orchestrator.go
  • server/internal/orchestrator/swarm/rag_config.go
  • server/internal/orchestrator/swarm/rag_config_resource.go
  • server/internal/orchestrator/swarm/rag_config_resource_test.go
  • server/internal/orchestrator/swarm/rag_config_test.go
  • server/internal/orchestrator/swarm/rag_service_user_role_test.go
  • server/internal/orchestrator/swarm/resources.go
  • server/internal/orchestrator/swarm/service_spec.go

Comment on lines +24 to +35
type ragPipelineYAML struct {
Name string `yaml:"name"`
Description string `yaml:"description,omitempty"`
Database ragDatabaseYAML `yaml:"database"`
Tables []ragTableYAML `yaml:"tables"`
EmbeddingLLM ragLLMYAML `yaml:"embedding_llm"`
RAGLLM ragLLMYAML `yaml:"rag_llm"`
APIKeys *ragAPIKeysYAML `yaml:"api_keys,omitempty"`
TokenBudget int `yaml:"token_budget,omitempty"`
TopN int `yaml:"top_n,omitempty"`
SystemPrompt string `yaml:"system_prompt,omitempty"`
Search *ragSearchYAML `yaml:"search,omitempty"`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Preserve explicit zero values in the YAML model.

database.RAGPipeline.TokenBudget, TopN, and database.RAGDefaults are pointer fields, but the YAML structs flatten them to int with omitempty. That collapses an explicit 0 into “unset”, so the generated YAML no longer faithfully represents the parsed config. Keep the optional numeric fields as *int and pass the pointers through.

🔧 Suggested fix
 type ragPipelineYAML struct {
 	Name         string          `yaml:"name"`
 	Description  string          `yaml:"description,omitempty"`
 	Database     ragDatabaseYAML `yaml:"database"`
 	Tables       []ragTableYAML  `yaml:"tables"`
 	EmbeddingLLM ragLLMYAML      `yaml:"embedding_llm"`
 	RAGLLM       ragLLMYAML      `yaml:"rag_llm"`
 	APIKeys      *ragAPIKeysYAML `yaml:"api_keys,omitempty"`
-	TokenBudget  int             `yaml:"token_budget,omitempty"`
-	TopN         int             `yaml:"top_n,omitempty"`
+	TokenBudget  *int            `yaml:"token_budget,omitempty"`
+	TopN         *int            `yaml:"top_n,omitempty"`
 	SystemPrompt string          `yaml:"system_prompt,omitempty"`
 	Search       *ragSearchYAML  `yaml:"search,omitempty"`
 }
@@
 type ragDefaultsYAML struct {
-	TokenBudget int `yaml:"token_budget,omitempty"`
-	TopN        int `yaml:"top_n,omitempty"`
+	TokenBudget *int `yaml:"token_budget,omitempty"`
+	TopN        *int `yaml:"top_n,omitempty"`
 }
@@
 	var defaults *ragDefaultsYAML
 	if params.Config.Defaults != nil {
-		d := &ragDefaultsYAML{}
-		if params.Config.Defaults.TokenBudget != nil {
-			d.TokenBudget = *params.Config.Defaults.TokenBudget
-		}
-		if params.Config.Defaults.TopN != nil {
-			d.TopN = *params.Config.Defaults.TopN
-		}
-		if d.TokenBudget != 0 || d.TopN != 0 {
+		d := &ragDefaultsYAML{
+			TokenBudget: params.Config.Defaults.TokenBudget,
+			TopN:        params.Config.Defaults.TopN,
+		}
+		if d.TokenBudget != nil || d.TopN != nil {
 			defaults = d
 		}
 	}
@@
 	if p.TokenBudget != nil {
-		pipeline.TokenBudget = *p.TokenBudget
+		pipeline.TokenBudget = p.TokenBudget
 	}
 	if p.TopN != nil {
-		pipeline.TopN = *p.TopN
+		pipeline.TopN = p.TopN
 	}

Also applies to: 72-75, 100-111, 181-185

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/rag_config.go` around lines 24 - 35, The
YAML model is collapsing explicit zeros because TokenBudget and TopN are plain
ints with `omitempty`; update ragPipelineYAML to use `*int` for TokenBudget and
TopN (and change the corresponding numeric fields in any related YAML structs
such as ragDatabaseYAML and the RAGDefaults YAML struct to `*int` as well), and
preserve/propagate the pointers when mapping between internal structs and these
YAML structs so nil vs. explicit 0 is preserved (look for all mappings that
build/return ragPipelineYAML, ragDatabaseYAML, and the RAGDefaults YAML structs
and assign pointer values instead of converting to plain ints).

Comment on lines +94 to +98
func GenerateRAGConfig(params *RAGConfigParams) ([]byte, error) {
pipelines := make([]ragPipelineYAML, 0, len(params.Config.Pipelines))
for _, p := range params.Config.Pipelines {
pipelines = append(pipelines, buildRAGPipelineYAML(p, params))
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject mismatched same-provider API keys instead of silently preferring one.

The control-plane config model allows separate EmbeddingLLM.APIKey and RAGLLM.APIKey, but this generator overwrites the embedding path whenever both stages use the same provider. If a pipeline supplies different OpenAI/Anthropic keys for embedding vs. generation, one stage will authenticate with the wrong credential. If the server can only accept one key per provider, fail fast here (or in ParseRAGServiceConfig) when the two keys differ.

🔧 Suggested guard
 import (
+	"fmt"
 	"path"
@@
 func GenerateRAGConfig(params *RAGConfigParams) ([]byte, error) {
 	pipelines := make([]ragPipelineYAML, 0, len(params.Config.Pipelines))
 	for _, p := range params.Config.Pipelines {
-		pipelines = append(pipelines, buildRAGPipelineYAML(p, params))
+		pipeline, err := buildRAGPipelineYAML(p, params)
+		if err != nil {
+			return nil, err
+		}
+		pipelines = append(pipelines, pipeline)
 	}
@@
-func buildRAGPipelineYAML(p database.RAGPipeline, params *RAGConfigParams) ragPipelineYAML {
+func buildRAGPipelineYAML(p database.RAGPipeline, params *RAGConfigParams) (ragPipelineYAML, error) {
@@
-	apiKeys := buildRAGAPIKeysYAML(p, params.KeysDir)
+	apiKeys, err := buildRAGAPIKeysYAML(p, params.KeysDir)
+	if err != nil {
+		return ragPipelineYAML{}, err
+	}
@@
-	return pipeline
+	return pipeline, nil
 }
@@
-func buildRAGAPIKeysYAML(p database.RAGPipeline, keysDir string) *ragAPIKeysYAML {
+func buildRAGAPIKeysYAML(p database.RAGPipeline, keysDir string) (*ragAPIKeysYAML, error) {
+	if p.EmbeddingLLM.Provider == p.RAGLLM.Provider &&
+		p.EmbeddingLLM.APIKey != nil && p.RAGLLM.APIKey != nil &&
+		*p.EmbeddingLLM.APIKey != "" && *p.RAGLLM.APIKey != "" &&
+		*p.EmbeddingLLM.APIKey != *p.RAGLLM.APIKey {
+		return nil, fmt.Errorf(
+			"pipeline %q uses different %s API keys for embedding and rag",
+			p.Name,
+			p.RAGLLM.Provider,
+		)
+	}
+
 	keys := &ragAPIKeysYAML{}
@@
 	if keys.Anthropic == "" && keys.OpenAI == "" && keys.Voyage == "" {
-		return nil
+		return nil, nil
 	}
-	return keys
+	return keys, nil
 }

Also applies to: 130-160, 206-236

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/rag_config.go` around lines 94 - 98,
GenerateRAGConfig currently builds pipelines by calling buildRAGPipelineYAML
without validating that per-stage credentials match; this lets a pipeline with
EmbeddingLLM.APIKey and RAGLLM.APIKey set to different values silently prefer
one key and misauthenticate. Add a guard where pipelines are iterated (in
GenerateRAGConfig, and similarly in ParseRAGServiceConfig if present) to detect
when both EmbeddingLLM.APIKey and RAGLLM.APIKey are non-empty for the same
provider (e.g., "openai", "anthropic") and differ, then return an error (fail
fast) describing the mismatched keys and include the pipeline identifier; update
buildRAGPipelineYAML callers to rely on this validation so no silent overwrite
occurs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant