Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .golangci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ linters:
- io/ioutil.ReadFile
- io.Copy(*bytes.Buffer)
- io.Copy(os.Stdout)
- (github.com/go-kit/log.Logger).Log
# Display function signature instead of selector.
# Default: false
verbose: true
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ clean: ## Clean build time resources, primarily, unused docker images.

.PHONY: conform
conform:
docker run --rm -v ${PWD}:/src -w /src ghcr.io/siderolabs/conform:v0.1.0-alpha.27 enforce
docker run --rm -v ${PWD}:/src -w /src ghcr.io/siderolabs/conform:v0.1.0-alpha.31 enforce

.PHONY: lint
lint:
Expand Down
21 changes: 20 additions & 1 deletion charts/operator/templates/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ rules:
- resources:
- configmaps
apiGroups: [""]
verbs: ["list", "watch", "create"]
verbs: ["list", "watch", "create", "get", "patch", "update", "delete"]
Comment thread
yama6a marked this conversation as resolved.
- resources:
- configmaps
apiGroups: [""]
Expand Down Expand Up @@ -185,3 +185,22 @@ rules:
- gmp-operator
verbs: ["delete"]
{{- end -}}
{{- if .Values.collector.rbac.create -}}
{{- if .Values.operator.rbac.create }}
---
{{- end }}
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: collector
namespace: {{.Values.namespace.system}}
{{- if .Values.commonLabels }}
labels:
{{- include "prometheus-engine.labels" . | nindent 4 }}
{{- end }}
rules:
- resources:
- configmaps
apiGroups: [""]
verbs: ["get", "list", "watch"]
{{- end -}}
17 changes: 17 additions & 0 deletions charts/operator/templates/rolebinding.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -103,4 +103,21 @@ subjects:
- name: collector
namespace: {{.Values.namespace.system}}
kind: ServiceAccount
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: collector
namespace: {{.Values.namespace.system}}
{{- if .Values.commonLabels }}
labels:
{{- include "prometheus-engine.labels" . | nindent 4 }}
{{- end }}
roleRef:
name: collector
kind: Role
apiGroup: rbac.authorization.k8s.io
subjects:
- name: collector
kind: ServiceAccount
{{- end -}}
12 changes: 7 additions & 5 deletions charts/operator/templates/rule-evaluator.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,13 @@ spec:
initContainers:
- name: config-init
image: {{.Values.images.bash.image}}:{{.Values.images.bash.tag}}
command: ['/bin/bash', '-c', 'touch /prometheus/config_out/config.yaml']
# placeholder.yaml keeps rule_files glob non-empty until the syncer populates rules-out.
command: ['/bin/bash', '-c', 'touch /prometheus/config_out/config.yaml /prometheus/rules_out/placeholder.yaml']
volumeMounts:
- name: config-out
mountPath: /prometheus/config_out
- name: rules-out
mountPath: /prometheus/rules_out
securityContext:
allowPrivilegeEscalation: false
capabilities:
Expand Down Expand Up @@ -99,6 +102,8 @@ spec:
- --config-file=/prometheus/config/config.yaml
- --config-file-output=/prometheus/config_out/config.yaml
- --config-dir=/etc/rules
- --config-dir-from-configmap-selector=monitoring.googleapis.com/rules-shard=true
- --config-dir-from-configmap-namespace={{.Values.namespace.system}}
- --config-dir-output=/prometheus/rules_out
- --watched-dir=/etc/secrets
- --reload-url=http://127.0.0.1:19092/-/reload
Expand All @@ -115,7 +120,6 @@ spec:
- name: config-out
mountPath: /prometheus/config_out
- name: rules
readOnly: true
mountPath: /etc/rules
- name: rules-out
mountPath: /prometheus/rules_out
Expand All @@ -137,9 +141,7 @@ spec:
- name: config-out
emptyDir: {}
- name: rules
configMap:
name: rules-generated
defaultMode: 420
emptyDir: {}
- name: rules-out
emptyDir: {}
- name: rules-secret
Expand Down
4 changes: 4 additions & 0 deletions cmd/config-reloader/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ Meant to be run as a sidecar.
Usage of config-reloader:
-config-dir string
config directory to watch for changes
-config-dir-from-configmap-namespace string
namespace to list ConfigMaps from (required when --config-dir-from-configmap-selector is set)
-config-dir-from-configmap-selector string
label selector to discover ConfigMaps via K8s API (e.g. monitoring.googleapis.com/rules-shard=true). When set, materialized ConfigMap entries are written into --config-dir-output alongside any files from --config-dir.
-config-dir-output string
config directory to write with interpolated environment variables
-config-file string
Expand Down
180 changes: 180 additions & 0 deletions cmd/config-reloader/internal/syncer.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
// Copyright 2026 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package internal

import (
"context"
"crypto/sha256"
"fmt"
"os"
"path/filepath"
"sort"
"strings"
"time"

"github.com/go-kit/log"
"github.com/go-kit/log/level"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/rest"
)

// ConfigMapSyncer materializes ConfigMaps matched by selector into files
// under outputDir. ConfigMaps whose name does not start with namePrefix are
// skipped.
type ConfigMapSyncer struct {
Comment thread
yama6a marked this conversation as resolved.
client kubernetes.Interface
namespace string
selector string
namePrefix string
outputDir string
logger log.Logger
interval time.Duration

lastHash string
}

// NewConfigMapSyncer constructs a syncer using in-cluster credentials.
// Empty namePrefix disables the name check.
func NewConfigMapSyncer(namespace, selector, namePrefix, outputDir string, interval time.Duration, logger log.Logger) (*ConfigMapSyncer, error) {
cfg, err := rest.InClusterConfig()
if err != nil {
return nil, err
}
client, err := kubernetes.NewForConfig(cfg)
if err != nil {
return nil, err
}
return newConfigMapSyncerWithClient(client, namespace, selector, namePrefix, outputDir, interval, logger), nil
}

func newConfigMapSyncerWithClient(client kubernetes.Interface, namespace, selector, namePrefix, outputDir string, interval time.Duration, logger log.Logger) *ConfigMapSyncer {
return &ConfigMapSyncer{
client: client,
namespace: namespace,
selector: selector,
namePrefix: namePrefix,
outputDir: outputDir,
interval: interval,
logger: logger,
}
}

// Sync runs one list-and-write cycle. It returns whether any file changed.
func (s *ConfigMapSyncer) Sync(ctx context.Context) (bool, error) {
Comment thread
yama6a marked this conversation as resolved.
cmList, err := s.client.CoreV1().ConfigMaps(s.namespace).List(ctx, metav1.ListOptions{
LabelSelector: s.selector,
})
if err != nil {
return false, fmt.Errorf("list configmaps: %w", err)
}

files := make(map[string][]byte)
for i := range cmList.Items {
cm := &cmList.Items[i]
if s.namePrefix != "" && !strings.HasPrefix(cm.Name, s.namePrefix) {
level.Warn(s.logger).Log("msg", "ignoring configmap with unexpected name", "name", cm.Name, "want_prefix", s.namePrefix)
continue
}
for k, v := range cm.Data {
files[cm.Name+"__"+k] = []byte(v)
Comment thread
yama6a marked this conversation as resolved.
}
for k, v := range cm.BinaryData {
files[cm.Name+"__"+k] = v
}
}

hash := hashFiles(files)
if hash == s.lastHash {
return false, nil
}

if err := s.writeFiles(files); err != nil {
Comment thread
yama6a marked this conversation as resolved.
return false, err
}

s.lastHash = hash
level.Info(s.logger).Log("msg", "synced configmap rules", "configmaps", len(cmList.Items), "files", len(files))
return true, nil
}

// Run does an initial Sync and then re-syncs on every interval until ctx is cancelled.
func (s *ConfigMapSyncer) Run(ctx context.Context) error {
// Best-effort initial sync; the reloader will pick up files on its next poll cycle.
if _, err := s.Sync(ctx); err != nil {
level.Warn(s.logger).Log("msg", "initial configmap sync failed", "err", err)
}
ticker := time.NewTicker(s.interval)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return nil
case <-ticker.C:
if _, err := s.Sync(ctx); err != nil {
level.Warn(s.logger).Log("msg", "configmap sync failed", "err", err)
}
}
}
}

func (s *ConfigMapSyncer) writeFiles(files map[string][]byte) error {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The writeFiles method updates files non-atomically. If the rule evaluator or the reloader polls the directory while files are being written or removed, it may encounter a partial or inconsistent state. Consider using an atomic update pattern, such as writing to a temporary directory within the same volume and performing an atomic swap (e.g., using a symlink or renaming the directory) to ensure the downstream consumer always sees a consistent set of rules.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine in practice - the config-reloader already hashes the directory on each poll and only triggers a reload when the hash changes. If it catches a mid-write state, it just picks it up on the next cycle. Not worth the complexity of an atomic swap imho.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so it worked with one file, but now we have many. Are we sure rule-eval will not load inconsistent state of rule groups, especially during 1->3 and 3->1 e.g. causing duplicate groups or key recording rule missing? What am I missing?

One alternative is writing to a big file locally (: Another is integrating K8s API into Thanos config-reloader (we could have our copy of it, it's not a lot of code). Symlink is odd but might work.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, fair point. There's a real window when going between one and multiple shards, where we can briefly see duplicates or gaps, but it self-heals on the next sync within about 10 seconds.

Having one merged file would fix the atomicity thingy, but it adds a global group-name uniqueness constraint that we don't have today, not sure we wanna deal with that.

If a few seconds out-of-sync'ness isn't acceptable, I would propose to implement the same ..data symlink swap that K8s does for configmap mounts. Then no rule-eval changes would be needed then. It seems like the least invasive option to me. What do you think? Can we live with the self-healing?

if err := os.MkdirAll(s.outputDir, 0o755); err != nil {
return err
}

for name, data := range files {
if filepath.Base(name) != name {
continue
}
if err := os.WriteFile(filepath.Join(s.outputDir, name), data, 0o644); err != nil {
Comment thread
bwplotka marked this conversation as resolved.
return err
}
}

entries, err := os.ReadDir(s.outputDir)
if err != nil {
return err
}
for _, e := range entries {
if e.IsDir() {
continue
}
if _, ok := files[e.Name()]; !ok {
if err := os.Remove(filepath.Join(s.outputDir, e.Name())); err != nil {
return err
}
}
}

return nil
}

func hashFiles(files map[string][]byte) string {
h := sha256.New()

keys := make([]string, 0, len(files))
for k := range files {
keys = append(keys, k)
}
sort.Strings(keys)

for _, k := range keys {
fmt.Fprintf(h, "%s\x00", k)
_, _ = h.Write(files[k])
_, _ = h.Write([]byte{0})
}
return fmt.Sprintf("%x", h.Sum(nil))
}
Loading
Loading