|
| 1 | +--- |
| 2 | +title: "Scheduling Classes" |
| 3 | +linkTitle: "Scheduling Classes" |
| 4 | +description: "Restrict tenant workloads to specific nodes or failure domains using SchedulingClass resources and the Cozystack scheduler." |
| 5 | +weight: 150 |
| 6 | +--- |
| 7 | + |
| 8 | +SchedulingClass is a cluster-scoped custom resource that lets administrators define |
| 9 | +placement policies for tenant workloads. When a tenant is assigned a scheduling class, |
| 10 | +all of its pods are automatically routed to the Cozystack custom scheduler, which |
| 11 | +merges the class-defined constraints with any constraints already present on the pod. |
| 12 | + |
| 13 | +This allows platform operators to pin tenants to specific data centers, availability |
| 14 | +zones, or node groups — without modifying individual application charts. |
| 15 | + |
| 16 | +## How it works |
| 17 | + |
| 18 | +The feature has two components: |
| 19 | + |
| 20 | +1. **Lineage-controller webhook** (part of `cozystack`): a mutating admission webhook |
| 21 | + that intercepts pod creation in tenant namespaces. When a namespace carries the |
| 22 | + `scheduler.cozystack.io/scheduling-class` label, the webhook sets `schedulerName: cozystack-scheduler` |
| 23 | + and adds the `scheduler.cozystack.io/scheduling-class` annotation on every pod. |
| 24 | + If the referenced SchedulingClass CR does not exist (e.g. the scheduler is not installed), |
| 25 | + pods are left untouched and scheduled normally. |
| 26 | + |
| 27 | +2. **Cozystack scheduler** (the `cozystack-scheduler` package): a custom Kubernetes |
| 28 | + scheduler that runs alongside the default scheduler. During scheduling, it resolves |
| 29 | + the SchedulingClass referenced by the pod annotation and merges the CR's constraints |
| 30 | + (node affinity, pod affinity/anti-affinity, topology spread) with the pod's own spec — |
| 31 | + entirely in memory, without mutating the pod in the API server. |
| 32 | + |
| 33 | +## Prerequisites |
| 34 | + |
| 35 | +- Cozystack v1.2+ |
| 36 | +- The `cozystack-scheduler` system package (v0.2.0+) |
| 37 | + |
| 38 | +## Installing the scheduler |
| 39 | + |
| 40 | +```bash |
| 41 | +cozypkg add cozystack.cozystack-scheduler |
| 42 | +``` |
| 43 | + |
| 44 | +## Creating a SchedulingClass |
| 45 | + |
| 46 | +A SchedulingClass CR mirrors familiar Kubernetes scheduling primitives. All fields |
| 47 | +are optional — include only the constraints you need. |
| 48 | + |
| 49 | +### Example: pin workloads to a data center |
| 50 | + |
| 51 | +```yaml |
| 52 | +apiVersion: cozystack.io/v1alpha1 |
| 53 | +kind: SchedulingClass |
| 54 | +metadata: |
| 55 | + name: dc-west |
| 56 | +spec: |
| 57 | + nodeSelector: |
| 58 | + topology.kubernetes.io/region: us-west-2 |
| 59 | +``` |
| 60 | +
|
| 61 | +Pods assigned to this class will only be scheduled on nodes labeled |
| 62 | +`topology.kubernetes.io/region=us-west-2`. |
| 63 | + |
| 64 | +### Example: spread across availability zones |
| 65 | + |
| 66 | +```yaml |
| 67 | +apiVersion: cozystack.io/v1alpha1 |
| 68 | +kind: SchedulingClass |
| 69 | +metadata: |
| 70 | + name: zone-spread |
| 71 | +spec: |
| 72 | + topologySpreadConstraints: |
| 73 | + - maxSkew: 1 |
| 74 | + topologyKey: topology.kubernetes.io/zone |
| 75 | + whenUnsatisfiable: DoNotSchedule |
| 76 | +``` |
| 77 | + |
| 78 | +{{% alert title="Note" %}} |
| 79 | +When a `topologySpreadConstraint` or pod affinity/anti-affinity term has a nil |
| 80 | +`labelSelector`, the scheduler automatically populates it with a selector matching |
| 81 | +the workload's Cozystack application identity labels (`apps.cozystack.io/application.group`, |
| 82 | +`.kind`, `.name`). This means you can define generic spreading or anti-affinity |
| 83 | +policies without hard-coding label values per application. |
| 84 | +{{% /alert %}} |
| 85 | + |
| 86 | +### Example: require dedicated nodes with anti-affinity |
| 87 | + |
| 88 | +```yaml |
| 89 | +apiVersion: cozystack.io/v1alpha1 |
| 90 | +kind: SchedulingClass |
| 91 | +metadata: |
| 92 | + name: dedicated-nodes |
| 93 | +spec: |
| 94 | + nodeAffinity: |
| 95 | + requiredDuringSchedulingIgnoredDuringExecution: |
| 96 | + nodeSelectorTerms: |
| 97 | + - matchExpressions: |
| 98 | + - key: node-pool |
| 99 | + operator: In |
| 100 | + values: |
| 101 | + - dedicated |
| 102 | + podAntiAffinity: |
| 103 | + requiredDuringSchedulingIgnoredDuringExecution: |
| 104 | + - topologyKey: kubernetes.io/hostname |
| 105 | +``` |
| 106 | + |
| 107 | +This pins workloads to nodes in the `dedicated` pool and spreads pods across |
| 108 | +hosts. The anti-affinity `labelSelector` is auto-populated per application, so |
| 109 | +pods from different applications of the same tenant can still land on the same node. |
| 110 | + |
| 111 | +## Full SchedulingClass spec reference |
| 112 | + |
| 113 | +| Field | Type | Description | |
| 114 | +|-------|------|-------------| |
| 115 | +| `spec.nodeSelector` | `map[string]string` | Simple key-value node labels that all nodes must match. | |
| 116 | +| `spec.nodeAffinity` | [`NodeAffinity`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.31/#nodeaffinity-v1-core) | Required and preferred node affinity rules. | |
| 117 | +| `spec.podAffinity` | [`PodAffinity`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.31/#podaffinity-v1-core) | Required and preferred pod co-location rules. | |
| 118 | +| `spec.podAntiAffinity` | [`PodAntiAffinity`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.31/#podantiaffinity-v1-core) | Required and preferred pod anti-co-location rules. | |
| 119 | +| `spec.topologySpreadConstraints` | [`[]TopologySpreadConstraint`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.31/#topologyspreadconstraint-v1-core) | Topology spread constraints for even distribution across failure domains. | |
| 120 | + |
| 121 | +## Assigning a SchedulingClass to a tenant |
| 122 | + |
| 123 | +When creating or editing a tenant, set the `schedulingClass` parameter to the name |
| 124 | +of an existing SchedulingClass CR: |
| 125 | + |
| 126 | +**Via the dashboard:** |
| 127 | + |
| 128 | +Select the scheduling class from the dropdown in the tenant creation form. |
| 129 | + |
| 130 | +**Via Helm values (`values.yaml`):** |
| 131 | + |
| 132 | +```yaml |
| 133 | +schedulingClass: dc-west |
| 134 | +``` |
| 135 | + |
| 136 | +**Via the tenant secret (child tenant inheritance):** |
| 137 | + |
| 138 | +When a parent tenant has a scheduling class assigned, all child tenants inherit it |
| 139 | +automatically. A child tenant cannot override the parent's scheduling class — it can |
| 140 | +only set one if the parent has none. |
| 141 | + |
| 142 | +The assignment writes the `scheduler.cozystack.io/scheduling-class` label on the |
| 143 | +tenant's namespace. The webhook reads this label (or resolves it from the owning |
| 144 | +Application CR) to inject the scheduler name and annotation into pods. |
| 145 | + |
| 146 | +## Auto-populated label selectors |
| 147 | + |
| 148 | +The scheduler (v0.2.0+) automatically fills in nil `labelSelector` fields on |
| 149 | +pod affinity, pod anti-affinity, and topology spread constraint terms. It uses |
| 150 | +the pod's Cozystack application identity labels: |
| 151 | + |
| 152 | +- `apps.cozystack.io/application.group` |
| 153 | +- `apps.cozystack.io/application.kind` |
| 154 | +- `apps.cozystack.io/application.name` |
| 155 | + |
| 156 | +This means that a generic SchedulingClass like: |
| 157 | + |
| 158 | +```yaml |
| 159 | +spec: |
| 160 | + podAntiAffinity: |
| 161 | + requiredDuringSchedulingIgnoredDuringExecution: |
| 162 | + - topologyKey: kubernetes.io/hostname |
| 163 | +``` |
| 164 | + |
| 165 | +will automatically scope the anti-affinity to pods of the same application — each |
| 166 | +application gets its own anti-affinity behavior without needing a separate |
| 167 | +SchedulingClass per app. |
| 168 | + |
| 169 | +The default label keys can be overridden in the scheduler's Helm values: |
| 170 | + |
| 171 | +```yaml |
| 172 | +defaultLabelSelectorKeys: |
| 173 | + - apps.cozystack.io/application.group |
| 174 | + - apps.cozystack.io/application.kind |
| 175 | + - apps.cozystack.io/application.name |
| 176 | +``` |
| 177 | + |
| 178 | +If a term already has an explicit `labelSelector`, it is preserved as-is. |
| 179 | + |
| 180 | +## Operators without native schedulerName support |
| 181 | + |
| 182 | +Some operators used by Cozystack do not expose `schedulerName` in their CRDs. |
| 183 | +The webhook-based approach handles these transparently because it mutates pods |
| 184 | +directly at admission time, regardless of which operator created them: |
| 185 | + |
| 186 | +- etcd-operator |
| 187 | +- redis-operator (spotahome) |
| 188 | +- mariadb-operator |
| 189 | +- clickhouse-operator (altinity) |
| 190 | + |
| 191 | +No special configuration is needed for workloads managed by these operators. |
| 192 | + |
| 193 | +## Verifying the setup |
| 194 | + |
| 195 | +1. Confirm the scheduler is running: |
| 196 | + |
| 197 | + ```bash |
| 198 | + kubectl get pods -n cozy-system -l app.kubernetes.io/name=cozystack-scheduler |
| 199 | + ``` |
| 200 | + |
| 201 | +2. Confirm the SchedulingClass exists: |
| 202 | + |
| 203 | + ```bash |
| 204 | + kubectl get schedulingclasses |
| 205 | + ``` |
| 206 | + |
| 207 | +3. Check that a tenant namespace has the label: |
| 208 | + |
| 209 | + ```bash |
| 210 | + kubectl get ns tenant-example -o jsonpath='{.metadata.labels.scheduler\.cozystack\.io/scheduling-class}' |
| 211 | + ``` |
| 212 | + |
| 213 | +4. Check that pods in the tenant namespace use the custom scheduler: |
| 214 | + |
| 215 | + ```bash |
| 216 | + kubectl get pods -n tenant-example -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.schedulerName}{"\n"}{end}' |
| 217 | + ``` |
0 commit comments