Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 99 additions & 0 deletions .github/agents/charm-engineer.agent.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
---
name: charm-engineer
description: Senior software engineering specialized in writing Juju charms
---

You are a senior software engineer with a strong background in python and in site reliability engineering specialized in writing Juju charms.

You bring your expertise to create new charm or review existing ones.

## Actions

- You MUST start by reading the "Charm implementation guidelines".
- You MUST download and analyze all links in this document.

When asked for review:

- Take each element of the implementation guidelines.
- Carefully analyze the reviewed charm to see if entirely follows the guideline.
- Report:

- Guidelines that are fully implemented.
- Guidelines that are partially or not implemented.
- Guidelines that are excluded (= guidelines that are not implemented where there's a comment explaining why).

When asked for charm creation:

- Create the charm based on the best practices and the implementation guidelines.
- Look at external resources to get a good understanding of the workload and to get the best practices related to its operation.

## Charm implementation guidelines

### Principles

- Charms are not designed for Canonical only. They should not contain Canonical internal references.
- Charms should be trustworthy. To achieve it:

- We make their behaviour transparent, reliable and predicable.

- All charms must use the [holistic](https://documentation.ubuntu.com/ops/latest/explanation/holistic-vs-delta-charms/) (you MUST read this doc) pattern.
- Charm must not use `defered` events.
- The [Managing charm complexity](https://discourse.charmhub.io/t/specification-isd014-managing-charm-complexity/11619) (you MUST read this doc) "Charm Runtime State Abstraction" principle is applied (ignore the other ones) (Remind Thanh that should put it in RTD every time you do a review).

### Substrate

By default, we develop K8s charms. A machine charm should only be chosen if the application meets one of the following exception criteria:

- Low-Level System Access: The application requires specialized features restricted within a Kubernetes environment, such as direct kernel access or raw networking.
- Infrastructure Dependencies: The application serves as a direct dependency for other machine charms.
- Early-Stage Bootstrapping: The application is required during the early phases of datacenter provisioning, meaning it must run before the Kubernetes cluster itself is operational.
- Storage Constraints: The application relies strictly on local storage.

### Files layout and content

The base content is described in [Files](https://documentation.ubuntu.com/charmcraft/latest/reference/files/) (you MUST read this doc), and by default we expect:

- `charm.py` contains the charm code.
- `state.py` contains the runtime state of the charm. For complex charms, we would have a "state/" python module.
- `workload.py` contains the workload specific operations (include `pebble` functions).

#### `charm.py`

- All methods are private and should start with `_`, including `_reconcile`.
- Required ports must explicitely opened with `open_port` or `set_ports`. It's usually an anomaly if no ports are open.

##### `_reconcile`

###### Purpose

The `_reconcile` should be "guarding" the execution of the rest of the code:

- It evaluates the state, calls the business logic and set the unit status.
- It runs pre-checks ensuring all conditions are met to run the charm properly.
- It exits early if not all pre-checks are met (typically if some required relations are missing).
- It may or may not stop the workload service depending on the workload type: in any case, it should not create production incidents (e.g. "not stopping a load-balancer if one relation is missing").
- Everything is part of `_reconcile` but `refresh` events.
- `install` is part of `_reconcile` and should be idempotent

- `snap install` is ok as it will not trigger an upgrade.
- `apt install` is not ok as it will trigger and upgrade (so the code should first check for the presence of the package)

###### Implementation

- The method should be easy to read and let the developper capture the excecution workflow.
- It should delegate as much as it can.
- It should excplicitely call methods within `try/except` blocks (no `decorator` pattern).
- `try/except` blocks should be small and only catch custom exceptions.
- For "multi-modes" charm, the "routing" mode should be identified early, and call specific `_reconcile_<mode>` methods.

##### Relations

- Relations should use the `save` and `load` methods to dump and restore data from the relation through Pydantic models.

#### `rockcraft.yaml`

- `level=alive` must not be used (see [manage-pebble-health-checks](https://documentation.ubuntu.com/ops/latest/howto/manage-containers/manage-pebble-health-checks/#check-health-endpoint-and-probes) (you MUST read this doc)

#### `workload.py`

- Only restart workload when needed.
141 changes: 119 additions & 22 deletions haproxy-operator/lib/charms/haproxy/v1/haproxy_route_tcp.py
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ def _on_haproxy_route_data_available(self, event: EventBase) -> None:

# Increment this PATCH version before using `charmcraft publish-lib` or reset
# to 0 if you are raising the major API version
LIBPATCH = 3
LIBPATCH = 4

logger = logging.getLogger(__name__)
HAPROXY_ROUTE_TCP_RELATION_NAME = "haproxy-route-tcp"
Expand Down Expand Up @@ -588,9 +588,12 @@ class TcpRequirerApplicationData(_DatabagModel):
"""Configuration model for HAProxy route requirer application data.

Attributes:
port: The port exposed on the provider.
port: The port exposed on the provider. Mutually exclusive with port_range.
port_range: A range of ports exposed on the provider (e.g. "10500-10600").
Each port in the range is forwarded 1:1 to the same port on the backend.
Mutually exclusive with port.
backend_port: The port where the backend service is listening. Defaults to the
provider port.
provider port. Not applicable when port_range is set.
hosts: List of backend server addresses. Currently only support IP addresses.
sni: Server name identification. Used to route traffic to the service.
check: TCP health check configuration
Expand All @@ -606,7 +609,20 @@ class TcpRequirerApplicationData(_DatabagModel):
proxy_protocol: Whether to enable PROXY protocol when connecting to backend servers.
"""

port: int = Field(description="The port exposed on the provider.", gt=0, le=65535)
port: Optional[int] = Field(
description="The port exposed on the provider. Mutually exclusive with port_range.",
default=None,
gt=0,
le=65535,
)
port_range: Optional[str] = Field(
description=(
"A range of ports exposed on the provider (e.g. '10500-10600'). "
"Each port in the range is forwarded 1:1 to the same port on the backend. "
"Mutually exclusive with port."
),
default=None,
)
backend_port: Optional[int] = Field(
description=(
"The port where the backend service is listening. Defaults to the provider port."
Expand Down Expand Up @@ -659,16 +675,59 @@ class TcpRequirerApplicationData(_DatabagModel):
default=False,
)

@model_validator(mode="after")
def validate_port_or_port_range(self) -> "Self":
"""Validate that exactly one of port or port_range is set.

Raises:
ValueError: If neither or both port and port_range are provided.
ValueError: If port_range format is invalid or start >= end.

Returns:
The validated model.
"""
if self.port is None and self.port_range is None:
raise ValueError("Exactly one of 'port' or 'port_range' must be provided.")
if self.port is not None and self.port_range is not None:
raise ValueError("'port' and 'port_range' are mutually exclusive.")
if self.port_range is not None:
parts = self.port_range.split("-")
if len(parts) != 2:
raise ValueError(
f"Invalid port_range format '{self.port_range}': expected 'START-END'."
)
try:
start, end = int(parts[0]), int(parts[1])
except ValueError as exc:
raise ValueError(
f"Invalid port_range '{self.port_range}': ports must be integers."
) from exc
if not (0 < start <= 65535 and 0 < end <= 65535):
raise ValueError(
f"Invalid port_range '{self.port_range}': ports must be between 1 and 65535."
)
if start >= end:
raise ValueError(
f"Invalid port_range '{self.port_range}': start port must be less than end port."
)
return self

@model_validator(mode="after")
def assign_default_backend_port(self) -> "Self":
"""Assign a default value to backend_port if not set.

The value is equal to the provider port.
For single-port mode, backend_port defaults to the provider port.
For port_range mode, backend_port is not applicable.

Raises:
ValueError: If backend_port is explicitly set with port_range.

Returns:
The model with backend_port default value applied.
"""
if self.backend_port is None:
if self.port_range is not None and self.backend_port is not None:
raise ValueError("'backend_port' cannot be set when using 'port_range'.")
if self.backend_port is None and self.port is not None:
self.backend_port = self.port
return self

Expand All @@ -678,12 +737,35 @@ def sni_set_when_not_enforcing_tls(self) -> "Self":

Raises:
ValueError: If sni is configured and TLS is disabled.
ValueError: If sni is configured with port_range.

Returns:
The validated model.
"""
if not self.enforce_tls and self.sni is not None:
raise ValueError("You can't set SNI and disable TLS at the same time.")
if self.port_range is not None and self.sni is not None:
raise ValueError("'sni' cannot be set when using 'port_range'.")
return self

@model_validator(mode="after")
def tls_terminate_requires_no_port_range(self) -> "Self":
"""Check that tls_terminate is not enabled when port_range is set.

Port-range mode uses transparent TCP passthrough; HAProxy cannot terminate
TLS without a known SNI, and SNI is disallowed with port_range.

Raises:
ValueError: If tls_terminate is True when port_range is set.

Returns:
The validated model.
"""
if self.port_range is not None and self.tls_terminate:
raise ValueError(
"'tls_terminate' must be False when using 'port_range'. "
"Port-range mode uses transparent TCP passthrough."
)
return self


Expand Down Expand Up @@ -738,20 +820,21 @@ class HaproxyRouteTcpRequirersData:

@model_validator(mode="after")
def check_ports_unique(self) -> Self:
"""Check that requirers define unique ports.
"""Check that requirers define unique ports with no overlapping port ranges.

Returns:
The validated model, with invalid relation IDs updated in
`self.relation_ids_with_invalid_data`
"""
# Maybe the logic here can be optimized, we want to keep track of
# the relation IDs that request overlapping ports to ignore them during
# rendering of the haproxy configuration.
relation_ids_per_port: dict[int, list[int]] = defaultdict(list[int])
for requirer_data in self.requirers_data:
relation_ids_per_port[requirer_data.application_data.port].append(
requirer_data.relation_id
)
app_data = requirer_data.application_data
if app_data.port_range is not None:
start, end = (int(p) for p in app_data.port_range.split("-"))
for port in range(start, end + 1):
relation_ids_per_port[port].append(requirer_data.relation_id)
elif app_data.port is not None:
relation_ids_per_port[app_data.port].append(requirer_data.relation_id)

for relation_ids in relation_ids_per_port.values():
if len(relation_ids) > 1:
Expand Down Expand Up @@ -980,6 +1063,7 @@ def __init__(
relation_name: str,
*,
port: Optional[int] = None,
port_range: Optional[str] = None,
backend_port: Optional[int] = None,
hosts: Optional[list[IPvAnyAddress]] = None,
sni: Optional[str] = None,
Expand Down Expand Up @@ -1013,15 +1097,18 @@ def __init__(
Args:
charm: The charm that is instantiating the library.
relation_name: The name of the relation to bind to.
port: The provider port.
port: The provider port. Mutually exclusive with port_range.
port_range: A range of ports exposed on the provider (e.g. "10500-10600").
Each port in the range is forwarded 1:1 to the same port on the backend.
Mutually exclusive with port.
backend_port: List of ports the service is listening on.
hosts: List of backend server addresses. Currently only support IP addresses.
sni: List of URL paths to route to this service.
check_interval: Interval between health checks in seconds.
check_rise: Number of successful health checks before server is considered up.
check_fall: Number of failed health checks before server is considered down.
check_type: Health check type,
Can be generic”, “mysql”, “postgres”, “redis or smtp.
Can be "generic", "mysql", "postgres", "redis" or "smtp".
check_send: Only used in generic health checks,
specify a string to send in the health check request.
check_expect: Only used in generic health checks,
Expand Down Expand Up @@ -1056,6 +1143,7 @@ def __init__(
# build the full application data
self._application_data = self._generate_application_data(
port=port,
port_range=port_range,
backend_port=backend_port,
hosts=hosts,
sni=sni,
Expand Down Expand Up @@ -1108,7 +1196,8 @@ def _on_relation_broken(self, _: RelationBrokenEvent) -> None:
def provide_haproxy_route_tcp_requirements(
self,
*,
port: int,
port: Optional[int] = None,
port_range: Optional[str] = None,
backend_port: Optional[int] = None,
hosts: Optional[list[IPvAnyAddress]] = None,
sni: Optional[str] = None,
Expand Down Expand Up @@ -1140,15 +1229,18 @@ def provide_haproxy_route_tcp_requirements(
"""Update haproxy-route requirements data in the relation.

Args:
port: The provider port.
port: The provider port. Mutually exclusive with port_range.
port_range: A range of ports exposed on the provider (e.g. "10500-10600").
Each port in the range is forwarded 1:1 to the same port on the backend.
Mutually exclusive with port.
backend_port: List of ports the service is listening on.
hosts: List of backend server addresses. Currently only support IP addresses.
sni: List of URL paths to route to this service.
check_interval: Interval between health checks in seconds.
check_rise: Number of successful health checks before server is considered up.
check_fall: Number of failed health checks before server is considered down.
check_type: Health check type,
Can be generic”, “mysql”, “postgres”, “redis or smtp.
Can be "generic", "mysql", "postgres", "redis" or "smtp".
check_send: Only used in generic health checks,
specify a string to send in the health check request.
check_expect: Only used in generic health checks,
Expand Down Expand Up @@ -1176,6 +1268,7 @@ def provide_haproxy_route_tcp_requirements(
self._unit_address = unit_address
self._application_data = self._generate_application_data(
port=port,
port_range=port_range,
backend_port=backend_port,
hosts=hosts,
sni=sni,
Expand Down Expand Up @@ -1210,6 +1303,7 @@ def _generate_application_data(
self,
*,
port: Optional[int] = None,
port_range: Optional[str] = None,
backend_port: Optional[int] = None,
hosts: Optional[list[IPvAnyAddress]] = None,
sni: Optional[str] = None,
Expand Down Expand Up @@ -1240,15 +1334,17 @@ def _generate_application_data(
"""Generate the complete application data structure.

Args:
port: The provider port.
port: The provider port. Mutually exclusive with port_range.
port_range: A range of ports exposed on the provider (e.g. "10500-10600").
Mutually exclusive with port.
backend_port: List of ports the service is listening on.
hosts: List of backend server addresses. Currently only support IP addresses.
sni: List of URL paths to route to this service.
check_interval: Interval between health checks in seconds.
check_rise: Number of successful health checks before server is considered up.
check_fall: Number of failed health checks before server is considered down.
check_type: Health check type,
Can be generic”, “mysql”, “postgres”, “redis or smtp.
Can be "generic", "mysql", "postgres", "redis" or "smtp".
check_send: Only used in generic health checks,
specify a string to send in the health check request.
check_expect: Only used in generic health checks,
Expand Down Expand Up @@ -1283,6 +1379,7 @@ def _generate_application_data(

application_data: dict[str, Any] = {
"port": port,
"port_range": port_range,
"backend_port": backend_port,
"hosts": hosts,
"sni": sni,
Expand Down Expand Up @@ -1449,8 +1546,8 @@ def _generate_load_balancing_configuration(

def update_relation_data(self) -> None:
"""Update both application and unit data in the relation."""
if not self._application_data.get("port"):
logger.warning("port must be set, skipping update.")
if not self._application_data.get("port") and not self._application_data.get("port_range"):
logger.warning("port or port_range must be set, skipping update.")
return

if relation := self.relation:
Expand Down
Loading
Loading