Skip to content

Commit fa78a4b

Browse files
silugCopilot
andauthored
Add AGENTS.md (#90)
Also update README.md with additional documentation around business logic --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent b6f2a88 commit fa78a4b

3 files changed

Lines changed: 200 additions & 0 deletions

File tree

.gitignore

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,3 +17,21 @@
1717
/Gemfile.lock
1818
/vendor/
1919
/spec/fixtures/modules/
20+
21+
## AI coding assistant-specific configuration
22+
## The standard is AGENTS.md
23+
# Claude Code
24+
/CLAUDE.md
25+
/.claude/
26+
# GitHub Copilot
27+
/.github/copilot-instructions.md
28+
# Cursor
29+
/.cursor/
30+
/.cursorrules
31+
# Windsurf
32+
/.windsurf/
33+
/.windsurfrules
34+
# Gemini CLI
35+
/GEMINI.md
36+
# Cline
37+
/.clinerules/

AGENTS.md

Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
# AGENTS.md
2+
3+
This file provides guidance to AI agents when working with code in this repository.
4+
5+
## Overview
6+
7+
This is a Ruby gem (`compliance_engine`) that parses and works with [Sicura/SIMP Compliance Engine (SCE)](https://simp-project.com/docs/sce/) data. It also ships as a Puppet module providing a Hiera backend (`compliance_engine::enforcement`) for enforcing compliance profiles in Puppet environments.
8+
9+
## Commands
10+
11+
### Testing
12+
```bash
13+
# Run all tests and rubocop (default task)
14+
bundle exec rake
15+
16+
# Run just spec tests (with fixture prep/cleanup)
17+
bundle exec rake spec
18+
19+
# Run spec tests standalone (no fixture prep)
20+
bundle exec rake spec:standalone
21+
22+
# Run rubocop linting
23+
bundle exec rake rubocop
24+
25+
# Run a single spec file
26+
bundle exec rspec spec/classes/compliance_engine/data_spec.rb
27+
28+
# Run tests in parallel (used in CI for Ruby < 4.0)
29+
bundle exec rake parallel_spec
30+
```
31+
32+
### Development
33+
```bash
34+
# Install dependencies
35+
bundle install
36+
37+
# Open interactive shell with compliance data loaded
38+
bundle exec compliance_engine inspect --module /path/to/module
39+
40+
# CLI usage examples
41+
bundle exec compliance_engine profiles --modulepath /path/to/modules
42+
bundle exec compliance_engine hiera --profile my_profile --modulepath /path/to/modules
43+
bundle exec compliance_engine lookup some::class::param --profile my_profile --module /path/to/module
44+
```
45+
46+
## Architecture
47+
48+
### Data Model
49+
50+
Compliance data lives in YAML/JSON files at `<module>/SIMP/compliance_profiles/*.yaml` or `<module>/simp/compliance_profiles/*.yaml`. Files are structured with four top-level keys: `profiles`, `ce` (Compliance Elements), `checks`, and `controls`.
51+
52+
The library models this data with a two-layer class hierarchy:
53+
54+
**Collections** (`ComplianceEngine::Collection` subclass) hold named groups of components:
55+
- `ComplianceEngine::Profiles` — keyed by `'profiles'` in source data
56+
- `ComplianceEngine::Ces` — keyed by `'ce'` in source data
57+
- `ComplianceEngine::Checks` — keyed by `'checks'` in source data
58+
- `ComplianceEngine::Controls` — keyed by `'controls'` in source data
59+
60+
**Components** (`ComplianceEngine::Component` subclass) represent individual named entries within those collections:
61+
- `ComplianceEngine::Profile` — a named compliance profile
62+
- `ComplianceEngine::Ce` — a Compliance Element (CE)
63+
- `ComplianceEngine::Check` — a single compliance check; only `type: puppet-class-parameter` checks produce Hiera data via `Check#hiera`
64+
- `ComplianceEngine::Control` — a compliance control
65+
66+
A component can have multiple **fragments** (one per source file), which are deep-merged together via `deep_merge`. Confinement logic in `Component` filters fragments based on Puppet facts, module presence/version, and remediation risk level.
67+
68+
### Central Data Object
69+
70+
`ComplianceEngine::Data` is the primary entry point. It:
71+
1. Loads files via `open(*paths)` which delegates to `ModuleLoader``DataLoader::Yaml/Json`
72+
2. Uses Ruby's `Observable` pattern — `DataLoader` objects notify `Data` of changes
73+
3. Lazily constructs and caches the four collection objects; invalidates all caches when facts, enforcement_tolerance, modulepath, or environment_data change
74+
4. Exposes `Data#hiera(profiles)` which walks the check_mapping of requested profiles to produce a flat Hiera-compatible hash
75+
76+
### Business Logic: From Profiles to Hiera
77+
78+
**`Data#hiera(profile_names)`** is the primary output method. It:
79+
1. Resolves each name to a `Profile` object (logs and skips unknown names).
80+
2. Calls `Data#check_mapping(profile)` for each profile to find all associated checks.
81+
3. Filters to checks with `type: 'puppet-class-parameter'`.
82+
4. Calls `Check#hiera` on each, which returns `{ settings['parameter'] => settings['value'] }`.
83+
5. Deep-merges all results into a single flat hash and caches it.
84+
85+
**`Data#check_mapping(profile_or_ce)`** is the correlation engine that links profiles (or CEs) to checks. A check is included if **any** of the following hold (evaluated via `Data#mapping?`):
86+
87+
| Condition | What it checks |
88+
|-----------|---------------|
89+
| Shared **control** | `check.controls` and `profile.controls` share a key set to `true` |
90+
| Shared **CE** | `check.ces` and `profile.ces` share a key set to `true` |
91+
| CE→Control overlap | Any of `check.ces`' CEs has a control that also appears in `profile.controls` |
92+
| Direct reference | `profile.checks[check_key]` is truthy |
93+
94+
`check_mapping` can also be called with CE objects (in addition to profiles). Results are cached by `"#{object.class}:#{object.key}"`.
95+
96+
### Loading Pipeline
97+
98+
```
99+
paths → EnvironmentLoader → ModuleLoader (one per module dir)
100+
→ DataLoader::Yaml / DataLoader::Json
101+
↓ (Observable notify)
102+
ComplianceEngine::Data#update
103+
```
104+
105+
- `EnvironmentLoader` scans a Puppet modulepath for module directories
106+
- `EnvironmentLoader::Zip` handles zip-archived environments
107+
- `ModuleLoader` reads a module's `metadata.json` and discovers compliance data files
108+
- `DataLoader` (and its subclasses) read and parse individual files; they use the Observable pattern to push updates to `Data`
109+
110+
### Puppet Hiera Backend
111+
112+
`lib/puppet/functions/compliance_engine/enforcement.rb` implements the Hiera `lookup_key` function. It:
113+
- Resolves profiles from `compliance_engine::enforcement` and optionally `compliance_markup::enforcement` Hiera keys
114+
- Creates and caches a `ComplianceEngine::Data` object on the Puppet lookup context
115+
- Calls `data.hiera(profiles)` and bulk-caches results for subsequent lookups
116+
- Supports `compliance_markup` backwards compatibility via `compliance_markup_compatibility` option
117+
118+
### Confinement and Enforcement Tolerance
119+
120+
`Component#fragments` filters source fragments based on:
121+
- **Fact confinement** (`confine` key): dot-notation Puppet facts (e.g. `os.release.major`). Values may be a string (exact match), a string prefixed with `!` (negation), or an array (any match). Implemented in `Component#fact_match?`. Fact confinement is skipped when `facts` is `nil`.
122+
- **Module confinement** (`confine.module_name` + `confine.module_version`): checks against `environment_data` (a `{module_name => version}` hash) using semantic versioning. Module confinement only runs when `environment_data` is set (e.g. by `ComplianceEngine::Data#open`).
123+
- **Remediation risk** (`remediation.risk`): when `enforcement_tolerance` is a positive `Integer`, drops fragments where risk level ≥ `enforcement_tolerance` and drops disabled remediations. Only applies to `Check` components.
124+
125+
In practice, only fact confinement is bypassed when `facts` is `nil`; module confinement still applies whenever `environment_data` is available. All confinement and risk/disabled-remediation filtering are effectively bypassed only when both `facts` and `environment_data` are unset and `enforcement_tolerance` is not a positive `Integer` (every fragment is then included). This is useful for offline analysis where system context and enforcement settings are unavailable.
126+
127+
### Code Style
128+
129+
Rubocop is configured via `.rubocop.yml` inheriting from `voxpupuli-test`. Key style choices:
130+
- `compact` class/module nesting style (e.g. `class ComplianceEngine::Data` not nested modules)
131+
- Trailing commas on multiline args/arrays
132+
- Leading dot position for method chaining
133+
- `braces_for_chaining` block delimiters
134+
- Max line length: 200

README.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,54 @@ Options:
4242

4343
See the [`ComplianceEngine::Data`](https://rubydoc.info/gems/compliance_engine/ComplianceEngine/Data) class for details.
4444

45+
## Concepts
46+
47+
### Data Model
48+
49+
Compliance data is expressed across four entity types that live in YAML/JSON files inside Puppet modules (`<module>/SIMP/compliance_profiles/*.yaml`):
50+
51+
| Entity | Key | Purpose |
52+
|--------|-----|---------|
53+
| **Profile** | `profiles` | A named compliance standard (e.g. `nist_800_53_rev4`). References CEs, checks, and/or controls that together constitute that standard. |
54+
| **CE** (Compliance Element) | `ce` | A single, named compliance capability (e.g. "enable audit logging"). Bridges profiles to checks via a shared vocabulary. |
55+
| **Check** | `checks` | A verifiable assertion about a system setting. Checks of `type: puppet-class-parameter` carry a `parameter` and `value` that become Hiera data. |
56+
| **Control** | `controls` | A cross-reference label from an external framework (e.g. `nist_800_53:rev4:AU-2`). Profiles and checks both annotate themselves with controls to express alignment. |
57+
58+
### From Profiles to Hiera Data
59+
60+
The central operation of the library is `Data#hiera(profiles)`, which converts a list of profile names into a flat hash of Puppet class parameters and their enforced values:
61+
62+
```
63+
profile names
64+
↓ check_mapping: find all checks that belong to each profile
65+
checks (type: puppet-class-parameter only)
66+
↓ Check#hiera: extract { 'class::param' => value }
67+
deep-merged hash → { 'widget_spinner::audit_logging' => true, ... }
68+
```
69+
70+
**How check_mapping works** — a check is considered part of a profile if any of the following are true:
71+
72+
1. The check and profile share a **control** label (`nist_800_53:rev4:AU-2`).
73+
2. The check and profile share a **CE** reference.
74+
3. The check's CE and the profile share a **control** label.
75+
4. The profile explicitly lists the check by key under its `checks:` map.
76+
77+
This layered matching lets compliance authors express mappings at different levels of abstraction and have the engine resolve them automatically.
78+
79+
### Confinement
80+
81+
A component (profile, CE, check, or control) may be defined across multiple source files. Each file contributes a **fragment**. Before fragments are merged, they are filtered by:
82+
83+
- **Facts** (`confine:` key): dot-notation Puppet facts, optionally negated with a `!` prefix. A fragment is dropped if its confinement does not match the current system's facts.
84+
- **Module presence/version** (`confine.module_name` / `confine.module_version`): fragment is dropped if the required module is absent or the wrong version.
85+
- **Remediation risk/status** (`remediation.risk` / `remediation.disabled`): when `enforcement_tolerance` is set to a positive Integer, a fragment is dropped if remediation is explicitly `disabled` or if its risk level is ≥ `enforcement_tolerance`.
86+
87+
If `facts` is `nil`, all fact/module confinement is skipped; fragments are still subject to remediation-based filtering when `enforcement_tolerance` is set.
88+
89+
### Enforcement Tolerance
90+
91+
`enforcement_tolerance` is an optional integer threshold that controls how cautiously the engine applies remediations. When it is set to a positive Integer, fragments whose `remediation.risk.level` meets or exceeds the threshold, or whose remediation is explicitly `disabled`, are silently excluded from the merged result, allowing operators to tune aggressiveness (e.g. apply only low-risk remediations in production, all remediations in a test environment). When `enforcement_tolerance` is `nil` or not a positive Integer, no remediation-based filtering occurs and `remediation.risk` / `remediation.disabled` do not affect fragment inclusion.
92+
4593
## Using as a Puppet Module
4694

4795
The Compliance Engine can be used as a Puppet module to provide a Hiera backend for compliance data. This allows you to enforce compliance profiles through Hiera lookups within your Puppet manifests.

0 commit comments

Comments
 (0)