|
| 1 | ++++ |
| 2 | +author = "Jason Smith" |
| 3 | +title = "The SBOM Storage Tax: Optimization at Scale" |
| 4 | +date = "2026-03-02" |
| 5 | +linkedin = "https://www.linkedin.com/posts/j28smith_sbom-supplychainsecurity-finops-activity-7434359722511601665-DPTT" |
| 6 | +image = "img/thirdparty/2026-03-02-sbom-storage-tax.png" |
| 7 | ++++ |
| 8 | + |
| 9 | +Following my last post on the "Storage Tax" of binary blob signing, I received some insightful feedback from the |
| 10 | +community. The common critique was: |
| 11 | + |
| 12 | +> JSON minification doesn't really matter if you compress the JSON. |
| 13 | +
|
| 14 | +It's a valid point. Technically, compression will save far more than minification. But does the choice to sign the |
| 15 | +"data" (allowing minification) still matter if we are just going to run it through Zstandard (zstd) anyway? |
| 16 | + |
| 17 | +To find out, I ran a series of tests on the sample set of CycloneDX and SPDX SBOMs from my |
| 18 | +[sbom-signing-best-practices](https://github.com/shiftleftcyber/sbom-signing-best-practices) GitHub repo. I compared |
| 19 | +the storage footprints of pretty-printed files, minified files, and their compressed counterparts. |
| 20 | + |
| 21 | +The data confirms the initial intuition, but only if you look at the surface-level percentages. Under the hood, there |
| 22 | +is a "hidden" efficiency gain that still justifies a data-aware signing strategy. |
| 23 | + |
| 24 | +## The Technical Reality Check |
| 25 | + |
| 26 | +Based on my analysis of the test data, here is the breakdown of average storage savings: |
| 27 | + |
| 28 | +- **Minification:** Reduced file size by ~32%. |
| 29 | +- **Zstd Compression:** Reduced file size by ~79%. |
| 30 | +- **The "Combo" (Minify + Zstd):** Reduced file size by ~81%. |
| 31 | + |
| 32 | +At first glance, the "compression-only" camp seems to have won. Adding minification to the compression pipeline only |
| 33 | +appears to save an additional 2% relative to the original file size. |
| 34 | + |
| 35 | +**The 2% Trap:** While 2% of the total original file seems negligible, we don't pay for storage based on the original |
| 36 | +file size. We pay based on the final compressed artifact. When we look at the results through a "FinOps" lens, that |
| 37 | +small delta becomes more significant. |
| 38 | + |
| 39 | +## The 11% Efficiency Gain |
| 40 | + |
| 41 | +When we look at the final artifact size (the compressed file we actually pay to store), the story changes. |
| 42 | + |
| 43 | +In my tests, applying minification before Zstandard compression resulted in a final file that was 11% smaller on |
| 44 | +average than compressing the pretty-printed version. By stripping away the "entropy" of unnecessary whitespace and |
| 45 | +newlines first, the compression algorithm can focus purely on data patterns, leading to a tighter result. |
| 46 | + |
| 47 | +Let's scale that 11% efficiency gain to the requirements of the **EU Cyber Resilience Act (CRA)**: |
| 48 | + |
| 49 | +1. **10-Year Retention:** You aren't storing one SBOM. You are storing a decade of build history. 11% savings today |
| 50 | +compounded over time is a massive reduction in "whitespace debt" by year ten. |
| 51 | +2. **Enterprise Scale:** For thousands of software products and services, each with daily or even per-commit builds, |
| 52 | +this is the difference between manageable cold storage and a budget-breaking line item. |
| 53 | +3. **Global Traffic:** 11% less data means lower egress costs for every audit and transfer. |
| 54 | + |
| 55 | +In this context, an 11% reduction in your long-term storage footprint isn't just "technical elegance". It's a |
| 56 | +significant reduction in operational overhead and "whitespace debt". |
| 57 | + |
| 58 | +## The Recommended Storage Strategy |
| 59 | + |
| 60 | +If you are locked into "Binary Blob Signing", you are effectively forbidden from fully optimizing your data. To keep |
| 61 | +that signature valid for a decade, you must store the compressed "Pretty-Printed" version. As a result, you are paying |
| 62 | +an 11% storage tax indefinitely. |
| 63 | + |
| 64 | +To avoid the storage tax, our recommended strategy for long-term SBOM retention is: |
| 65 | + |
| 66 | +1. **Implement Data-Aware Signing:** Stop signing the container (the file) and start signing the "Facts" (the |
| 67 | +canonicalized data). |
| 68 | +2. **Minify + Compress for Storage:** Use JSON minification to strip out the structural overhead and a modern algorithm |
| 69 | +like Zstandard on the minified version of the SBOM to reach peak efficiency. |
| 70 | + |
| 71 | +This approach ensures that your signatures are resilient to formatting changes while your storage is optimized for the |
| 72 | +next 10 years of compliance. |
| 73 | + |
| 74 | +## Final Remarks |
| 75 | + |
| 76 | +I initially brought up the storage topic during my last presentation at the [OpenSSF](https://openssf.org/) SBOM |
| 77 | +Everywhere SIG meeting when presenting my findings on SBOM signing best practices and it sparked an immediate shift |
| 78 | +in the conversation. A special thank you to [Kate Stewart](https://www.linkedin.com/in/katestewartaustin/) for |
| 79 | +inviting me to the SPDX tech call to give the same presentation there. And thanks to the OpenSSF SBOM Everywhere SIG |
| 80 | +members and the SPDX Tech call members for all the feedback that prompted this deeper dive. |
| 81 | + |
| 82 | +We are moving toward a world where signing the "Facts" isn't just more secure, it's cheaper. |
| 83 | + |
| 84 | +**The benchmark is being set.** Are you signing the container, or are you signing the data? |
0 commit comments