Comprehensive analysis toolkit for understanding LEV (Likelihood of Exploitation in the Wild) performance relative to EPSS and KEV methodologies.
This suite provides sophisticated statistical analysis and visualization capabilities for vulnerability prioritization research, implementing and extending the methodologies described in NIST CSWP 41.
The LEV Analysis Suite consists of two complementary modules:
- Core LEV Analyzer - Essential comparisons and operational insights (8 analysis types)
- Advanced LEV Analyzer - Statistical validation and research-grade analysis (6 additional analysis types)
Together, they provide 14 different visualization types with comprehensive statistical validation for understanding vulnerability scoring methodologies.
π View Generated Core Analysis Report
| Plot | Purpose | Key Insights |
|---|---|---|
| 1. EPSS vs LEV Scatter | Relationship visualization | Method complementarity, KEV distribution patterns |
| 2. LEV Recall Curve | NIST CSWP 41 validation | KEV coverage at different thresholds |
| 3. Probability Distributions | Score characteristics | Distribution shapes, outlier patterns |
| 4. Method Agreement Matrix | Overlap quantification | Agreement rates, unique identifications |
| 5. Temporal Evolution | Time-based patterns | CVE age vs risk score relationships |
| 6. Composite Effectiveness | Coverage comparison | Improvement over individual methods |
| 7. Risk Quadrant Analysis | Actionable categorization | Strategic prioritization guidance |
| 8. Summary Statistics | Comprehensive metrics | Key performance indicators |
π View Generated Advanced Analysis Report
| Plot | Purpose | Key Insights |
|---|---|---|
| 9. ROC Analysis | Predictive performance | KEV prediction accuracy comparison |
| 10. EPSS Evolution Impact | Temporal score analysis | How EPSS changes affect LEV calculations |
| 11. Sensitivity Analysis | Threshold optimization | Operational parameter selection |
| 12. Vulnerability Aging | Time-based risk patterns | CVE age effects on scoring |
| 13. Composite Value Analysis | Method combination benefits | Where composite scoring adds most value |
| 14. Statistical Validation | Correlation & significance | Statistical rigor and validation |
# Required packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import roc_curve, auc, precision_recall_curve
from scipy import stats# Initialize core analyzer
analyzer = LEVAnalyzer(
lev_file="data_out/lev_probabilities_nist_detailed.csv.gz",
composite_file="data_out/composite_probabilities_nist.csv.gz"
)
# Generate comprehensive report with all core plots
stats = analyzer.create_comprehensive_report()# Load data for advanced analysis
lev_df = pd.read_csv("data_out/lev_probabilities_nist_detailed.csv.gz")
composite_df = pd.read_csv("data_out/composite_probabilities_nist.csv.gz")
# Initialize advanced analyzer
advanced_analyzer = AdvancedLEVAnalyzer(lev_df, composite_df)
# Generate advanced analysis report
insights = advanced_analyzer.create_advanced_comprehensive_report()# Generate publication-quality plots
pub_insights = create_publication_ready_plots(
lev_file="data_out/lev_probabilities_nist_detailed.csv.gz",
composite_file="data_out/composite_probabilities_nist.csv.gz",
output_dir="analysis/publication_plots"
)The LEV Analysis Suite generates comprehensive markdown reports with embedded visualizations and detailed insights:
analysis/lev_analysis_plots/analysis_report.md
Contains:
- Executive summary with key dataset statistics
- 8 embedded visualizations with specific insights
- High-risk CVE analysis by threshold
- Method agreement and correlation analysis
- Strategic recommendations for vulnerability management
- Actionable quadrant analysis for prioritization
analysis/advanced_lev_analysis_plots/advanced_analysis_report.md
Contains:
- Statistical validation and correlation analysis
- ROC curve analysis for KEV prediction performance
- Sensitivity analysis for threshold optimization
- Temporal evolution and vulnerability aging analysis
- Composite method value-add quantification
- Research-grade statistical findings and implications
After running the analysis, you can directly access the generated reports:
# View core analysis results
open analysis/lev_analysis_plots/analysis_report.md
# View advanced analysis results
open analysis/advanced_lev_analysis_plots/advanced_analysis_report.mdBoth reports include:
- π Embedded visualizations for offline viewing
- π Statistical summaries with key metrics
- π― Actionable insights for operational teams
- π¬ Research findings validating NIST methodologies
- π‘ Strategic recommendations for vulnerability prioritization
analysis/lev_analysis_plots/
βββ epss_vs_lev_scatter.png # Method relationship visualization
βββ lev_recall_curve.png # KEV coverage analysis
βββ probability_distributions.png # Score distribution comparison
βββ method_agreement_matrix.png # Overlap quantification
βββ temporal_evolution.png # Time-based patterns
βββ composite_effectiveness.png # Coverage improvement analysis
βββ risk_quadrants.png # Actionable categorization
βββ analysis_report.md # Comprehensive report with embedded plots
π View Core Analysis Report
analysis/advanced_lev_analysis_plots/
βββ roc_analysis.png # Predictive performance curves
βββ epss_evolution_impact.png # Temporal score relationships
βββ method_sensitivity_analysis.png # Threshold optimization
βββ vulnerability_aging_analysis.png # Age-based risk patterns
βββ composite_value_analysis.png # Value-add quantification
βββ statistical_validation.png # Correlation and significance
βββ advanced_analysis_report.md # Advanced statistical report
π View Advanced Analysis Report
analysis/publication_plots/
βββ roc_analysis_hires.png # Research-quality ROC curves
βββ sensitivity_analysis_hires.png # Publication-ready sensitivity analysis
βββ epss_evolution_impact_hires.png # High-resolution temporal analysis
βββ vulnerability_aging_hires.png # Publication-ready aging analysis
βββ composite_value_analysis_hires.png # High-resolution value analysis
βββ statistical_validation_hires.png # Publication-quality statistics
βββ publication_summary.md # Research findings summary
- Method Complementarity: Quantifies how EPSS and LEV identify different high-risk CVE sets
- Optimal Thresholds: Data-driven recommendations for operational cutoff points
- ROI Analysis: Identifies scenarios where composite method provides maximum value
- Temporal Patterns: Reveals how vulnerability risk evolves over time
- Statistical Significance: Correlation coefficients and p-values for method relationships
- Predictive Performance: AUC scores for KEV prediction capabilities
- Method Agreement: Quantifies alignment between EPSS and LEV at different thresholds
- Distribution Analysis: Characterizes risk score distributions and outlier patterns
- High-Risk Identification: Comparative analysis of critical CVE detection
- Update Frequency: Guidance on score refresh requirements
- Prioritization Logic: Evidence-based triage recommendations
- Edge Cases: Analysis of scenarios where methods disagree
Based on NIST CSWP 41 research and empirical analysis:
| Metric | Expected Range | Interpretation |
|---|---|---|
| EPSS-LEV Correlation | 0.2 - 0.4 | Methods are complementary, not redundant |
| LEV KEV Recall @ 10% | 40% - 60% | Validates NIST findings |
| Composite Improvement | 20% - 30% | Additional high-risk CVEs identified |
| Method Disagreement | 15% - 25% | Unique value from each approach |
| KEV Prediction AUC | 0.7 - 0.8 | Good predictive performance |
- Basic Relationships (Plots 1-3): Understand core EPSS-LEV dynamics
- Method Performance (Plots 4, 9): Validate against KEV ground truth
- Distribution Analysis (Plot 3, 14): Characterize score behaviors
- Threshold Selection (Plots 6, 11): Optimize for organizational needs
- Coverage Analysis (Plots 2, 6): Understand method coverage
- Value Quantification (Plots 13): Identify composite method benefits
- Temporal Analysis (Plots 5, 10, 12): Understand time-based patterns
- Risk Categorization (Plot 7): Develop actionable frameworks
- Statistical Validation (Plot 14): Ensure methodological rigor
- Pearson correlation coefficients with significance testing
- Non-parametric correlation for non-normal distributions
- Cross-validation of relationships across different CVE subsets
- ROC curve analysis with AUC calculation
- Precision-recall curves for imbalanced datasets
- Bootstrap confidence intervals for performance metrics
- Mann-Whitney U tests for group comparisons
- Kolmogorov-Smirnov tests for distribution similarity
- Q-Q plots for normality assessment
- Cohen's kappa for categorical agreement
- Intraclass correlation coefficients
- Bland-Altman analysis for method comparison
# Analyze custom threshold ranges
thresholds = np.arange(0.05, 0.95, 0.05)
sensitivity_results = analyzer.plot_method_sensitivity_analysis()# Focus on specific time periods
recent_analysis = analyzer.plot_temporal_evolution(
sample_cves=500,
date_range=('2023-01-01', '2024-12-31')
)# Generate specific plots for presentations
fig1 = analyzer.plot_epss_vs_lev_scatter(sample_size=5000, figsize=(14, 10))
fig2 = analyzer.plot_risk_quadrants(figsize=(12, 9))Required columns:
cve: CVE identifierlev_probability: LEV probability scorepeak_epss_30day: Peak EPSS score over 30-day windowfirst_epss_date: First date CVE appeared in EPSSnum_relevant_epss_dates: Number of EPSS data points
Required columns:
cve: CVE identifierepss_score: Current EPSS scorekev_score: KEV membership scoreis_in_kev: Boolean KEV membershipcomposite_probability: Combined score
- High Resolution: 300 DPI PNG output for publications
- Professional Styling: Serif fonts, proper sizing, clean layouts
- Color Accessibility: Colorblind-friendly palettes
- Annotation Rich: Detailed labels, legends, and insights
- Hover Information: Detailed CVE information on hover
- Zoom Capabilities: Interactive exploration of data regions
- Filter Options: Dynamic filtering by CVE characteristics
- Export Options: Multiple format support for different use cases
- Create new method in appropriate analyzer class
- Follow naming convention:
plot_analysis_name() - Return matplotlib figure object
- Include comprehensive docstring with purpose and insights
- Add to comprehensive report generation
- Import required statistical libraries
- Implement validation methods in
generate_*_insights_report() - Add results to insights dictionary
- Include interpretation in markdown report
- Follow existing styling conventions
- Ensure accessibility compliance
- Add meaningful annotations and labels
- Test with various data sizes and distributions
- NIST CSWP 41: "Recommended Criteria for Cybersecurity Labeling for Consumer Internet of Things (IoT) Devices"
- EPSS Framework: Exploit Prediction Scoring System methodology
- KEV Catalog: CISA Known Exploited Vulnerabilities catalog
- Statistical Methods: Standard practices for predictive model evaluation
For questions about methodology, implementation, or results interpretation:
- Check Documentation: Review method docstrings and report insights
- Examine Output: Comprehensive reports include interpretation guidance
- Validate Results: Compare against NIST CSWP 41 expected findings
- Statistical Questions: Refer to implemented statistical method documentation
This comprehensive analysis suite provides the definitive toolkit for understanding LEV methodology performance and its complementary relationship with EPSS and KEV approaches for vulnerability prioritization. π―