xRiskLab
diff --git a/‎.cursor/rules/python/RULE.md‎
Lines changed: 68 additions & 0 deletions b/‎.cursor/rules/python/RULE.md‎
Lines changed: 68 additions & 0 deletions
diff --git a/‎.github/workflows/ci.yml‎
Lines changed: 10 additions & 4 deletions b/‎.github/workflows/ci.yml‎
Lines changed: 10 additions & 4 deletions
diff --git a/‎.github/workflows/compatibility.yml‎
Lines changed: 8 additions & 0 deletions b/‎.github/workflows/compatibility.yml‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎.pre-commit-config.yaml‎
Lines changed: 8 additions & 0 deletions b/‎.pre-commit-config.yaml‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 32 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 32 additions & 0 deletions
diff --git a/‎Makefile‎
Lines changed: 14 additions & 2 deletions b/‎Makefile‎
Lines changed: 14 additions & 2 deletions
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/fastwoe_shapley.py‎
Lines changed: 40 additions & 0 deletions b/‎examples/fastwoe_shapley.py‎
Lines changed: 40 additions & 0 deletions
@@ -0,0 +1,68 @@
+---
+title: Python Code Style and Formatting Rules
+description: Rules for writing Python code with proper formatting and output standards
+globs: **/*.py
+alwaysApply: true
+---
+
+## Code Quality and Formatting Tools
+
+### Code Review with Sourcery
+  - Use Sourcery for automated code review and fixes:
+    ```bash
+    # Review and fix a specific file
+    sourcery review --fix path/to/file.py
+
+    # Review and fix all Python files in a directory
+    sourcery review --fix chapters/feature_flags/local/scripts/
+
+    # Review without applying fixes (dry run)
+    sourcery review path/to/file.py
+    ```
+  - Sourcery provides suggestions for:
+    - Code simplification and refactoring
+    - Removing unnecessary else clauses
+    - Simplifying conditionals
+    - Improving variable naming
+    - Reducing complexity
+  - Run Sourcery before committing code to catch common issues early.
+  - Sourcery suggestions should be reviewed and applied when they improve code readability and maintainability.
+
+## Code Output and Formatting
+
+### Prohibited Patterns
+
+- **DO NOT USE decorative separator lines:**
+    - `print("=" * 80)`
+    - `print("-" * 50)`
+    - Any decorative print statements using repeated characters
+- **DO NOT USE empty print statements for spacing:**
+    - `print()` used only for adding blank lines in output
+- **DO NOT USE bullet point summary statements:**
+    - `print("  • AppConfig configuration was updated...")`
+    - `print("  - Key finding: ...")`
+    - `print("  * Summary: ...")`
+    - Any print statements with bullet points or indented summary text
+- **DO NOT USE emojis in output (Python scripts only):**
+    - `print("✅ Success")`
+    - `print("❌ Error")`
+    - `print("📊 Step 1: ...")`
+    - Any emoji characters in print statements
+    - **Exception**: Emojis and colors are OK in Makefiles
+- **DO NOT USE leading spaces in print statements:**
+    - `print("   Text with leading spaces")`
+    - `print(f"    Testing etc")`
+    - Use plain text without leading spaces for indentation
+
+### Recommended Patterns
+
+- **DO USE:**
+    - Direct, informative print statements without decorative elements
+    - Concise output that focuses on actionable information
+    - No extra spacing or formatting beyond what's necessary
+    - Plain text status indicators (e.g., "Success:", "Error:", "Step 1:")
+    - **Makefiles**: Emojis and colors are acceptable, but avoid extra spacing
+
+### Rationale
+
+Cleaner, more professional output that's easier to parse programmatically with less visual clutter.
@@ -43,6 +43,14 @@ jobs:
       run: |
         uv run python -m pytest tests/ -v --tb=short -m "not compatibility"
 
+    - name: Cleanup test artifacts
+      if: always()
+      run: |
+        rm -rf .pytest_cache/
+        rm -rf .venv/
+        rm -rf __pycache__/
+        find . -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true
+
     # Coverage reporting disabled until pytest-cov is added to dependencies
     # - name: Upload coverage to Codecov
     #   if: matrix.python-version == '3.11'
@@ -98,8 +106,6 @@ jobs:
       run: |
         uv sync --dev
 
-    - name: Run mypy (if configured)
+    - name: Run mypy
       run: |
-        # Add mypy configuration and uncomment when ready
-        # uv run mypy fastwoe/
-        echo "Type checking skipped - configure mypy when ready"
+        uv run mypy fastwoe/ --check-untyped-defs
@@ -59,6 +59,14 @@ jobs:
         uv run python -c "import fastwoe; print(f'fastwoe version: {fastwoe.__version__}')"
         uv run python -m pytest tests/test_compatibility.py -v --tb=short
 
+    - name: Cleanup test artifacts
+      if: always()
+      run: |
+        rm -rf .pytest_cache/
+        rm -rf .venv/
+        rm -rf __pycache__/
+        find . -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true
+
     - name: Upload test results
       uses: actions/upload-artifact@v4
       if: always()
 
@@ -20,3 +20,11 @@ repos:
       - id: ruff-check
         args: [--fix]
       - id: ruff-format
+
+  - repo: https://github.com/pre-commit/mirrors-mypy
+    rev: v1.11.2
+    hooks:
+      - id: mypy
+        additional_dependencies: [pandas-stubs]
+        args: [--check-untyped-defs]
+        files: ^fastwoe/
@@ -1,5 +1,37 @@
 # Changelog
 
+## Version 0.1.6 (2026-01-07)
+
+**Stable Release: Type Safety, Robustness & Code Quality** 🎯
+
+### ✨ Improvements
+- **Complete Type Safety**: All type checking passes with both `ty` and `mypy` type checkers
+  - Full type annotations throughout the codebase
+  - Proper type narrowing for `Optional` and `Union` types
+  - Handled complex pandas/numpy/faiss type scenarios
+- **Robust Numba Import**: Added fallback mechanism for numba/llvmlite compatibility issues
+  - Graceful degradation when numba fails to import (common in some Python 3.12 environments)
+  - Code continues to work without JIT compilation (with performance trade-off)
+  - Clear warning messages when fallback is used
+- **Enhanced Compatibility Testing**: Improved test robustness for cross-version compatibility
+  - Better error handling for known environment-specific issues
+  - More reliable test execution across Python and scikit-learn versions
+
+### 🔧 Technical Improvements
+- Added comprehensive type hints for better IDE support and static analysis
+- Improved error handling for edge cases (numba/llvmlite, matplotlib optional dependencies)
+- Enhanced code quality with strict type checking (no shortcuts or lenient mode)
+- Fixed all type-related errors across the codebase
+
+### 📦 Dependencies
+- No changes to core dependencies
+- Enhanced type checking support with `pandas-stubs`
+
+### 🐛 Bug Fixes
+- Fixed `UnboundLocalError` in `interpret_fastwoe.py` for `sample_series` variable
+- Fixed `AttributeError` for matplotlib `Axes` type hints when matplotlib is optional
+- Fixed compatibility test failures related to numba/llvmlite memory issues
+
 ## Version 0.1.6a3 (2025-12-14)
 
 **Alpha Release: CAP Curves, Styled Display, MSD Feature Selection & Enhanced Metrics** 📊
 
@@ -32,6 +32,11 @@ typecheck:  ## Run type checking (lenient mode)
 	}
 	@echo "✅ Type checking passed"
 
+mypy:  ## Run mypy type checking
+	@echo "Running mypy type checking..."
+	@uv run mypy fastwoe/ --check-untyped-defs
+	@echo "✅ Mypy type checking passed"
+
 typecheck-strict:  ## Run type checking (strict mode)
 	@echo "Running ty type checking (strict mode)..."
 	@echo "🔒 Strict mode enabled"
@@ -40,14 +45,18 @@ typecheck-strict:  ## Run type checking (strict mode)
 		exit 1; \
 	}
 
-clean:  ## Clean build artifacts
+clean:  ## Clean build artifacts and virtual environments
 	rm -rf build/
 	rm -rf dist/
 	rm -rf *.egg-info/
+	rm -rf .venv/
+	rm -rf .venv.*/
 	find . -type d -name __pycache__ -exec rm -rf {} +
 	find . -type f -name "*.pyc" -delete
+	find . -type d -name ".pytest_cache" -exec rm -rf {} +
+	find . -type d -name ".ruff_cache" -exec rm -rf {} +
 
-check-all: format-check lint typecheck  ## Run all checks (format, lint, typecheck)
+check-all: format-check lint typecheck mypy  ## Run all checks (format, lint, typecheck, mypy)
 
 # CI-friendly target
 ci-check: format-check lint  ## Run CI checks (without strict type checking)
@@ -60,3 +69,6 @@ ci-check: format-check lint  ## Run CI checks (without strict type checking)
 		exit 0; \
 	}
 	@echo "✅ CI type checking passed"
+	@echo "🔍 Running mypy type checking..."
+	@uv run mypy fastwoe/ --check-untyped-defs
+	@echo "✅ Mypy type checking passed"
@@ -1,4 +1,4 @@
-# FastWoe: Fast Weight of Evidence (WOE) encoding and inference
+# FastWoe: Fast Weight of Evidence (WOE) Encoding and Inference
 
 [![CI](https://github.com/xRiskLab/fastwoe/workflows/CI/badge.svg)](https://github.com/xRiskLab/fastwoe/actions)
 [![Compatibility](https://github.com/xRiskLab/fastwoe/workflows/Python%20Version%20Compatibility/badge.svg)](https://github.com/xRiskLab/fastwoe/actions)
 
@@ -0,0 +1,40 @@
+"""
+## Shapley Value Decomposition for Gini/Somers' D Contributions
+
+Shapley values provide a fair attribution of Gini/Somers' D contributions using **exact Shapley value computation** (enumerating all subsets). When a base score is specified, the function:
+
+- Fixes the population to where the base score is available
+- Always includes the base score in the averaged score
+- Computes Shapley values exactly by enumerating all possible subsets (2^n)
+- Shows base-only Gini/Somers' D separately and incremental effects for extras
+
+The Shapley values are computed exactly (not approximated) by evaluating all subsets, ensuring:
+
+- Order-invariant attribution
+- Values sum to the total combined Gini/Somers' D
+- All interactions are accounted for
+
+Note: For binary classification, Gini coefficient equals Somers' D (2 × AUC - 1).
+"""
+
+import numpy as np
+
+from fastwoe.screening import somersd_shapley
+
+# You have 3 credit scoring models
+np.random.seed(42)
+n = 10000
+
+# Simulate target (default = 1, no default = 0)
+y = np.random.binomial(1, 0.15, n)
+
+# Simulate scores from different models (higher score = higher risk)
+bureau_score = np.random.randn(n) + 0.5 * y  # Traditional bureau score
+alt_data_score = np.random.randn(n) + 0.3 * y  # Alternative data (phone, utility)
+behavioral_score = np.random.randn(n) + 0.4 * y  # Behavioral patterns
+
+scores = {"bureau": bureau_score, "alt_data": alt_data_score, "behavioral": behavioral_score}
+
+# Compute Shapley attribution
+result = somersd_shapley(scores, y)
+print(result)
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-# FastWoe: Fast Weight of Evidence (WOE) encoding and inference`
	`1`	`+# FastWoe: Fast Weight of Evidence (WOE) Encoding and Inference`
`2`	`2`
`3`	`3`	`[![CI](https://github.com/xRiskLab/fastwoe/workflows/CI/badge.svg)](https://github.com/xRiskLab/fastwoe/actions)`
`4`	`4`	`[![Compatibility](https://github.com/xRiskLab/fastwoe/workflows/Python%20Version%20Compatibility/badge.svg)](https://github.com/xRiskLab/fastwoe/actions)`