Skip to content

Commit 2ca13d1

Browse files
committed
Release v0.1.6
1 parent 4db9169 commit 2ca13d1

22 files changed

Lines changed: 1690 additions & 1725 deletions

.cursor/rules/python/RULE.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
---
2+
title: Python Code Style and Formatting Rules
3+
description: Rules for writing Python code with proper formatting and output standards
4+
globs: **/*.py
5+
alwaysApply: true
6+
---
7+
8+
## Code Quality and Formatting Tools
9+
10+
### Code Review with Sourcery
11+
- Use Sourcery for automated code review and fixes:
12+
```bash
13+
# Review and fix a specific file
14+
sourcery review --fix path/to/file.py
15+
16+
# Review and fix all Python files in a directory
17+
sourcery review --fix chapters/feature_flags/local/scripts/
18+
19+
# Review without applying fixes (dry run)
20+
sourcery review path/to/file.py
21+
```
22+
- Sourcery provides suggestions for:
23+
- Code simplification and refactoring
24+
- Removing unnecessary else clauses
25+
- Simplifying conditionals
26+
- Improving variable naming
27+
- Reducing complexity
28+
- Run Sourcery before committing code to catch common issues early.
29+
- Sourcery suggestions should be reviewed and applied when they improve code readability and maintainability.
30+
31+
## Code Output and Formatting
32+
33+
### Prohibited Patterns
34+
35+
- **DO NOT USE decorative separator lines:**
36+
- `print("=" * 80)`
37+
- `print("-" * 50)`
38+
- Any decorative print statements using repeated characters
39+
- **DO NOT USE empty print statements for spacing:**
40+
- `print()` used only for adding blank lines in output
41+
- **DO NOT USE bullet point summary statements:**
42+
- `print(" • AppConfig configuration was updated...")`
43+
- `print(" - Key finding: ...")`
44+
- `print(" * Summary: ...")`
45+
- Any print statements with bullet points or indented summary text
46+
- **DO NOT USE emojis in output (Python scripts only):**
47+
- `print("✅ Success")`
48+
- `print("❌ Error")`
49+
- `print("📊 Step 1: ...")`
50+
- Any emoji characters in print statements
51+
- **Exception**: Emojis and colors are OK in Makefiles
52+
- **DO NOT USE leading spaces in print statements:**
53+
- `print(" Text with leading spaces")`
54+
- `print(f" Testing etc")`
55+
- Use plain text without leading spaces for indentation
56+
57+
### Recommended Patterns
58+
59+
- **DO USE:**
60+
- Direct, informative print statements without decorative elements
61+
- Concise output that focuses on actionable information
62+
- No extra spacing or formatting beyond what's necessary
63+
- Plain text status indicators (e.g., "Success:", "Error:", "Step 1:")
64+
- **Makefiles**: Emojis and colors are acceptable, but avoid extra spacing
65+
66+
### Rationale
67+
68+
Cleaner, more professional output that's easier to parse programmatically with less visual clutter.

.github/workflows/ci.yml

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,14 @@ jobs:
4343
run: |
4444
uv run python -m pytest tests/ -v --tb=short -m "not compatibility"
4545
46+
- name: Cleanup test artifacts
47+
if: always()
48+
run: |
49+
rm -rf .pytest_cache/
50+
rm -rf .venv/
51+
rm -rf __pycache__/
52+
find . -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true
53+
4654
# Coverage reporting disabled until pytest-cov is added to dependencies
4755
# - name: Upload coverage to Codecov
4856
# if: matrix.python-version == '3.11'
@@ -98,8 +106,6 @@ jobs:
98106
run: |
99107
uv sync --dev
100108
101-
- name: Run mypy (if configured)
109+
- name: Run mypy
102110
run: |
103-
# Add mypy configuration and uncomment when ready
104-
# uv run mypy fastwoe/
105-
echo "Type checking skipped - configure mypy when ready"
111+
uv run mypy fastwoe/ --check-untyped-defs

.github/workflows/compatibility.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,14 @@ jobs:
5959
uv run python -c "import fastwoe; print(f'fastwoe version: {fastwoe.__version__}')"
6060
uv run python -m pytest tests/test_compatibility.py -v --tb=short
6161
62+
- name: Cleanup test artifacts
63+
if: always()
64+
run: |
65+
rm -rf .pytest_cache/
66+
rm -rf .venv/
67+
rm -rf __pycache__/
68+
find . -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true
69+
6270
- name: Upload test results
6371
uses: actions/upload-artifact@v4
6472
if: always()

.pre-commit-config.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,11 @@ repos:
2020
- id: ruff-check
2121
args: [--fix]
2222
- id: ruff-format
23+
24+
- repo: https://github.com/pre-commit/mirrors-mypy
25+
rev: v1.11.2
26+
hooks:
27+
- id: mypy
28+
additional_dependencies: [pandas-stubs]
29+
args: [--check-untyped-defs]
30+
files: ^fastwoe/

CHANGELOG.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,37 @@
11
# Changelog
22

3+
## Version 0.1.6 (2026-01-07)
4+
5+
**Stable Release: Type Safety, Robustness & Code Quality** 🎯
6+
7+
### ✨ Improvements
8+
- **Complete Type Safety**: All type checking passes with both `ty` and `mypy` type checkers
9+
- Full type annotations throughout the codebase
10+
- Proper type narrowing for `Optional` and `Union` types
11+
- Handled complex pandas/numpy/faiss type scenarios
12+
- **Robust Numba Import**: Added fallback mechanism for numba/llvmlite compatibility issues
13+
- Graceful degradation when numba fails to import (common in some Python 3.12 environments)
14+
- Code continues to work without JIT compilation (with performance trade-off)
15+
- Clear warning messages when fallback is used
16+
- **Enhanced Compatibility Testing**: Improved test robustness for cross-version compatibility
17+
- Better error handling for known environment-specific issues
18+
- More reliable test execution across Python and scikit-learn versions
19+
20+
### 🔧 Technical Improvements
21+
- Added comprehensive type hints for better IDE support and static analysis
22+
- Improved error handling for edge cases (numba/llvmlite, matplotlib optional dependencies)
23+
- Enhanced code quality with strict type checking (no shortcuts or lenient mode)
24+
- Fixed all type-related errors across the codebase
25+
26+
### 📦 Dependencies
27+
- No changes to core dependencies
28+
- Enhanced type checking support with `pandas-stubs`
29+
30+
### 🐛 Bug Fixes
31+
- Fixed `UnboundLocalError` in `interpret_fastwoe.py` for `sample_series` variable
32+
- Fixed `AttributeError` for matplotlib `Axes` type hints when matplotlib is optional
33+
- Fixed compatibility test failures related to numba/llvmlite memory issues
34+
335
## Version 0.1.6a3 (2025-12-14)
436

537
**Alpha Release: CAP Curves, Styled Display, MSD Feature Selection & Enhanced Metrics** 📊

Makefile

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,11 @@ typecheck: ## Run type checking (lenient mode)
3232
}
3333
@echo "✅ Type checking passed"
3434

35+
mypy: ## Run mypy type checking
36+
@echo "Running mypy type checking..."
37+
@uv run mypy fastwoe/ --check-untyped-defs
38+
@echo "✅ Mypy type checking passed"
39+
3540
typecheck-strict: ## Run type checking (strict mode)
3641
@echo "Running ty type checking (strict mode)..."
3742
@echo "🔒 Strict mode enabled"
@@ -40,14 +45,18 @@ typecheck-strict: ## Run type checking (strict mode)
4045
exit 1; \
4146
}
4247

43-
clean: ## Clean build artifacts
48+
clean: ## Clean build artifacts and virtual environments
4449
rm -rf build/
4550
rm -rf dist/
4651
rm -rf *.egg-info/
52+
rm -rf .venv/
53+
rm -rf .venv.*/
4754
find . -type d -name __pycache__ -exec rm -rf {} +
4855
find . -type f -name "*.pyc" -delete
56+
find . -type d -name ".pytest_cache" -exec rm -rf {} +
57+
find . -type d -name ".ruff_cache" -exec rm -rf {} +
4958

50-
check-all: format-check lint typecheck ## Run all checks (format, lint, typecheck)
59+
check-all: format-check lint typecheck mypy ## Run all checks (format, lint, typecheck, mypy)
5160

5261
# CI-friendly target
5362
ci-check: format-check lint ## Run CI checks (without strict type checking)
@@ -60,3 +69,6 @@ ci-check: format-check lint ## Run CI checks (without strict type checking)
6069
exit 0; \
6170
}
6271
@echo "✅ CI type checking passed"
72+
@echo "🔍 Running mypy type checking..."
73+
@uv run mypy fastwoe/ --check-untyped-defs
74+
@echo "✅ Mypy type checking passed"

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# FastWoe: Fast Weight of Evidence (WOE) encoding and inference
1+
# FastWoe: Fast Weight of Evidence (WOE) Encoding and Inference
22

33
[![CI](https://github.com/xRiskLab/fastwoe/workflows/CI/badge.svg)](https://github.com/xRiskLab/fastwoe/actions)
44
[![Compatibility](https://github.com/xRiskLab/fastwoe/workflows/Python%20Version%20Compatibility/badge.svg)](https://github.com/xRiskLab/fastwoe/actions)

examples/fastwoe_shapley.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
"""
2+
## Shapley Value Decomposition for Gini/Somers' D Contributions
3+
4+
Shapley values provide a fair attribution of Gini/Somers' D contributions using **exact Shapley value computation** (enumerating all subsets). When a base score is specified, the function:
5+
6+
- Fixes the population to where the base score is available
7+
- Always includes the base score in the averaged score
8+
- Computes Shapley values exactly by enumerating all possible subsets (2^n)
9+
- Shows base-only Gini/Somers' D separately and incremental effects for extras
10+
11+
The Shapley values are computed exactly (not approximated) by evaluating all subsets, ensuring:
12+
13+
- Order-invariant attribution
14+
- Values sum to the total combined Gini/Somers' D
15+
- All interactions are accounted for
16+
17+
Note: For binary classification, Gini coefficient equals Somers' D (2 × AUC - 1).
18+
"""
19+
20+
import numpy as np
21+
22+
from fastwoe.screening import somersd_shapley
23+
24+
# You have 3 credit scoring models
25+
np.random.seed(42)
26+
n = 10000
27+
28+
# Simulate target (default = 1, no default = 0)
29+
y = np.random.binomial(1, 0.15, n)
30+
31+
# Simulate scores from different models (higher score = higher risk)
32+
bureau_score = np.random.randn(n) + 0.5 * y # Traditional bureau score
33+
alt_data_score = np.random.randn(n) + 0.3 * y # Alternative data (phone, utility)
34+
behavioral_score = np.random.randn(n) + 0.4 * y # Behavioral patterns
35+
36+
scores = {"bureau": bureau_score, "alt_data": alt_data_score, "behavioral": behavioral_score}
37+
38+
# Compute Shapley attribution
39+
result = somersd_shapley(scores, y)
40+
print(result)

0 commit comments

Comments
 (0)