A Python application that analyzes personal cash flow from bank CSV exports. Achieves >90% transaction categorization accuracy using 200+ regex patterns and provides mortgage interest integration for realistic expense tracking.
- Parses CSV exports from Chase, Wells Fargo, Bank of America, and generic bank formats with automatic encoding detection
- Classifies every transaction into one of four flow types: INCOME, EXPENSE, INTERNAL_TRANSFER, or EXCLUDED (debt payments)
- Categorizes transactions into 50+ categories using layered pattern matching (regex, fuzzy matching, merchant aliases) with confidence scoring
- Calculates true net cash flow by excluding internal transfers and debt principal payments
- Integrates mortgage payment data, separating principal (wealth transfer, excluded) from interest (true expense, included)
- All processing is local — no external APIs, no cloud dependencies
cd cashflow_analysis
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# Generate sample data and run analysis
python cashflow_analyzer.py --generate-sample
# Analyze your own bank export
python3 -m src.main data/your_bank_export.csv
# Enhanced analysis with mortgage data
python3 enhanced_analysis.py
# Interactive dashboard
python run_dashboard.pyNet Cash Flow = Income - True Expenses
Excluded from expenses:
- Internal transfers (savings, investments — money stays in your system)
- Credit card payments (already counted when originally spent)
- Mortgage principal (wealth transfer, not an operating cost)
Included as expenses:
- Mortgage interest (true operating cost)
- All other outflows that leave your financial system
This distinction matters. Without it, savings contributions look like expenses and mortgage principal distorts your expense ratio.
src/
├── core/ # Transaction models, categorization constants, exceptions
├── data/ # CSV loader (multi-bank), mortgage loader, balance validator
├── categorization/ # Flow classifier (4-tier priority) and regex categorizer
├── analysis/ # Core cash flow metrics and enhanced mortgage integration
├── visualization/ # Dash dashboard
└── utils/ # Sample data generator
config/config.yaml # Categorization rules, confidence thresholds, merchant aliases
tests/ # Unit tests for flow classification and cash flow calculations
python -m pytest tests/ -vTests cover flow classification logic (ensuring transactions are assigned the correct INCOME/EXPENSE/TRANSFER/EXCLUDED type) and net cash flow calculations (verifying that transfers and debt payments are properly excluded).
Python 3.13, pandas, NumPy, Plotly/Dash, PyYAML, fuzzywuzzy. Uses Decimal arithmetic throughout for financial precision.