|
| 1 | +# Copilot Instructions for mssql-python |
| 2 | + |
| 3 | +## Repository Overview |
| 4 | + |
| 5 | +**mssql-python** is a Python driver for Microsoft SQL Server and Azure SQL databases that leverages Direct Database Connectivity (DDBC). It's built using **pybind11** and **CMake** to create native extensions, providing DB API 2.0 compliant database access with enhanced Pythonic features. |
| 6 | + |
| 7 | +- **Size**: Medium-scale project (~750KB total) |
| 8 | +- **Languages**: Python (main), C++ (native bindings), CMake (build system) |
| 9 | +- **Target Platforms**: Windows (x64, ARM64), macOS (Universal2), Linux (x86_64, ARM64) |
| 10 | +- **Python Versions**: 3.10+ |
| 11 | +- **Key Dependencies**: pybind11, azure-identity, Microsoft ODBC Driver 18 |
| 12 | + |
| 13 | +## Development Workflows |
| 14 | + |
| 15 | +This repository includes detailed prompt files for common tasks. Reference these with `#`: |
| 16 | + |
| 17 | +| Task | Prompt | When to Use | |
| 18 | +|------|--------|-------------| |
| 19 | +| First-time setup | `#setup-dev-env` | New machine, fresh clone | |
| 20 | +| Build C++ extension | `#build-ddbc` | After modifying .cpp/.h files | |
| 21 | +| Run tests | `#run-tests` | Validating changes | |
| 22 | +| Create PR | `#create-pr` | Ready to submit changes | |
| 23 | + |
| 24 | +**Workflow order for new contributors:** |
| 25 | +1. `#setup-dev-env` — Set up venv and dependencies |
| 26 | +2. `#build-ddbc` — Build native extension |
| 27 | +3. Make your changes |
| 28 | +4. `#run-tests` — Validate |
| 29 | +5. `#create-pr` — Submit |
| 30 | + |
| 31 | +## Build System and Validation |
| 32 | + |
| 33 | +### Prerequisites |
| 34 | +**Always install these before building:** |
| 35 | +```bash |
| 36 | +# All platforms |
| 37 | +pip install -r requirements.txt |
| 38 | + |
| 39 | +# Windows: Requires Visual Studio Build Tools with "Desktop development with C++" workload |
| 40 | +# macOS: brew install cmake && brew install msodbcsql18 |
| 41 | +# Linux: Install cmake, python3-dev, and ODBC driver per distribution |
| 42 | +``` |
| 43 | + |
| 44 | +### Building the Project |
| 45 | + |
| 46 | +**CRITICAL**: The project requires building native extensions before testing. Extensions are platform-specific (`.pyd` on Windows, `.so` on macOS/Linux). |
| 47 | + |
| 48 | +#### Windows Build: |
| 49 | +```bash |
| 50 | +cd mssql_python/pybind |
| 51 | +build.bat [x64|x86|arm64] # Defaults to x64 if not specified |
| 52 | +``` |
| 53 | + |
| 54 | +#### macOS Build: |
| 55 | +```bash |
| 56 | +cd mssql_python/pybind |
| 57 | +./build.sh # Creates universal2 binary (ARM64 + x86_64) |
| 58 | +``` |
| 59 | + |
| 60 | +#### Linux Build: |
| 61 | +```bash |
| 62 | +cd mssql_python/pybind |
| 63 | +./build.sh # Detects architecture automatically |
| 64 | +``` |
| 65 | + |
| 66 | +**Build Output**: Creates `ddbc_bindings.cp{python_version}-{architecture}.{so|pyd}` in the `mssql_python/` directory. |
| 67 | + |
| 68 | +### Testing |
| 69 | + |
| 70 | +**IMPORTANT**: Tests require a SQL Server connection via `DB_CONNECTION_STRING` environment variable. |
| 71 | + |
| 72 | +```bash |
| 73 | +# Run all tests with coverage |
| 74 | +python -m pytest -v --cov=. --cov-report=xml --capture=tee-sys --cache-clear |
| 75 | + |
| 76 | +# Run specific test files |
| 77 | +python -m pytest tests/test_000_dependencies.py -v # Dependency checks |
| 78 | +python -m pytest tests/test_001_globals.py -v # Basic functionality |
| 79 | +``` |
| 80 | + |
| 81 | +**Test Dependencies**: Tests require building the native extension first. The dependency test (`test_000_dependencies.py`) validates that all platform-specific libraries exist. |
| 82 | + |
| 83 | +### Linting and Code Quality |
| 84 | + |
| 85 | +```bash |
| 86 | +# Python formatting |
| 87 | +black --check --line-length=100 mssql_python/ tests/ |
| 88 | + |
| 89 | +# C++ formatting |
| 90 | +clang-format -style=file -i mssql_python/pybind/*.cpp mssql_python/pybind/*.h |
| 91 | + |
| 92 | +# Coverage reporting (configured in .coveragerc) |
| 93 | +python -m pytest --cov=. --cov-report=html |
| 94 | +``` |
| 95 | + |
| 96 | +## Project Architecture |
| 97 | + |
| 98 | +### Core Components |
| 99 | + |
| 100 | +``` |
| 101 | +mssql_python/ |
| 102 | +├── __init__.py # Package initialization, connection registry, cleanup |
| 103 | +├── connection.py # DB API 2.0 connection object |
| 104 | +├── cursor.py # DB API 2.0 cursor object |
| 105 | +├── db_connection.py # connect() function implementation |
| 106 | +├── auth.py # Microsoft Entra ID authentication |
| 107 | +├── pooling.py # Connection pooling implementation |
| 108 | +├── logging.py # Logging configuration |
| 109 | +├── exceptions.py # Exception hierarchy |
| 110 | +├── connection_string_builder.py # Connection string construction |
| 111 | +├── connection_string_parser.py # Connection string parsing |
| 112 | +├── parameter_helper.py # Query parameter handling |
| 113 | +├── row.py # Row object implementation |
| 114 | +├── type.py # DB API 2.0 type objects |
| 115 | +├── constants.py # ODBC constants |
| 116 | +├── helpers.py # Utility functions and settings |
| 117 | +├── ddbc_bindings.py # Platform-specific extension loader with architecture detection |
| 118 | +├── mssql_python.pyi # Type stubs for IDE support |
| 119 | +├── py.typed # PEP 561 type marker |
| 120 | +└── pybind/ # Native extension source |
| 121 | + ├── ddbc_bindings.cpp # Main C++ binding code |
| 122 | + ├── ddbc_bindings.h # Header for bindings |
| 123 | + ├── CMakeLists.txt # Cross-platform build configuration |
| 124 | + ├── build.sh/.bat # Platform-specific build scripts |
| 125 | + ├── configure_dylibs.sh # macOS dylib configuration |
| 126 | + ├── logger_bridge.cpp/.hpp # Python logging bridge |
| 127 | + ├── unix_utils.cpp/.h # Unix platform utilities |
| 128 | + └── connection/ # Connection management |
| 129 | + ├── connection.cpp/.h # Connection implementation |
| 130 | + └── connection_pool.cpp/.h # Connection pooling |
| 131 | +``` |
| 132 | + |
| 133 | +### Platform-Specific Libraries |
| 134 | + |
| 135 | +``` |
| 136 | +mssql_python/libs/ |
| 137 | +├── windows/{x64,x86,arm64}/ # Windows ODBC drivers and dependencies |
| 138 | +├── macos/{arm64,x86_64}/lib/ # macOS dylibs |
| 139 | +└── linux/{debian_ubuntu,rhel,suse,alpine}/{x86_64,arm64}/lib/ # Linux distributions |
| 140 | +``` |
| 141 | + |
| 142 | +### Configuration Files |
| 143 | + |
| 144 | +- **`.clang-format`**: C++ formatting (LLVM style, 100 column limit) |
| 145 | +- **`.coveragerc`**: Coverage configuration |
| 146 | +- **`requirements.txt`**: Development dependencies |
| 147 | +- **`setup.py`**: Package configuration with platform detection |
| 148 | +- **`pyproject.toml`**: Modern Python packaging configuration |
| 149 | +- **`.gitignore`**: Excludes build artifacts (*.so, *.pyd, build/, __pycache__) |
| 150 | + |
| 151 | +## CI/CD Pipeline Details |
| 152 | + |
| 153 | +### GitHub Workflows |
| 154 | +- **`devskim.yml`**: Security scanning (runs on PRs and main) |
| 155 | +- **`pr-format-check.yml`**: PR validation (title format, GitHub issue/ADO work item links) |
| 156 | +- **`lint-check.yml`**: Python (Black) and C++ (clang-format) linting |
| 157 | +- **`pr-code-coverage.yml`**: Code coverage reporting |
| 158 | +- **`forked-pr-coverage.yml`**: Coverage for forked PRs |
| 159 | + |
| 160 | +### Azure DevOps Pipelines (`eng/pipelines/`) |
| 161 | +- **`pr-validation-pipeline.yml`**: Comprehensive testing across all platforms |
| 162 | +- **`build-whl-pipeline.yml`**: Wheel building for distribution |
| 163 | +- **Platform Coverage**: Windows (LocalDB), macOS (Docker SQL Server), Linux (Ubuntu, Debian, RHEL, Alpine) with both x86_64 and ARM64 |
| 164 | + |
| 165 | +### Build Matrix |
| 166 | +The CI system tests: |
| 167 | +- **Python versions**: 3.10, 3.11, 3.12, 3.13 |
| 168 | +- **Windows**: x64, ARM64 architectures |
| 169 | +- **macOS**: Universal2 (ARM64 + x86_64) |
| 170 | +- **Linux**: Multiple distributions (Debian, Ubuntu, RHEL, Alpine) on x86_64 and ARM64 |
| 171 | + |
| 172 | +## Common Build Issues and Workarounds |
| 173 | + |
| 174 | +### macOS-Specific Issues |
| 175 | +- **dylib path configuration**: Run `configure_dylibs.sh` after building to fix library paths |
| 176 | +- **codesigning**: Script automatically codesigns libraries for compatibility |
| 177 | + |
| 178 | +### Linux Distribution Differences |
| 179 | +- **Debian/Ubuntu**: Use `apt-get install python3-dev cmake pybind11-dev` |
| 180 | +- **RHEL**: Requires enabling CodeReady Builder repository for development tools |
| 181 | +- **Alpine**: Uses musl libc, requires special handling in build scripts |
| 182 | + |
| 183 | +### Windows Build Dependencies |
| 184 | +- **Visual Studio Build Tools**: Must include "Desktop development with C++" workload |
| 185 | +- **Architecture Detection**: Build scripts auto-detect target architecture from environment |
| 186 | + |
| 187 | +### Known Limitations (from TODOs) |
| 188 | +- Linux RPATH configuration pending for driver .so files |
| 189 | +- Some Unicode support gaps in executemany operations |
| 190 | +- Platform-specific test dependencies in exception handling |
| 191 | + |
| 192 | +## Architecture Detection and Loading |
| 193 | + |
| 194 | +The `ddbc_bindings.py` module handles architecture detection: |
| 195 | +- **Windows**: Normalizes `win64/amd64/x64` → `x64`, `win32/x86` → `x86`, `arm64` → `arm64` |
| 196 | +- **macOS**: Runtime architecture detection, always loads from universal2 binary |
| 197 | +- **Linux**: Maps `x64/amd64` → `x86_64`, `arm64/aarch64` → `arm64` |
| 198 | + |
| 199 | +## Exception Hierarchy |
| 200 | + |
| 201 | +Critical for error handling guidance: |
| 202 | + |
| 203 | +``` |
| 204 | +Exception (base) |
| 205 | +├── Warning |
| 206 | +└── Error |
| 207 | + ├── InterfaceError # Driver/interface issues |
| 208 | + └── DatabaseError |
| 209 | + ├── DataError # Invalid data processing |
| 210 | + ├── OperationalError # Connection/timeout issues |
| 211 | + ├── IntegrityError # Constraint violations |
| 212 | + ├── InternalError # Internal driver/database errors |
| 213 | + ├── ProgrammingError # SQL syntax errors |
| 214 | + └── NotSupportedError # Unsupported features/operations |
| 215 | +``` |
| 216 | + |
| 217 | +## Critical Anti-Patterns (DO NOT) |
| 218 | + |
| 219 | +- **NEVER** hardcode connection strings - always use `DB_CONNECTION_STRING` env var for tests |
| 220 | +- **NEVER** use `pyodbc` imports - this driver doesn't require external ODBC |
| 221 | +- **NEVER** modify files in `mssql_python/libs/` - these are pre-built binaries |
| 222 | +- **NEVER** skip `conn.commit()` after INSERT/UPDATE/DELETE operations |
| 223 | +- **NEVER** use bare `except:` blocks - always catch specific exceptions |
| 224 | +- **NEVER** leave connections open - use context managers or explicit `close()` |
| 225 | + |
| 226 | +## When Modifying Code |
| 227 | + |
| 228 | +### Python Changes |
| 229 | +- Preserve existing error handling patterns from `exceptions.py` |
| 230 | +- Use context managers (`with`) for all connection/cursor operations |
| 231 | +- Update `__all__` exports if adding public API |
| 232 | +- Add corresponding test in `tests/test_*.py` |
| 233 | +- Follow Black formatting (line length 100) |
| 234 | + |
| 235 | +### C++ Changes |
| 236 | +- Follow RAII patterns for resource management |
| 237 | +- Use `py::gil_scoped_release` for blocking ODBC operations |
| 238 | +- Update `mssql_python.pyi` type stubs if changing Python API |
| 239 | +- Follow `.clang-format` style (LLVM style, 100 column limit) |
| 240 | + |
| 241 | +## Debugging Quick Reference |
| 242 | + |
| 243 | +| Error | Cause | Solution | |
| 244 | +|-------|-------|----------| |
| 245 | +| `ImportError: ddbc_bindings` | Extension not built | Run `#build-ddbc` | |
| 246 | +| Connection timeout | Missing env var | Set `DB_CONNECTION_STRING` | |
| 247 | +| `dylib not found` (macOS) | Library paths | Run `configure_dylibs.sh` | |
| 248 | +| `ODBC Driver not found` | Missing driver | Install Microsoft ODBC Driver 18 | |
| 249 | +| `ModuleNotFoundError` | Not in venv | Run `#setup-dev-env` | |
| 250 | + |
| 251 | +## Contributing Guidelines |
| 252 | + |
| 253 | +### PR Requirements |
| 254 | +- **Title Format**: Must start with `FEAT:`, `CHORE:`, `FIX:`, `DOC:`, `STYLE:`, `REFACTOR:`, or `RELEASE:` |
| 255 | +- **Issue Linking**: Must link to either GitHub issue or ADO work item |
| 256 | +- **Summary**: Minimum 10 characters of meaningful content under "### Summary" |
| 257 | + |
| 258 | +### Development Workflow |
| 259 | +1. **Always build native extensions first** before running tests |
| 260 | +2. **Use virtual environments** for dependency isolation |
| 261 | +3. **Test on target platform** before submitting PRs |
| 262 | +4. **Check CI pipeline results** for cross-platform compatibility |
| 263 | + |
0 commit comments