Contributing to DataLineagePy

Thank you for your interest in contributing to DataLineagePy! This project aims to make data lineage tracking accessible to everyone.

🚀 Quick Start

Development Setup

Fork the repository

# Fork on GitHub, then clone your fork
git clone https://github.com/yourusername/DataLineagePy.git
cd DataLineagePy

Set up development environment

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e ".[dev]"

Run tests to verify setup
```
pytest tests/ -v
```

🛠️ Making Changes

Before You Start

Check existing issues and pull requests
For new features, please open an issue first to discuss the approach

Development Workflow

Create a feature branch

git checkout -b feature/your-feature-name

Make your changes
- Write clean, readable code
- Follow existing code style and patterns
- Add tests for new functionality
- Update documentation as needed

Test your changes

# Run tests
pytest tests/

# Check code formatting (if black is installed)
black --check .

# Run specific test categories
pytest tests/test_basic_functionality.py -v

Commit your changes

git add .
git commit -m "feat: add your feature description"

Push and create pull request

git push origin feature/your-feature-name
# Then create PR on GitHub

📝 Code Guidelines

Code Style

Follow PEP 8 Python style guide
Use descriptive variable and function names
Add type hints where appropriate
Write docstrings for public functions and classes

Testing

Add tests for all new functionality
Ensure existing tests continue to pass
Aim for good test coverage of new code
Use descriptive test names that explain what is being tested

Documentation

Update relevant documentation for new features
Add examples to demonstrate usage
Keep README.md and other docs up to date

🔧 Development Tips

Running Specific Tests

# Run all tests
pytest

# Run specific test file
pytest tests/test_basic_functionality.py

# Run tests with verbose output
pytest -v

# Run tests matching a pattern
pytest -k "test_dataframe"

Building Documentation Locally

# If using sphinx or similar (future enhancement)
cd docs/
make html

Performance Testing

# Run performance benchmarks
python tests/test_benchmark_suite.py

🎯 Areas for Contribution

High Priority

New Connectors: Database connectors (MongoDB, Snowflake, etc.)
Performance Optimizations: Memory usage, speed improvements
Documentation: More examples and tutorials
Testing: Additional test coverage and edge cases

Medium Priority

Visualization Enhancements: New chart types, better layouts
Integration Examples: Jupyter notebooks, real-world scenarios
Error Handling: Better error messages and recovery
Configuration Options: More customization capabilities

Future Enhancements

Apache Spark Integration: Native Spark DataFrame support
Real-time Streaming: Enhanced Kafka and streaming support
Cloud Integrations: Better cloud storage connectors
ML Pipeline Support: Integration with ML frameworks

🐛 Reporting Issues

Bug Reports

When reporting bugs, please include:

Python version and operating system
DataLineagePy version
Minimal code example that reproduces the issue
Full error message and stack trace
Expected vs actual behavior

Feature Requests

For new features, please provide:

Clear description of the feature
Use case and motivation
Proposed API or interface (if applicable)
Any implementation ideas

📋 Pull Request Process

Ensure tests pass: All existing tests must continue to pass
Add tests: New functionality should include appropriate tests
Update documentation: Add or update docs for new features
Describe changes: Write clear PR description explaining what and why
Link issues: Reference any related GitHub issues

PR Review Process

Maintainer will review PRs within a few days
Address any feedback or requested changes
Once approved, maintainer will merge the PR

🌟 Recognition

Contributors will be:

Listed in the project's contributors section
Credited in release notes for significant contributions
Invited to join the project as a maintainer (for ongoing contributors)

💬 Getting Help

GitHub Issues: For bugs and feature requests
Discussions: For questions and general discussion
Email: Contact arbaznazir4@gmail.com for complex topics

📃 License

By contributing to DataLineagePy, you agree that your contributions will be licensed under the same MIT License that covers the project.

🙏 Thank You

Every contribution, no matter how small, helps make DataLineagePy better for the entire data engineering community. Thank you for being part of this project!

Happy coding! 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing to DataLineagePy

🚀 Quick Start

Development Setup

🛠️ Making Changes

Before You Start

Development Workflow

📝 Code Guidelines

Code Style

Testing

Documentation

🔧 Development Tips

Running Specific Tests

Building Documentation Locally

Performance Testing

🎯 Areas for Contribution

High Priority

Medium Priority

Future Enhancements

🐛 Reporting Issues

Bug Reports

Feature Requests

📋 Pull Request Process

PR Review Process

🌟 Recognition

💬 Getting Help

📃 License

🙏 Thank You

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to DataLineagePy

🚀 Quick Start

Development Setup

🛠️ Making Changes

Before You Start

Development Workflow

📝 Code Guidelines

Code Style

Testing

Documentation

🔧 Development Tips

Running Specific Tests

Building Documentation Locally

Performance Testing

🎯 Areas for Contribution

High Priority

Medium Priority

Future Enhancements

🐛 Reporting Issues

Bug Reports

Feature Requests

📋 Pull Request Process

PR Review Process

🌟 Recognition

💬 Getting Help

📃 License

🙏 Thank You