Thank you for your interest in contributing to Vectorizer! This document provides guidelines and instructions for contributing to the project.
- Code of Conduct
- Getting Started
- Development Setup
- Development Workflow
- Testing
- Code Quality
- Documentation
- Submitting Changes
- Release Process
This project adheres to the Contributor Covenant Code of Conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to team@hivellm.org.
- Fork the repository on GitHub
- Clone your fork locally:
git clone https://github.com/your-username/vectorizer.git cd vectorizer - Add upstream remote:
git remote add upstream https://github.com/hivellm/vectorizer.git
- Rust: Nightly toolchain 1.85+ with Edition 2024
- Git: For version control
- WSL/Linux/macOS: For development (Windows users should use WSL)
rustup toolchain install nightly
rustup default nightly
rustup update nightly# Build in debug mode
cargo build
# Build in release mode
cargo build --release
# Build with GPU support (if available)
cargo build --features gpu# Development mode
cargo run
# Production mode
cargo run --releasegit checkout -b feature/your-feature-name- Follow the Rust style guide
- Write clear, concise commit messages
- Add tests for new functionality
- Update documentation as needed
Use Conventional Commits format:
# Feature
git commit -m "feat: Add new vector search algorithm"
# Bug fix
git commit -m "fix: Resolve memory leak in batch processing"
# Documentation
git commit -m "docs: Update API documentation"
# Performance improvement
git commit -m "perf: Optimize embedding generation"
# Refactoring
git commit -m "refactor: Simplify storage interface"
# Tests
git commit -m "test: Add integration tests for replication"# Run all tests
cargo test
# Run tests with output
cargo test -- --nocapture
# Run specific test
cargo test test_name
# Run integration tests only
cargo test --test '*'
# Run specific integration tests
cargo test --test api_workflow_test
cargo test --test concurrent_test
cargo test --test multi_collection_test
# Run integration tests with timeout
cargo nextest run --test api_workflow_test --test concurrent_test --test multi_collection_test --timeout 600scargo test --workspace --lib --all-features fails at link time on
Windows MSVC with exit code 1319 ("path too long"). The combined
feature set (real-models + onnx-models + arrow + parquet + transmutation + fastembed + hive-gpu + simd-*) produces 160+ rlib +
symbol paths that overflow the ~32 kB Windows command-line limit.
Local Windows workflow: run cargo test --workspace --lib
(no --all-features). The authoritative all-features signal lives
on CI under
.github/workflows/rust-all-features.yml
where ubuntu-latest absorbs the full link line.
The project includes comprehensive integration tests covering:
- API Workflow Tests (
api_workflow_test.rs): Full CRUD operations, batch operations, multi-collection workflows, and error handling - Concurrent Tests (
concurrent_test.rs): Concurrent searches, inserts, read-while-write scenarios, and race condition verification - Multi-Collection Tests (
multi_collection_test.rs): Tests with 100+ collections, cross-collection searches, and memory scaling
Integration tests use helper functions from tests/helpers/mod.rs for:
- Creating test stores and collections
- Generating test vectors
- Server startup utilities
- Assertion macros
To run all integration tests:
cargo nextest run --test '*' --timeout 600sNote: Integration tests may take longer to run and require proper timeout configuration.
# Generate coverage report
cargo llvm-cov --all --ignore-filename-regex 'examples'
# Generate HTML coverage report
cargo llvm-cov --html --all --ignore-filename-regex 'examples'Minimum coverage requirement: 95%
# Format all code
cargo +nightly fmt --all
# Check formatting
cargo +nightly fmt --all -- --check# Run clippy (must pass with no warnings)
cargo clippy --workspace -- -D warnings
# Run clippy on all targets
cargo clippy --workspace --all-targets --all-features -- -D warnings# Install codespell
pip install 'codespell[toml]'
# Run spell check
codespell \
--skip="*.lock,*.json,target,node_modules,.git" \
--ignore-words-list="crate,ser,deser"Before committing, ensure:
- ✅ Code is formatted:
cargo +nightly fmt --all - ✅ No clippy warnings:
cargo clippy --workspace -- -D warnings - ✅ All tests pass:
cargo test - ✅ Coverage ≥ 95%:
cargo llvm-cov - ✅ No typos:
codespell - ✅ Documentation updated
- ✅ CHANGELOG.md updated (for significant changes)
- Add doc comments (
///) to all public APIs - Include examples in doc comments
- Document error conditions
- Run doc tests:
cargo test --doc
Example:
/// Searches for vectors similar to the query.
///
/// # Arguments
///
/// * `query` - The search query text
/// * `limit` - Maximum number of results to return
///
/// # Examples
///
/// ```
/// use vectorizer::search;
///
/// let results = search("machine learning", 10)?;
/// ```
///
/// # Errors
///
/// Returns an error if the query is empty or invalid.
pub fn search(query: &str, limit: usize) -> Result<Vec<SearchResult>> {
// Implementation
}Update relevant documentation in /docs:
/docs/specs/- Feature specifications/docs/ARCHITECTURE.md- Architecture changes/docs/ROADMAP.md- Implementation progressREADME.md- User-facing changesCHANGELOG.md- Version history
-
Sync with upstream:
git fetch upstream git rebase upstream/main
-
Run quality checks:
cargo +nightly fmt --all cargo clippy --workspace -- -D warnings cargo test -
Update documentation
-
Update CHANGELOG.md (for significant changes)
-
Push to your fork:
git push origin feature/your-feature-name
-
Open a Pull Request on GitHub
-
Fill out the PR template:
- Description of changes
- Related issues
- Testing performed
- Breaking changes (if any)
- All PRs require at least one approval
- CI/CD checks must pass
- Coverage must be ≥ 95%
- No merge conflicts with main branch
Follow Semantic Versioning:
- MAJOR (1.0.0 → 2.0.0): Breaking changes
- MINOR (1.0.0 → 1.1.0): New features (backwards compatible)
- PATCH (1.0.0 → 1.0.1): Bug fixes (backwards compatible)
-
Update version in
Cargo.toml -
Update CHANGELOG.md:
## [1.2.0] - 2024-01-15 ### Added - New feature X ### Fixed - Bug in component Y ### Changed - Refactored module Z
-
Run quality checks:
cargo +nightly fmt --all cargo clippy --workspace --all-targets -- -D warnings cargo test --all-features cargo doc --no-deps -
Commit version bump:
git add Cargo.toml CHANGELOG.md git commit -m "chore: Release version 1.2.0" -
Create annotated tag:
git tag -a v1.2.0 -m "Release version 1.2.0 Major changes: - Feature X - Bug fix Y All tests passing ✅ Coverage: 95%+ ✅ Linting: Clean ✅ Build: Success ✅"
-
Push changes (manual, as per project rules):
git push origin main git push origin v1.2.0
- Documentation: Check
/docsdirectory - Issues: Search existing issues on GitHub
- Discussions: Use GitHub Discussions for questions
- Email: team@hivellm.org
By contributing to Vectorizer, you agree that your contributions will be licensed under the project's license (MIT or Apache 2.0).
Thank you for contributing to Vectorizer! 🚀