Contributing to Vectorizer

Thank you for your interest in contributing to Vectorizer! This document provides guidelines and instructions for contributing to the project.

Code of Conduct
Getting Started
Development Setup
Development Workflow
Testing
Code Quality
Documentation
Submitting Changes
Release Process

Code of Conduct

This project adheres to the Contributor Covenant Code of Conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to team@hivellm.org.

Getting Started

Fork the repository on GitHub

Clone your fork locally:

git clone https://github.com/your-username/vectorizer.git
cd vectorizer

Add upstream remote:

git remote add upstream https://github.com/hivellm/vectorizer.git

Development Setup

Prerequisites

Rust: Nightly toolchain 1.85+ with Edition 2024
Git: For version control
WSL/Linux/macOS: For development (Windows users should use WSL)

Install Rust Nightly

rustup toolchain install nightly
rustup default nightly
rustup update nightly

Build the Project

# Build in debug mode
cargo build

# Build in release mode
cargo build --release

# Build with GPU support (if available)
cargo build --features gpu

Run the Server

# Development mode
cargo run

# Production mode
cargo run --release

Development Workflow

1. Create a Feature Branch

git checkout -b feature/your-feature-name

2. Make Your Changes

Follow the Rust style guide
Write clear, concise commit messages
Add tests for new functionality
Update documentation as needed

3. Commit Your Changes

Use Conventional Commits format:

# Feature
git commit -m "feat: Add new vector search algorithm"

# Bug fix
git commit -m "fix: Resolve memory leak in batch processing"

# Documentation
git commit -m "docs: Update API documentation"

# Performance improvement
git commit -m "perf: Optimize embedding generation"

# Refactoring
git commit -m "refactor: Simplify storage interface"

# Tests
git commit -m "test: Add integration tests for replication"

Testing

Run All Tests

# Run all tests
cargo test

# Run tests with output
cargo test -- --nocapture

# Run specific test
cargo test test_name

# Run integration tests only
cargo test --test '*'

# Run specific integration tests
cargo test --test api_workflow_test
cargo test --test concurrent_test
cargo test --test multi_collection_test

# Run integration tests with timeout
cargo nextest run --test api_workflow_test --test concurrent_test --test multi_collection_test --timeout 600s

Windows Contributors and `--all-features`

cargo test --workspace --lib --all-features fails at link time on Windows MSVC with exit code 1319 ("path too long"). The combined feature set (real-models + onnx-models + arrow + parquet + transmutation + fastembed + hive-gpu + simd-*) produces 160+ rlib + symbol paths that overflow the ~32 kB Windows command-line limit.

Local Windows workflow: run cargo test --workspace --lib (no --all-features). The authoritative all-features signal lives on CI under .github/workflows/rust-all-features.yml where ubuntu-latest absorbs the full link line.

Integration Tests

The project includes comprehensive integration tests covering:

API Workflow Tests (api_workflow_test.rs): Full CRUD operations, batch operations, multi-collection workflows, and error handling
Concurrent Tests (concurrent_test.rs): Concurrent searches, inserts, read-while-write scenarios, and race condition verification
Multi-Collection Tests (multi_collection_test.rs): Tests with 100+ collections, cross-collection searches, and memory scaling

Integration tests use helper functions from tests/helpers/mod.rs for:

Creating test stores and collections
Generating test vectors
Server startup utilities
Assertion macros

To run all integration tests:

cargo nextest run --test '*' --timeout 600s

Note: Integration tests may take longer to run and require proper timeout configuration.

Coverage

# Generate coverage report
cargo llvm-cov --all --ignore-filename-regex 'examples'

# Generate HTML coverage report
cargo llvm-cov --html --all --ignore-filename-regex 'examples'

Minimum coverage requirement: 95%

Code Quality

Format Code

# Format all code
cargo +nightly fmt --all

# Check formatting
cargo +nightly fmt --all -- --check

Lint Code

# Run clippy (must pass with no warnings)
cargo clippy --workspace -- -D warnings

# Run clippy on all targets
cargo clippy --workspace --all-targets --all-features -- -D warnings

Spell Check

# Install codespell
pip install 'codespell[toml]'

# Run spell check
codespell \
  --skip="*.lock,*.json,target,node_modules,.git" \
  --ignore-words-list="crate,ser,deser"

Quality Checklist

Before committing, ensure:

✅ Code is formatted: cargo +nightly fmt --all
✅ No clippy warnings: cargo clippy --workspace -- -D warnings
✅ All tests pass: cargo test
✅ Coverage ≥ 95%: cargo llvm-cov
✅ No typos: codespell
✅ Documentation updated
✅ CHANGELOG.md updated (for significant changes)

Documentation

Code Documentation

Add doc comments (///) to all public APIs
Include examples in doc comments
Document error conditions
Run doc tests: cargo test --doc

Example:

/// Searches for vectors similar to the query.
///
/// # Arguments
///
/// * `query` - The search query text
/// * `limit` - Maximum number of results to return
///
/// # Examples
///
/// ```
/// use vectorizer::search;
///
/// let results = search("machine learning", 10)?;
/// ```
///
/// # Errors
///
/// Returns an error if the query is empty or invalid.
pub fn search(query: &str, limit: usize) -> Result<Vec<SearchResult>> {
    // Implementation
}

Project Documentation

Update relevant documentation in /docs:

/docs/specs/ - Feature specifications
/docs/ARCHITECTURE.md - Architecture changes
/docs/ROADMAP.md - Implementation progress
README.md - User-facing changes
CHANGELOG.md - Version history

Submitting Changes

Before Submitting

Sync with upstream:

git fetch upstream
git rebase upstream/main

Run quality checks:

cargo +nightly fmt --all
cargo clippy --workspace -- -D warnings
cargo test

Update documentation
Update CHANGELOG.md (for significant changes)

Create Pull Request

Push to your fork:

git push origin feature/your-feature-name

Open a Pull Request on GitHub
Fill out the PR template:
- Description of changes
- Related issues
- Testing performed
- Breaking changes (if any)

PR Review Process

All PRs require at least one approval
CI/CD checks must pass
Coverage must be ≥ 95%
No merge conflicts with main branch

Release Process

Version Numbering

Follow Semantic Versioning:

MAJOR (1.0.0 → 2.0.0): Breaking changes
MINOR (1.0.0 → 1.1.0): New features (backwards compatible)
PATCH (1.0.0 → 1.0.1): Bug fixes (backwards compatible)

Creating a Release

Update version in Cargo.toml

Update CHANGELOG.md:

## [1.2.0] - 2024-01-15

### Added
- New feature X

### Fixed
- Bug in component Y

### Changed
- Refactored module Z

Run quality checks:

cargo +nightly fmt --all
cargo clippy --workspace --all-targets -- -D warnings
cargo test --all-features
cargo doc --no-deps

Commit version bump:

git add Cargo.toml CHANGELOG.md
git commit -m "chore: Release version 1.2.0"

Create annotated tag:

git tag -a v1.2.0 -m "Release version 1.2.0

Major changes:
- Feature X
- Bug fix Y

All tests passing ✅
Coverage: 95%+ ✅
Linting: Clean ✅
Build: Success ✅"

Push changes (manual, as per project rules):

git push origin main
git push origin v1.2.0

Getting Help

Documentation: Check /docs directory
Issues: Search existing issues on GitHub
Discussions: Use GitHub Discussions for questions
Email: team@hivellm.org

License

By contributing to Vectorizer, you agree that your contributions will be licensed under the project's license (MIT or Apache 2.0).

Thank you for contributing to Vectorizer! 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing to Vectorizer

Table of Contents

Code of Conduct

Getting Started

Development Setup

Prerequisites

Install Rust Nightly

Build the Project

Run the Server

Development Workflow

1. Create a Feature Branch

2. Make Your Changes

3. Commit Your Changes

Testing

Run All Tests

Windows Contributors and `--all-features`

Integration Tests

Coverage

Code Quality

Format Code

Lint Code

Spell Check

Quality Checklist

Documentation

Code Documentation

Project Documentation

Submitting Changes

Before Submitting

Create Pull Request

PR Review Process

Release Process

Version Numbering

Creating a Release

Getting Help

License

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to Vectorizer

Table of Contents

Code of Conduct

Getting Started

Development Setup

Prerequisites

Install Rust Nightly

Build the Project

Run the Server

Development Workflow

1. Create a Feature Branch

2. Make Your Changes

3. Commit Your Changes

Testing

Run All Tests

Windows Contributors and --all-features

Integration Tests

Coverage

Code Quality

Format Code

Lint Code

Spell Check

Quality Checklist

Documentation

Code Documentation

Project Documentation

Submitting Changes

Before Submitting

Create Pull Request

PR Review Process

Release Process

Version Numbering

Creating a Release

Getting Help

License

Windows Contributors and `--all-features`