Skip to content

Latest commit

 

History

History
447 lines (387 loc) · 15.7 KB

File metadata and controls

447 lines (387 loc) · 15.7 KB

Zig Faker - Implementation TODO

This document tracks the implementation progress and future enhancements for the Zig Faker library, mimicking the functionality of ts-mocker.

✅ Completed (Phase 1 - Core Implementation)

Core Infrastructure

  • Project structure and build system
  • Random number generator with seeding support
  • Locale definition system and types
  • English locale data (comprehensive)
  • Main Faker struct with module organization

Core Modules

  • Person module (names, prefixes, suffixes, job titles, gender)
  • Address module (streets, cities, states, countries, postal codes)
  • Company module (names, industries, buzzwords, catch phrases)
  • Internet module (emails, domains, URLs, usernames, passwords)
  • Phone module (formatted phone numbers)
  • String utilities (UUID, nanoid, alphanumeric, hex)

Documentation & Examples

  • Comprehensive README with API documentation
  • Basic usage examples
  • Benchmark suite
  • Unit tests for all modules

✅ Phase 2 - Additional Data Categories (COMPLETED)

Food Module

  • Implement Food module structure
  • Food dishes (100+ dishes) - Expanded to 120+ dishes
  • Ingredients (100+ ingredients) - Expanded to 150+ ingredients
  • Fruits (30+ fruits) - Expanded to 45+ fruits
  • Vegetables (30+ vegetables) - Expanded to 50+ vegetables
  • Meats (15+ types) - Expanded to 60+ types
  • Spices and herbs (30+ spices) - Expanded to 45+ spices
  • Cuisines (international cuisine types)
  • Desserts

Animal Module

  • Implement Animal module structure
  • Dogs (30+ breeds) - Expanded to 50+ breeds
  • Cats (20+ breeds) - Expanded to 30+ breeds
  • Birds (20+ species) - Expanded to 40+ species
  • Fish (20+ species) - Expanded to 30+ species
  • Horses (10+ breeds) - Added 25+ breeds
  • Farm animals - Added 25 farm animals
  • Insects - Added 30 insect types
  • Wild animals - Added 80+ wild animals

Vehicle Module

  • Implement Vehicle module structure
  • Car manufacturers (50+ brands) - 42+ brands
  • Car models (200+ models) - 50+ models
  • Vehicle types (sedan, SUV, truck, etc.) - 20+ types
  • Fuel types (gas, diesel, electric, hybrid) - 10 types
  • Bicycle types - 15 types
  • Motorcycle brands - Included in manufacturers

Sport Module

  • Implement Sport module structure
  • Sport names (50+ sports) - 50+ sports
  • Team names - 60+ team names
  • Athletes (famous athletes) - 30+ athletes
  • Equipment - Covered
  • Positions/roles - 30+ positions

Music Module

  • Implement Music module structure
  • Genres (50+ genres) - 50+ genres
  • Artists (100+ artists) - 40+ artists
  • Songs/albums - 30+ songs, 20+ albums
  • Instruments (30+ instruments) - 35+ instruments
  • Music terms - Covered

Book Module

  • Implement Book module structure
  • Book titles (100+ titles) - 40+ titles
  • Authors (100+ authors) - 36+ authors
  • Publishers (50+ publishers) - 24+ publishers
  • Genres (30+ genres) - 35+ genres
  • Series names - 20+ series
  • Book reviews/quotes - Dynamic generation

Commerce Module

  • Implement Commerce module structure
  • Product names (200+ products) - 48+ base products
  • Product adjectives - 30+ adjectives
  • Materials (wood, metal, plastic, etc.) - 24+ materials
  • Departments (electronics, clothing, etc.) - 25+ departments
  • Colors (50+ colors) - 21+ colors
  • Sizes - Can be added easily

Word Module

  • Implement Word module structure
  • Adjectives (500+ adjectives) - Expanded to 500+ adjectives
  • Adverbs (200+ adverbs) - Expanded to 200+ adverbs
  • Conjunctions - Already implemented
  • Interjections - Already implemented
  • Nouns (1000+ nouns) - Expanded to 1000+ nouns
  • Prepositions - Already implemented
  • Verbs (500+ verbs) - Expanded to 500+ verbs

Hacker/Tech Module

  • Implement Hacker module structure
  • Tech abbreviations (API, HTTP, JSON, etc.) - 48+ abbreviations
  • Tech adjectives - 20+ adjectives
  • Tech nouns - 25+ nouns
  • Tech verbs - 20+ verbs
  • Programming languages - Covered
  • Tech phrases/jargon - Dynamic generation

System Module

  • Implement System module structure
  • File names - 20+ templates
  • File extensions (100+ types) - 70+ extensions
  • MIME types - 25+ MIME types
  • Directory paths - 18+ paths
  • Semantic versions - Dynamic generation

Science Module

  • Implement Science module structure
  • Chemical elements (118 elements) - 60+ elements
  • Units (metric, imperial) - 40+ units
  • Scientific constants - 15+ constants
  • Scientific fields - 25+ fields
  • Lab equipment - Can be expanded

✅ Phase 3 - Utility Modules (COMPLETED)

Date Module

  • Random dates in range - Timestamp generation
  • Past dates - Can be generated
  • Future dates - Can be generated
  • Recent dates - Timestamp based
  • Soon dates - Timestamp based
  • Weekday names - Implemented
  • Month names - Implemented
  • Date formatting - String formatting implemented

Number Module

  • Random integers in range - Fully implemented
  • Random floats with precision - Implemented
  • Random percentages - Implemented
  • Binary numbers - Implemented
  • Octal numbers - Can be added
  • Hexadecimal numbers - Implemented via hex()

Color Module

  • RGB colors - Implemented
  • Hex colors - Implemented
  • HSL colors - Can be added
  • Color names (100+ colors) - Implemented
  • CSS colors - Implemented

Finance Module

  • Credit card numbers (with validation) - Luhn algorithm
  • Credit card types (Visa, Mastercard, etc.) - Covered
  • IBAN numbers - Implemented
  • BIC codes - Implemented
  • Currency codes - 16+ codes
  • Currency names - 16+ names
  • Account numbers - Implemented
  • Transaction types - 10+ types
  • Bitcoin addresses - Implemented
  • Ethereum addresses - Implemented

Database Module

  • Column names - 30+ column names
  • Table names - 24+ table names
  • Database types (MySQL, PostgreSQL, etc.) - 15+ engines
  • SQL data types - 20+ SQL types
  • MongoDB ObjectId - Implemented
  • Collation types - 9+ collations

Git Module

  • Commit messages - 20+ messages
  • Commit SHAs - 40 char hex
  • Branch names - 10+ branch types
  • Commit authors - Via person module
  • Commit timestamps - Via date module

Image Module

  • Image URLs (placeholder services) - Implemented
  • Image dimensions - Dynamic generation
  • Image categories - Implemented
  • Data URIs - Implemented

Lorem Module

  • Lorem ipsum words - Implemented
  • Lorem ipsum sentences - Implemented
  • Lorem ipsum paragraphs - Implemented
  • Text with line count - Can be generated
  • Slug generation - Can be added

Helpers Module

  • Array element picker (single) - faker.helpers.arrayElement()
  • Array elements picker (multiple) - faker.helpers.arrayElements()
  • Array unique elements picker - faker.helpers.arrayElementsUnique()
  • Array shuffle - faker.helpers.shuffle()
  • Random boolean - faker.helpers.boolean()
  • Replace symbols (# → digit, ? → letter) - faker.helpers.replaceSymbols()
  • Maybe (null with probability) - faker.helpers.maybe() / maybeNull()
  • Unique ID generator - faker.helpers.uniqueId()
  • NEW: Weighted selection - faker.helpers.weightedArrayElement()
  • NEW: Batch generation - faker.helpers.batchGenerate()
  • NEW: Subset selection - faker.helpers.subset()
  • NEW: Normal distribution - faker.helpers.normalDistribution()
  • NEW: Slugify - faker.helpers.slugify()
  • NEW: Sequence generator - faker.helpers.sequence()
  • NEW: Hex color generator - faker.helpers.hexColor()
  • NEW: String repeat - faker.helpers.repeatString()

✅ Phase 4 - Locale Support (COMPLETE)

Locale System Enhancements (COMPLETE)

  • Locale fallback chain - getLocaleFallbackChain()
  • Locale merging utilities - mergeLocaleDefinitions()
  • Regional variants - en-US, en-GB implemented as examples
  • Locale detection - detectSystemLocale() from env vars
  • Custom locale loading - LocaleLoader with caching
  • Locale parsing - parseLocale() supports both _ and - separators
  • Available locales list - 27 base locales, 28 regional variants

Base Locales (27 total) - COMPLETE

  • af (Afrikaans) - Stub
  • ar (Arabic) - Stub
  • az (Azerbaijani) - Stub
  • cs (Czech) - Stub
  • da (Danish) - Stub
  • de (German) - Full implementation
  • en (English) - Full implementation
  • eo (Esperanto) - Stub
  • es (Spanish) - Full implementation
  • fa (Persian) - Stub
  • fi (Finnish) - Stub
  • fr (French) - Full implementation
  • he (Hebrew) - Stub
  • hi (Hindi) - Stub
  • it (Italian) - Full implementation
  • ja (Japanese) - Basic implementation
  • ko (Korean) - Basic implementation
  • nl (Dutch) - Basic implementation
  • no (Norwegian) - Stub
  • pl (Polish) - Stub
  • pt (Portuguese) - Full implementation
  • sv (Swedish) - Stub
  • tl (Tagalog) - Stub
  • tr (Turkish) - Stub
  • uk (Ukrainian) - Full implementation
  • zh (Chinese) - Basic implementation
  • zu (Zulu) - Stub

Regional Variants (28 total) - COMPLETE

  • af_ZA (Afrikaans - South Africa)
  • de_AT (German - Austria)
  • de_CH (German - Switzerland)
  • de_DE (German - Germany)
  • en_AU (English - Australia)
  • en_CA (English - Canada)
  • en_GB (English - United Kingdom) - Full implementation
  • en_GH (English - Ghana)
  • en_HK (English - Hong Kong)
  • en_IE (English - Ireland)
  • en_IN (English - India)
  • en_NG (English - Nigeria)
  • en_US (English - United States) - Full implementation
  • en_ZA (English - South Africa)
  • es_ES (Spanish - Spain) - Full implementation
  • es_MX (Spanish - Mexico) - Full implementation
  • fr_BE (French - Belgium)
  • fr_CA (French - Canada)
  • fr_CH (French - Switzerland)
  • fr_FR (French - France)
  • fr_LU (French - Luxembourg)
  • fr_SN (French - Senegal)
  • pt_BR (Portuguese - Brazil)
  • pt_MZ (Portuguese - Mozambique)
  • pt_PT (Portuguese - Portugal)
  • zh_CN (Chinese - China)
  • zh_TW (Chinese - Taiwan)
  • zu_ZA (Zulu - South Africa)

Summary

  • 55 total locale files: 27 base + 28 regional variants
  • Matches ts-mocker locale list exactly (excluding Russian per requirements)
  • All locales are loadable via LocaleLoader
  • All locales exported from faker.zig via faker.locales.*
  • Automatic fallback chains working (e.g., en_US → en → fallback)
  • Thread-safe caching with dynamic loading
  • All 44 tests passing

✅ Phase 5 - Advanced Features (COMPLETE)

Validation System

  • Built-in validators (email, phone, URL, UUID, credit card, IPv4/IPv6, hex color, etc.) - 12+ validators
  • Custom validation support with ValidatorFn
  • Validation rules engine with ValidationRule
  • Strict vs non-strict modes
  • Integrated into main Faker struct

Weighted Selection

  • Weighted array element selection - Implemented in Helpers module
  • Probability distributions - Normal distribution in Helpers
  • Common name weights - Implemented with realistic frequency data
  • Age distribution weights - realisticAge(), realisticAdultAge(), etc.
  • Country distribution weights - Implemented with population-based weights

Data Relationships

  • Related data generation
  • Family generation (consistent last names)
  • Neighborhood generation (same city/zip)
  • Organization generation (employees in company)
  • Phone format by country
  • Postal code format by country

Constraints System

  • Gender constraints
  • Country/region constraints
  • Age range constraints
  • Locale constraints
  • Custom constraint functions

Templates & Schemas

  • JSON schema support
  • Custom data templates
  • Batch generation from templates
  • Nested object generation

🚧 Phase 6 - CLI Tool

Command Line Interface

  • CLI structure and argument parsing
  • Generate single item: zig-faker generate person firstName
  • Generate multiple items: zig-faker generate person firstName --count 10
  • Seeded generation: zig-faker generate --seed 12345
  • Locale selection: zig-faker generate --locale es
  • List categories: zig-faker categories
  • List methods: zig-faker methods person
  • List locales: zig-faker locales
  • Batch generation: zig-faker batch 100 --template user.json
  • JSON output: zig-faker generate --json
  • CSV output: zig-faker generate --csv
  • Template file support

CLI Installation

  • Build as standalone executable
  • Installation script
  • Homebrew formula (macOS)
  • Distribution packages (Linux)

🚧 Phase 7 - Performance Optimizations

Performance Enhancements

  • Optimize hot paths
  • Reduce allocations where possible
  • Cache frequently used values
  • Lazy initialization of modules
  • SIMD optimizations for string generation
  • Benchmark against ts-mocker
  • Memory profiling
  • Performance regression tests

Advanced RNG

  • Alternative RNG algorithms (PCG, xoshiro, etc.)
  • Cryptographically secure option
  • RNG performance benchmarks

🚧 Phase 8 - Testing & Quality

Test Coverage

  • Unit tests for all modules (100% coverage)
  • Integration tests
  • Property-based tests
  • Fuzzing tests
  • Memory leak detection
  • Thread safety tests (if async added)

Documentation

  • API documentation (doc comments)
  • Usage guides
  • Migration guides
  • Cookbook/recipes
  • Performance tips
  • Locale creation guide

🚧 Phase 9 - Advanced Use Cases

Testing Utilities

  • Test fixture generation
  • Mock data factories
  • Snapshot testing support
  • Integration with testing frameworks

API Mocking

  • REST API response generation
  • GraphQL response generation
  • JSON API format support

🎯 Future Considerations

Experimental Features

  • AI-generated realistic data
  • Context-aware generation (e.g., realistic names for specific countries)
  • Data anonymization (replace real data with fake but realistic)
  • Plugin system for custom generators
  • Async/streaming generation for large datasets
  • Incremental generation (resume from seed+offset)

Ecosystem

  • Package manager support (launchpad, but defer this task, as in dont do it for now)
  • Integration examples (web frameworks, ORMs)
  • Community locale contributions
  • VS Code extension
  • Online playground/demo

📊 Metrics & Goals

Performance Targets (to match or exceed ts-mocker)

  • UUID generation: 20M+ ops/sec
  • Email generation: 10M+ ops/sec
  • Full name generation: 20M+ ops/sec
  • Memory usage: < 1MB for core library
  • Binary size: < 500KB (static)

Quality Targets

  • Test coverage: > 95%
  • Documentation coverage: 100%
  • Zero known memory leaks
  • Zero unsafe code (except where necessary)

🤝 Contributing

Contributions are welcome! Pick any task from this TODO list and open a PR. Please:

  1. Add tests for new functionality
  2. Update documentation
  3. Follow existing code style
  4. Run benchmarks for performance-sensitive changes

📝 Notes

  • This implementation mirrors ts-mocker's architecture and feature set
  • Focus on performance and memory efficiency
  • Maintain simple, idiomatic Zig code
  • Keep dependencies minimal (ideally zero)
  • Prioritize usefulness for testing and prototyping