Skip to content

Releases: roALAB1/data-normalization-platform

v4.24.0 - Workshop 4: Data Quality & Intent Filtering

23 Feb 21:02

Choose a tag to compare

Workshop 4 added with Wistia embed (5dx0ehg1oe), timestamped transcript, comprehensive KB article. Covers derivative data, match rate deconstruction, geoframing, Distance + Deviation intent model, closed feedback loop, stateless workers, DeepVerify. 211 tests passing.

v4.23.0: RetargetIQ Dedicated Battle Card

19 Feb 17:36

Choose a tag to compare

Added dedicated RetargetIQ battle card cloned from DataShopper with reseller-specific messaging. 5 cards, 13 comparison dimensions, 4 fatal flaws, rebuttal script, talk track, 6 FAQ, 3-month roadmap. 190 tests passing.

v4.21.0 - Premium Tier Rebuttal Cards

19 Feb 03:01

Choose a tag to compare

Added dedicated Premium Tier Rebuttal battle cards and FAQ entries for ZoomInfo (card #6, FAQ #7) and Apollo.io (card #5, FAQ #6). Addresses the common objection 'We already use the premium/enterprise tier' with detailed rebuttals explaining why paying more doesn't fix structural data architecture limitations. 172 tests passing.

v4.20.0 - Battle Card Seeding: ZoomInfo, Apollo.io, RB2B

18 Feb 22:24

Choose a tag to compare

Synthesized comprehensive battle cards from ChatGPT, 2x Perplexity, and Google Deep Research. ZoomInfo (5 cards, 8 comparisons, 4 fatal flaws), Apollo.io (4 cards, 8 comparisons, 3 fatal flaws), RB2B (4 cards, 8 comparisons, 3 fatal flaws). Each includes rebuttal scripts, talk tracks, sales FAQ, and 90-day deployment roadmaps. 172 tests passing.

v4.19.0 — Data Objection Handler

17 Feb 07:06

Choose a tag to compare

AI-powered Data Objection Handler page. Reps paste a prospect's data quality objection and receive a tailored response backed by the 'Cheap Data Costs More' workshop intelligence. Features 4 tone options, 15 common objection presets, copy-to-clipboard, response history, retry, and Cmd+Enter shortcut. 143 tests passing.

v4.18.0 — Data Economics Quiz & Vendor Checklist

17 Feb 06:40

Choose a tag to compare

Added 12-question Data Economics Quiz (NCOA, Distance Scoring, Starbucks Problem, derivative data, match rates, vendor vetting) with 4 score tiers and category breakdown. Generated one-page Vendor Comparison Checklist PDF on S3 CDN. Both linked from Data Workshop page and Workshop Hub. 121 tests passing across 9 test files.

v4.17.0 — Data Deep Dive Workshop

17 Feb 06:30

Choose a tag to compare

Added 'Cheap Data Costs More' workshop page with embedded Loom video, full timestamped transcript, and comprehensive Knowledge Base article covering the three pillars of identity data (Audience, Pixel, Intent). Clickable timestamps jump to video position. Workshop 3 card added to Workshop Hub. 108 tests passing across 8 test files.

v3.50.0: Smart Column Mapping

18 Dec 21:01

Choose a tag to compare

Smart Column Mapping 🤖

Intelligent pre-normalization feature that automatically detects and suggests combining fragmented columns (address components, name components, phone components) with confidence scoring and preview generation. Eliminates 5-10 minutes of manual Excel work with one-click acceptance.

Key Features

  • 🏠 Address Components: House + Street + Apt → Address (e.g., "65" + "MILL ST" + "306" → "65 MILL ST Apt 306")
  • 👤 Name Components: First + Middle + Last + Prefix + Suffix → Full Name (supports 15+ column name variations)
  • 📞 Phone Components: Area Code + Number + Extension → Phone (e.g., "555" + "123-4567" → "(555) 123-4567")
  • 🎯 Pattern Matching: Case-insensitive detection with space/underscore support
  • 📊 Confidence Scoring: High (≥80%), Medium (60-79%), Low (<60%) confidence indicators
  • 👁️ Preview Generation: Shows 3 sample combinations before acceptance
  • Fast Detection: <50ms for typical CSV (10-20 columns)
  • 🎨 SmartSuggestions UI: User-friendly interface with Accept/Customize/Ignore actions

UI Enhancements

User Experience

  • Before: 5-10 minutes of manual column combination in Excel
  • After: One-click "Accept" on smart suggestion
  • Eliminates manual Excel formula work and reduces errors

Test Coverage

  • 22/22 comprehensive unit tests (100% pass rate)
  • Detection time: <50ms for typical CSV
  • Minimal memory overhead (only 5 sample rows per column)

Technical Details

Files Added:

  • shared/utils/ColumnCombinationDetector.ts - Core detection logic
  • client/src/components/SmartSuggestions.tsx - UI component
  • tests/v3.50.0.test.ts - Comprehensive test suite
  • docs/VERSION_HISTORY_v3.50.0.md - Detailed documentation

Test Categories:

  • Address component detection (5 tests)
  • Name component detection (3 tests)
  • Phone component detection (3 tests)
  • Column combination application (4 tests)
  • Multiple suggestions (2 tests)
  • Edge cases (5 tests)

See CHANGELOG.md for complete details.

Release v3.49.0

18 Dec 01:23

Choose a tag to compare

What's New in v3.49.0 🚀

Changes

  • Checkpoint: v3.49.0: Fix critical memory issues with 400k+ row files (5f26173)
  • Release v3.49.0: Large File Processing Fix (661db03)

Full Changelog

See CHANGELOG.md for complete version history.

Installation

git clone https://github.com/roALAB1/data-normalization-platform.git
cd data-normalization-platform
pnpm install
pnpm run dev

Documentation

v3.48.0: URL Normalization Feature 🌐

18 Dec 00:12

Choose a tag to compare

URL Normalization Feature 🌐

Comprehensive URL normalization that extracts clean domain names from URLs by removing protocols, www prefixes, paths, query parameters, and fragments. Auto-detects URL columns in CSV files with 95%+ accuracy and supports international domains (.co.uk, .com.au, etc.). Includes confidence scoring for URL validity and handles 18+ multi-part TLDs. All 40 tests passing with full integration into the intelligent normalization engine.

Key Features

  • 🌐 Protocol Removal: Strips http://, https://, ftp://, and other protocols
  • 🔗 WWW Prefix Removal: Removes www. from domain names (case-insensitive)
  • 🎯 Root Domain Extraction: Extracts only domain + extension (google.com)
  • 🗑️ Path/Query/Fragment Removal: Removes /paths, ?query=params, and #fragments
  • 🌍 International Domain Support: Handles .co.uk, .com.au, and 18+ multi-part TLDs
  • 🤖 Auto-Detection: Automatically identifies URL columns (Website, URL, Link, Homepage)
  • 📊 Confidence Scoring: 0-1 confidence scores based on domain validity
  • 40 Tests Passing: Comprehensive coverage including real-world examples

Examples

http://www.google.com → google.com
https://www.example.com/page?query=1 → example.com
www.facebook.com/profile#section → facebook.com
subdomain.site.co.uk/path → site.co.uk

Technical Details

  • URLNormalizer Utility Class: Three main methods
    • normalize(url): Returns detailed result with metadata
    • normalizeString(url): Simplified version for CSV processing
    • normalizeBatch(urls): Batch processing for multiple URLs
  • Integration with Intelligent Engine: Added 'url' DataType to UnifiedNormalizationEngine
    • Seamless integration with existing normalization pipeline
    • Lazy import for optimal performance
    • Metadata includes: domain, subdomain, tld, isValid, confidence

Test Coverage

40 comprehensive tests (100% pass rate):

  • Basic URL normalization (4 tests)
  • Protocol removal (4 tests)
  • WWW prefix removal (3 tests)
  • Path/query/fragment removal (6 tests)
  • Subdomain handling (3 tests)
  • International domains (4 tests)
  • Edge cases (6 tests)
  • Confidence scoring (3 tests)
  • Batch normalization (1 test)
  • String normalization (1 test)
  • Real-world examples (5 tests)

What's Changed

  • Updated version to 3.48.0 in package.json and versionManager.ts
  • Added comprehensive URL normalization feature
  • Updated README.md with v3.48.0 overview
  • Updated CHANGELOG.md with detailed v3.48.0 entry
  • All existing features remain fully functional

Full Changelog: v3.45.0...v3.48.0