Debugging Guide

Purpose: Document known issues, failed approaches, optimal solutions, and debugging patterns for future developers and AI agents.

⚠️ MANDATORY: Read this BEFORE attempting any fixes. Update AFTER every debugging session.

🔴 BEFORE YOU START: Read FIX_PROCESS.md

ALL code changes MUST follow the 6-step process in FIX_PROCESS.md:

✅ Check docs (VERSION_HISTORY, DEBUGGING_GUIDE, ARCHITECTURE_DECISIONS)
✅ Create test FIRST
✅ Apply fix
✅ Run test
✅ Update docs
✅ Ask user to verify

NO SHORTCUTS. See FIX_PROCESS.md for full details.

Known Issues
Failed Approaches (What NOT to Do)
Optimal Solutions (What TO Do)
Debugging Patterns
Testing Requirements
Emergency Procedures

Known Issues

1. v3.8.1 Remaining Issues (IN PROGRESS 🚧)

Problem:

Row 81: Nancy Kurts - → Last Name = "-" (trailing hyphen not cleaned)
Row 170: -Ling Erik Kuo → First Name = "-Ling" (leading hyphen issue)

Root Cause:

Row 81: Trailing hyphen cleanup exists (line 1149) but runs before final name parsing
- Code: textNoNicknames.replace(/\s*[-\u2013\u2014]\s*$/, '').trim()
- Hyphen gets into lastName during parsing after cleanup runs
Row 170: Leading hyphen in first name
- Original: -Ling Erik Kuo (should be Meng-Ling Erik Kuo)
- Excel formula prevention was removed from hyphen check
- Need to handle leading hyphens in name parts

Solution (v3.8.2):

TBD - See todo.md for planned approach
Will require test-first development per FIX_PROCESS.md

Status:

v3.8.1 marked STABLE despite these 2 issues (93% clean overall)
26 other "issues" are incomplete source data (acceptable)
Production ready for real-world use

2. Credentials Without Commas Not Removed (FIXED ✅)

Problem:

Last Name column still has credentials like "Simon MD", "Kopman DDS"
Middle initials like "S. Perrin" not being removed
v3.7.1 only removed credentials AFTER commas

Root Cause:

normalizeValue.ts only had comma-removal logic: cleaned.replace(/,.*$/, '')
No pattern matching for credentials as standalone words

Solution (v3.7.2):

Exported ALL_CREDENTIALS from NameEnhanced.ts

Added credential regex pattern to normalizeValue.ts:

const credentialPattern = new RegExp(
  `(?<![-])\\b(${ALL_CREDENTIALS.map(c => escapeRegex(c)).join('|')})(?=\\s|$|[^\\w])`,
  'gi'
);
cleaned = cleaned.replace(credentialPattern, '').trim();

Added middle initial removal: cleaned.replace(/^[A-Z]\\.\\s+/, '')

Test Coverage:

✅ 18/18 tests passing in csv-column-cleaning.test.ts
✅ Credentials without commas removed
✅ Middle initials removed
✅ Credentials after commas still working

Files:

client/src/lib/NameEnhanced.ts - Exported ALL_CREDENTIALS
client/src/lib/normalizeValue.ts - Added credential pattern
tests/csv-column-cleaning.test.ts - Added 3 new tests

2. Worker Import Errors (BLOCKER 🔴)

Problem:

Worker fails to initialize with "Failed to process chunk 0" error
Vite shows: "Failed to resolve import ... from ... Does the file exist?"

Root Cause:

Worker trying to import modules that don't exist
Broken import statements left in code

Solution:

Check error message for the missing module path
Search worker file for that import
Remove the import statement
Remove any usage of that module in the code
Add TODO comment if feature is needed later
Create test to prevent regression

Example Fix:

// ❌ Before (broken)
import { LocationNormalizer } from '../../../shared/normalization/locations';
case 'location': {
  return LocationNormalizer.normalize(value);
}

// ✅ After (fixed)
// No import
case 'location': {
  // TODO: Implement location normalization
  return value;
}

Test to Add:

it('should not have broken imports', async () => {
  const workerContent = await fs.readFile(workerPath, 'utf-8');
  expect(workerContent).not.toContain('import { NonExistentModule }');
});

2. CSV Column Cleaning (FIXED ✅)

Problem:

Input CSV already has "First Name" and "Last Name" columns with credentials/titles
Worker was only processing "Name" column, not cleaning existing columns
Credentials like "MD", "CFP" still appearing in Last Name column
Titles like "Dr." still appearing in First Name column
Pronouns like "(she/her)" not being removed

Root Cause:

Worker only handled "name" type, not "first-name" and "last-name" types
No logic to clean individual column values

Solution (VALIDATED):

Create separate normalizeValue.ts utility file
Add first-name type handler:
- Remove titles: Dr., Prof., Mr., Mrs., Ms., Miss.
- Remove middle initials: Jennifer R. → Jennifer
Add last-name type handler:
- Remove credentials after commas: Berman, MD → Berman
- Remove pronouns: Bouch (she/her) → Bouch
- Remove trailing periods

Test Coverage:

15/15 tests passing in csv-column-cleaning.test.ts
Covers titles, credentials, pronouns, complex cases

Files:

client/src/lib/normalizeValue.ts - Utility function
client/src/workers/normalization.worker.ts - Uses normalizeValue
tests/csv-column-cleaning.test.ts - Test suite

3. Module Loading in Workers (FIXED ✅)

Problem:

ALL_CREDENTIALS array imported from @shared/normalization/names returns empty [] when loaded in Web Workers
Vite bundling breaks ES module imports for worker contexts
Console shows: CREDENTIALS_SET size: 0

Root Cause:

Vite's worker bundling doesn't properly include as const arrays from shared modules
Circular dependencies or initialization order issues

Symptoms:

Credentials not being stripped from names
Empty credential arrays in console logs
isCredential() always returns false

Solution (RESEARCHED & VALIDATED):

Enterprise Pattern from theiconic/name-parser (131 stars, production-proven):

DON'T import credentials from external modules
DO hardcode credentials as constants directly in the class file
Pattern: Define data where it's consumed

Implementation:

// In NameEnhanced.ts - at the top of the file
const ALL_CREDENTIALS = [
  'MD', 'PhD', 'MBA', 'CFP', 'CPA', 'RN', 'DDS', ...
  // All 723 credentials hardcoded here
];

// Then use it directly
const CREDENTIALS_SET = new Set(ALL_CREDENTIALS);

Why This Works:

No module imports = no bundling issues
Works in all contexts (main thread, workers, tests)
Zero dependencies on external modules
Proven pattern from production libraries processing "hundreds of thousands" of names

Status: ✅ FIXED in v3.7.0 - All tests passing

2. Format Code Leaking (FIXED ✅)

Problem:

Random letters (p, m, s, q, d) appearing at beginning/end of names
Example: "p Michael m March s" instead of "Michael March"

Root Cause:

// ❌ BAD - Uses || operator
.map(c => formatMap[c] || c)

// When formatMap['p'] is empty string '', it returns 'p' (the letter)

Solution:

// ✅ GOOD - Checks undefined explicitly
.map(c => formatMap[c] !== undefined ? formatMap[c] : c)
.filter(s => s && s.trim())  // Also filter empty strings

Fixed In: v3.7.0 (staging)

Location: client/src/lib/NameEnhanced.ts line ~480

3. Regex Escaping in Credential Patterns

Problem:

Regex patterns not matching credentials correctly
Word boundaries not working: \\b vs \b

Root Cause:

// ❌ BAD - Double escaping
`\\\\b(${credentials.join('|')})\\\\b`  // Results in literal "\\b"

// ✅ GOOD - Single escaping in template literal
`\\b(${credentials.join('|')})\\b`  // Results in word boundary

Solution:

Use single backslash \ in template literals
Escape special regex chars: c.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')
Make periods optional: .replace(/\\\./g, '\\.?')

Status: ATTEMPTED but didn't solve module loading issue

4. Nested Anchor Tags

Problem:

React error: "cannot contain a nested "
Occurs when wrapping <a> inside <Link> component

Root Cause:

// ❌ BAD - Link already renders <a>
<Link href="/changelog">
  <a className="...">Changelog</a>
</Link>

Solution:

// ✅ GOOD - Pass className directly to Link
<Link href="/changelog" className="...">
  Changelog
</Link>

Fixed In: v3.6.2

Failed Approaches (What NOT to Do)

❌ Approach 1: Debugging Regex Escaping for Hours

What We Tried:

Spent 10+ iterations trying different regex escape patterns
Tested \\b, \\\\b, \b variations
Added console.log debugging for pattern matching

Why It Failed:

The regex wasn't the problem - the credentials array was EMPTY
Debugging symptoms instead of root cause
Wasted time on wrong problem

Lesson:

Always check data exists BEFORE debugging patterns
If array is empty, no regex will work
Use console.log to verify data FIRST

❌ Approach 2: Hardcoding Credentials (Partial Failure)

What We Tried:

Created HARDCODED_CREDENTIALS array with 150+ credentials
Bypassed module import entirely
Used local array in NameEnhanced.ts

Why It Failed:

Regex still didn't match (escaping issues)
Didn't solve root cause (module loading)
Band-aid solution, not systematic fix

Lesson:

Hardcoding works as TEMPORARY fix
Must still fix underlying module loading issue
Need enterprise-grade solution

❌ Approach 3: Multiple Changes Without Testing

What We Tried:

Fixed format() method
Changed credential regex
Updated imports
All in one commit

Why It Failed:

Couldn't isolate which change broke what
Introduced regressions
Hard to rollback specific changes

Lesson:

One change at a time
Test after EACH change
Commit working changes before next fix

❌ Approach 4: No Test Suite Before Changes

What We Tried:

Made fixes without automated tests
Relied on manual CSV uploads to verify
No regression detection

Why It Failed:

Every fix broke something else
No way to catch regressions automatically
Debugging loop never ended

Lesson:

Create tests FIRST
Tests validate fixes work
Tests catch regressions immediately

Optimal Solutions (What TO Do)

✅ Solution 1: Rollback to Last Working Version

When to Use:

Stuck in debugging loop (3+ failed attempts)
Don't know root cause
Breaking more things than fixing

How to Do It:

Check VERSION_HISTORY.md for last stable version
Use webdev_rollback_checkpoint(version_id="...")
Test that version works
Apply ONE fix at a time from there

Example:

# Rollback to v3.6.0
webdev_rollback_checkpoint(version_id="c1420db")

# Or fallback to v3.4.1
webdev_rollback_checkpoint(version_id="8c1056a")

✅ Solution 2: Create Tests Before Fixes

When to Use:

Before ANY code changes
After rolling back to stable version
When implementing new features

How to Do It:

Create test file: tests/name-normalization.test.ts
Write tests for expected behavior
Run tests - they should PASS on stable version
Make fix
Run tests - they should STILL pass

Example:

import { describe, it, expect } from 'vitest';
import { NameEnhanced } from '../client/src/lib/NameEnhanced';

describe('Credential Stripping', () => {
  it('should strip MD from last name', () => {
    const name = new NameEnhanced('Jennifer Berman MD');
    expect(name.lastName).toBe('Berman');
    expect(name.full).toBe('Jennifer Berman');
  });
  
  it('should strip CFP® from last name', () => {
    const name = new NameEnhanced('John Bell CFP®');
    expect(name.lastName).toBe('Bell');
  });
});

✅ Solution 3: Use Staging Environment

When to Use:

Testing any fixes
Before publishing to production
Experimenting with new approaches

How to Do It:

Keep production on stable version (v3.6.0)
Use dev server (port 3000) as staging
Test fixes in staging
Only publish after validation

✅ Solution 4: Research Enterprise Solutions

When to Use:

Stuck on same problem multiple times
Need proven, production-ready approach
Building critical features

How to Do It:

Search for enterprise libraries solving same problem
Study their source code on GitHub
Adopt their patterns and approaches
Don't reinvent the wheel

Examples:

libphonenumber-js for phone normalization
validator.js for email validation
Check how they handle module loading in workers

Debugging Patterns

Pattern 1: Data First, Logic Second

Always check:

Does the data exist? (console.log(array.length))
Is it the right format? (console.log(array[0]))
Is it being loaded? (console.log('Module loaded'))

Then debug: 4. Pattern matching (regex, etc.) 5. Logic flow 6. Edge cases

Pattern 2: Binary Search for Bugs

When multiple changes broke something:

Rollback all changes
Apply half the changes
Test - works or broken?
If broken, remove half again
If works, add half back
Repeat until you find the breaking change

Pattern 3: Console Log Checkpoints

Add logs at key points:

console.log('[NameEnhanced] Starting parse:', originalText);
console.log('[NameEnhanced] After credential strip:', cleanedText);
console.log('[NameEnhanced] Final parts:', { firstName, lastName });

Remove logs after fix is working

Testing Requirements

Before ANY Code Changes:

✅ Read VERSION_HISTORY.md
✅ Read this DEBUGGING_GUIDE.md
✅ Read ARCHITECTURE_DECISIONS.md
✅ Create test file for the feature
✅ Run tests on current stable version (should pass)

After Making Changes:

✅ Run automated tests
✅ Test with user's CSV file manually
✅ Check console for errors
✅ Verify no regressions in other features
✅ Update documentation

Before Publishing:

✅ All tests pass
✅ Manual CSV test passes
✅ No console errors
✅ Documentation updated
✅ Checkpoint saved

Emergency Procedures

If Production is Broken:

Immediate Rollback:

webdev_rollback_checkpoint(version_id="c1420db")  # v3.6.0

Notify user of rollback
Fix in staging before re-publishing

If Stuck in Debugging Loop:

Stop - Don't make more changes
Rollback to last working version
Document what failed in this guide
Research enterprise solutions
Create tests before trying again

If Module Loading Breaks:

Don't debug - it's a known issue
Hardcode critical data as temporary fix
Research how enterprise libraries handle it
Implement proper solution from research

Quick Reference Commands

Rollback to Stable:

webdev_rollback_checkpoint(version_id="c1420db")  # v3.6.0
webdev_rollback_checkpoint(version_id="8c1056a")  # v3.4.1

Run Tests:

pnpm test

Check Logs:

# Browser console (F12)
# Look for [NameEnhanced] logs

Apply Database Migration:

pnpm db:push

Update Log

Date	Who	What Changed
2025-11-02	AI Agent	Initial creation with v3.7.0 lessons

Remember: This guide is only useful if we UPDATE it after every debugging session!

v3.13.4 - Middle Initial Removal + Location Splitting (2025-01-XX)

Problem 1: Middle Initials in Last Name

Symptom:

"James A. Simon" → Last Name: "A Simon" (should be "Simon")
"Jennifer R. Berman" → First Name: "Jennifer R." (should be "Jennifer")

Root Cause:

Single-letter "A" was in LAST_NAME_PREFIXES array (line 750) for Portuguese/Spanish names like "João a Silva"
When parsing "James A Simon", the logic treated "A" as a last name prefix and added it to lastNameParts
No filtering for single-letter middle initials

Failed Approaches:

❌ Filtering middleParts after the while loop - too late, "A" already in lastNameParts
❌ Removing "a" from LAST_NAME_PREFIXES - breaks Portuguese/Spanish name parsing

Optimal Solution:

✅ Check parts[i].length === 1 BEFORE treating as last name prefix (line 1366)
✅ Filter single-letter initials from middleParts after parsing (line 1383-1388)

Code:

// v3.13.4: Skip single-letter initials (A, B, etc.) - they're middle initials, not last name prefixes
const isSingleLetterInitial = parts[i].length === 1;

if (!isSingleLetterInitial && LAST_NAME_PREFIXES.includes(candidate as any)) {
  lastNameParts = [parts[i], ...lastNameParts];
  middleParts = parts.slice(1, i);
}

// v3.13.4: Filter out single-letter middle initials (A., B., etc.)
middleParts = middleParts.filter(part => {
  const cleaned = part.replace(/\./g, '');
  return cleaned.length > 1;
});

Files Modified:

client/src/lib/NameEnhanced.ts (lines 1366, 1383-1388)

Tests:

tests/v3134-critical-fixes.test.ts - 4 tests for middle initial removal
Updated 2 old tests that expected middle initials to be kept

Problem 2: Location Splitting Not Implemented

Symptom:

Location column passed through unchanged: "Durham, North Carolina, United States"
No Personal City or Personal State columns in output
Enrichment tool requires separate city and state fields

Root Cause:

normalizeValue.ts had TODO comment for location normalization (line 76-78)
Schema analyzer detects location columns as type 'address', not 'location'
No location parsing logic existed

Failed Approaches:

❌ Checking for colSchema.type === 'location' - schema uses 'address' type
❌ Trying to return object from normalizeValue - it only returns strings
❌ State name matching before abbreviation matching - caused "Washington" to match WA instead of DC

Optimal Solution:

✅ Created locationParser.ts with comprehensive US location parsing
✅ Added location splitting logic to contextAwareExecutor.ts (lines 94-113)
✅ Check for colSchema.type === 'address' && /location/i.test(colName)
✅ Prioritize state abbreviations over state names in parsing

Code:

// v3.13.4: Handle location splitting
const isLocationColumn = colSchema.type === 'address' && /location/i.test(colName);

if (isLocationColumn) {
  const parsed = parseLocation(value);
  
  // Remove original Location column
  delete normalized[colName];
  
  // Add Personal City and Personal State columns
  if (parsed.city) {
    normalized['Personal City'] = parsed.city;
  }
  if (parsed.state) {
    normalized['Personal State'] = parsed.state;
  }
  
  return;
}

Location Parser Features:

Handles "City, State, Country" format
Handles "City State" format
Handles "City Area" format (San Francisco Bay Area)
Converts state names to 2-letter abbreviations
Prioritizes state abbreviations over state names
Infers state from well-known city names
Removes area suffixes (Bay Area, Metropolitan Area, etc.)

Files Modified:

client/src/lib/locationParser.ts (NEW FILE)
client/src/lib/contextAwareExecutor.ts (lines 15, 94-113)

Tests:

tests/v3134-critical-fixes.test.ts - 2 tests for location splitting
Covers edge cases: "Washington DC-Baltimore Area", "San Francisco Bay Area"

Problem 3: Full Name Column Appearing in Output

Symptom:

Full Name column sometimes appearing in output despite v3.10.0 deletion logic
User reported seeing "Name" column in normalized results

Root Cause:

FALSE ALARM - v3.10.0 logic was working correctly
User's input CSV had all three columns: Name, First Name, Last Name
Context-aware processor correctly removes Name column and keeps First/Last
Issue was confusion about what columns were in the input vs output

Optimal Solution:

✅ No code changes needed - v3.10.0 logic is correct
✅ Added tests to verify Full Name column removal works in all scenarios

Tests:

tests/v3134-critical-fixes.test.ts - 3 tests for Full Name column removal
Covers: single name column, multiple name columns, column mapping scenarios

Debugging Patterns Learned

Always check schema type assignment:
- Use analyzeSchema(headers) to see what type is assigned
- Schema analyzer may use different type names than expected
- Example: "Location" columns get type 'address', not 'location'
Prioritize specific patterns over general patterns:
- State abbreviations (DC, CA) should be checked before state names (District of Columbia, California)
- Prevents ambiguous matches like "Washington" matching WA instead of DC
Single-letter handling requires special cases:
- Single letters can be initials OR last name prefixes (Portuguese "a", "e")
- Check length before applying general rules
- Filter after parsing to remove unwanted single letters
Test with real user data:
- User-provided CSV revealed edge cases not covered by unit tests
- "San Francisco Bay Area" format wasn't in original test suite
- "Washington DC-Baltimore Area" required special handling

Testing Requirements for v3.13.4

Required Test Coverage:

Middle initial removal:
- "James A. Simon" → First: "James", Last: "Simon"
- "Jennifer R. Berman, MD" → First: "Jennifer", Last: "Berman"
- Single-letter last names still work: "James A" → First: "James", Last: "A"
Location splitting:
- "Durham, North Carolina, United States" → City: "Durham", State: "NC"
- "San Francisco Bay Area" → City: "San Francisco", State: "CA"
- "Washington DC-Baltimore Area" → City: "Washington", State: "DC"
Full Name column removal:
- Name column NOT in output when processing name data
- First Name and Last Name columns ARE in output
- Works with multiple name columns in input

Test Files:

tests/v3134-critical-fixes.test.ts - 11 comprehensive tests
All 139 tests must pass before checkpoint

FilesExpand file tree

DEBUGGING_GUIDE.md

Latest commit

History

DEBUGGING_GUIDE.md

File metadata and controls

Debugging Guide

🔴 BEFORE YOU START: Read FIX_PROCESS.md

Table of Contents

Known Issues

1. v3.8.1 Remaining Issues (IN PROGRESS 🚧)

2. Credentials Without Commas Not Removed (FIXED ✅)

2. Worker Import Errors (BLOCKER 🔴)

2. CSV Column Cleaning (FIXED ✅)

3. Module Loading in Workers (FIXED ✅)

2. Format Code Leaking (FIXED ✅)

3. Regex Escaping in Credential Patterns

4. Nested Anchor Tags

Failed Approaches (What NOT to Do)

❌ Approach 1: Debugging Regex Escaping for Hours

❌ Approach 2: Hardcoding Credentials (Partial Failure)

❌ Approach 3: Multiple Changes Without Testing

❌ Approach 4: No Test Suite Before Changes

Optimal Solutions (What TO Do)

✅ Solution 1: Rollback to Last Working Version

✅ Solution 2: Create Tests Before Fixes

✅ Solution 3: Use Staging Environment

✅ Solution 4: Research Enterprise Solutions

Debugging Patterns

Pattern 1: Data First, Logic Second

Pattern 2: Binary Search for Bugs

Pattern 3: Console Log Checkpoints

Testing Requirements

Before ANY Code Changes:

After Making Changes:

Before Publishing:

Emergency Procedures

If Production is Broken:

If Stuck in Debugging Loop:

If Module Loading Breaks:

Quick Reference Commands

Rollback to Stable:

Run Tests:

Check Logs:

Apply Database Migration:

Update Log

v3.13.4 - Middle Initial Removal + Location Splitting (2025-01-XX)

Problem 1: Middle Initials in Last Name

Problem 2: Location Splitting Not Implemented

Problem 3: Full Name Column Appearing in Output

Debugging Patterns Learned

Testing Requirements for v3.13.4