fix(dataextractor): RFC 4180 quote fields containing delimiter, quote, or newline#3371
fix(dataextractor): RFC 4180 quote fields containing delimiter, quote, or newline#3371ryanmelt-agent wants to merge 4 commits into
Conversation
…, or newline DataExtractor previously joined values with the active delimiter without quoting, so a string telemetry value that contained a comma (or quote, CR, LF) corrupted CSV output. Add a quoteField helper applied at the header and row join points; drop the manual "[..]" pre-quoting on arrays so the same helper handles both CSV and tab-delimited modes. Fixes OpenC3#858 Co-Authored-By: Paperclip <noreply@paperclip.ing>
Numbers, booleans, etc. can never contain the delimiter, quote, CR, or LF, so they don't need escaping. Return them untouched and let Array.prototype.join stringify them; this avoids a String conversion and four .includes() scans per numeric cell. Arrays are pre-stringified at the call site, so they still take the normal string path. Co-Authored-By: Paperclip <noreply@paperclip.ing>
|
Added a perf follow-up in Re-ran the same Node verification expanded to 22 cases (including mixed numeric/boolean/NaN/Infinity rows, sparse rows, TSV passthrough). All pass; lint clean. |
Split quoteField into a primitive fast-path (number/boolean/bigint return untouched) and a defensive object path (any other non-string value is coerced via String() before the RFC 4180 scan). Strings continue to skip the redundant String() call. Makes the helper safe against future regressions where a non-pre-stringified object reaches the join, while keeping numeric cells free of scans. Co-Authored-By: Paperclip <noreply@paperclip.ing>
|
Tiered the helper in
23 Node assertions pass, including |
…781) Addresses Sonar javascript:S7781. The literal-string overload is clearer than the regex form and avoids constructing a RegExp. Co-Authored-By: Paperclip <noreply@paperclip.ing>
|
Addressed Sonar -return '"' + s.replace(/"/g, '""') + '"'
+return '"' + s.replaceAll('"', '""') + '"'Same semantics (escape every |
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3371 +/- ##
==========================================
+ Coverage 78.09% 78.81% +0.71%
==========================================
Files 480 687 +207
Lines 35532 57265 +21733
Branches 728 728
==========================================
+ Hits 27750 45133 +17383
- Misses 7704 12054 +4350
Partials 78 78
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|



Summary
Fixes #858.
DataExtractorpreviously built each output row asrow.join(this.delimiter)with no quoting. A string telemetry value that contained the active delimiter (typically,) — or",CR,LF— would corrupt the CSV output. Array values were pre-wrapped as"[…]"by hand, which only happened to be correct in CSV mode and produced a stray quote in tab-delimited mode.This change introduces an RFC 4180
quoteFieldhelper and applies it consistently at the header and row join points, so both CSV and TSV output stay parseable when values contain the delimiter, embedded quotes, or newlines.Changes
quoteField(value)method that returns the value as-is unless it contains the currentdelimiter,",\r, or\n, in which case it wraps in double quotes and doubles any embedded double quotes (per RFC 4180).null/undefinedbecome empty strings.quoteFieldto each header beforejoin.quoteFieldto each row cell beforejoin.'"[' + arr + ']"'wrapping for array values. With the helper in place, arrays produce[a,b,c]raw, and the join-time helper quotes correctly for whatever delimiter is active.Verification
pnpm lint --max-warnings 0clean inopenc3-cosmos-tool-dataextractor.", embedded\n, array brackets, plain int/string passthrough,null/undefined→"",NaN/Infinitypassthrough, and TSV mode (no quoting on,, quoting on\t, array brackets pass through). All pass.playwright/tests/data-extractor.p.spec.ts:TARGET,PACKET,TEMP1,TEMP2,% TIME,TARGET,PACKET,Q1,Q2,INST HEALTH_STATUS TEMP1, andINST,LATEST,HTMLall produce identical bytes (no commas/quotes in those header/value strings → passthrough).Test plan
pnpm lintinopenc3-cosmos-init/plugins/packages/openc3-cosmos-tool-dataextractor.playwright/tests/data-extractor.p.spec.tssuite against this branch and confirms no regression.csv.DictReader).