Skip to content

fs: dispatch ASCII and Latin1 via simdutf in ReadFileUtf8#63370

Open
mertcanaltin wants to merge 1 commit into
nodejs:mainfrom
mertcanaltin:mert/readfileutf8-zero-copy-fast-path
Open

fs: dispatch ASCII and Latin1 via simdutf in ReadFileUtf8#63370
mertcanaltin wants to merge 1 commit into
nodejs:mainfrom
mertcanaltin:mert/readfileutf8-zero-copy-fast-path

Conversation

@mertcanaltin
Copy link
Copy Markdown
Member

In fs.readFileSync(path, 'utf8'), I dispatch the V8 string creation through simdutf. ASCII and Latin1-fits utf-8 use one-byte V8 strings, multibyte goes through simdutf to UTF-16, invalid utf-8 falls back to V8.

@nodejs/fs, @anonrig @addaleax, @lemire @mcollina

Bench results (gist): https://gist.github.com/mertcanaltin/a0096c3fad387d0bace821938754af44

fs/readfile-utf8-fastpath.js
    latin1     path 4MB     +803%
    latin1     path 256KB   +725%
    utf8_mixed path 4MB     +453%
    utf8_mixed path 256KB   +358%
    latin1     path 16KB    +309%
    latin1     fd   4MB     +238%

@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

Review requested:

  • @nodejs/performance

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. labels May 16, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 17, 2026

Codecov Report

❌ Patch coverage is 57.14286% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.06%. Comparing base (9de9b9f) to head (deaa5e1).
⚠️ Report is 28 commits behind head on main.

Files with missing lines Patch % Lines
src/util.cc 57.14% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #63370      +/-   ##
==========================================
- Coverage   90.07%   90.06%   -0.02%     
==========================================
  Files         714      714              
  Lines      225564   225740     +176     
  Branches    42656    42718      +62     
==========================================
+ Hits       203177   203306     +129     
- Misses      14189    14221      +32     
- Partials     8198     8213      +15     
Files with missing lines Coverage Δ
src/util-inl.h 83.38% <ø> (+0.44%) ⬆️
src/util.h 90.98% <ø> (ø)
src/util.cc 86.95% <57.14%> (-0.49%) ⬇️

... and 65 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we make this change specifically for fs and not as part of the general ToV8Value() conversion or StringBytes?

Signed-off-by: Mert Can Altin <mertgold60@gmail.com>
@mertcanaltin mertcanaltin force-pushed the mert/readfileutf8-zero-copy-fast-path branch from cdeaa76 to deaa5e1 Compare May 18, 2026 18:56
@mertcanaltin
Copy link
Copy Markdown
Member Author

Why would we make this change specifically for fs and not as part of the general ToV8Value() conversion or StringBytes?

Sure, I applied now,

new benchmark results:

have a one regresion, I solving today,
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='utf8_mixed' size=16384 *** -9.65 % ±2.96% ±3.94% ±5.13%

➜  node git:(mert/readfileutf8-zero-copy-fast-path) ✗ node-benchmark-compare ./result.csv
                                                                                    confidence improvement accuracy (*)    (**)   (***)
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='ascii' size=1024                           0.11 %       ±4.46%  ±5.99%  ±7.89%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='ascii' size=16384                          2.08 %       ±5.10%  ±6.80%  ±8.88%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='ascii' size=262144                        -3.75 %       ±3.75%  ±5.00%  ±6.50%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='ascii' size=4194304               ***      4.98 %       ±2.36%  ±3.14%  ±4.08%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='ascii' size=64                             0.11 %       ±3.45%  ±4.60%  ±6.00%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='latin1' size=1024                          1.61 %       ±4.25%  ±5.68%  ±7.45%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='latin1' size=16384                 **      6.43 %       ±3.72%  ±4.96%  ±6.46%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='latin1' size=262144               ***     27.97 %       ±4.38%  ±5.84%  ±7.63%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='latin1' size=4194304              ***    221.38 %       ±4.45%  ±5.97%  ±7.87%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='latin1' size=64                            1.45 %       ±2.80%  ±3.73%  ±4.85%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='utf8_mixed' size=1024                      3.00 %       ±5.51%  ±7.37%  ±9.67%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='utf8_mixed' size=16384            ***     -9.65 %       ±2.96%  ±3.94%  ±5.13%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='utf8_mixed' size=262144           ***     17.65 %       ±3.20%  ±4.26%  ±5.55%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='utf8_mixed' size=4194304          ***    126.65 %       ±3.25%  ±4.34%  ±5.68%
fs/readfile-utf8-fastpath.js n=3000 source='fd' content='utf8_mixed' size=64                        0.33 %       ±4.22%  ±5.64%  ±7.38%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='ascii' size=1024                         2.40 %       ±2.93%  ±3.95%  ±5.23%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='ascii' size=16384                        0.24 %       ±4.38%  ±5.83%  ±7.60%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='ascii' size=262144              ***     10.85 %       ±1.06%  ±1.42%  ±1.86%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='ascii' size=4194304             ***     27.92 %       ±3.49%  ±4.68%  ±6.18%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='ascii' size=64                           0.79 %       ±2.69%  ±3.62%  ±4.79%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='latin1' size=1024               ***     29.48 %       ±3.18%  ±4.27%  ±5.65%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='latin1' size=16384              ***    264.74 %       ±5.63%  ±7.57% ±10.00%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='latin1' size=262144             ***    575.46 %       ±7.93% ±10.65% ±14.06%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='latin1' size=4194304            ***    823.55 %       ±4.83%  ±6.46%  ±8.47%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='latin1' size=64                          3.02 %       ±3.61%  ±4.86%  ±6.44%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='utf8_mixed' size=1024           ***     18.36 %       ±3.05%  ±4.11%  ±5.43%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='utf8_mixed' size=16384          ***    136.45 %       ±2.53%  ±3.37%  ±4.38%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='utf8_mixed' size=262144         ***    260.71 %       ±4.50%  ±5.99%  ±7.80%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='utf8_mixed' size=4194304        ***    330.51 %       ±2.51%  ±3.38%  ±4.48%
fs/readfile-utf8-fastpath.js n=3000 source='path' content='utf8_mixed' size=64                      3.01 %       ±3.61%  ±4.86%  ±6.43%

Be aware that when doing many comparisons the risk of a false-positive result increases.
In this case, there are 30 comparisons, you can thus expect the following amount of false-positive results:
  1.50 false positives, when considering a   5% risk acceptance (*, **, ***),
  0.30 false positives, when considering a   1% risk acceptance (**, ***),
  0.03 false positives, when considering a 0.1% risk acceptance (***)
➜  node git:(mert/readfileutf8-zero-copy-fast-path) ✗

@mertcanaltin
Copy link
Copy Markdown
Member Author

mertcanaltin commented May 18, 2026

-9.65 % on fd utf8_mixed 16 KiB is addressed in #63385 it collapses the StringBytes UTF-16 path to a single simdutf pass for buffers ≤ 1 MiB. Both PRs together remove the regression @addaleax fyi

@addaleax addaleax added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. request-ci Add this label to start a Jenkins CI on a PR. labels May 19, 2026
@github-actions github-actions Bot removed the request-ci Add this label to start a Jenkins CI on a PR. label May 19, 2026
@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author ready PRs that have at least one approval, no pending requests for changes, and a CI started. c++ Issues and PRs that require attention from people who are familiar with C++. fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants