Skip to content

Commit 960da8b

Browse files
Claude4.0oclaude
andcommitted
Add public StringUtilities.getChars(String) for fast char[] access
New public API returns a ThreadLocal char[] populated via String.getChars() (SIMD bulk copy). Callers replace str.charAt(i) with buf[i] to avoid per-character method call and JDK 9+ coder check overhead. hashCodeIgnoreCase now uses this public method. Any hot loop across java-util, json-io, or downstream projects can benefit with: char[] buf = StringUtilities.getChars(str); for (int i = 0; i < str.length(); i++) { buf[i] ... } Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 94c27c2 commit 960da8b

2 files changed

Lines changed: 28 additions & 7 deletions

File tree

changelog.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
### Revision History
22

33
#### 4.99.0 (Unreleased)
4-
* **PERFORMANCE**: `StringUtilities.hashCodeIgnoreCase(String)` — uses `String.getChars()` (SIMD-optimized bulk copy) into a ThreadLocal `char[]` buffer, then hashes from the array directly. Avoids `charAt()`'s per-character method call and JDK 9+ compact-string coder check overhead. No reflection, no VarHandle, no `--add-opens` — works on all JDK versions (8-25+). Benchmark shows CaseInsensitiveMap GET improved **70-75%** (230 → 58-69 ns/op), PUT improved **48-52%** (135 → 65-71 ns/op), and MIXED-CASE GET improved **68%** (302 → 93-97 ns/op) on 100K entries.
4+
* **FEATURE**: New `StringUtilities.getChars(String s)` — public API that returns a ThreadLocal `char[]` buffer populated via `String.getChars()` (SIMD-optimized bulk copy). Callers can replace `str.charAt(i)` loops with direct `buf[i]` array access, avoiding per-character method call and JDK 9+ coder check overhead. Use `s.length()` for the valid range. Buffer is shared per-thread — valid until the next `getChars()` call on the same thread.
5+
* **PERFORMANCE**: `StringUtilities.hashCodeIgnoreCase(String)` — uses `StringUtilities.getChars()` (SIMD-optimized bulk copy) into a ThreadLocal `char[]` buffer, then hashes from the array directly. Avoids `charAt()`'s per-character method call and JDK 9+ compact-string coder check overhead. No reflection, no VarHandle, no `--add-opens` — works on all JDK versions (8-25+). Benchmark shows CaseInsensitiveMap GET improved **70-75%** (230 → 58-69 ns/op), PUT improved **48-52%** (135 → 65-71 ns/op), and MIXED-CASE GET improved **68%** (302 → 93-97 ns/op) on 100K entries.
56
* **PERFORMANCE**: New `FastReader.readLine(char[] dest, int off, int maxLen)` — dedicated line-reading method optimized for TOON's line-oriented parsing. Combines scanning, copying, and line-ending consumption (`\n`, `\r`, `\r\n`) into a single call. Uses a `c <= '\r'` range guard so printable characters (the vast majority) require only one comparison per character instead of two. Eliminates the per-line overhead of separate `readUntil()` + `read()` + pushback round-trip. JFR shows TOON line-reading samples dropped from 173 to 125 (28% reduction), and `FastReader.read()` calls halved (53 → 25 samples).
67
* **PERFORMANCE**: `FastReader.readUntil()` pushback drain loop now uses a local variable for `pushbackPosition` instead of repeated member field access, avoiding load/store through `this` on each iteration. JFR shows 14.8% reduction in aggregate FastReader CPU share.
78
* **PERFORMANCE**: `FastReader.readUntil()` replaced `Math.min()` call with inline ternary in the tight buffer-scan loop, eliminating method call overhead. JFR confirmed 3.5% wall-clock improvement.

src/main/java/com/cedarsoftware/util/StringUtilities.java

Lines changed: 26 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -974,11 +974,7 @@ public static int hashCodeIgnoreCase(String s) {
974974
if (s == null) return 0;
975975

976976
final int n = s.length();
977-
// Bulk-copy chars via getChars (SIMD-optimized), then hash from the array.
978-
// Avoids charAt()'s per-character method call and JDK 9+ coder check overhead.
979-
// Uses no reflection, no VarHandle — works on all JDK versions.
980-
char[] buf = getCharBuf(n);
981-
s.getChars(0, n, buf, 0);
977+
char[] buf = getChars(s);
982978

983979
int h = 0;
984980
for (int i = 0; i < n; i++) {
@@ -996,7 +992,31 @@ public static int hashCodeIgnoreCase(String s) {
996992
return h;
997993
}
998994

999-
/** Get a reusable char buffer from ThreadLocal, growing if needed. */
995+
/**
996+
* Returns a ThreadLocal {@code char[]} buffer populated with the string's characters
997+
* via {@link String#getChars(int, int, char[], int)} (SIMD-optimized bulk copy).
998+
* <p>
999+
* Use {@code s.length()} for the valid range — the returned buffer may be larger.
1000+
* <p>
1001+
* This avoids {@code charAt()}'s per-character method call and JDK 9+ compact-string
1002+
* coder check overhead. Callers can loop over the returned array with direct array access
1003+
* instead of {@code str.charAt(i)}.
1004+
* <p>
1005+
* <b>Important:</b> The returned array is a shared ThreadLocal buffer. It is valid only
1006+
* until the next call to {@code getChars()} on the same thread. Do not store the reference
1007+
* beyond the immediate scope.
1008+
*
1009+
* @param s the string whose characters to extract (must not be null)
1010+
* @return a char[] containing the string's characters starting at index 0
1011+
*/
1012+
public static char[] getChars(String s) {
1013+
int n = s.length();
1014+
char[] buf = getCharBuf(n);
1015+
s.getChars(0, n, buf, 0);
1016+
return buf;
1017+
}
1018+
1019+
/** Internal: get a reusable char buffer from ThreadLocal, growing if needed. */
10001020
private static char[] getCharBuf(int minSize) {
10011021
char[] buf = TL_CHAR_BUF.get();
10021022
if (minSize > buf.length) {

0 commit comments

Comments
 (0)