@@ -61,23 +61,30 @@ it may have been freed if you were its last user.
6161 API
6262*****
6363
64- The string API is defined in ``Zend/zend_string.h ``. It provides a number of functions for creating
65- new strings .
64+ The string API is defined in ``Zend/zend_string.h ``. It contains creation, resizing, comparison,
65+ hashing, and interning helpers .
6666
67- .. list-table :: ``zend_string`` creation
67+ .. list-table :: Creation and allocation APIs
6868 :header-rows: 1
6969
7070 - - Function/Macro [#persistent ]_
7171 - Description
7272
73- - - ``ZSTR_INIT(s, p) ``
74- - Creates a new string from a string literal.
73+ - - ``ZSTR_INIT_LITERAL(s, p) ``
74+ - Creates a new string from a C string literal. This is a convenience wrapper around
75+ ``zend_string_init `` that computes the literal length at compile time.
7576
7677 - - ``zend_string_init(s, l, p) ``
7778 - Creates a new string from a character buffer.
7879
80+ - - ``zend_string_init_fast(s, l) ``
81+ - Fast-path initializer that may return shared immutable strings for lengths 0 and 1.
82+
7983 - - ``zend_string_alloc(l, p) ``
80- - Creates a new string of a given length without initializing its content.
84+ - Allocates a new string of length ``l `` without initializing its contents.
85+
86+ - - ``zend_string_safe_alloc(n, m, l, p) ``
87+ - Allocates ``n * m + l `` bytes of payload with overflow checks.
8188
8289 - - ``zend_string_concat2(s1, l1, s2, l2) ``
8390 - Creates a non-persistent string by concatenating two character buffers.
@@ -86,20 +93,53 @@ new strings.
8693 - Same as ``zend_string_concat2 ``, but for three character buffers.
8794
8895 - - ``ZSTR_EMPTY_ALLOC() ``
89- - Gets an immutable, empty string. This does not allocate memory.
96+ - Returns an immutable, empty string. This does not allocate memory.
9097
91- - - ``ZSTR_CHAR(char ) ``
92- - Gets an immutable, single-character string. This does not allocate memory.
98+ - - ``ZSTR_CHAR(c ) ``
99+ - Returns an immutable, single-character string. This does not allocate memory.
93100
94101 - - ``ZSTR_KNOWN(ZEND_STR_const) ``
95102
96103 - Gets an immutable, predefined string. Used for string common within PHP itself, e.g.
97- ``"class" ``. See ``ZEND_KNOWN_STRINGS `` in ``Zend/zend_string.h ``. This does not allocate
98- memory.
104+ ``"class" ``. See ``ZEND_KNOWN_STRINGS `` in ``Zend/zend_string.h ``. This does not allocate memory.
105+
106+ - - ``ZSTR_MAX_OVERHEAD ``
107+ - Maximum allocator/header overhead used by ``zend_string ``.
108+
109+ - - ``ZSTR_MAX_LEN ``
110+ - Maximum representable payload length for a ``zend_string ``.
111+
112+ .. list-table :: Resizing and copy-on-write APIs
113+ :header-rows: 1
114+
115+ - - Function/Macro [#persistent ]_
116+ - Description
117+
118+ - - ``zend_string_realloc(s, l, p) ``
119+ - Resizes a string to length ``l ``. May return a different pointer. If shared/interned, a new
120+ allocation is created.
121+
122+ - - ``zend_string_safe_realloc(s, n, m, l, p) ``
123+ - Resizes a string to ``n * m + l `` bytes with overflow checks.
124+
125+ - - ``zend_string_extend(s, l, p) ``
126+ - Extends a string to a larger length (``l >= ZSTR_LEN(s) ``).
127+
128+ - - ``zend_string_truncate(s, l, p) ``
129+ - Shrinks a string to a smaller length (``l <= ZSTR_LEN(s) ``).
130+
131+ - - ``zend_string_dup(s, p) ``
132+ - Creates a real copy in a new allocation, except for interned strings (which are returned
133+ unchanged).
134+
135+ - - ``zend_string_separate(s, p) ``
136+ - Ensures unique ownership (copy-on-write): duplicates if shared or interned, otherwise
137+ reuses ``s `` after resetting cached metadata.
99138
100139.. [#persistent ]
101140
102- ``s `` = ``zend_string ``, ``l `` = ``length ``, ``p `` = ``persistent ``.
141+ ``s `` = ``zend_string ``, ``l `` = ``length ``, ``p `` = ``persistent ``,
142+ ``n * m + l `` = checked size expression used for safe allocation/reallocation.
103143
104144 As per php-src fashion, you are not supposed to access the ``zend_string `` fields directly. Instead,
105145use the following macros. There are macros for both ``zend_string `` and ``zvals `` known to contain
@@ -122,43 +162,195 @@ strings.
122162
123163 - - ``ZSTR_HASH ``
124164 - ``Z_STRHASH[_P] ``
125- - Computes the string has if it hasn't already been, and returns it.
165+ - Computes the string hash if it hasn't already been, and returns it.
126166
127167 - - ``ZSTR_H ``
128168 - \-
129169 - Returns the string hash. This macro assumes that the hash has already been computed.
130170
131- .. list-table :: Reference counting macros
171+ .. list-table :: String property macros
132172 :header-rows: 1
133173
134174 - - Macro
135175 - Description
136176
177+ - - ``ZSTR_IS_INTERNED(s) ``
178+ - Checks whether a string is interned.
179+
180+ - - ``ZSTR_IS_VALID_UTF8(s) ``
181+ - Checks whether a string has the ``IS_STR_VALID_UTF8 `` flag set.
182+
183+ .. list-table :: Reference counting and lifetime APIs
184+ :header-rows: 1
185+
186+ - - Function/Macro [#persistent ]_
187+ - Description
188+
137189 - - ``zend_string_copy(s) ``
138190 - Increases the reference count and returns the same string. The reference count is not
139191 increased if the string is interned.
140192
193+ - - ``zend_string_refcount(s) ``
194+ - Returns the reference count. Interned strings always report ``1 ``.
195+
196+ - - ``zend_string_addref(s) ``
197+ - Increments the reference count of a non-interned string.
198+
199+ - - ``zend_string_delref(s) ``
200+ - Decrements the reference count of a non-interned string.
201+
141202 - - ``zend_string_release(s) ``
142203 - Decreases the reference count and frees the string if it goes to 0.
143204
144- - - ``zend_string_dup(s, p) ``
145- - Creates a true copy of the string in a new allocation, except if the string is interned.
205+ - - ``zend_string_release_ex(s, p) ``
206+ - Like ``zend_string_release() ``, but chooses the deallocator from ``p ``. Use only if the
207+ persistence mode is known by the caller.
146208
147- - - ``zend_string_separate(s) ``
148- - Duplicates the string if the reference count is greater than 1. See
149- :doc: `./reference-counting ` for details.
209+ - - ``zend_string_free(s) ``
210+ - Frees a non-interned string directly. The caller must ensure it is no longer shared.
150211
151- - - ``zend_string_realloc(s, l, p) ``
212+ - - ``zend_string_efree(s) ``
213+ - Frees a non-persistent, non-interned string with ``efree ``.
214+
215+ There are various functions to compare strings.
216+
217+ .. list-table :: Comparison APIs
218+ :header-rows: 1
219+
220+ - - Function/Macro
221+ - Description
222+
223+ - - ``zend_string_equals(s1, s2) ``
224+ - Full equality check for two ``zend_string `` values.
225+
226+ - - ``zend_string_equal_content(s1, s2) ``
227+ - Full equality check assuming both arguments are ``zend_string `` pointers.
228+
229+ - - ``zend_string_equal_val(s1, s2) ``
230+ - Compares only the string payload bytes (caller must ensure equal lengths).
231+
232+ - - ``zend_string_equals_cstr(s1, s2, l2) ``
233+ - Compares a ``zend_string `` with a ``char* `` buffer and explicit length.
234+
235+ - - ``zend_string_equals_ci(s1, s2) ``
236+ - Case-insensitive full equality check.
237+
238+ - - ``zend_string_equals_literal(str, literal) ``
239+ - Equality check against a string literal with compile-time literal length.
240+
241+ - - ``zend_string_equals_literal_ci(str, literal) ``
242+ - Case-insensitive literal equality check.
243+
244+ - - ``zend_string_starts_with(str, prefix) ``
245+ - Checks whether ``str `` begins with ``prefix ``.
246+
247+ - - ``zend_string_starts_with_cstr(str, prefix, prefix_length) ``
248+ - Prefix check against a ``char* `` buffer and explicit length.
249+
250+ - - ``zend_string_starts_with_ci(str, prefix) ``
251+ - Case-insensitive prefix check for two ``zend_string `` values.
252+
253+ - - ``zend_string_starts_with_cstr_ci(str, prefix, prefix_length) ``
254+ - Case-insensitive prefix check against a ``char* `` buffer.
255+
256+ - - ``zend_string_starts_with_literal(str, prefix) ``
257+ - Prefix check against a string literal.
258+
259+ - - ``zend_string_starts_with_literal_ci(str, prefix) ``
260+ - Case-insensitive prefix check against a string literal.
261+
262+ .. list-table :: Hashing APIs
263+ :header-rows: 1
264+
265+ - - Function/Macro
266+ - Description
267+
268+ - - ``zend_string_hash_func(s) ``
269+ - Computes and stores the hash for ``s ``.
270+
271+ - - ``zend_string_hash_val(s) ``
272+ - Returns the cached hash if available, otherwise computes it.
273+
274+ - - ``zend_hash_func(str, len) ``
275+ - Computes a hash for a raw ``char* `` buffer.
276+
277+ - - ``zend_inline_hash_func(str, len) ``
278+ - Inline implementation of PHP's string hashing routine for ``char* `` buffers.
279+
280+ - - ``zend_string_forget_hash_val(s) ``
281+ - Clears cached hash/derived flags after string contents change.
282+
283+ .. list-table :: Concatenation property macros
284+ :header-rows: 1
285+
286+ - - Macro
287+ - Description
288+
289+ - - ``ZSTR_COPYABLE_CONCAT_PROPERTIES ``
290+ - Bitmask of string flags that can be preserved when concatenating strings.
291+
292+ - - ``ZSTR_GET_COPYABLE_CONCAT_PROPERTIES(s) ``
293+ - Extracts copyable concatenation properties from one string.
294+
295+ - - ``ZSTR_GET_COPYABLE_CONCAT_PROPERTIES_BOTH(s1, s2) ``
296+ - Extracts copyable properties shared by both input strings.
297+
298+ - - ``ZSTR_COPY_CONCAT_PROPERTIES(out, in) ``
299+ - Copies concatenation properties from one input to the output string.
300+
301+ - - ``ZSTR_COPY_CONCAT_PROPERTIES_BOTH(out, in1, in2) ``
302+ - Copies only properties that are set on both inputs.
303+
304+ .. list-table :: Stack allocation helper macros
305+ :header-rows: 1
306+
307+ - - Macro
308+ - Description
309+
310+ - - ``ZSTR_ALLOCA_ALLOC(str, l, use_heap) ``
311+ - Allocates a temporary string buffer using ``do_alloca ``.
312+
313+ - - ``ZSTR_ALLOCA_INIT(str, s, l, use_heap) ``
314+ - Same as ``ZSTR_ALLOCA_ALLOC ``, then copies data from ``s `` and appends ``'\0' ``.
315+
316+ - - ``ZSTR_ALLOCA_FREE(str, use_heap) ``
317+ - Frees memory previously allocated with ``ZSTR_ALLOCA_ALLOC ``/``ZSTR_ALLOCA_INIT ``.
318+
319+ .. list-table :: Interned string APIs
320+ :header-rows: 1
321+
322+ - - Function
323+ - Description
324+
325+ - - ``zend_new_interned_string(s) ``
326+ - Interns ``s `` if possible and returns the interned instance.
327+
328+ - - ``zend_string_init_interned(str, len, p) ``
329+ - Creates or fetches an interned string from raw bytes.
330+
331+ - - ``zend_string_init_existing_interned(str, len, p) ``
332+ - Returns an interned string only if it already exists; does not create a new one.
333+
334+ - - ``zend_interned_string_find_permanent(s) ``
335+ - Looks up ``s `` in the permanent interned string storage.
336+
337+ - - ``zend_interned_strings_init() ``
338+ - Initializes interned string storage during engine startup.
339+
340+ - - ``zend_interned_strings_dtor() ``
341+ - Destroys interned string storage during engine shutdown.
342+
343+ - - ``zend_interned_strings_activate() ``
344+ - Activates request-local interned string state.
345+
346+ - - ``zend_interned_strings_deactivate() ``
347+ - Deactivates request-local interned string state.
152348
153- - Changes the size of the string. If the string has a reference count greater than 1 or if
154- the string is interned, a new string is created. You must always use the return value of
155- this function, as the original array may have been moved to a new location in memory.
349+ - - ``zend_interned_strings_set_request_storage_handlers(...) ``
350+ - Installs callbacks that customize request interned string storage behavior.
156351
157- There are various functions to compare strings. The ``zend_string_equals `` function compares two
158- strings in full, while ``zend_string_starts_with `` checks whether the first argument starts with the
159- second. There are variations for ``_ci `` and ``_literal ``, i.e. case-insensitive comparison and
160- literal strings, respectively. We won't go over all variations here, as they are straightforward to
161- use.
352+ - - ``zend_interned_strings_switch_storage(request) ``
353+ - Switches between request and persistent interned string storage backends.
162354
163355******************
164356 Interned strings
0 commit comments