Skip to content

Commit 38ba41e

Browse files
committed
Update zend_string.rst
1 parent 1a9a2c7 commit 38ba41e

1 file changed

Lines changed: 220 additions & 28 deletions

File tree

docs/source/core/data-structures/zend_string.rst

Lines changed: 220 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -61,23 +61,30 @@ it may have been freed if you were its last user.
6161
API
6262
*****
6363

64-
The string API is defined in ``Zend/zend_string.h``. It provides a number of functions for creating
65-
new strings.
64+
The string API is defined in ``Zend/zend_string.h``. It contains creation, resizing, comparison,
65+
hashing, and interning helpers.
6666

67-
.. list-table:: ``zend_string`` creation
67+
.. list-table:: Creation and allocation APIs
6868
:header-rows: 1
6969

7070
- - Function/Macro [#persistent]_
7171
- Description
7272

73-
- - ``ZSTR_INIT(s, p)``
74-
- Creates a new string from a string literal.
73+
- - ``ZSTR_INIT_LITERAL(s, p)``
74+
- Creates a new string from a C string literal. This is a convenience wrapper around
75+
``zend_string_init`` that computes the literal length at compile time.
7576

7677
- - ``zend_string_init(s, l, p)``
7778
- Creates a new string from a character buffer.
7879

80+
- - ``zend_string_init_fast(s, l)``
81+
- Fast-path initializer that may return shared immutable strings for lengths 0 and 1.
82+
7983
- - ``zend_string_alloc(l, p)``
80-
- Creates a new string of a given length without initializing its content.
84+
- Allocates a new string of length ``l`` without initializing its contents.
85+
86+
- - ``zend_string_safe_alloc(n, m, l, p)``
87+
- Allocates ``n * m + l`` bytes of payload with overflow checks.
8188

8289
- - ``zend_string_concat2(s1, l1, s2, l2)``
8390
- Creates a non-persistent string by concatenating two character buffers.
@@ -86,20 +93,53 @@ new strings.
8693
- Same as ``zend_string_concat2``, but for three character buffers.
8794

8895
- - ``ZSTR_EMPTY_ALLOC()``
89-
- Gets an immutable, empty string. This does not allocate memory.
96+
- Returns an immutable, empty string. This does not allocate memory.
9097

91-
- - ``ZSTR_CHAR(char)``
92-
- Gets an immutable, single-character string. This does not allocate memory.
98+
- - ``ZSTR_CHAR(c)``
99+
- Returns an immutable, single-character string. This does not allocate memory.
93100

94101
- - ``ZSTR_KNOWN(ZEND_STR_const)``
95102

96103
- Gets an immutable, predefined string. Used for string common within PHP itself, e.g.
97-
``"class"``. See ``ZEND_KNOWN_STRINGS`` in ``Zend/zend_string.h``. This does not allocate
98-
memory.
104+
``"class"``. See ``ZEND_KNOWN_STRINGS`` in ``Zend/zend_string.h``. This does not allocate memory.
105+
106+
- - ``ZSTR_MAX_OVERHEAD``
107+
- Maximum allocator/header overhead used by ``zend_string``.
108+
109+
- - ``ZSTR_MAX_LEN``
110+
- Maximum representable payload length for a ``zend_string``.
111+
112+
.. list-table:: Resizing and copy-on-write APIs
113+
:header-rows: 1
114+
115+
- - Function/Macro [#persistent]_
116+
- Description
117+
118+
- - ``zend_string_realloc(s, l, p)``
119+
- Resizes a string to length ``l``. May return a different pointer. If shared/interned, a new
120+
allocation is created.
121+
122+
- - ``zend_string_safe_realloc(s, n, m, l, p)``
123+
- Resizes a string to ``n * m + l`` bytes with overflow checks.
124+
125+
- - ``zend_string_extend(s, l, p)``
126+
- Extends a string to a larger length (``l >= ZSTR_LEN(s)``).
127+
128+
- - ``zend_string_truncate(s, l, p)``
129+
- Shrinks a string to a smaller length (``l <= ZSTR_LEN(s)``).
130+
131+
- - ``zend_string_dup(s, p)``
132+
- Creates a real copy in a new allocation, except for interned strings (which are returned
133+
unchanged).
134+
135+
- - ``zend_string_separate(s, p)``
136+
- Ensures unique ownership (copy-on-write): duplicates if shared or interned, otherwise
137+
reuses ``s`` after resetting cached metadata.
99138

100139
.. [#persistent]
101140
102-
``s`` = ``zend_string``, ``l`` = ``length``, ``p`` = ``persistent``.
141+
``s`` = ``zend_string``, ``l`` = ``length``, ``p`` = ``persistent``,
142+
``n * m + l`` = checked size expression used for safe allocation/reallocation.
103143
104144
As per php-src fashion, you are not supposed to access the ``zend_string`` fields directly. Instead,
105145
use the following macros. There are macros for both ``zend_string`` and ``zvals`` known to contain
@@ -122,43 +162,195 @@ strings.
122162

123163
- - ``ZSTR_HASH``
124164
- ``Z_STRHASH[_P]``
125-
- Computes the string has if it hasn't already been, and returns it.
165+
- Computes the string hash if it hasn't already been, and returns it.
126166

127167
- - ``ZSTR_H``
128168
- \-
129169
- Returns the string hash. This macro assumes that the hash has already been computed.
130170

131-
.. list-table:: Reference counting macros
171+
.. list-table:: String property macros
132172
:header-rows: 1
133173

134174
- - Macro
135175
- Description
136176

177+
- - ``ZSTR_IS_INTERNED(s)``
178+
- Checks whether a string is interned.
179+
180+
- - ``ZSTR_IS_VALID_UTF8(s)``
181+
- Checks whether a string has the ``IS_STR_VALID_UTF8`` flag set.
182+
183+
.. list-table:: Reference counting and lifetime APIs
184+
:header-rows: 1
185+
186+
- - Function/Macro [#persistent]_
187+
- Description
188+
137189
- - ``zend_string_copy(s)``
138190
- Increases the reference count and returns the same string. The reference count is not
139191
increased if the string is interned.
140192

193+
- - ``zend_string_refcount(s)``
194+
- Returns the reference count. Interned strings always report ``1``.
195+
196+
- - ``zend_string_addref(s)``
197+
- Increments the reference count of a non-interned string.
198+
199+
- - ``zend_string_delref(s)``
200+
- Decrements the reference count of a non-interned string.
201+
141202
- - ``zend_string_release(s)``
142203
- Decreases the reference count and frees the string if it goes to 0.
143204

144-
- - ``zend_string_dup(s, p)``
145-
- Creates a true copy of the string in a new allocation, except if the string is interned.
205+
- - ``zend_string_release_ex(s, p)``
206+
- Like ``zend_string_release()``, but chooses the deallocator from ``p``. Use only if the
207+
persistence mode is known by the caller.
146208

147-
- - ``zend_string_separate(s)``
148-
- Duplicates the string if the reference count is greater than 1. See
149-
:doc:`./reference-counting` for details.
209+
- - ``zend_string_free(s)``
210+
- Frees a non-interned string directly. The caller must ensure it is no longer shared.
150211

151-
- - ``zend_string_realloc(s, l, p)``
212+
- - ``zend_string_efree(s)``
213+
- Frees a non-persistent, non-interned string with ``efree``.
214+
215+
There are various functions to compare strings.
216+
217+
.. list-table:: Comparison APIs
218+
:header-rows: 1
219+
220+
- - Function/Macro
221+
- Description
222+
223+
- - ``zend_string_equals(s1, s2)``
224+
- Full equality check for two ``zend_string`` values.
225+
226+
- - ``zend_string_equal_content(s1, s2)``
227+
- Full equality check assuming both arguments are ``zend_string`` pointers.
228+
229+
- - ``zend_string_equal_val(s1, s2)``
230+
- Compares only the string payload bytes (caller must ensure equal lengths).
231+
232+
- - ``zend_string_equals_cstr(s1, s2, l2)``
233+
- Compares a ``zend_string`` with a ``char*`` buffer and explicit length.
234+
235+
- - ``zend_string_equals_ci(s1, s2)``
236+
- Case-insensitive full equality check.
237+
238+
- - ``zend_string_equals_literal(str, literal)``
239+
- Equality check against a string literal with compile-time literal length.
240+
241+
- - ``zend_string_equals_literal_ci(str, literal)``
242+
- Case-insensitive literal equality check.
243+
244+
- - ``zend_string_starts_with(str, prefix)``
245+
- Checks whether ``str`` begins with ``prefix``.
246+
247+
- - ``zend_string_starts_with_cstr(str, prefix, prefix_length)``
248+
- Prefix check against a ``char*`` buffer and explicit length.
249+
250+
- - ``zend_string_starts_with_ci(str, prefix)``
251+
- Case-insensitive prefix check for two ``zend_string`` values.
252+
253+
- - ``zend_string_starts_with_cstr_ci(str, prefix, prefix_length)``
254+
- Case-insensitive prefix check against a ``char*`` buffer.
255+
256+
- - ``zend_string_starts_with_literal(str, prefix)``
257+
- Prefix check against a string literal.
258+
259+
- - ``zend_string_starts_with_literal_ci(str, prefix)``
260+
- Case-insensitive prefix check against a string literal.
261+
262+
.. list-table:: Hashing APIs
263+
:header-rows: 1
264+
265+
- - Function/Macro
266+
- Description
267+
268+
- - ``zend_string_hash_func(s)``
269+
- Computes and stores the hash for ``s``.
270+
271+
- - ``zend_string_hash_val(s)``
272+
- Returns the cached hash if available, otherwise computes it.
273+
274+
- - ``zend_hash_func(str, len)``
275+
- Computes a hash for a raw ``char*`` buffer.
276+
277+
- - ``zend_inline_hash_func(str, len)``
278+
- Inline implementation of PHP's string hashing routine for ``char*`` buffers.
279+
280+
- - ``zend_string_forget_hash_val(s)``
281+
- Clears cached hash/derived flags after string contents change.
282+
283+
.. list-table:: Concatenation property macros
284+
:header-rows: 1
285+
286+
- - Macro
287+
- Description
288+
289+
- - ``ZSTR_COPYABLE_CONCAT_PROPERTIES``
290+
- Bitmask of string flags that can be preserved when concatenating strings.
291+
292+
- - ``ZSTR_GET_COPYABLE_CONCAT_PROPERTIES(s)``
293+
- Extracts copyable concatenation properties from one string.
294+
295+
- - ``ZSTR_GET_COPYABLE_CONCAT_PROPERTIES_BOTH(s1, s2)``
296+
- Extracts copyable properties shared by both input strings.
297+
298+
- - ``ZSTR_COPY_CONCAT_PROPERTIES(out, in)``
299+
- Copies concatenation properties from one input to the output string.
300+
301+
- - ``ZSTR_COPY_CONCAT_PROPERTIES_BOTH(out, in1, in2)``
302+
- Copies only properties that are set on both inputs.
303+
304+
.. list-table:: Stack allocation helper macros
305+
:header-rows: 1
306+
307+
- - Macro
308+
- Description
309+
310+
- - ``ZSTR_ALLOCA_ALLOC(str, l, use_heap)``
311+
- Allocates a temporary string buffer using ``do_alloca``.
312+
313+
- - ``ZSTR_ALLOCA_INIT(str, s, l, use_heap)``
314+
- Same as ``ZSTR_ALLOCA_ALLOC``, then copies data from ``s`` and appends ``'\0'``.
315+
316+
- - ``ZSTR_ALLOCA_FREE(str, use_heap)``
317+
- Frees memory previously allocated with ``ZSTR_ALLOCA_ALLOC``/``ZSTR_ALLOCA_INIT``.
318+
319+
.. list-table:: Interned string APIs
320+
:header-rows: 1
321+
322+
- - Function
323+
- Description
324+
325+
- - ``zend_new_interned_string(s)``
326+
- Interns ``s`` if possible and returns the interned instance.
327+
328+
- - ``zend_string_init_interned(str, len, p)``
329+
- Creates or fetches an interned string from raw bytes.
330+
331+
- - ``zend_string_init_existing_interned(str, len, p)``
332+
- Returns an interned string only if it already exists; does not create a new one.
333+
334+
- - ``zend_interned_string_find_permanent(s)``
335+
- Looks up ``s`` in the permanent interned string storage.
336+
337+
- - ``zend_interned_strings_init()``
338+
- Initializes interned string storage during engine startup.
339+
340+
- - ``zend_interned_strings_dtor()``
341+
- Destroys interned string storage during engine shutdown.
342+
343+
- - ``zend_interned_strings_activate()``
344+
- Activates request-local interned string state.
345+
346+
- - ``zend_interned_strings_deactivate()``
347+
- Deactivates request-local interned string state.
152348

153-
- Changes the size of the string. If the string has a reference count greater than 1 or if
154-
the string is interned, a new string is created. You must always use the return value of
155-
this function, as the original array may have been moved to a new location in memory.
349+
- - ``zend_interned_strings_set_request_storage_handlers(...)``
350+
- Installs callbacks that customize request interned string storage behavior.
156351

157-
There are various functions to compare strings. The ``zend_string_equals`` function compares two
158-
strings in full, while ``zend_string_starts_with`` checks whether the first argument starts with the
159-
second. There are variations for ``_ci`` and ``_literal``, i.e. case-insensitive comparison and
160-
literal strings, respectively. We won't go over all variations here, as they are straightforward to
161-
use.
352+
- - ``zend_interned_strings_switch_storage(request)``
353+
- Switches between request and persistent interned string storage backends.
162354

163355
******************
164356
Interned strings

0 commit comments

Comments
 (0)