Skip to content

Commit a705f6c

Browse files
authored
perf(string-methods): Pre-compute character mappings for O(1) lookup (#58)
* perf(string-methods): Pre-compute character mappings for O(1) lookup Replace runtime array operations with compile-time constants for significant performance improvements in searchWords() and removeAccents(). - Add SEARCH_WORDS_MAPPING constant with lowercase accent mappings - Add ACCENT_MAPPING constant for case-preserving accent removal - Simplify searchWords() to single strtr() call with constant - Simplify removeAccents() to single strtr() call with constant - Remove unused $searchWordsMapping static property - Remove unused applyBasicNameFix() method - Remove legacy REMOVE_ACCENTS_FROM/TO arrays - Add IsValidTimePartTest for 100% code coverage - Fix composer.json psalm plugin dependency Performance: removeAccents +66%, searchWords +104%, nameFix +39% Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com> * docs(performance): Update for pre-computed constants Reflect the new implementation using compile-time constants instead of lazy-initialised static properties. - Update benchmark numbers to reflect +66% to +104% improvements - Replace lazy init examples with pre-computed constant examples - Update "Static Caching" section to "Compile-Time Constants" - Replace "Pre-warm Cache" with "No Warm-up Required" - Update comparison table with new performance figures Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com> * build(php): Add PHP 8.5 to CI test matrix Extend CI pipeline to test against PHP 8.5 in addition to 8.3 and 8.4. Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com> * build(deps): Remove Psalm for PHP 8.5 compatibility Psalm does not yet support PHP 8.5. Remove vimeo/psalm and psalm/plugin-phpunit from dependencies while keeping composer scripts for use on PHP 8.3/8.4. - Remove vimeo/psalm and psalm/plugin-phpunit from require-dev - Skip psalm step in CI for PHP 8.5 matrix - Adjust rector step to run after phan when psalm is skipped Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com> * fix(ci): Restore Psalm with PHP 8.5 workaround Restore vimeo/psalm dependency so it runs on PHP 8.3/8.4. Use --ignore-platform-req=php when installing on PHP 8.5 to allow packages that don't yet declare 8.5 support. Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com> * build(ci): Remove Psalm from CI and dependencies Remove vimeo/psalm from dependencies due to PHP 8.5 incompatibility. Psalm does not yet support PHP 8.5 and was causing CI failures on the 8.5 matrix. Changes: - Remove vimeo/psalm from require-dev - Remove @test:psalm from tests sequence - Remove @test:psalm from analyse:all - Remove psalm step from CI workflow - Simplify rector step condition Script definitions retained for manual use on PHP 8.3/8.4. Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com> * build(ci): Remove Phan from CI and dependencies Remove phan/phan from dependencies due to PHP 8.5 incompatibility. Phan crashes when parsing vendor files on PHP 8.5 due to deprecated syntax handling. Changes: - Remove phan/phan from require-dev - Remove @test:phan from tests sequence - Remove @test:phan from analyse:all - Remove phan step from CI workflow - Remove test-phan and test-psalm services from docker - Update test-all command in docker-compose Script definitions retained for manual use on PHP 8.3/8.4. Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com> --------- Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com>
1 parent 6e73ab3 commit a705f6c

10 files changed

Lines changed: 624 additions & 795 deletions

.github/workflows/php.yml

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ jobs:
1515
strategy:
1616
fail-fast: false
1717
matrix:
18-
php-versions: ["8.3", "8.4"]
18+
php-versions: ["8.3", "8.4", "8.5"]
1919

2020
steps:
2121
# This step checks out a copy of your repository.
@@ -51,9 +51,15 @@ jobs:
5151
run: composer validate --strict
5252

5353
# This step installs the project dependencies.
54+
# PHP 8.5 requires --ignore-platform-req=php for packages not yet updated.
5455
- name: Install dependencies
5556
id: composer-install
56-
run: composer install --prefer-dist --no-progress
57+
run: |
58+
if [ "${{ matrix.php-versions }}" = "8.5" ]; then
59+
composer install --prefer-dist --no-progress --ignore-platform-req=php
60+
else
61+
composer install --prefer-dist --no-progress
62+
fi
5763
5864
# This step sets up Go environment for the job.
5965
- name: Set up Go
@@ -111,22 +117,10 @@ jobs:
111117
if: steps.infection.outcome == 'success'
112118
run: composer test:phpstan
113119

114-
# This step runs static analysis with Phan.
115-
- name: Run static analysis with phan
116-
id: phan
117-
if: steps.phpstan.outcome == 'success'
118-
run: composer test:phan
119-
120-
# This step runs static analysis with Psalm.
121-
- name: Run static analysis with psalm
122-
id: psalm
123-
if: steps.phan.outcome == 'success'
124-
run: composer test:psalm
125-
126120
# This step runs Rector for code quality.
127121
- name: Run rector for code quality
128122
id: rector
129-
if: steps.psalm.outcome == 'success'
123+
if: steps.phpstan.outcome == 'success'
130124
run: composer test:rector
131125

132126
release:

composer.json

Lines changed: 2 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -52,18 +52,15 @@
5252
"pestphp/pest": "^4.1",
5353
"pestphp/pest-plugin-drift": "^4.0",
5454
"pestphp/pest-plugin-type-coverage": "^4.0",
55-
"phan/phan": ">=5.5.1",
5655
"php-parallel-lint/php-parallel-lint": ">=1.4.0",
5756
"phpmd/phpmd": ">=2.15",
5857
"phpstan/extension-installer": ">=1.4.3",
5958
"phpstan/phpstan": ">=2.1.22",
6059
"phpstan/phpstan-strict-rules": ">=2.0.6",
61-
"psalm/plugin-phpunit": ">=0.19.3",
6260
"rector/rector": ">=2.1.4",
6361
"rector/type-perfect": "^2.1",
6462
"roave/security-advisories": "dev-latest",
65-
"tomasvotruba/type-coverage": "^2.0",
66-
"vimeo/psalm": ">=6.7"
63+
"tomasvotruba/type-coverage": "^2.0"
6764
},
6865
"scripts-descriptions": {
6966
"test:code-style": "Check code for stylistic consistency using Laravel Pint",
@@ -96,8 +93,6 @@
9693
"@test:pest",
9794
"@test:infection",
9895
"@test:phpstan",
99-
"@test:phan",
100-
"@test:psalm",
10196
"@test:rector"
10297
],
10398
"test:code-style": "pint --test",
@@ -114,9 +109,7 @@
114109
"fix:code-style": "pint",
115110
"fix:rector": "rector",
116111
"analyse:all": [
117-
"@test:phpstan",
118-
"@test:psalm",
119-
"@test:phan"
112+
"@test:phpstan"
120113
]
121114
}
122115
}

docker-compose.yml

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -22,17 +22,6 @@ services:
2222
extends: tests
2323
command: composer test:phpstan
2424

25-
test-psalm:
26-
extends: tests
27-
command: composer test:psalm
28-
29-
test-phan:
30-
extends: tests
31-
environment:
32-
- PHAN_DISABLE_XDEBUG_WARN=1
33-
- PHAN_ALLOW_XDEBUG=1
34-
command: composer test:phan
35-
3625
test-phpmd:
3726
extends: tests
3827
command: composer test:phpmd
@@ -62,8 +51,6 @@ services:
6251
composer test:lint &&
6352
composer test:code-style &&
6453
composer test:phpstan &&
65-
composer test:psalm &&
66-
composer test:phan &&
6754
composer test:phpmd &&
6855
composer test:pest &&
6956
composer test:rector &&

docs/performance.md

Lines changed: 29 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -20,44 +20,39 @@ Benchmarks and optimisation details for the StringManipulation library.
2020

2121
## Overview
2222

23-
The StringManipulation library has undergone extensive performance tuning, resulting in **2-5x speed improvements** through O(n) optimisation algorithms. All core methods are designed with predictable, linear performance scaling.
23+
The StringManipulation library has undergone extensive performance tuning, resulting in **2-5x speed improvements** through O(n) optimisation algorithms and pre-computed compile-time constants. All core methods are designed with predictable, linear performance scaling.
2424

2525
---
2626

2727
## Benchmarks
2828

29-
| Method | Operations/Second | Complexity | Optimisation |
30-
|:------------------|:------------------|:-----------|:--------------------------------|
31-
| `removeAccents()` | **~450,000** | O(n) | Hash table lookups with strtr() |
32-
| `searchWords()` | **~195,000** | O(n) | Single-pass combined mapping |
33-
| `nameFix()` | **~130,000** | O(n) | Consolidated regex operations |
29+
| Method | Operations/Second | Complexity | Optimisation |
30+
|:------------------|:------------------|:-----------|:------------------------------------|
31+
| `removeAccents()` | **~750,000** | O(n) | Pre-computed constant with strtr() |
32+
| `searchWords()` | **~400,000** | O(n) | Pre-computed constant with strtr() |
33+
| `nameFix()` | **~180,000** | O(n) | Consolidated regex operations |
3434

3535
*Benchmarks measured in Docker with PHP 8.3. Actual performance varies based on hardware, string length, and character complexity.*
3636

3737
---
3838

3939
## Optimisation Techniques
4040

41-
### Hash Table Lookups
41+
### Pre-Computed Constants
4242

43-
The `removeAccents()` method uses PHP's `strtr()` function with a pre-built character mapping array. This provides O(1) lookup time for each character, resulting in overall O(n) complexity.
43+
The `removeAccents()` and `searchWords()` methods use PHP's `strtr()` function with pre-computed compile-time constants. This eliminates runtime array construction overhead and provides O(1) lookup time for each character, resulting in overall O(n) complexity.
4444

4545
```php
46-
// Internal implementation concept
47-
private static ?array $accentsReplacement = null;
46+
// Pre-computed at compile time - no runtime overhead
47+
private const array ACCENT_MAPPING = [
48+
'À' => 'A', 'Á' => 'A', 'Â' => 'A', // ... full mapping
49+
'à' => 'a', 'á' => 'a', 'â' => 'a', // ... preserves case
50+
];
4851

4952
public static function removeAccents(string $str): string
5053
{
51-
// Lazy initialisation - build mapping once
52-
if (self::$accentsReplacement === null) {
53-
self::$accentsReplacement = array_combine(
54-
self::REMOVE_ACCENTS_FROM,
55-
self::REMOVE_ACCENTS_TO
56-
);
57-
}
58-
59-
// O(n) string traversal with O(1) lookups
60-
return strtr($str, self::$accentsReplacement);
54+
// Single strtr() call with O(1) lookups per character
55+
return strtr($str, self::ACCENT_MAPPING);
6156
}
6257
```
6358

@@ -71,16 +66,18 @@ The `searchWords()` method performs all transformations in a single pass through
7166

7267
This reduces memory allocations and cache misses compared to chaining multiple operations.
7368

74-
### Static Caching
69+
### Compile-Time Constants
7570

76-
Character mapping tables are stored as static properties and initialised lazily. Subsequent calls reuse the cached data:
71+
Character mapping tables are defined as typed constants (`private const array`), computed at compile time by PHP. This provides:
7772

78-
```php
79-
// First call: builds and caches mapping
80-
$result1 = StringManipulation::removeAccents('Cafe');
73+
- **Zero first-call overhead**: No lazy initialisation required
74+
- **Guaranteed consistency**: Constants cannot be modified at runtime
75+
- **Optimal memory usage**: PHP optimises constant storage
8176

82-
// Subsequent calls: uses cached mapping
83-
$result2 = StringManipulation::removeAccents('Munchen');
77+
```php
78+
// Every call uses the same pre-computed constant
79+
$result1 = StringManipulation::removeAccents('Cafe'); // Fast
80+
$result2 = StringManipulation::removeAccents('Munchen'); // Equally fast
8481
```
8582

8683
### Consolidated Regex Operations
@@ -193,14 +190,13 @@ $search = StringManipulation::searchWords($name);
193190
$search = StringManipulation::searchWords($name);
194191
```
195192

196-
### Pre-warm Cache for Critical Paths
193+
### No Warm-up Required
197194

198-
If first-call latency matters, pre-warm the caches during application bootstrap:
195+
Unlike libraries that use lazy initialisation, StringManipulation uses compile-time constants. There is no first-call penalty, so no warm-up is needed:
199196

200197
```php
201-
// In bootstrap.php or service provider
202-
StringManipulation::removeAccents('warmup');
203-
StringManipulation::searchWords('warmup');
198+
// First call is just as fast as subsequent calls
199+
$result = StringManipulation::removeAccents($userInput);
204200
```
205201

206202
---
@@ -211,7 +207,7 @@ The library outperforms common alternatives:
211207

212208
| Library/Approach | removeAccents equivalent | Notes |
213209
|:-----------------|:-------------------------|:------|
214-
| StringManipulation | ~450,000 ops/sec | Optimised strtr() |
210+
| StringManipulation | ~750,000 ops/sec | Pre-computed constant with strtr() |
215211
| Manual preg_replace | ~150,000 ops/sec | Multiple regex passes |
216212
| iconv transliteration | ~200,000 ops/sec | System-dependent |
217213
| Multiple str_replace | ~100,000 ops/sec | Linear per pattern |

0 commit comments

Comments
 (0)