Add new code generator from AVRO to PHP#3708
Conversation
…o add-schema-to-php-generator
There was a problem hiding this comment.
Pull request overview
This PR adds a PHP code-generation pipeline to the Avro PHP implementation, including a new avro CLI command, a schema-to-PHP code generator, and a “specific” datum writer for serializing generated classes.
Changes:
- Add
AvroCodeGenerator(schema → PHP records/enums) andGenerateCommand+lang/php/bin/avroCLI entrypoint. - Add
AvroSpecificDatumWriterto serialize generated record objects/enums to Avro binary. - Add PHPUnit coverage and fixtures; update PHP tooling config (PHPStan/CS Fixer) and CI paths; add new Composer dependencies.
Reviewed changes
Copilot reviewed 18 out of 30 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| lang/php/test/Generator/AvroCodeGeneratorTest.php | Adds extensive unit tests asserting generated PHP output for many schema shapes |
| lang/php/test/Fixtures/Schemas/user.avsc | Fixture schema for CLI generation tests (record) |
| lang/php/test/Fixtures/Schemas/status.avsc | Fixture schema for CLI generation tests (enum) |
| lang/php/test/Fixtures/Generated/User.php | Generated-class fixture used by specific writer tests |
| lang/php/test/Fixtures/Generated/Team.php | Generated-class fixture used by specific writer tests |
| lang/php/test/Fixtures/Generated/Task.php | Generated-class fixture used by specific writer tests |
| lang/php/test/Fixtures/Generated/Profile.php | Generated-class fixture used by specific writer tests |
| lang/php/test/Fixtures/Generated/Priority.php | Generated-enum fixture used by specific writer tests |
| lang/php/test/Fixtures/Generated/Order.php | Generated-class fixture used by specific writer tests |
| lang/php/test/Fixtures/Generated/Metadata.php | Generated-class fixture used by specific writer tests |
| lang/php/test/Fixtures/Generated/Member.php | Generated-class fixture used by specific writer tests |
| lang/php/test/Fixtures/Generated/FuelType.php | Generated-enum fixture used by specific writer tests |
| lang/php/test/Fixtures/Generated/Car.php | Generated-class fixture used by specific writer tests |
| lang/php/test/Fixtures/Generated/Address.php | Generated-class fixture used by specific writer tests |
| lang/php/test/Datum/AvroSpecificDatumWriterTest.php | Adds round-trip and byte-equivalence tests for the new specific writer |
| lang/php/test/Console/GenerateCommandTest.php | Adds Symfony Console command tests for the generator CLI |
| lang/php/README.md | Documents the new code generation CLI usage and output |
| lang/php/phpstan.neon | Includes bin/ in PHPStan scan paths |
| lang/php/lib/Schema/AvroSchema.php | Fixes JSON decoding usage and wraps JSON parse errors in AvroSchemaParseException |
| lang/php/lib/Schema/AvroNamedSchema.php | Exposes short name() accessor for named schemas |
| lang/php/lib/Schema/AvroName.php | Exposes short name() accessor for schema names |
| lang/php/lib/Schema/AvroEnumSchema.php | Adds return types / minor signature tightening for enum helpers |
| lang/php/lib/Generator/AvroCodeGeneratorException.php | Adds generator-specific exception type |
| lang/php/lib/Generator/AvroCodeGenerator.php | Implements schema traversal + PHP AST generation via nikic/php-parser |
| lang/php/lib/Datum/AvroSpecificDatumWriter.php | Implements object-based datum writing according to schema |
| lang/php/lib/Console/GenerateCommand.php | Adds generate console command for file/directory schema inputs |
| lang/php/bin/avro | Adds a composer-bin CLI entrypoint for the generator |
| lang/php/.php-cs-fixer.dist.php | Includes bin/ in code style fixer scope |
| composer.json | Adds nikic/php-parser + symfony/console, registers the avro binary, updates archive include list |
| .github/workflows/test-lang-php.yml | Narrows path filters to PHP-relevant paths and bumps Composer tool version |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
There was a problem hiding this comment.
Please add some documentation about the purpose of this file.
| }; | ||
|
|
||
| if (null !== $node && $registeredSchema instanceof AvroNamedSchema) { | ||
| $filename = $path.'/'.ucwords($registeredSchema->name()).'.php'; |
There was a problem hiding this comment.
This should probably use the schema fullname, i.e. namespace + name. Currently it will overwrite the file if there are two schemas with the same name in different namespaces.
There was a problem hiding this comment.
What do you think about providing only a root namespace for the generated files (e.g. \MyExternalService), and the remaining part of the namespace is the one provided by the Avro schema (e.g. Something.Domain.BlaBla) and the namespace will be something like \MyExternalService\Something\Domain\BlaBla\?
The change will be reflected on the path on disk to be autoload-compliant
| string $phpNamespace, | ||
| array $values | ||
| ): Node { | ||
| $className = ucwords($avroEnum->name()); |
There was a problem hiding this comment.
Same issue here - should the fullname be used ?
| $enum = $this->factory->enum($className)->setScalarType('string'); | ||
|
|
||
| foreach ($values as $value) { | ||
| $caseName = strtoupper($value); |
There was a problem hiding this comment.
Names in Avro are case-sensitive, i.e. schemas with names "name" and "Name" are OK to exist at the same time.
strtoupper() will lead to a collision in the PHP class name.
| } | ||
| } | ||
|
|
||
| if ([] !== $written) { |
There was a problem hiding this comment.
An empty directory (i.e. files == []) will be reported as success. Is this intentional ?
I am not sure how the other SDK generators behave.
| use Composer\InstalledVersions; | ||
| use Symfony\Component\Console\Application; | ||
|
|
||
| $version = InstalledVersions::getPrettyVersion('apache/avro'); |
There was a problem hiding this comment.
| $version = InstalledVersions::getPrettyVersion('apache/avro'); | |
| try { | |
| $version = InstalledVersions::getPrettyVersion('apache/avro') ?? 'unknown'; | |
| } catch (\Throwable) { | |
| $version = 'unknown'; | |
| } |
|
|
||
| private function collectSchemas(AvroSchema $schema): void | ||
| { | ||
| switch ($schema::class) { |
There was a problem hiding this comment.
Elsewhere you use match (true) { $schema instanceof Foo => ...}.
Why this is different ?
| ->addOption('file', 'f', InputOption::VALUE_OPTIONAL, 'One Avro schema file (.avsc)') | ||
| ->addOption('directory', 'd', InputOption::VALUE_OPTIONAL, 'A directory containing multiple Avro schema files (.avsc)') | ||
| ->addOption('output', 'o', InputOption::VALUE_REQUIRED, 'Output directory for generated PHP files') | ||
| ->addOption('namespace', 'ns', InputOption::VALUE_REQUIRED, 'PHP namespace for generated classes'); |
There was a problem hiding this comment.
| ->addOption('namespace', 'ns', InputOption::VALUE_REQUIRED, 'PHP namespace for generated classes'); | |
| ->addOption('namespace', 'n', InputOption::VALUE_REQUIRED, 'PHP namespace for generated classes'); |
Usually the short options are single character
There was a problem hiding this comment.
The -n is already taken by the Symfony Console Component
-n, --no-interaction Do not ask any interactive question
We could use a capital N, wdyt?
| public function hasSymbol($symbol) | ||
| public function hasSymbol($symbol): bool | ||
| { | ||
| return in_array($symbol, $this->symbols); |
There was a problem hiding this comment.
| return in_array($symbol, $this->symbols); | |
| return is_string($symbol) && in_array($symbol, $this->symbols, true); |
a bit more strict check
| $tester->execute([ | ||
| '--file' => $this->schemaPath('user.avsc'), | ||
| '--output' => $this->outputDir, | ||
| '--namespace' => 'My\\App\\Avro', | ||
| ]); | ||
|
|
There was a problem hiding this comment.
| $tester->execute([ | |
| '--file' => $this->schemaPath('user.avsc'), | |
| '--output' => $this->outputDir, | |
| '--namespace' => 'My\\App\\Avro', | |
| ]); | |
| $exitCode = $tester->execute([ | |
| '--file' => $this->schemaPath('user.avsc'), | |
| '--output' => $this->outputDir, | |
| '--namespace' => 'My\\App\\Avro', | |
| ]); | |
| self::assertSame(Command::SUCCESS, $exitCode); | |
| self::assertFileExists($this->outputDir.'/User.php'); | |
What is the purpose of the change
Adds an Avro to PHP code generator and
AvroSpecificDatumWriterto serialise autogenerated classes in AvroVerifying this change
New feature with new tests
Documentation