Skip to content

Support non-sequence meta-features in PyGrain packing transformations.#1334

Open
copybara-service[bot] wants to merge 1 commit into
mainfrom
test_926448116
Open

Support non-sequence meta-features in PyGrain packing transformations.#1334
copybara-service[bot] wants to merge 1 commit into
mainfrom
test_926448116

Conversation

@copybara-service
Copy link
Copy Markdown

@copybara-service copybara-service Bot commented Jun 4, 2026

Support non-sequence meta-features in PyGrain packing transformations.

Previously, only sequence meta-features (defined in length_struct) were supported. Non-sequence meta-features (in meta_features but not in length_struct) were stripped or caused errors during packing.

This change adds support for non-sequence meta-features in both FirstFit and BestFit packing methods:

  • Python implementation (PackedBatch): Non-sequence meta-features are identified, accumulated as lists during packing, and yielded as 1D numpy object arrays of lists.
  • Python Iterator (PackingDatasetIterator): The _combined_struct is updated to include non-sequence meta-features so they are not stripped from input elements.
  • Refactored common key-extraction logic into shared helper functions in packing_packed_batch.py.
  • Added unit tests in testing_util.py to verify FirstFit and BestFit with non-sequence meta-features (both fixed and variable shapes).
  • Updated docstrings in packing.py to document the behavior of sequence vs non-sequence meta-features.

Previously, only sequence meta-features (defined in `length_struct`) were supported. Non-sequence meta-features (in `meta_features` but not in `length_struct`) were stripped or caused errors during packing.

This change adds support for non-sequence meta-features in both FirstFit and BestFit packing methods:
- Python implementation (`PackedBatch`): Non-sequence meta-features are identified, accumulated as lists during packing, and yielded as 1D numpy object arrays of lists.
- Python Iterator (`PackingDatasetIterator`): The `_combined_struct` is updated to include non-sequence meta-features so they are not stripped from input elements.
- Refactored common key-extraction logic into shared helper functions in `packing_packed_batch.py`.
- Added unit tests in `testing_util.py` to verify FirstFit and BestFit with non-sequence meta-features (both fixed and variable shapes).
- Updated docstrings in `packing.py` to document the behavior of sequence vs non-sequence meta-features.

PiperOrigin-RevId: 926448116
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant