Skip to content

Fix imports for fake op wrappers used in export#366

Closed
geoffreyQiu wants to merge 1 commit intoNVIDIA:mainfrom
geoffreyQiu:junyiq/fix_fake_ops_import
Closed

Fix imports for fake op wrappers used in export#366
geoffreyQiu wants to merge 1 commit intoNVIDIA:mainfrom
geoffreyQiu:junyiq/fix_fake_ops_import

Conversation

@geoffreyQiu
Copy link
Copy Markdown
Collaborator

Description

  • Add imports for fake op wrappers used for torch.export
  • Add directives to avoid formatter removing/reordering the import

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 15, 2026

Greptile Summary

This PR fixes missing imports for fake op wrapper modules (dynamicemb meta modules, hstu_cuda_ops, fake_hstu_cuda_ops, and hstu.hstu_ops_gpu) required for torch.export to work correctly. It uses # isort: off/on guards in exportable_embedding.py to preserve the load-order dependency between the real op registration and the fake impl registration.

Confidence Score: 5/5

Safe to merge — changes are additive import fixes with no logic modifications.

Both files receive only import additions. The load-order invariant (real ops before fake impls) is correctly preserved via isort guards in exportable_embedding.py. In fused_hstu_op.py the new hstu.hstu_ops_gpu import is a submodule of hstu, so Python's package initialization guarantees the parent's init.py runs first regardless of line order, making an isort guard unnecessary there. No logic changes, no regressions.

No files require special attention.

Important Files Changed

Filename Overview
examples/hstu/modules/exportable_embedding.py Adds isort-protected import block registering fake op implementations (dynamicemb meta modules, hstu_cuda_ops, fake_hstu_cuda_ops) needed for torch.export; also repositions the section divider comment.
examples/hstu/ops/fused_hstu_op.py Adds import of hstu.hstu_ops_gpu immediately after import hstu to register fake op implementations for torch.export; no isort guard added but import order is naturally safe due to Python submodule initialization semantics.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Module import starts] --> B[_load_inference_emb_ops\nloads inference_emb_ops.so\nregisters torch.ops.INFERENCE_EMB.*]
    B --> C[import dynamicemb.index_range_meta\nregisters fake impls for torch.export]
    C --> D[import dynamicemb.lookup_meta\nregisters fake impls for torch.export]
    D --> E[import hstu_cuda_ops\nregisters torch.ops.hstu_cuda_ops.*]
    E --> F[import commons.ops.cuda_ops.fake_hstu_cuda_ops\nregisters fake impls for torch.export]
    F --> G[isort: on - Normal imports resume]
    G --> H[ExportableEmbedding class available]

    subgraph fused_hstu_op.py
        I[import hstu\nregisters torch.ops.fbgemm.*] --> J[import hstu.hstu_ops_gpu\nregisters fake impls for torch.export]
    end
Loading

Reviews (1): Last reviewed commit: "Fix imports for fake ops wrapper used in..." | Re-trigger Greptile

@JacoCheung
Copy link
Copy Markdown
Collaborator

/build

1 similar comment
@JacoCheung
Copy link
Copy Markdown
Collaborator

/build

@JacoCheung JacoCheung self-requested a review April 15, 2026 15:51
@JacoCheung
Copy link
Copy Markdown
Collaborator

JacoCheung commented Apr 15, 2026

Pipeline #48606612 -- failed

Job Status Log
pre_check ✅ success view
train_build ✅ success view
inference_build ✅ success view
tritonserver_build ✅ success view
build_whl ✅ success view
dynamicemb_test_fwd_bwd_8gpus ✅ success view
dynamicemb_test_load_dump_8gpus ✅ success view
unit_test_1gpu_a100 ❌ failed view
unit_test_1gpu_h100 ❌ failed view
unit_test_4gpu ❌ failed view
unit_test_tp_4gpu ❌ failed view
L20_unit_test_1gpu ✅ success view
inference_unit_test_1gpu ❌ failed view
inference_test_1gpu ❌ failed view

Result: 8/14 jobs passed

View full pipeline

@JacoCheung
Copy link
Copy Markdown
Collaborator

Close as meged to #363

@JacoCheung JacoCheung closed this Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants