Skip to content

flash-attn2: prefer using add_op_namespace_prefix#877

Merged
drbh merged 6 commits into
mainfrom
fix-op-prefixes
Jun 2, 2026
Merged

flash-attn2: prefer using add_op_namespace_prefix#877
drbh merged 6 commits into
mainfrom
fix-op-prefixes

Conversation

@drbh
Copy link
Copy Markdown
Collaborator

@drbh drbh commented May 19, 2026

This PR fixes flash-attn2 to correctly register fake ops

sayakpaul
sayakpaul previously approved these changes May 19, 2026
Copy link
Copy Markdown
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@drbh
Copy link
Copy Markdown
Collaborator Author

drbh commented May 20, 2026

note this PR has been updated to remove the unused ops folder that included non flash attn code that was unused. It also cleans up the exposed functions to removed the unused low levels functions in init.

the kernel now only exposes the core top level functions (listed in all) and passes the python kernels/nix-builder/pkgs/torch-ops-check/torch-ops-check-hook.py kernels-community/flash-attn2/torch-ext check add in huggingface/kernels#569

Copy link
Copy Markdown
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why the windows build would fail?

@drbh
Copy link
Copy Markdown
Collaborator Author

drbh commented May 20, 2026

Any reason why the windows build would fail?

not 100% sure at the moment but seems to be related to the xpu path on windows. In general the windows build workflow may need some tweaks since it has some custom logic that diverges from the standard kernel-builder nix path.

gonna take a look and see if there is a small change to resolve - otherwise fixing the workflow may be best to tackle in another PR

@sayakpaul
Copy link
Copy Markdown
Member

Works for me.

@drbh
Copy link
Copy Markdown
Collaborator Author

drbh commented May 20, 2026

added a small PR to skip the windows xpu backend for flash attn2 since there seems to be a bug related to the cutlass fork, its possible that merging that PR and rebasing this PR on top will avoid the xpu windows build and enable the windows cuda build to succeed.. #885

danieldk
danieldk previously approved these changes May 20, 2026
@drbh drbh force-pushed the fix-op-prefixes branch from 70249ba to 780299f Compare May 22, 2026 06:57
)


def fwd(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an API break, we need to bump up the version if we remove these. Is removal necessary?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, Ive added the low level functions back and exposed them in all

@vasqu
Copy link
Copy Markdown
Collaborator

vasqu commented May 27, 2026

Based on #894, we likely also need to sync with the latest stable version 2.8.3 instead of main - the issue(s) are described there

@drbh drbh merged commit 7ac8ea1 into main Jun 2, 2026
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants