Skip to content

Add: paged_attention_taskring CI case for small ring buffer coverage#148

Merged
ChaoWao merged 1 commit into
mainfrom
test/paged-attention-taskring-ci
Mar 1, 2026
Merged

Add: paged_attention_taskring CI case for small ring buffer coverage#148
ChaoWao merged 1 commit into
mainfrom
test/paged-attention-taskring-ci

Conversation

@ChaoWao
Copy link
Copy Markdown
Collaborator

@ChaoWao ChaoWao commented Mar 1, 2026

Summary

  • Add a new device test paged_attention_taskring that reuses the existing paged attention kernels and orchestration but configures small ring buffer sizes via RUNTIME_ENV
  • Exercises CAS-based watermark advancement (task_window=16), heap ring wrapping (1MB), and dependency pool wrapping (256 entries)
  • Uses batch=64 to generate enough tasks to stress slot reuse while keeping CI runtime reasonable

Testing

  • Simulation tests pass
  • Hardware tests pass

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new continuous integration (CI) test case designed to rigorously test the ring buffer management within the paged attention implementation. By configuring intentionally small ring buffer sizes, the test ensures robust handling of memory wrapping, task slot reuse, and dependency pool management under stress, thereby enhancing the stability and reliability of the paged attention system.

Highlights

  • New Device Test: Added a new device test, paged_attention_taskring, specifically designed to stress ring buffer mechanisms within the paged attention implementation.
  • Ring Buffer Stress Configuration: Configured intentionally small ring buffer sizes for the task window (16 slots), heap (1MB), and dependency pool (256 entries) via RUNTIME_ENV to induce wrapping and heavy slot reuse.
  • Kernel and Orchestration Reuse: The new test reuses existing paged attention kernels and orchestration, focusing on configuration changes to stress the ring buffer system.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • tests/device_tests/tensormap_and_ringbuffer/paged_attention_taskring/golden.py
    • Added a new golden reference implementation for the paged attention task ring stress test.
    • Included generate_inputs and compute_golden functions for test data generation and expected output calculation.
  • tests/device_tests/tensormap_and_ringbuffer/paged_attention_taskring/kernels/kernel_config.py
    • Added a new kernel configuration file for the paged attention task ring stress test.
    • Configured RUNTIME_ENV with small PTO2_RING_TASK_WINDOW, PTO2_RING_HEAP, and PTO2_RING_DEP_POOL values to stress ring buffer behavior.
    • Reused existing paged attention orchestration and kernel sources.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new CI test case, paged_attention_taskring, designed to stress-test the ring buffer functionality by configuring small buffer sizes. The approach of reusing existing paged attention kernels and orchestration while overriding runtime environment variables is sound. The changes effectively target the goal of exercising ring buffer wrapping and slot reuse. I have a few minor suggestions to improve documentation clarity and code style.

Comment thread tests/device_tests/tensormap_and_ringbuffer/paged_attention_taskring/golden.py Outdated
@ChaoWao ChaoWao force-pushed the test/paged-attention-taskring-ci branch from a3501d0 to 960a6a0 Compare March 1, 2026 09:02
Reuse existing paged_attention kernels/orchestration with small ring
buffer sizes (task_window=16, heap=1MB, dep_pool=256) to exercise
CAS-based watermark advancement, heap wrapping, and dependency pool
wrapping under stress.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ChaoWao ChaoWao force-pushed the test/paged-attention-taskring-ci branch from 960a6a0 to 232ac53 Compare March 1, 2026 09:08
@ChaoWao ChaoWao merged commit 2fde2a2 into main Mar 1, 2026
3 checks passed
@ChaoWao ChaoWao deleted the test/paged-attention-taskring-ci branch March 1, 2026 09:10
PKUZHOU pushed a commit to PKUZHOU/simpler that referenced this pull request Mar 31, 2026
…w-native-sys#148)

Reuse existing paged_attention kernels/orchestration with small ring
buffer sizes (task_window=16, heap=1MB, dep_pool=256) to exercise
CAS-based watermark advancement, heap wrapping, and dependency pool
wrapping under stress.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant