perf[next-dace]: Update thread block size to (256, 1) by iomaganaris · Pull Request #2598 · GridTools/gt4py

iomaganaris · 2026-05-11T14:32:40Z

Update thread block size to (256, 1) as a default. Thanks to @dganellari for finding this out 👍
Initially this seemed like it improves MI300A performance but I've also tested GH200 and A100 CPUs with blueline in a single node (icon_ch1-medium-stencils experiment) and I saw improvements in some stencils.
Here are the results per GT4Py program in GH200:

And here are the results from icon_ch1-medium that reports the timer of the whole dycore (taking the min timer for nh_solve from each output) from A100:

| case            | t_min (s)  | acceleration vs ACC |
|:----------------|:----------:|:-------------------:|
| ACC             | 0.00615907 |        1.000        |
| icon4py default | 0.00517106 |        1.191        |
| icon4py 256x1   | 0.00480199 |        1.282        |

I thought that since we see improvements in all cases we can use it as the default in GT4Py

Co-authored-by: Daniel Ganellari <gadaniel@ethz.ch>

philip-paul-mueller

LGTM.

However, I am not sure if it is also set somewhere in ICON4Py.
But from the numbers you posted I do not think that.

edopao

Very good results, and nice color gradients in the plot!
However, I am not sure whether these default settings should be set in gt4py or in icon4py, in model_backends.py. To me, it feels more an application default, and it should belong to icon4py.

iomaganaris · 2026-05-12T07:37:17Z

However, I am not sure whether these default settings should be set in gt4py or in icon4py, in model_backends.py. To me, it feels more an application default, and it should belong to icon4py.

I had the same trouble figuring out where this should be set. Since we have to set a default value in GT4Py for that though I thought it's better to set there the best values we know until now. If you think these should still go to icon4py let me know. I don't have a strong opinion tbh

philip-paul-mueller · 2026-05-12T07:41:21Z

I also kind of think that it should be in ICON4Py, the question is just what value do we use in GT4Py?

We could use the old one for compatibility.
We could make it mandatory and force the user to make a choice.
We could use the new value.

edopao · 2026-05-12T07:50:38Z

I also kind of think that it should be in ICON4Py, the question is just what value do we use in GT4Py?
1. We could use the old one for compatibility.

2. We could make it mandatory and force the user to make a choice.

3. We could use the new value.

Adding a fourth option:
4. We could set default None, so use the dace default value.

I vote for either 1 or 4, but I don't have a strong opinion.

iomaganaris · 2026-05-12T09:58:00Z

I think that 1. might add confusion so I would propose 4.
Thanks a lot for the suggestions 👍

Update thread block size to (256, 1)

dcadae6

Co-authored-by: Daniel Ganellari <gadaniel@ethz.ch>

iomaganaris requested review from dganellari, edopao, havogt and philip-paul-mueller May 11, 2026 14:32

philip-paul-mueller approved these changes May 12, 2026

View reviewed changes

edopao reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf[next-dace]: Update thread block size to (256, 1)#2598

perf[next-dace]: Update thread block size to (256, 1)#2598
iomaganaris wants to merge 1 commit into
mainfrom
update_threadblocksize

iomaganaris commented May 11, 2026 •

edited

Loading

Uh oh!

philip-paul-mueller left a comment

Uh oh!

edopao left a comment •

edited

Loading

Uh oh!

iomaganaris commented May 12, 2026

Uh oh!

philip-paul-mueller commented May 12, 2026

Uh oh!

edopao commented May 12, 2026

Uh oh!

iomaganaris commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

iomaganaris commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

philip-paul-mueller left a comment

Choose a reason for hiding this comment

Uh oh!

edopao left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iomaganaris commented May 12, 2026

Uh oh!

philip-paul-mueller commented May 12, 2026

Uh oh!

edopao commented May 12, 2026

Uh oh!

iomaganaris commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

iomaganaris commented May 11, 2026 •

edited

Loading

edopao left a comment •

edited

Loading