Commit 9d3aa0c
Update documentation for 120+ CUDA intrinsics expansion
- Add CUDA Codegen Intrinsics Expansion section to CHANGELOG
- Update README with 120+ intrinsics count and 3D stencil patterns
- Update docs/13-cuda-codegen.md with complete intrinsics reference
- Fix clippy excessive_precision warnings in dsl.rs erf() function
- Format code with cargo fmt
Changes from merged PR #8:
- Expanded GPU intrinsics from ~45 to 120+ operations
- Added 11 atomic operations (and, or, xor, inc, dec, etc.)
- Added 3D stencil intrinsics (up, down, at with dz)
- Added warp match/reduce operations (Volta+/SM 8.0+)
- Added bit manipulation, memory, special, and timing ops
- Updated tests from 143 to 171
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>1 parent 9081dbc commit 9d3aa0c
6 files changed
Lines changed: 360 additions & 98 deletions
File tree
- crates/ringkernel-cuda-codegen/src
- docs
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
24 | 39 | | |
25 | 40 | | |
26 | 41 | | |
| 42 | + | |
27 | 43 | | |
28 | 44 | | |
29 | 45 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
463 | 463 | | |
464 | 464 | | |
465 | 465 | | |
466 | | - | |
| 466 | + | |
467 | 467 | | |
468 | 468 | | |
469 | | - | |
| 469 | + | |
| 470 | + | |
470 | 471 | | |
471 | 472 | | |
472 | 473 | | |
473 | 474 | | |
474 | | - | |
| 475 | + | |
475 | 476 | | |
476 | 477 | | |
477 | 478 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
160 | 160 | | |
161 | 161 | | |
162 | 162 | | |
163 | | - | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
164 | 168 | | |
165 | 169 | | |
166 | 170 | | |
167 | 171 | | |
168 | 172 | | |
169 | 173 | | |
170 | | - | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
171 | 179 | | |
172 | 180 | | |
173 | 181 | | |
174 | 182 | | |
175 | 183 | | |
176 | 184 | | |
177 | | - | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
178 | 190 | | |
179 | 191 | | |
180 | 192 | | |
| |||
585 | 597 | | |
586 | 598 | | |
587 | 599 | | |
588 | | - | |
589 | | - | |
590 | | - | |
591 | | - | |
592 | | - | |
593 | | - | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
594 | 606 | | |
595 | 607 | | |
596 | 608 | | |
| |||
654 | 666 | | |
655 | 667 | | |
656 | 668 | | |
657 | | - | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
658 | 674 | | |
659 | 675 | | |
660 | 676 | | |
| |||
670 | 686 | | |
671 | 687 | | |
672 | 688 | | |
673 | | - | |
| 689 | + | |
| 690 | + | |
| 691 | + | |
| 692 | + | |
| 693 | + | |
674 | 694 | | |
675 | 695 | | |
676 | 696 | | |
| |||
788 | 808 | | |
789 | 809 | | |
790 | 810 | | |
791 | | - | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
792 | 816 | | |
793 | 817 | | |
794 | 818 | | |
795 | 819 | | |
796 | 820 | | |
797 | | - | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
798 | 826 | | |
799 | 827 | | |
800 | 828 | | |
801 | 829 | | |
802 | 830 | | |
803 | | - | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
804 | 836 | | |
805 | 837 | | |
806 | 838 | | |
| |||
1059 | 1091 | | |
1060 | 1092 | | |
1061 | 1093 | | |
1062 | | - | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
1063 | 1098 | | |
1064 | 1099 | | |
1065 | 1100 | | |
| |||
0 commit comments