Skip to content

use fast path in blendSegment to bump up FPS#5464

Open
DedeHai wants to merge 1 commit intowled:mainfrom
DedeHai:blenSegment_fastpath
Open

use fast path in blendSegment to bump up FPS#5464
DedeHai wants to merge 1 commit intowled:mainfrom
DedeHai:blenSegment_fastpath

Conversation

@DedeHai
Copy link
Copy Markdown
Collaborator

@DedeHai DedeHai commented Mar 30, 2026

currently blendSegment is "the bottleneck" in the rendering pipeline. I ran a breakdown of times (in µs):

  • total=8958
  • effect rendering: 6156
  • show: 2677 = blend 1359 + pixelstobus 1318

this is for a single segment. blending is slow for a "just copy the segment buffer"

with this PR it becomes:

  • total=8091
  • effect rendering: 6536 (fluctuates)
  • show: 14258 = blend 105 + pixelstobus 1320

Improvements in FPS:

  • 2 overlapping segments: 76.6FPS -> 84.5FPS
  • single segment: 103FPS -> 110FPS

i.e. up to 10% faster.

test run on C3, 2 outputs 256 pixels each, 32x16 matrix using PS-Fire

Summary by CodeRabbit

  • Refactor
    • Optimized LED animation rendering performance through enhanced pixel blending operations and memory management, particularly improving fade transition efficiency when grouping and mirroring features are not in use.
    • Streamlined pixel clearing and processing operations to reduce computational overhead and improve overall animation rendering speed while maintaining backward compatibility with all existing effects.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 30, 2026

Walkthrough

This PR optimizes the segment blending pipeline by introducing a fast-path in blendSegment() for fade transitions without grouping, mirroring, or CCT pixels, and replaces the per-element pixel-clearing loop with memset() in show().

Changes

Cohort / File(s) Summary
Segment Blending Optimization
wled00/FX_fcn.cpp
Introduced fast-path optimization in blendSegment() bypassing slow blending logic when conditions permit (no old segment, fade transition, no grouping/mirroring/CCT); includes special handling for Reverse/ReverseY and 2D transpose; refactored progress/opacity computation; removed redundant variable declaration in slow path.
Pixel Clearing Efficiency
wled00/FX_fcn.cpp
Replaced per-element loop clearing _pixels[i] = BLACK with single memset() call in show().

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • willmmiles
  • softhack007
  • blazoncek
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: introducing a fast path in blendSegment to improve FPS performance, which is the primary objective of the pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@DedeHai
Copy link
Copy Markdown
Collaborator Author

DedeHai commented Mar 30, 2026

just FYI: ported this improvement over to my WLEDbus driver branch, I do get ~100FPS for the 2 overlapping segments which come in at 84.5FPS using NPB.

@DedeHai DedeHai requested a review from softhack007 March 31, 2026 05:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant