Skip to content

traffic_top: modernize UI with responsive layouts and performance monitoring#12758

Draft
bryancall wants to merge 22 commits into
apache:masterfrom
bryancall:bcall/traffic_top_improvements
Draft

traffic_top: modernize UI with responsive layouts and performance monitoring#12758
bryancall wants to merge 22 commits into
apache:masterfrom
bryancall:bcall/traffic_top_improvements

Conversation

@bryancall

Copy link
Copy Markdown
Contributor

Summary

This PR modernizes traffic_top with a complete UI redesign featuring:

  • Responsive multi-column layouts that adapt to terminal size (80x24, 120x40, 160x40)
  • 9 dedicated pages: Overview, Responses, Connections, Cache, SSL/TLS, Errors, Performance, Graphs, Help
  • Real-time ASCII graphs showing request rates, bandwidth, cache hit rates, and connections
  • HTTP milestone timing on the Performance page showing request lifecycle timing
  • Color-coded statistics with btop-style visual design
  • Unicode box drawing with ASCII fallback for compatibility
  • Batch mode (-b/-B) for scripted monitoring with JSON output option

Code Quality Improvements

  • Replaced goto statements with clean loop control
  • Added explicit SIGWINCH handler for responsive terminal resize
  • Replaced ink_assert with graceful error handling
  • Defined magic numbers as named constants
  • Added stat lookup table validation at startup (debug builds)
  • Moved <chrono> include from header to implementation
  • Added comprehensive code documentation
  • Fixed duplicate stats appearing on the same page in larger layouts
  • Updated help page with all keyboard shortcuts

Test plan

  • Build with ASAN: ninja traffic_top
  • Run autest: autest -f traffic_top_batch passes
  • Manual testing of all pages at different terminal sizes (80x24, 120x40, 160x40)
  • Verify batch mode output: traffic_top -b -c 1
  • Test ASCII mode toggle with 'a' key
  • Test window resize handling

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR comprehensively modernizes the traffic_top tool with a complete redesign featuring responsive multi-column layouts, real-time ASCII graphs, multiple dedicated pages for different metrics, and batch mode with JSON output. The changes represent a major architectural improvement from the original simple stats display to a feature-rich monitoring tool.

Key Changes

  • Replaced goto statements with structured control flow and signal handlers
  • Implemented responsive layouts adapting to terminal sizes (80x24, 120x40, 160x40+)
  • Added 9 dedicated pages for different stat categories with btop-style visualization
  • Introduced batch mode with JSON/text output for scripted monitoring

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/gold_tests/traffic_top/traffic_top_batch.test.py New test file for batch mode with JSON/text output validation
src/traffic_top/traffic_top.cc Refactored main with signal handlers, interactive/batch mode split
src/traffic_top/Stats.h New header with Stats class declaration and StatType enum
src/traffic_top/Stats.cc Stats implementation with RPC communication and metric calculations
src/traffic_top/StatType.h Enum defining stat types for calculation/display
src/traffic_top/Display.h Display class for curses-based TUI rendering
src/traffic_top/Display.cc Display implementation with responsive layouts and graphs
src/traffic_top/Output.h Output formatters for batch mode
src/traffic_top/Output.cc Text and JSON output implementations
src/traffic_top/README Updated documentation with usage examples
src/traffic_top/LAYOUT.md Detailed layout specifications for different terminal sizes
src/traffic_top/format_layout.py Python helper for generating layout documentation
src/traffic_top/format_graphs.py Python helper for graph formatting examples
src/traffic_top/CMakeLists.txt Updated build configuration for new source files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/gold_tests/traffic_top/traffic_top_batch.test.py
Comment thread src/traffic_top/Stats.cc
Comment thread src/traffic_top/Display.cc
Comment thread src/traffic_top/Output.cc
@bryancall bryancall self-assigned this Dec 15, 2025
@bryancall bryancall added this to the 10.2.0 milestone Dec 15, 2025
@bryancall bryancall requested a review from cmcfarlen February 21, 2026 21:05
@bryancall bryancall force-pushed the bcall/traffic_top_improvements branch from 2c5db71 to 1c61274 Compare March 16, 2026 16:39
Major refactoring of traffic_top to provide a modern, responsive terminal UI:

- Add responsive layouts for 80x24, 120x40, and 160x40 terminals
- Use Unicode box-drawing characters with rounded corners (ASCII fallback)
- Add btop-style graph page with Unicode block characters (▁▂▃▄▅▆▇█)
- Color gradient for graphs: blue→cyan→green→yellow→red
- Add history tracking for key metrics (last 120 samples)
- Reorganize stats into logical groups: Cache, Client, Origin, etc.
- Add batch mode (-b) with JSON output support (-j)
- Improve color coding for values by magnitude (K/M/G/T suffixes)

New files:
- Display.cc/h: ANSI terminal rendering with boxes, stats, graphs
- Stats.cc/h: Metric collection with history tracking
- Output.cc/h: Batch mode text/JSON output
- StatType.h: Stat type definitions
- LAYOUT.md: Documentation of exact terminal layouts
- format_layout.py, format_graphs.py: Layout generation tools

Press 'g' or '7' for the new graphs page.
Tests for the new traffic_top batch mode functionality:
- JSON output format validation
- Required JSON fields (timestamp, host)
- Text output format with headers
- Multiple iterations output
- Help and version flags
…ching

- Add Performance page (7/p) showing HTTP milestones in chronological order
- Add RateNsToMs stat type to convert nanosecond milestones to milliseconds
- Fix transaction time calculations (values already in ms, don't multiply)
- Add consistent color scheme across all layouts (CLIENT=cyan, ORIGIN=blue, etc.)
- Pass border colors to drawStatPairRow for consistent vertical borders
- Add bright border colors (Border4-7: bright blue, yellow, red, green)
- Improve status bar with blue background and colored mode indicator
- Auto-switch from absolute to rate mode once rates can be calculated
- Show [ABS]/[RATE] indicator in status bar with colored badges
- Display values immediately on startup using absolute mode
- Use shorter initial timeout (100ms) for responsive first display
- Update help page with all 8 pages including Performance
- Update key hints to show 1-8 page range
The ssl_handshake_fail stat was incorrectly mapped to
total_handshake_time (a timing metric) instead of an actual
error metric. Changed to use ssl_error_ssl which counts SSL errors.

Also fixed the traffic_top_batch test to properly discover the
traffic_top binary from build directories instead of falling
back to the installed /opt/ats version.
Instead of crashing with ink_assert when an unknown stat key is
requested, return default values and log an error message. This
makes traffic_top more robust and prevents crashes when stat
definitions don't match.

Removed the now-unused ink_assert.h include.
Replace the goto statement with a boolean running flag for cleaner
control flow. This is a more idiomatic C++ pattern and eliminates
the goto/label construct.
Add a SIGWINCH signal handler to properly handle terminal window
resize events. When the terminal is resized, the handler sets a
flag that triggers ncurses to refresh its internal state, ensuring
the display updates correctly to the new dimensions.

The display already polls terminal size on each render, but this
explicit handler ensures ncurses is properly notified of the resize.
Define named constants for timeout values and retry counts to
improve code readability and maintainability:

- FIRST_DISPLAY_TIMEOUT_MS: Initial display timeout (1000ms)
- CONNECT_RETRY_TIMEOUT_MS: Connection retry interval (500ms)
- MAX_CONNECTION_RETRIES: Maximum retry attempts (10)
- MS_PER_SECOND: Milliseconds per second conversion factor
- RPC_TIMEOUT_MS: RPC call timeout (1000ms)
- RPC_RETRY_COUNT: RPC retry count (10)
Add validateLookupTable() method to check that derived stats
(Ratio, Percentage, Sum, SumBits, SumAbsolute, TimeRatio) have
valid references to their numerator and denominator keys.

The validation runs in debug builds and prints warnings for any
misconfigured stats, making it easier to catch configuration
errors during development.
Add missing keyboard shortcuts to the help page:
- Left/m for previous page
- Right/r for next page
- b/ESC to go back from help

Also clarified existing shortcuts and improved formatting.
The <chrono> header was included in Stats.h but only used in
Stats.cc. Move the include to where it's actually needed to
reduce unnecessary header dependencies.
Add detailed comments throughout the traffic_top codebase:

- File header: Added overview of features and capabilities
- Signal handlers: Documented purpose and usage of each signal
- run_interactive(): Detailed explanation of the main event loop,
  timeout strategy, display modes, and state variables
- run_batch(): Documented output formats and batch mode behavior
- Keyboard handling: Added reference comment block listing all keys
- main(): Added usage examples and argument format documentation
- Stats.h: Documented private members and their purposes
- Display.h: Documented page renderers, layout functions, and
  the responsive layout strategy

This improves code maintainability and helps new contributors
understand the codebase structure.
Fixed several issues with duplicate statistics appearing on the same page:

120x40 layout:
- HIT RATES: Replaced 'abort' (error stat) with 'ram_hit', 'ram_miss'
- BANDWIDTH: Replaced 'ka_total' with 'total_time'
- Row 4: Completely redesigned to show HTTP METHODS, RESPONSE TIMES,
  and HTTP CODES instead of duplicating stats from rows 1-3

160x40 layout:
- REQUESTS: Fixed row 6 to show 5xx and client_req_conn instead of
  duplicating client_req/server_req
- HIT RATES: Fixed to show ram_hit/ram_miss instead of entries
- RESPONSES: Changed to show specific HTTP codes (200-502) instead
  of duplicating aggregate stats (2xx-5xx)
- BANDWIDTH: Replaced fresh_time with total_time
- TOTALS: Completely redesigned to show request/response summary
- CACHE DETAIL: Fixed cache_writes appearing 5 times (copy-paste error)
- ORIGIN DETAIL: Changed to show SSL origin stats
- MISC STATS: Changed to show transaction errors and timing stats
Remove references to btop/btop++ from code comments and documentation.
Replace ncurses with standard POSIX terminal I/O:

- Use termios for raw terminal mode (no line buffering, no echo)
- Use select() with timeout for non-blocking keyboard input
- Parse ANSI escape sequences for arrow keys
- Remove ncurses include files and CURSES_LIBRARIES link

This eliminates the ncurses build dependency while maintaining
full functionality. The display output already used ANSI escape
codes, so ncurses was only needed for keyboard input handling.

Changes:
- Display::getInput() replaces ncurses getch()/timeout()
- Display stores/restores termios settings on init/shutdown
- CMakeLists.txt no longer links CURSES_LIBRARIES
- README updated to reflect POSIX terminal requirement
Add missing Apache 2.0 license headers to format_graphs.py and
format_layout.py to fix RAT (Release Audit Tool) check failure.
The setlocale() function and LC_ALL macro require <clocale> to be
explicitly included on FreeBSD. Other platforms may provide these
symbols through other headers, but FreeBSD requires the explicit
include.
Remove WIDTH_SMALL and HEIGHT_SMALL constants that were defined but
never used in the code. The layout breakpoint documentation is
preserved in the comments. This fixes the -Werror build failure on
Ubuntu where -Wunused-const-variable is enabled.
Use std::lround() instead of static_cast<int>() when formatting
stat values with SI suffixes. This ensures values like 1.9K display
as "2K" instead of "1K", providing more accurate information to users.
@bryancall bryancall force-pushed the bcall/traffic_top_improvements branch from 1c61274 to 8e2e8af Compare March 16, 2026 16:43
@bryancall bryancall removed this from ATS v10.2.x Apr 20, 2026
bryancall added 2 commits June 9, 2026 09:33
…time constant)

- Output::printError now JSON-escapes the message and timestamp via a new
  jsonEscape() helper (handles ", \, \b, \f, \n, \r, \t and other control
  characters as \u00XX) so the JSON error object is always well-formed.
- Stats: replace the duplicated magic divisor 10000000 for the total_time
  stat with a named, documented constant TOTAL_TIME_SCALE (value unchanged;
  behavior preserved).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants