Skip to content

Commit 8e12583

Browse files
Fix benchmark ranking issues and update comprehensive documentation
Co-authored-by: christiannagel <1908285+christiannagel@users.noreply.github.com>
1 parent 6c2138b commit 8e12583

3 files changed

Lines changed: 394 additions & 96 deletions

File tree

src/services/bot/CodeBreaker.Bot.Benchmarks/ComparisonBenchmarks.cs

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@ namespace CodeBreaker.Bot.Benchmarks;
99
/// </summary>
1010
[MemoryDiagnoser]
1111
[SimpleJob]
12-
[RankColumn]
1312
[GroupBenchmarksBy(BenchmarkDotNet.Configs.BenchmarkLogicalGroupRule.ByCategory)]
1413
public class ComparisonBenchmarks
1514
{
Lines changed: 181 additions & 95 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,31 @@
11
# CodeBreaker.Bot Benchmarks
22

3-
This project provides comprehensive performance benchmarks for the CodeBreaker.Bot algorithms. It measures execution time and memory consumption of the core algorithms used for playing Codebreaker games.
3+
This project provides comprehensive performance benchmarks for the CodeBreaker algorithms, comparing both the binary-based (`CodeBreaker.Bot`) and string-based (`CodeBreaker.BotWithString`) implementations. It measures execution time and memory consumption of the core algorithms used for playing Codebreaker games.
44

55
## Overview
66

7-
The benchmarks evaluate the performance of:
7+
The benchmark suite includes four main categories of benchmarks:
88

9-
### Core Algorithm Methods
10-
- **HandleBlackMatches**: Filters possible values based on exact position matches (black pegs)
11-
- **HandleWhiteMatches**: Filters based on correct color/wrong position matches (white pegs)
12-
- **HandleBlueMatches**: Filters based on partial matches (specific to Game5x5x4)
13-
- **HandleNoMatches**: Filters when no colors match the selection
14-
- **SelectPeg**: Extracts individual peg values from integer representation
15-
- **IntToColors**: Converts integer representation to color names
9+
1. **AlgorithmBenchmarks** - Original binary algorithm performance tests
10+
2. **GameScenarioBenchmarks** - Realistic gameplay simulation tests
11+
3. **InitializationBenchmarks** - One-time setup operation tests
12+
4. **ComparisonBenchmarks** - Direct comparisons between binary and string implementations
1613

17-
### Initialization Methods
18-
- **InitializePossibleValues**: Creates initial possible values lists for different game types
19-
- **Memory-intensive list operations**: Sorting, reducing, and managing large collections
14+
## Algorithm Implementations Compared
2015

21-
### Game Scenarios
22-
- **Early game**: Initial moves with large possibility spaces
23-
- **Mid-game**: Progressive filtering with mixed match types
24-
- **Late game**: High-precision filtering with small possibility spaces
25-
- **Complete game simulation**: Full game progression scenarios
16+
### Binary Implementation
17+
- **Data representation**: `int` with bit manipulation
18+
- **Color handling**: Bit masks and shifts
19+
- **Algorithm complexity**: Bit operations
20+
- **Memory efficiency**: Compact representation
21+
- **API compatibility**: Requires conversion to/from strings
22+
23+
### String Implementation
24+
- **Data representation**: `string[]` arrays
25+
- **Color handling**: Direct string comparison
26+
- **Algorithm complexity**: Simple array operations
27+
- **Readability**: Higher (string operations)
28+
- **API compatibility**: Direct compatibility with Games API
2629

2730
## Game Types Tested
2831

@@ -33,7 +36,7 @@ The benchmarks evaluate the performance of:
3336
## Benchmark Categories
3437

3538
### 1. AlgorithmBenchmarks
36-
Core algorithm performance with different list sizes:
39+
Core binary algorithm performance with different list sizes:
3740
- Full lists (1,000+ values)
3841
- Reduced lists (20-200 values)
3942
- Various game types and match scenarios
@@ -50,6 +53,14 @@ Realistic gameplay simulations:
5053
- Combined operation sequences
5154
- Best/worst-case filtering scenarios
5255

56+
### 4. ComparisonBenchmarks (NEW)
57+
Direct performance comparisons between implementations:
58+
- **Black/White/No matches filtering** - Core game logic performance
59+
- **Peg selection operations** - Individual element access
60+
- **Initialization performance** - Setup time comparison
61+
- **Memory usage patterns** - Memory allocation analysis
62+
- **Color conversion operations** - Data transformation costs
63+
5364
## Running the Benchmarks
5465

5566
### Prerequisites
@@ -72,108 +83,143 @@ Realistic gameplay simulations:
7283

7384
3. **Run specific categories**:
7485
```bash
86+
# Run only comparison benchmarks
87+
dotnet run -c Release -- --filter "*Comparison*"
88+
7589
# Run only algorithm benchmarks
7690
dotnet run -c Release -- --filter "*AlgorithmBenchmarks*"
7791

7892
# Run only Game6x4 benchmarks
7993
dotnet run -c Release -- --filter "*Game6x4*"
8094

81-
# Run only memory-intensive benchmarks
95+
# Run only binary vs string comparisons for black matches
96+
dotnet run -c Release -- --filter "*BlackMatches*"
97+
98+
# Run memory-intensive benchmarks
8299
dotnet run -c Release -- --filter "*Memory*"
83100
```
84101

85-
### Advanced Options
86-
87-
1. **Export results to different formats**:
102+
4. **Quick dry run for testing**:
88103
```bash
89-
# Export to CSV
90-
dotnet run -c Release -- --exporters csv
91-
92-
# Export to JSON
93-
dotnet run -c Release -- --exporters json
94-
95-
# Export to HTML
96-
dotnet run -c Release -- --exporters html
104+
dotnet run -c Release -- --filter "*Comparison*" -j Dry
97105
```
98106

99-
2. **Run specific benchmark methods**:
100-
```bash
101-
# Run only black matches benchmarks
102-
dotnet run -c Release -- --filter "*BlackMatches*"
103-
104-
# Run only initialization benchmarks
105-
dotnet run -c Release -- --filter "*Initialization*"
106-
```
107+
### Specific Comparison Examples
107108

108-
3. **Memory profiling**:
109-
```bash
110-
# Run with detailed memory analysis
111-
dotnet run -c Release -- --memory
112-
```
109+
```bash
110+
# Compare initialization performance
111+
dotnet run -c Release -- --filter "*Initialization*"
112+
113+
# Compare black matches filtering
114+
dotnet run -c Release -- --filter "*BlackMatches*"
115+
116+
# Compare memory usage
117+
dotnet run -c Release -- --filter "*Memory*"
118+
119+
# Compare peg selection operations
120+
dotnet run -c Release -- --filter "*PegSelection*"
121+
122+
# Compare white matches filtering
123+
dotnet run -c Release -- --filter "*WhiteMatches*"
124+
125+
# Compare no matches filtering
126+
dotnet run -c Release -- --filter "*NoMatches*"
127+
```
128+
129+
### Advanced Options
130+
131+
```bash
132+
# Generate detailed reports
133+
dotnet run -c Release -- --exporters html json
134+
135+
# Run with memory profiling
136+
dotnet run -c Release -- --memory
137+
138+
# Compare different .NET versions (if available)
139+
dotnet run -c Release -- --runtimes net8.0 net9.0
140+
141+
# Group benchmarks by implementation type
142+
dotnet run -c Release -- --filter "*Binary*"
143+
dotnet run -c Release -- --filter "*String*"
144+
```
113145

114146
## Understanding the Results
115147

116-
### Key Metrics
148+
### Key Metrics to Watch
117149

118-
- **Mean**: Average execution time
119-
- **Error**: Half of the 99.9% confidence interval
120-
- **StdDev**: Standard deviation of measurements
121-
- **Median**: Middle value of all measurements
122-
- **Allocated**: Memory allocated during execution
123-
- **Gen 0/1/2**: Garbage collection counts
150+
1. **Mean Execution Time**: Average time per operation
151+
2. **Memory Allocation**: Bytes allocated during execution
152+
3. **Gen 0/1/2 Collections**: Garbage collection pressure
153+
4. **Ratio**: Relative performance between implementations
154+
5. **Rank**: Performance ranking within the benchmark group
124155

125-
### Typical Performance Expectations
156+
### Expected Performance Characteristics
126157

127-
| Operation | List Size | Expected Range |
128-
|-----------|-----------|----------------|
129-
| HandleBlackMatches | 1,000+ values | 10-100 μs |
130-
| HandleWhiteMatches | 1,000+ values | 50-500 μs |
131-
| HandleNoMatches | 1,000+ values | 5-50 μs |
132-
| SelectPeg | Single value | < 1 μs |
133-
| IntToColors | Single value | 1-5 μs |
134-
| InitializePossibleValues | N/A | 1-10 ms |
158+
#### Binary Implementation Advantages:
159+
- **Memory efficiency**: Compact integer representation
160+
- **Cache performance**: Better locality for large datasets
161+
- **Arithmetic operations**: Fast bit manipulation
162+
- **Less GC pressure**: Fewer object allocations
135163

136-
### Memory Usage Patterns
164+
#### String Implementation Advantages:
165+
- **API compatibility**: No conversion overhead with Games API
166+
- **Code readability**: Easier to understand and maintain
167+
- **Debugging**: More straightforward to inspect values
168+
- **Type safety**: Less bit manipulation complexity
137169

138-
- **Game6x4 initialization**: ~50-100 KB
139-
- **Game8x5 initialization**: ~200-500 KB
140-
- **Large list filtering**: Proportional to input size
141-
- **String conversions**: Additional overhead for color names
170+
### Sample Comparison Output
142171

143-
## Interpreting Results for Optimization
172+
```
173+
| Method | Mean | Error | StdDev | Ratio | Gen0 | Allocated |
174+
|------------------------------------------ |----------:|---------:|---------:|------:|-------:|----------:|
175+
| Binary_HandleBlackMatches_Game6x4_FullList | 15.23 ms | 0.25 ms | 0.22 ms | 1.00 | 125.0 | 2.1 MB |
176+
| String_HandleBlackMatches_Game6x4_FullList | 28.45 ms | 0.52 ms | 0.48 ms | 1.87 | 285.0 | 4.8 MB |
177+
```
144178

145-
### Performance Baselines
179+
This shows:
180+
- Binary implementation is ~1.87x faster
181+
- String implementation uses ~2.3x more memory
182+
- Both have predictable performance characteristics
146183

147-
Use these benchmarks to:
184+
### Performance Comparison Categories
148185

149-
1. **Establish baselines** before implementing algorithm changes
150-
2. **Compare alternative implementations** of the same functionality
151-
3. **Identify bottlenecks** in real game scenarios
152-
4. **Monitor regression** when making code changes
186+
The comparison benchmarks organize results by:
187+
- **Operation type** (BlackMatches, WhiteMatches, NoMatches, etc.)
188+
- **Game type** (Game6x4, Game8x5, Game5x5x4)
189+
- **Implementation** (Binary vs String)
190+
- **Data size** (FullList vs ReducedList)
153191

154-
### Common Optimization Targets
192+
## Interpreting Results for Optimization
155193

156-
Based on the benchmarks, focus optimization efforts on:
194+
### When to use Binary implementation:
195+
- Large datasets (1000+ combinations)
196+
- Memory-constrained environments
197+
- Performance-critical paths
198+
- Batch processing scenarios
199+
- High-frequency operations
157200

158-
1. **HandleWhiteMatches**: Often the most expensive operation
159-
2. **Large list operations**: When possibility space is still large
160-
3. **Memory allocations**: Frequent list creation and destruction
161-
4. **Game8x5 scenarios**: Larger search spaces require more processing
201+
### When to use String implementation:
202+
- API compatibility requirements
203+
- Development/debugging scenarios
204+
- Small to medium datasets
205+
- Code maintainability priorities
206+
- Direct integration with Games API
162207

163-
### Red Flags
208+
### Algorithm Performance Ranking (typical):
164209

165-
Watch for:
166-
- **Execution times > 1ms** for individual filtering operations
167-
- **Memory allocations > 1MB** for single operations
168-
- **High GC pressure** (frequent Gen 1/2 collections)
169-
- **Inconsistent timing** (high standard deviation)
210+
1. **SelectPeg** operations: Fastest (direct access)
211+
2. **HandleNoMatches**: Fast (simple filtering)
212+
3. **HandleBlackMatches**: Moderate (exact matching)
213+
4. **HandleWhiteMatches**: Slower (complex matching logic)
214+
5. **Initialization**: Slowest (generates all combinations)
170215

171216
## Benchmark Configuration
172217

173-
The benchmarks use BenchmarkDotNet's default configuration with:
218+
The benchmarks use BenchmarkDotNet's configuration with:
174219
- **SimpleJob**: Reasonable number of iterations for accurate results
175220
- **MemoryDiagnoser**: Tracks memory allocations and GC behavior
176221
- **RankColumn**: Shows relative performance ranking
222+
- **GroupBenchmarksBy**: Organizes results by logical categories
177223

178224
## Troubleshooting
179225

@@ -182,32 +228,72 @@ The benchmarks use BenchmarkDotNet's default configuration with:
182228
1. **"No benchmarks found"**: Ensure you're running in Release configuration
183229
2. **Inconsistent results**: Run on a dedicated machine without other heavy processes
184230
3. **Out of memory**: Reduce the size of test data if running on constrained environments
231+
4. **Long execution times**: Use `-j Dry` for quick validation runs
185232

186233
### Performance Tips
187234

188235
1. **Close unnecessary applications** before running benchmarks
189236
2. **Use Release configuration** for accurate performance measurements
190237
3. **Run multiple times** to ensure consistency
191238
4. **Consider thermal throttling** on laptops during long benchmark runs
239+
5. **Use filters** to focus on specific comparisons
240+
241+
## Key Features
242+
243+
### Self-Contained Design
244+
- No external package dependencies (except BenchmarkDotNet)
245+
- Local copies of both binary and string algorithms
246+
- Local GameType definitions
247+
- Comprehensive test data generators
248+
- Works without private Azure DevOps feeds
249+
250+
### Comprehensive Coverage
251+
- 60+ comparison benchmarks available
252+
- Tests different data sizes and scenarios
253+
- Covers all major algorithm operations
254+
- Includes initialization and memory stress tests
255+
256+
### Easy Comparison
257+
- Side-by-side binary vs string results
258+
- Clear performance ratios and rankings
259+
- Memory allocation analysis
260+
- Grouped by operation and game type
192261

193262
## Contributing
194263

195264
When adding new benchmarks:
196265

197-
1. Follow the existing naming conventions
198-
2. Use appropriate benchmark categories
199-
3. Include memory diagnostics for operations that allocate
200-
4. Add realistic test scenarios that represent actual usage
266+
1. Follow the existing naming convention: `{Implementation}_{Operation}_{GameType}_{Scenario}`
267+
2. Use appropriate benchmark categories for organization
268+
3. Include both binary and string variants for comparison
269+
4. Test with different data sizes (full, reduced, small lists)
201270
5. Document expected performance characteristics
271+
6. Consider both time and memory implications
202272

203-
## Example Output
273+
## Example Usage Scenarios
204274

275+
### Performance Analysis
276+
```bash
277+
# Quick performance comparison
278+
dotnet run -c Release -- --filter "*BlackMatches*Game6x4*" -j Dry
279+
280+
# Detailed memory analysis
281+
dotnet run -c Release -- --filter "*Memory*" --memory
282+
283+
# Full initialization comparison
284+
dotnet run -c Release -- --filter "*Initialization*"
205285
```
206-
| Method | Mean | Error | StdDev | Median | Allocated |
207-
|-------------------------------------- |----------:|---------:|---------:|----------:|----------:|
208-
| HandleBlackMatches_Game6x4_FullList | 45.23 μs | 0.891 μs | 1.024 μs | 45.12 μs | 1.95 KB |
209-
| HandleNoMatches_Game6x4_FullList | 12.67 μs | 0.234 μs | 0.219 μs | 12.71 μs | 1.23 KB |
210-
| InitializePossibleValues_Game6x4 | 3.45 ms | 0.068 ms | 0.064 ms | 3.43 ms | 52.3 KB |
286+
287+
### Algorithm Selection
288+
```bash
289+
# Test specific game type performance
290+
dotnet run -c Release -- --filter "*Game8x5*"
291+
292+
# Compare filtering operations
293+
dotnet run -c Release -- --filter "*Matches*"
294+
295+
# Analyze peg operations
296+
dotnet run -c Release -- --filter "*Peg*"
211297
```
212298

213-
This output shows that black match handling takes about 45 microseconds on average for a full Game6x4 list, while initializing the possible values takes about 3.5 milliseconds but only happens once per game.
299+
This comprehensive benchmark suite helps you make informed decisions about which algorithm implementation to use based on your specific performance requirements, memory constraints, and API compatibility needs.

0 commit comments

Comments
 (0)