Skip to content

Commit 2de7e67

Browse files
authored
Merge pull request #247 from yanyinglin/master
update genai dataset
2 parents cbbcade + 95d4fc1 commit 2de7e67

2 files changed

Lines changed: 14 additions & 12 deletions

File tree

cluster-trace-v2026-GenAI/README.md

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -159,21 +159,23 @@ All data files can be correlated through the `container_ip` field, which is an M
159159
| `value` | float | End-to-end latency (milliseconds) |
160160
| `container_ip` | string | MD5 hash of container IP |
161161

162-
### 6. Request-Level Performance Data
162+
# ### 6. Request-Level Performance Data
163163
- **File**: `lora_request_trace.csv`
164-
- **Description**: Detailed inference request performance characteristics including execution time, prompt complexity, etc.
164+
- **Description**: Detailed inference request performance characteristics including execution time, prompt complexity, and model configuration.
165165

166166
| Field Name | Type | Description | Example |
167167
|------------|------|-------------|---------|
168-
| `predict_type` | string | MD5 hash identifier of request type | `e867a9754d73155e90d62f88dbedfc62` |
169-
| `exec_time_seconds` | float | Inference execution time (seconds) | `22.0` |
170-
| `style_type` | string | MD5 hash identifier of style type, can be empty | `88585aaa7d5402c0928a0c8639c83bab` |
171-
| `prompt_length` | float | Positive prompt length (character count) | `54.0` |
172-
| `negative_prompt_length` | float | Negative prompt length (character count), can be empty | `26.0` |
173-
| `num_images_per_prompt` | float | Number of images generated per prompt | `1.0` |
174-
| `num_inference_steps` | float | Number of inference steps | `30.0` |
175-
| `checkpoint_model_version_id` | string | MD5 hash identifier of base model version | `0a80ffe64c68aa574e80e6c9b9f5308a` |
176-
| `lora_args` | string | LoRA adapter parameters list (JSON format) | `[{'modelVersionId': '799974951f19a0c730acda2389cc852b', 'scale': 1.0}]` |
168+
| `gmt_create` | datetime | Request creation timestamp (anonymized with time offset) | `2024-11-15 16:57:50` |
169+
| `predict_type` | string | Type of generation task: `TXT_2_IMG`, `IMG_2_IMG`, or `INPAINTING` | `TXT_2_IMG` |
170+
| `predict_status` | string | Request completion status: `SUCCEED`, `PROCESSING`, etc. | `SUCCEED` |
171+
| `exec_time_seconds` | float | Total execution time in seconds for the request | `32.0` |
172+
| `groupId` | string | Anonymized user/group identifier | `G0000` |
173+
| `prompt_length` | float | Length of the input text prompt (character count) | `63.0` |
174+
| `negative_prompt_length` | float | Length of the negative prompt (character count), can be empty | `26.0` |
175+
| `num_images_per_prompt` | float | Number of images generated per request | `1.0` |
176+
| `num_inference_steps` | float | Number of diffusion inference steps | `30.0` |
177+
| `checkpoint_model_version_id` | string | Anonymized base model identifier | `M0000` |
178+
| `num_lora` | int | Number of LoRA adapters used in the request (0 = base model only) | `0` |
177179

178180
## Anonymization Description
179181

@@ -183,6 +185,7 @@ All data has undergone strict anonymization processing:
183185
- **Method**: MD5 hash algorithm
184186
- **Applied Fields**: Container IP, model version ID, request type, etc.
185187
- **Feature**: Same original value always maps to same hash value, ensuring correlation
188+
- **Time Offset**: All timestamps are anonymized with a time offset to preserve relative order but hide absolute values
186189

187190
## Usage Recommendations
188191

@@ -225,7 +228,6 @@ All data has undergone strict anonymization processing:
225228

226229
## Citation
227230

228-
229231
If you use this dataset for analyzing request characteristics of diffusion models, GPU utilization patterns, or queue behavior, please cite our SoCC paper:
230232

231233
```bibtex
173 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)