Skip to content

Commit 95928c1

Browse files
authored
Merge pull request #323 from nSircombe/github/april_updates
Update examples and CHANGELOG for r25.04
2 parents e9a7cfe + d0ff4c1 commit 95928c1

5 files changed

Lines changed: 29 additions & 18 deletions

File tree

ML-Frameworks/pytorch-aarch64/CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,17 @@ where `YY` is the year, and `MM` the month of the increment.
77

88
## [unreleased]
99

10+
### Added
11+
12+
### Changed
13+
14+
### Removed
15+
16+
### Fixed
17+
18+
## [r25.04] 2025-04-16
19+
https://github.com/ARM-software/Tool-Solutions/tree/r25.04
20+
1021
### Added
1122
- Work in progress oneDNN patch, [Enable jit conv for 128](https://github.com/uxlfoundation/oneDNN/pull/3022) with ~30% speed up for backward convolutions
1223
- Add `--wheel-only` flag for only building the torch wheel
@@ -25,6 +36,8 @@ where `YY` is the year, and `MM` the month of the increment.
2536
### Removed
2637
- Removes WIP patches which have now landed in the upstream nightly PyTorch builds.
2738
- Removes `--tags --force` from git clone command, and adds `--depth=1` to speedup the checkout.
39+
- Temporarily removes `--compile` option from some examples due to an issue with https://github.com/pytorch/pytorch/pull/147151
40+
the compile path does not work as expected in these cases.
2841

2942
### Fixed
3043

ML-Frameworks/pytorch-aarch64/examples/README.md

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -201,7 +201,7 @@ The script [torchchat_llm_text_gen.py](torchchat_llm_text_gen.py) demonstrates h
201201
To run infernece using torchchat call:
202202

203203
```
204-
LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libtcmalloc.so.4 TORCHINDUCTOR_CPP_WRAPPER=1 TORCHINDUCTOR_FREEZING=1 OMP_NUM_THREADS=16 python torchchat_llm_text_gen.py --compile
204+
LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libtcmalloc.so.4 TORCHINDUCTOR_CPP_WRAPPER=1 TORCHINDUCTOR_FREEZING=1 OMP_NUM_THREADS=16 python torchchat_llm_text_gen.py
205205
```
206206

207207
#### Command-Line Options
@@ -212,9 +212,6 @@ LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libtcmalloc.so.4 TORCHINDUCTOR_CPP_WRAPPE
212212
`--max-new-tokens`
213213
Description: Max new tokens to generate.
214214

215-
`--compile`
216-
Description: Whether to compile the model (default: `False`).
217-
218215
`--model`
219216
Description: Model alias. (Default: `"llama2"` )
220217

@@ -227,7 +224,7 @@ The script [transformers_llm_text_gen.py](transformers_llm_text_gen.py) demonstr
227224
To run infernece using torchchat call:
228225

229226
```
230-
LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libtcmalloc.so.4 TORCHINDUCTOR_CPP_WRAPPER=1 TORCHINDUCTOR_FREEZING=1 OMP_NUM_THREADS=16 python transformers_llm_text_gen.py --compile
227+
LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libtcmalloc.so.4 TORCHINDUCTOR_CPP_WRAPPER=1 TORCHINDUCTOR_FREEZING=1 OMP_NUM_THREADS=16 python transformers_llm_text_gen.py
231228
```
232229

233230
#### Command-Line Options
@@ -238,9 +235,6 @@ LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libtcmalloc.so.4 TORCHINDUCTOR_CPP_WRAPPE
238235
`--max-new-tokens`
239236
Description: Max new tokens to generate.
240237

241-
`--compile`
242-
Description: Whether to compile the model (default: `False`).
243-
244238
`--model`
245239
Description: Local Path to model repo or huggingface model id. (Default: `"meta-llama/Llama-2-7b-hf"` )
246240

ML-Frameworks/pytorch-aarch64/examples/torchchat_llm_text_gen.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,8 +31,6 @@ def main(args):
3131
"python3", torchchat_path, "generate", args.model,
3232
"--quantize", str(args.quant_config),
3333
"--prompt", prompt,
34-
"--compile" if args.compile else "",
35-
"--compile-prefill" if args.compile else "",
3634
"--max-autotune", "--max-new-tokens", str(args.max_new_tokens)
3735
]
3836
command = [arg for arg in command if arg]
@@ -47,8 +45,6 @@ def main(args):
4745
help='Path to json file for quantization config')
4846
parser.add_argument('--max-new-tokens', type=int,
4947
default=64, help='New tokens to generate at decode.')
50-
parser.add_argument('--compile', action='store_true',
51-
help='Whether to compile the model.')
5248
parser.add_argument('--model', type=str, default="llama2",
5349
help='Torchchat supported model alias')
5450
parser.add_argument('--prompt', type=str, default="In a distant world where magic and technology coexist, "

ML-Frameworks/pytorch-aarch64/examples/transformers_llm_text_gen.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -130,10 +130,6 @@ def get_quantized_model(args):
130130
print("Quantizing model to 4 bit ..")
131131
quantize_model(model, "cpu", args.quant_config)
132132
model = model.eval()
133-
if args.compile:
134-
model.generation_config.cache_implementation = "static"
135-
model.forward = torch.compile(
136-
model.forward, backend='inductor', dynamic=True, fullgraph=True)
137133
return model, tokenizer, config
138134

139135

@@ -197,8 +193,6 @@ def main(args):
197193
"gen_ai_utils/quant_configs/aarch64_cpu_channelwise.json", help='Path to json file for quantization config')
198194
parser.add_argument('--max-new-tokens', type=int,
199195
default=64, help='New tokens to generate at decode.')
200-
parser.add_argument('--compile', action='store_true',
201-
help='Whether to compile the model.')
202196
parser.add_argument('--model', type=Path, default=Path("meta-llama/Llama-2-7b-hf"),
203197
help='Hugging Face model ID or Cloned model repository with model files')
204198
parser.add_argument('--prompt', type=str, default="In a distant world where magic and technology coexist, "

ML-Frameworks/tensorflow-aarch64/CHANGELOG.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,20 @@ where `YY` is the year, and `MM` the month of the increment.
1515

1616
### Fixed
1717

18+
## [r25.04] 2025-04-16
19+
https://github.com/ARM-software/Tool-Solutions/tree/r25.04
20+
21+
### Added
22+
- Enables patching of build outside of Bazel build.
23+
- default num_threads to max for acl_threadpool, see www.github.com/tensorflow/uxlfoundation/oneDNN/2958
24+
25+
### Changed
26+
- Updates TensorFlow build to use oneDNN 3.7 + ACL 24.12, see www.github.com/tensorflow/tensorflow/pull/84975
27+
28+
### Removed
29+
30+
### Fixed
31+
1832
## [r25.03.1] 2025-03-26
1933
https://github.com/ARM-software/Tool-Solutions/tree/r25.03.1
2034

0 commit comments

Comments
 (0)