A neural network trainer (using candle) for the ShallowGuess chess engine, featuring quantization-aware training with simulated int8 quantization.
- Rust Stable - Install from rustup.rs
- CUDA Toolkit (optional) - For GPU acceleration
./target/release/train <hidden_layer_size> <data_dir> <validation_file> <model_export_path> <max_epochs> <sample_size> <total_steps>| Argument | Description |
|---|---|
hidden_layer_size |
Number of neurons in the hidden layer (e.g., 512) |
data_dir |
Directory containing training data files |
validation_file |
Path to validation data file |
model_export_path |
Directory to save trained models |
max_epochs |
Maximum number of training epochs |
sample_size |
Number of files to sample per epoch |
total_steps |
Total training steps for scheduler |
| Option | Default | Description |
|---|---|---|
--batch-size <N> |
32768 | Batch size for training |
--learning-rate <LR> |
0.01 | Initial learning rate |
--warmup-steps <N> |
16 | Warmup steps for scheduler |
--existing-model <PATH> |
- | Path to existing model to continue training |
Export trained weights to CSV format for embedding into the ShallowGuess engine. The fc1 (input-to-hidden layer) weights are exported in int8 quantized form.
cargo run --release --bin export <hidden_layer_size> <model_file> <export_file>