Skip to content

buildingwheels/ShallowGuessTrainer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Shallow Guess Trainer

A neural network trainer (using candle) for the ShallowGuess chess engine, featuring quantization-aware training with simulated int8 quantization.

Prerequisites

  • Rust Stable - Install from rustup.rs
  • CUDA Toolkit (optional) - For GPU acceleration

Training

Basic Usage

./target/release/train <hidden_layer_size> <data_dir> <validation_file> <model_export_path> <max_epochs> <sample_size> <total_steps>

Arguments

Argument Description
hidden_layer_size Number of neurons in the hidden layer (e.g., 512)
data_dir Directory containing training data files
validation_file Path to validation data file
model_export_path Directory to save trained models
max_epochs Maximum number of training epochs
sample_size Number of files to sample per epoch
total_steps Total training steps for scheduler

Options

Option Default Description
--batch-size <N> 32768 Batch size for training
--learning-rate <LR> 0.01 Initial learning rate
--warmup-steps <N> 16 Warmup steps for scheduler
--existing-model <PATH> - Path to existing model to continue training

Exporting Weights

Export trained weights to CSV format for embedding into the ShallowGuess engine. The fc1 (input-to-hidden layer) weights are exported in int8 quantized form.

cargo run --release --bin export <hidden_layer_size> <model_file> <export_file>

About

Trainer for the Shallow Guess chess engine

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages