██████╗ ██████╗ ███╗ ██╗███╗ ██╗███████╗ ██████╗████████╗ ██╗ ██╗ █████╗ ██╗
██╔════╝██╔═══██╗████╗ ██║████╗ ██║██╔════╝██╔════╝╚══██╔══╝ ██║ ██║ ██╔══██╗██║
██║ ██║ ██║██╔██╗██║██╔██╗██║█████╗ ██║ ██║ ███████║ ███████║██║
██║ ██║ ██║██║╚████║██║╚████║██╔══╝ ██║ ██║ ╚════██║ ██╔══██║██║
╚██████╗╚██████╔╝██║ ╚███║██║ ╚███║███████╗╚██████╗ ██║ ██║ ██║ ██║██║
╚═════╝ ╚═════╝ ╚═╝ ╚══╝╚═╝ ╚══╝╚══════╝ ╚═════╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝╚═╝
Two neural networks trained on 400,000+ board positions learn to think like grandmasters — one scans local patterns, one reads the whole board. Both deployed live on AWS, serving real-time predictions in under 50ms.
🎮 Play the Bot · 📓 Training Notebook · 🔧 Backend · 📊 Results
This project implements the MCTS self-play → supervised distillation → neural network inference pipeline — the same paradigm behind DeepMind's AlphaZero — applied to Connect 4 end-to-end.
Instead of searching the game tree at inference time (slow), we train a neural network to instantly replicate the decisions of a strong Monte Carlo Tree Search player. The result: a deep learning agent that plays at a superhuman level with a single forward pass (< 50ms).
Two architectures are built, trained, and rigorously compared:
| CNN | Vision Transformer | |
|---|---|---|
| Approach | Scans local 3×3 regions for patterns | Reads all 42 cells simultaneously via attention |
| Analogy | Detects threats the way a player spots lines | Sees the board holistically like a strategist |
| Strength | Fast convergence, precise local tactics | Superior multi-step planning against tactical play |
Both are live — pick your opponent at attentive-klutzy-jacket.anvil.app.
col: 0 1 2 3 4 5 6
┌───┬───┬───┬───┬───┬───┬───┐
row 0 │ │ │ │ │ │ │ │
├───┼───┼───┼───┼───┼───┼───┤
row 1 │ │ │ │ │ │ │ │
├───┼───┼───┼───┼───┼───┼───┤
row 2 │ │ │ │ 🔴│ │ │ │ ← Transformer: "I see a diagonal threat
├───┼───┼───┼───┼───┼───┼───┤ building via global attention"
row 3 │ │ │ 🔴│ 🔴│ │ │ │
├───┼───┼───┼───┼───┼───┼───┤
row 4 │ │ 🔴│ │ 🔴│ │ │ │ ← CNN: "I see three 3×3 threat patterns
├───┼───┼───┼───┼───┼───┼───┤ and recommend blocking col 1"
row 5 │ │ 🔴│ 🔴│ 🔴│ │ │ │
└───┴───┴───┴───┴───┴───┴───┘
Encoded as float32 tensor of shape (6, 7, 2):
Channel 0 → player (+1) positions Channel 1 → opponent (-1) positions
| Metric | 🔵 CNN | 🟣 Transformer |
|---|---|---|
| Validation Accuracy | 63.0% | 60.3% |
| Parameters | 553,353 | 553,479 |
| Win Rate vs Random Bot | 97.0% | 97.8% |
| Win Rate vs Tactical Bot | 53.2% | 62.0% |
| Training Epochs | 13 (early stopped) | 60 |
| Model Size | 4.4 MB | 824 KB |
All win rates from 500-game evaluations per opponent type, alternating starting player.
0% 25% 50% 75% 100%
├─────────┼─────────┼─────────┼─────────┤
🔵 CNN 53.2% ██████████████████████░░░░░░░░░░░░░░░░░░░
🟣 Transf 62.0% █████████████████████████░░░░░░░░░░░░░░░░
↑
Transformer wins here
(+8.8pp better at strategic play)
0% 25% 50% 75% 100%
├─────────┼─────────┼─────────┼─────────┤
🔵 CNN 97.0% ████████████████████████████████████████░
🟣 Transf 97.8% ████████████████████████████████████████░
Each model has its pros and cons. Regarding the CNN — performance-wise it came out ahead on supervised accuracy. This was largely because the board is small. Our dataset was relatively compact compared to what a Transformer typically needs, and the CNN had an easier time recognizing small sections of the board to identify patterns. The Transformer had to see the board as a whole, which can occasionally lead to sub-optimal moves in local situations. CNNs naturally capture local adjacency, pattern continuity, and geometric structure better, mainly due to their "small snapshot" behavior.
That said, the CNN is not without flaws, and the Transformer has real advantages. The CNN, due to its locality approach, struggles more with complex positions, fork detection, and multi-step tactical reasoning. The Transformer, since it sees the whole board, can capture these strategic trends more naturally — it does not always win, but it sees the bigger picture.
This plays out clearly in the gameplay numbers: while the CNN has higher validation accuracy (63% vs 60.3%), the Transformer wins more against the tactical opponent — 62% vs 53.2%. That 8.8 percentage point gap is the Transformer's global attention at work, catching the kinds of multi-step threats that a local 3×3 filter structurally cannot see.
Despite lower supervised accuracy (60.3% vs 63%), the Transformer beats the CNN against tactical play by +8.8 percentage points (62% vs 53.2%).
This reveals a fundamental limitation of validation accuracy as a proxy for gameplay strength. The CNN's inductive spatial bias helps it converge faster and score higher on the test set — but the Transformer's global self-attention learns to see multi-step threats and fork patterns that local 3×3 convolutions structurally cannot model.
┌──────────────────────────────────────────────────────────────────────────┐
│ FULL SYSTEM │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ 🌐 Anvil Web App (Python full-stack, browser-based) │ │
│ │ │ │
│ │ ┌────────────┐ ┌──────────────────┐ ┌─────────────────────┐ │ │
│ │ │ 🔐 Login │ │ 🎮 Game Board │ │ ⚙️ Settings │ │ │
│ │ │ Auth Gate │ │ 6×7 Grid UI │ │ CNN / Transformer │ │ │
│ │ │ │ │ 🔴 🟡 pieces │ │ Easy/Medium/Hard │ │ │
│ │ └────────────┘ └──────────────────┘ └─────────────────────┘ │ │
│ │ │ │
│ │ User clicks column → board encoded as (6,7,2) float32 tensor │ │
│ │ anvil.server.call('get_move', board_tensor, model_key) ──────────┼─┼──┐
│ └─────────────────────────────────────────────────────────────────────┘ │ │
│ │ │ Encrypted
│ Anvil Uplink│ │ Tunnel
│ ┌─────────────────────────────────────────────────────────────────────┐ │ │
│ │ ☁️ AWS Lightsail VM │ │ │
│ │ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ │
│ │ │ 🐳 Docker Container │ │ │ │
│ │ │ │◄─┼─┼──┘
│ │ │ backend.py │ │ │
│ │ │ ├── anvil.server.connect(uplink_key) │ │ │
│ │ │ ├── Load CNN SavedModel ──► cnn_infer() │ │ │
│ │ │ ├── Load Transformer SavedModel ► tr_infer() │ │ │
│ │ │ ├── _ensure_1_6_7_2(board) ← shape normalization │ │ │
│ │ │ ├── forward pass → argmax(probs[0]) │ │ │
│ │ │ └── anvil.server.wait_forever() │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────────────┐ ┌─────────────────────────────┐ │ │ │
│ │ │ │ cnn_savedmodel │ │ transformer_savedmodel │ │ │ │
│ │ │ │ 4.4 MB │ │ 824 KB │ │ │ │
│ │ │ │ serving_default│ │ serving_default sig. │ │ │ │
│ │ │ └─────────────────┘ └─────────────────────────────┘ │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────┘
│
Returns integer move (0–6) to UI
We built our training data by having MCTS play against itself over thousands of games, saving each board position along with the move MCTS recommended. To keep the dataset from getting repetitive, we mixed things up, randomizing the first few opening moves, occasionally throwing in a random move mid-game, and varying how hard MCTS was "thinking" between 800 and 1500 iterations per move. We ran the whole thing across 21 CPU cores in parallel to speed things up, with checkpoints saving progress along the way in case anything went wrong. Since the neural network only needs to learn from one player's perspective, we flipped the board whenever it was the other player's turn so everything looks the same to the model. When the same board showed up more than once with different move recommendations, we just kept whichever move came up most often. Finally, we mirrored every board left-to-right to nearly double our data for free. All of that gave us around 400,000 unique positions to train on.
┌─────────────────────────────────────────────────────────────────────┐
│ DATA GENERATION PIPELINE │
│ │
│ 21 CPU cores running in parallel │
│ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ... ┌──────┐ │
│ │MCTS │ │MCTS │ │MCTS │ │MCTS │ │MCTS │ │
│ │800– │ │1200 │ │1500 │ │900 │ │1100 │ ← varied │
│ │1500 │ │iters │ │iters │ │iters │ │iters │ strength │
│ │iters │ │ │ │ │ │ │ │ │ │
│ └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘ │
│ └────────┴────────┴────────┴───────────────┘ │
│ │ │
│ ▼ │
│ Raw game records (board, MCTS recommended move) │
│ │ │
│ ┌───────────────────┼───────────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Perspective │ │ Duplicate boards │ │ Left-right board │ │
│ │ flip: -1 │ │ → keep majority │ │ mirroring (free │ │
│ │ player boards│ │ vote move │ │ 2× augmentation)│ │
│ └─────────────┘ └──────────────────┘ └──────────────────┘ │
│ │ │
│ ▼ │
│ ~400,000 unique (board, move) pairs │
│ encoded as float32 tensor (6, 7, 2) │
│ │ │
│ ┌─────────┴────────┐ │
│ ▼ ▼ │
│ 80% Training 20% Validation │
│ (39,483 samples) (9,871 samples) │
└─────────────────────────────────────────────────────────────────────┘
Example board state → encoded as shape (6, 7, 2):
Raw board (6×7): Channel 0 — 🔴 (Player +1): Channel 1 — 🟡 (Player -1):
┌──┬──┬──┬──┬──┬──┬──┐ ┌──┬──┬──┬──┬──┬──┬──┐ ┌──┬──┬──┬──┬──┬──┬──┐
│ │ │ │ │ │ │ │ │0 │0 │0 │0 │0 │0 │0 │ │0 │0 │0 │0 │0 │0 │0 │
│ │ │ │🔴│ │ │ │ │0 │0 │0 │1 │0 │0 │0 │ │0 │0 │0 │0 │0 │0 │0 │
│ │ │🟡│🔴│ │ │ │ │0 │0 │0 │1 │0 │0 │0 │ │0 │0 │1 │0 │0 │0 │0 │
│ │🟡│🔴│🔴│🟡│ │ │ │0 │0 │1 │1 │0 │0 │0 │ │0 │1 │0 │0 │1 │0 │0 │
│🔴│🔴│🟡│🔴│🔴│ │ │ │1 │1 │0 │1 │1 │0 │0 │ │0 │0 │1 │0 │0 │0 │0 │
└──┴──┴──┴──┴──┴──┴──┘ └──┴──┴──┴──┴──┴──┴──┘ └──┴──┴──┴──┴──┴──┴──┘
"Where am I?" "Where is the opponent?"
For the model preparation, we proceeded with two approaches: a Convolutional Neural Network (CNN) and a Transformer-based model. Both were trained to predict the best column for the current player given a board encoded as a 6×7×2 tensor. The CNN scans small regions of the board and learns to recognize useful patterns, while the Transformer takes a different approach — it scans the whole board and learns relationships between all positions using attention (the same mechanism introduced in the paper "Attention Is All You Need"). Each approach has distinct pros and cons that we explore in depth below.
The architecture of the CNN model was built using stacked convolutional layers followed by a dense classification head. Our structure consisted of Conv2D layers, batch normalization, ReLU activations, convolutional blocks with 128 and 256 filters, GlobalAveragePooling (which helps reduce overfitting compared to Flatten), a Dense layer of 128 units with ReLU, a Dropout rate of 30%, and finally a Dense layer of 7 units with softmax activation. With this setup, we got a validation accuracy of 63%.
In plain terms — in the first layer the CNN looks at small 3×3 patterns and starts detecting simple relationships like two adjacent pieces, vertical alignment, and empty spaces. In the second layer it uses those patterns to detect more complex shapes: three-in-a-row, diagonal structures, near-winning setups. In the third layer it steps it up further — detecting double threats, fork setups (double attacks), and blocking patterns. That is what we mean by stacked convolutional layers. The dense classification head is the final step: after all the stacked layers we compress detected features into a summarized vector, feed it into a fully connected Dense layer, and output probabilities for the 7 possible columns. The model then picks the highest probability column.
Input (6, 7, 2)
│
▼ ┌──────────────────────────────────────────────────────────────┐
│ Block 1 — Pattern Detection (early features) │
│ Conv2D(64 filters, 3×3, padding='same') │
│ ↳ detects: 2-in-a-row, edge pieces, isolated cells │
│ BatchNormalization → Activation('relu') │
└──────────────────────────────────────────────────────────────┘
│
▼ ┌──────────────────────────────────────────────────────────────┐
│ Block 2 — Threat Recognition │
│ Conv2D(128 filters, 3×3, padding='same') │
│ ↳ detects: 3-in-a-row, diagonal lines, near-wins │
│ BatchNormalization → Activation('relu') │
└──────────────────────────────────────────────────────────────┘
│
▼ ┌──────────────────────────────────────────────────────────────┐
│ Block 3 — Tactical Pattern Assembly │
│ Conv2D(128 filters, 3×3, padding='same') │
│ ↳ detects: double threats, blocked lines, open-fours │
│ BatchNormalization → Activation('relu') │
└──────────────────────────────────────────────────────────────┘
│
▼ ┌──────────────────────────────────────────────────────────────┐
│ Block 4 — High-Level Strategy │
│ Conv2D(256 filters, 3×3, padding='same') │
│ ↳ combines all lower features into strategic signals │
│ BatchNormalization → Activation('relu') │
└──────────────────────────────────────────────────────────────┘
│
▼
GlobalAveragePooling2D (replaces Flatten → reduces overfitting)
│
▼
Dense(128) → ReLU → Dropout(0.30)
│
▼
Dense(7) → Softmax
│
▼
P(col_0), P(col_1), ..., P(col_6) ← probability over 7 columns
Hyperparameters (all verified from training logs)
| Parameter | Value |
|---|---|
| Optimizer | Adam |
| Initial learning rate | 3e-4 |
| LR schedule | ReduceLROnPlateau (factor=0.5, patience=2, min=1e-6) |
| Batch size | 64 |
| Max epochs | 50 |
| Early stopping | patience=5 on val_loss, restore best weights |
| Conv filters | 64 → 128 → 128 → 256 |
| Kernel size | 3×3 throughout |
| Regularization | BatchNorm + Dropout(0.30) + GlobalAvgPool |
| Total parameters | 553,353 |
| Trainable | 552,199 |
Training Log (extracted from notebook outputs)
Epoch Train Acc Val Acc Val Loss LR
───── ───────── ─────── ──────── ──────────
1 28.6% 44.5% 1.4578 3.0e-04
2 45.8% 46.1% 1.3656 3.0e-04
3 49.3% 48.4% 1.3219 3.0e-04
5 55.8% 48.2% 1.4208 ↓ 1.5e-04 ← LR reduced (plateau)
6 60.1% 51.7% 1.2707 1.5e-04
8 65.1% 52.8% ★ 1.2563 1.5e-04 ← best val loss
10 69.4% 49.8% 1.3846 ↓ 7.5e-05 ← LR reduced again
12 75.3% 53.3% 1.3517 ↓ 3.75e-05
13 78.2% 53.3% 1.3714 3.75e-05
─────────────────────────────────────────────────
Early stopped at epoch 13. Best weights restored from epoch 8.
Val Accuracy (best weights): 52.76% | Full dataset run: 63%
Our Transformer architecture was based on a Vision Transformer (ViT) style adapted for Connect 4. The structure consists of reshaping the board into 42 tokens, projecting each to a 128-dimensional embedding, adding a CLS token and trainable positional embeddings, passing through 4 Transformer encoder blocks (multi-head attention, residual connections, and MLP blocks), extracting the CLS token, and finally a dense head with 7-class softmax. With this setup, we got a validation accuracy of 60.32%.
Originally, Transformers were built for text. Then researchers adapted them for images, producing the Vision Transformer (ViT). We adapted that same idea for Connect 4. As mentioned, the Transformer sees the whole board at once — it breaks it into 42 small tokens (one per cell), converts each cell into a vector (the embedding step), and adds positional information so the model knows where each cell is. It then uses "attention" to allow every cell to interact with every other cell. The model can decide which other cells matter when analyzing a specific position — for example, a piece in column 3 might "pay attention" to pieces in column 2 and 4 to identify a possible diagonal threat. The CLS token acts as a summary notebook that gets to attend to all cells during this process. Once we extract it, it contains a global summary of the entire board state, and from that the model decides which column to play.
Input (6, 7, 2)
│
▼
Reshape → 42 tokens of shape (2,)
"Each of the 42 cells becomes one token. The network has no pre-baked idea
of which cells are adjacent — it must learn spatial relationships from data."
│
▼
Dense(128) → 42-token sequence, each 128-dim (token projection)
│
▼
Prepend [CLS] token → sequence length = 43
"This learnable token acts as a 'global summary notebook',
collecting information from every cell via attention."
│
▼
Add trainable positional embeddings (43 × 128)
│
▼
┌────────────────────────────────────────────────┐
│ Transformer Encoder Block ×4 │
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ Multi-Head Self-Attention │ │
│ │ │ │
│ │ Every token queries every other token: │ │
│ │ "Does col 3 matter when I'm at col 4?" │ │
│ │ → learns diagonal threats, fork setups │ │
│ └──────────────────────────────────────────┘ │
│ ↓ Residual + LayerNorm │
│ ┌──────────────────────────────────────────┐ │
│ │ Feed-Forward MLP (expand → contract) │ │
│ └──────────────────────────────────────────┘ │
│ ↓ Residual + LayerNorm │
└────────────────────────────────────────────────┘ × 4 blocks
│
▼
Extract [CLS] token (shape: 128-dim vector)
│
▼
Dense(7) → Softmax
│
▼
P(col_0), ..., P(col_6)
Hyperparameters (all verified from training logs)
| Parameter | Value |
|---|---|
| Optimizer | Adam (lr=3e-4) |
| Sequence length | 42 tokens (6×7 cells) |
| Token embedding dim | 128 |
| [CLS] token | Trainable, prepended to sequence |
| Positional embeddings | Trainable (43 × 128) |
| Encoder blocks | 4 |
| Max epochs | 60 |
| Data augmentation | Horizontal board flip (2× dataset) |
| Total parameters | 553,479 (all trainable) |
Training Log (60 full epochs — verified from notebook)
Epoch Train Acc Val Acc Val Loss
───── ───────── ─────── ────────
1 28.3% 35.8% 1.5812
10 44.7% 47.0% 1.3927
20 49.7% 50.8% 1.3033
30 52.5% 54.2% 1.2393
40 55.5% 57.1% 1.1848
50 57.9% 58.6% 1.1493
59 59.8% 60.1% 1.1241
60 59.9% 60.3% ✓ 1.1237 ← final
─────────────────────────────────────
Steady convergence — no early stop needed.
Training accuracy and validation accuracy stay close → no overfitting.
┌───────────────────────────────────────────────┐
│ AWS Lightsail │
│ │
│ Instance type : Linux/Unix, 1 GB RAM │
│ Purpose : Inference only (no training)│
│ Cost : Free tier / minimal │
│ │
│ ┌───────────────────────────────────────┐ │
│ │ Docker Container │ │
│ │ ├─ Python 3.10 │ │
│ │ ├─ tensorflow==2.12.1 │ │
│ │ ├─ numpy │ │
│ │ ├─ anvil-uplink==0.4.2 │ │
│ │ ├─ cnn_savedmodel/ (4.4 MB) │ │
│ │ ├─ transformer_savedmodel/ (824 KB) │ │
│ │ └─ backend.py │ │
│ │ │ │
│ │ Startup sequence: │ │
│ │ 1. Connect to Anvil via Uplink key │ │
│ │ 2. Load both models into memory once │ │
│ │ 3. wait_forever() — serves requests │ │
│ └───────────────────────────────────────┘ │
└───────────────────────────────────────────────┘
User clicks column 3
│
▼
Anvil encodes board as (6,7,2) float32 tensor
│
▼
anvil.server.call('get_move', board, 'cnn')
│
▼ [encrypted Uplink tunnel]
│
▼
backend.py receives board
_ensure_1_6_7_2(board) → shape (1, 6, 7, 2)
│
├─── CNN selected ──► cnn_infer(board_tensor)
│ ↳ output: (1, 7) probability vector
│ ↳ argmax → column 4 (0–6)
│
return 4
│
▼ [< 50ms round trip]
│
Anvil drops 🟡 in column 4, updates board
connect4-ai/
│
├── 📄 README.md ← You are here
├── 📄 LICENSE ← MIT
├── 📄 .gitignore
│
├── 📂 data/
│ └── 📂 generator/
│ └── 🐍 mcts_self_play.py ← MCTS self-play data generation
│ Parallelized across 21 CPU cores
│ Output: ~400K (board, move) pairs
│
├── 📂 training/
│ └── 📓 Connect4_AI_Training.ipynb ← End-to-end training notebook
│ CNN + Transformer + full eval
│ Plots: accuracy, loss, win rates
│
├── 📂 backend/
│ ├── 🐍 backend.py ← Anvil Uplink inference server
│ ├── 🐳 Dockerfile ← Container definition
│ ├── 🐳 docker-compose.yml ← Compose config
│ └── 📄 requirements.txt ← Pinned: tensorflow, numpy, anvil-uplink
│
├── 📂 models/
│ ├── 📂 cnn_savedmodel/ ← CNN in TF SavedModel format (4.4 MB)
│ │ ├── saved_model.pb
│ │ └── variables/
│ ├── 📂 transformer_savedmodel/ ← Transformer in TF SavedModel (824 KB)
│ │ ├── saved_model.pb
│ │ └── variables/
│ └── 📦 connect4_transformer_v2_portable.h5 ← Transformer in Keras .h5
│
└── 📂 app/
└── 📄 Connect4AIGrp26.yaml ← Anvil frontend export (clone-able)
Note: Training dataset (
connect4_400k_2channel.pkl, ~265 MB) excluded via.gitignore. Regenerate usingdata/generator/mcts_self_play.pyor request access.
git clone https://github.com/YOUR_USERNAME/connect4-ai.git
cd connect4-aiimport numpy as np
import tensorflow as tf
# Load the CNN
model = tf.saved_model.load("models/cnn_savedmodel")
infer = model.signatures["serving_default"]
input_key = list(infer.structured_input_signature[1].keys())[0]
# Build any board: shape (6, 7, 2)
# Channel 0 = your pieces (+1), Channel 1 = opponent pieces (-1)
board = np.zeros((6, 7, 2), dtype=np.float32)
board[5, 3, 0] = 1.0 # your piece at bottom-center
board[5, 4, 1] = 1.0 # opponent piece next to it
x = board[np.newaxis, ...] # add batch dim → (1, 6, 7, 2)
output = infer(**{input_key: tf.constant(x)})
probs = list(output.values())[0].numpy()[0]
print(f"Recommended column : {np.argmax(probs)}")
print(f"Column probabilities: {np.round(probs, 3)}")pip install -r backend/requirements.txt
# Set your Anvil Uplink key (replace the placeholder in backend.py)
python backend/backend.pyExpected output:
START: Backend Starting, about to connect to Anvil
Anvil Uplink connection established successfully
Loading models...
CNN model loaded.
Transformer model loaded.
Backend fully operational
docker-compose -f backend/docker-compose.yml up --build
docker logs -f <container_id># 1. Generate data (CPU-intensive, uses multiprocessing)
python data/generator/mcts_self_play.py
# 2. Open and run training notebook
jupyter notebook training/Connect4_AI_Training.ipynb
# All models saved automatically to models/ ┌────────────────┬──────────────────────────────────────────────────────┐
│ Layer │ Technology │
├────────────────┼──────────────────────────────────────────────────────┤
│ ML Framework │ TensorFlow 2.12 / Keras │
│ Data Engine │ MCTS (Monte Carlo Tree Search) · NumPy · multiprocess│
│ Architectures │ CNN (Conv2D) · Vision Transformer (ViT-style) │
│ Model Format │ TF SavedModel · Keras .h5 │
│ Backend │ Python 3.10 · Anvil Uplink │
│ Container │ Docker · Docker Compose │
│ Cloud │ AWS Lightsail │
│ Frontend │ Anvil (Python full-stack web framework) │
│ Training Env │ Google Colab (GPU) · Local CPU (data generation) │
└────────────────┴──────────────────────────────────────────────────────┘
MIT — see LICENSE.
Built with 🔴🟡 and a lot of MCTS iterations
If you liked this project, consider leaving a ⭐