This project is a full-stack, end-to-end machine learning application that recommends an optimal playstyle ("Aggressive Push," "Defensive Counter," or "Balanced Cycle") for a player's Clash Royale deck against an opponent's deck.
It is built with a FastAPI backend that serves two ML models, a vector search index, and a connection to the Gemini AI API. The frontend is an interactive Streamlit dashboard that includes a recommender, a deck optimizer, semantic search, and a full model evaluation suite.
- 🧠 ML Strategy Recommendation: Utilizes a two-model system (
Win-SeekerandLoss-AvoiderXGBoost models) to provide a nuanced, risk-adjusted playstyle recommendation. - 🔄 Intelligent Deck Optimizer: A "what-if" engine that simulates 800+ card swaps to find the single best card to add to your deck, ranked by its impact on win probability and synergy.
- 🔍 Semantic Card Search: Uses a FAISS vector index to find cards based on natural language queries (e.t., "fast troop that targets buildings" -> "Hog Rider").
- 🤖 AI-Powered Explanations: Connects to the Google Gemini API to provide natural language explanations for why a strategy was recommended and to generate specific, actionable tips for any card.
- 📊 Interactive EDA & Model Evaluation: A complete "Data Analysis" tab in the dashboard that visualizes the original data (pick rates, elixir costs) and the model's performance (Feature Importance, ROC Curves, Probability Distributions).
- Backend: FastAPI, Uvicorn
- Frontend: Streamlit
- ML Models: Scikit-learn, XGBoost, imbalanced-learn (for SMOTE)
- Semantic Search: Sentence-Transformers, FAISS (
faiss-cpu) - AI Explanations: Google Generative AI (
google-generativeai) - Data Handling: Pandas, NumPy
- Plotting: Matplotlib, Seaborn
📁 Match-Strategy-Recommender/ | ├── 📄 init.py # <-- Empty file to mark as package ├── 📄 .env # <-- STORES YOUR GEMINI_API_KEY ├── 📄 .gitignore # <-- IMPORTANT: Includes .env, venv/, pycache/ ├── 📄 README.md # <-- Your project's GitHub homepage ├── 📄 requirements.txt # <-- All Python libraries to install │ ├── 📁 analysis/ │ ├── 📄 generate_elixir_boxplot.py │ ├── 📄 generate_feature_importance.py │ ├── 📄 generate_probability_distribution.py │ ├── 📄 generate_recommended_troops.py │ ├── 📄 generate_roc_curve.py │ ├── 📄 generate_synergy_heatmap.py │ │ │ ├── 🖼️ elixir_usage_by_archetype.png │ ├── 🖼️ loss_avoider_feature_importance.png │ ├── 🖼️ probability_distribution.png │ ├── 🖼️ recommended_troop_lists.png │ ├── 🖼️ roc_curve_comparison.png │ ├── 🖼️ synergy_heatmap.png │ └── 🖼️ win_seeker_feature_importance.png │ ├── 📁 app/ ├── 📄 init.py # <-- Empty file to mark as package │ ├── 📄 main.py # <-- The complete FastAPI backend (API logic) │ ├── 📄 streamlit_app.py # <-- The complete Streamlit frontend (UI) │ └── 📄 cache.py # <-- Our in-memory cache class for the AI │ ├── 📁 data/ │ ├── 📄 cards_data.csv # (Raw) Card attributes (HP, DPS, etc.) │ ├── 📄 clash_data.csv # (Raw) Player info │ ├── 📄 deck_features.csv # (Raw) Player decks │ ├── 📄 match_history.csv # (Raw) Win/loss data │ ├── 📄 strategies_clean.csv # (Raw) Archetypes, game modes │ │ │ ├── 📄 preprocessed_data.csv # (Processed) Merged/cleaned data │ ├── 📄 attack_mode_features.csv # (Processed) Final features for model │ └── 📄 defense_mode_features.csv # (Processed) Final features for model │ ├── 📁 models/ │ ├── 📄 strategy_model_loss_avoider.pkl # <-- Our fixed, balanced model │ ├── 📄 strategy_model_win_seeker.pkl # <-- Our fixed, optimistic model │ ├── 📄 faiss_card_index.idx # <-- The semantic search index │ └── 📄 faiss_card_mapping.json # <-- The semantic search card list └── 📄 demo_semantic_index.index # (Optional) Generated by semantic_recommender.py | └── 📄 demo_semantic_index.meta.joblib # (Optional) Generated by semantic_recommender.py │ ├── 📁 notebooks/ │ └── 📄 EDA.ipynb # Your original notebook for exploration │ └── 📁 src/ ├── 📄 init.py # <-- Empty file to mark as package │ ├── 📄 data_preprocessing.py # Pipeline Step 1 ├── 📄 feature_engineering.py # Pipeline Step 2 ├── 📄 train_model.py # Pipeline Step 3 (Trains ML models) ├── 📄 train_fnn.py # (Experimental) Trains the FNN ├── 📄 semantic_recommender.py # Class for semantic search ├── 📄 build_semantic_index.py # One-time script to build the index ├── 📄 utils.py # Core helper functions (archetype, etc.) │ ├── 📄 generate_datasets_api.py # (Original scripts) ├── 📄 preprocess_battle_data.py # (Original scripts) └── 📄 recommend_strategy.py # (Original scripts)
Follow these steps to set up and run the entire application on your local machine.
a. Clone the Repository
git clone https://github.com/Priyanshu-Ku/Clash-Royale-Match-Strategy-Recommendation-System.git
b. Create and Activate a Virtual Environment
# Windows
python -m venv venv
.\venv\Scripts\activate
# macOS / Linux
python3 -m venv venv
source venv/bin/activate
c. Install Dependencies
pip install -r requirements.txt
d. Create Your Environment File You must have an API key from Google AI Studio for the AI features to work.
Create a file named .env in the root of the project.
Add your API key to it:
SUPERCELL_TOKEN=YOUR_CLASH_ROYALE_API_TOKEN_HERE
GEMINI_API_KEY="YOUR_API_KEY_HERE"
e. Build the Semantic Index (One-Time Step) Run the following command to create the vector index for the semantic search:
python src/build_semantic_index.py
This will create faiss_card_index.idx and faiss_card_mapping.json in the models/ folder.
2. Run the Application
This project requires two terminals to be running at the same time.
Terminal 1: Start the FastAPI Backend
uvicorn app.main:app --reload
Wait until you see the log message: Application startup complete.
Terminal 2: Start the Streamlit Frontend
streamlit run app/streamlit_app.py
This will automatically open the application in your web browser.
🔄 How to Retrain the Models
If you add new data to the data/ folder, you can retrain all models and analysis by running the scripts in order:
Run the Data Pipeline:
python src/generate_datasets_api.py
python src/preprocess_battle_data.py
python src/data_preprocessing.py
python src/feature_engineering.py
Train the ML Models:
python src/train_model.py
(Optional) Re-build the Semantic Index:
python src/build_semantic_index.py
(Optional) Re-generate all Analysis Plots:
python analysis/generate_feature_importance.py
python analysis/generate_roc_curve.py
python analysis/generate_probability_distribution.py
python analysis/generate_synergy_heatmap.py
python analysis/generate_recommended_troops.py
python analysis/generate_elixir_boxplot.py
## 📖 API Documentation
### Endpoints Overview
The FastAPI backend provides the following REST endpoints:
#### 1. Strategy Recommendation
```http
POST /recommendRequest Body:
{
"cards": [
"Knight",
"Archers",
"Mortar",
"Fireball",
"Tesla",
"Log",
"Ice Spirit",
"Skeletons"
],
"opponent_type": "beatdown"
}Response:
{
"strategy": "Defensive Counter",
"confidence": 0.85,
"features": {
"avg_elixir": 3.1,
"damage_potential": 320,
"defense_strength": 450,
"synergy_score": 7.2
}
}POST /semanticRequest Body:
{
"query": "aggressive giant push with support troops",
"top_k": 5
}Response:
{
"results": [
{
"strategy": "Giant + Musketeer + Mega Minion push with spell support",
"similarity": 0.92
},
{
"strategy": "Giant beatdown with elixir pump and heavy support",
"similarity": 0.88
}
],
"query_time_ms": 12.5,
"total_strategies": 15000
}GET /cardsResponse:
{
"cards": [
{
"name": "Knight",
"elixir": 3,
"type": "troop",
"role": "tank",
"attributes": {
"hitpoints": 1568,
"damage": 167,
"dps": 139
}
}
],
"total_cards": 107
}GET /healthResponse:
{
"status": "healthy",
"card_data": true,
"model": true,
"total_cards": 107,
"details": {
"semantic_index": "loaded",
"cache_status": "active"
}
}Python:
import requests
# Get strategy recommendation
response = requests.post(
"http://localhost:8000/recommend",
json={
"cards": ["Hog Rider", "Valkyrie", "Musketeer", "Fireball",
"Zap", "Ice Spirit", "Cannon", "Skeletons"],
"opponent_type": "cycle"
}
)
recommendation = response.json()
print(f"Recommended Strategy: {recommendation['strategy']}")
print(f"Confidence: {recommendation['confidence']:.2%}")
# Semantic search
response = requests.post(
"http://localhost:8000/semantic",
json={
"query": "fast cycle deck with spell bait",
"top_k": 3
}
)
results = response.json()
for result in results['results']:
print(f"{result['strategy']} (similarity: {result['similarity']:.2f})")cURL:
# Strategy recommendation
curl -X POST "http://localhost:8000/recommend" \
-H "Content-Type: application/json" \
-d '{
"cards": ["Giant", "Witch", "Musketeer", "Fireball",
"Zap", "Mega Minion", "Tombstone", "Ice Spirit"],
"opponent_type": "control"
}'
# Semantic search
curl -X POST "http://localhost:8000/semantic" \
-H "Content-Type: application/json" \
-d '{
"query": "defensive counter push strategy",
"top_k": 5
}'JavaScript (Fetch):
// Strategy recommendation
const getRecommendation = async (deck, opponentType) => {
const response = await fetch("http://localhost:8000/recommend", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
cards: deck,
opponent_type: opponentType,
}),
});
const data = await response.json();
console.log(`Strategy: ${data.strategy}`);
console.log(`Confidence: ${(data.confidence * 100).toFixed(1)}%`);
return data;
};
// Usage
getRecommendation(
[
"X-Bow",
"Tesla",
"Ice Golem",
"Archers",
"Fireball",
"Log",
"Ice Spirit",
"Skeletons",
],
"beatdown"
);The system uses five primary datasets containing Clash Royale match data, player profiles, and card attributes.
- Size: ~50,000 records
- Columns:
tag: Unique player identifiername: Player nametrophies: Current trophy countexpLevel: Experience level (1-14)arena: Current arenaclan: Clan affiliation
Purpose: Player demographic information and skill level indicators
- Size: ~100,000 deck configurations
- Columns:
playerTag: Player identifiercards: List of 8 cards in deckaverageElixir: Average elixir cost
Purpose: Deck composition analysis and feature extraction
- Size: ~500,000 matches
- Columns:
playerTag: Player identifieropponent: Opponent identifierwinner: Boolean win/lossbattleTime: Timestamp of matchgameMode: Game mode played
Purpose: Win rate calculation, counter analysis, and performance tracking
- Size: ~75,000 records
- Columns:
playerTag: Player identifieravg_elixir: Average elixir costest_elixir_per_min: Estimated elixir efficiencysynergy: Card synergy scorecohesion: Deck cohesion metricmode_difficulty: Game mode difficultywinner_flag: Match outcome (0/1)player_archetype_*: One-hot encoded archetypes (Beatdown, Hog Cycle, etc.)opponent_archetype_*: Opponent deck typesgame_mode_*: One-hot encoded game modes
Purpose: Training labels for strategy classification and archetype analysis
- Size: 107 unique cards
- Columns:
card_name: Card identifierelixir_cost: Elixir cost (1-9)card_type: Type (troop/spell/building)rarity: Rarity tier (common/rare/epic/legendary)arena_unlock: Arena unlock leveltargets: Target type (ground/air/both)damage_type: Damage type (single/area)range: Attack rangehit_speed: Attack speedhitpoints: Health pointsdamage: Damage per hitdps: Damage per secondspeed: Movement speedcount: Unit count (if spawner)spawn_effect: Special spawn abilitiesspecial_ability: Unique card abilitiesdescription: Card description text
Purpose: Card statistics, feature engineering, and semantic text generation
Raw Data → Preprocessing → Feature Engineering → Model Training
↓ ↓ ↓ ↓
CSV Files → Clean Data → Feature Matrices → Trained Models
-
Preprocessing (
data_preprocessing.py)- Missing value imputation (median for numeric, 'Unknown' for categorical)
- Column name standardization (lowercase, stripped)
- Categorical encoding (LabelEncoder/OneHotEncoder)
- Date parsing and type conversion
- Dataset merging on player tags
-
Feature Engineering (
feature_engineering.py)- Deck statistics computation (elixir, DPS, defense)
- Synergy scores (role, type, historical win rates)
- Counter metrics (card vs card, archetype matchups)
- Deck embeddings (32-dimensional vectors)
- Mode-specific features (Attack/Defense)
-
Model Training (
train_model.py)- Strategy classification (3 classes)
- Win rate prediction
- Model evaluation and validation
- Hyperparameter tuning
- Clash Royale Official API
- Community match replays
- Historical tournament data
- Player-submitted deck compositions
- Match history: Real-time via API
- Card attributes: After game balance updates
- Player profiles: Daily batch updates
- Strategy classifications: Weekly retraining
Scenario: You want to know the best strategy for your Hog Cycle deck.
Using Streamlit Dashboard:
- Open the dashboard at
http://localhost:8501 - Navigate to "Strategy Recommender" tab
- Select your deck cards:
- Hog Rider, Ice Golem, Musketeer, Cannon
- Fireball, Log, Ice Spirit, Skeletons
- Choose opponent type: "Beatdown"
- Click "Get Recommendation"
- View results with confidence score and deck statistics
Using Python API:
import requests
deck = [
"Hog Rider", "Ice Golem", "Musketeer", "Cannon",
"Fireball", "Log", "Ice Spirit", "Skeletons"
]
response = requests.post(
"http://localhost:8000/recommend",
json={"cards": deck, "opponent_type": "beatdown"}
)
result = response.json()
print(f"🎯 Strategy: {result['strategy']}")
print(f"📊 Confidence: {result['confidence']:.1%}")
print(f"⚡ Avg Elixir: {result['features']['avg_elixir']}")
print(f"🛡️ Defense: {result['features']['defense_strength']}")Expected Output:
🎯 Strategy: Defensive Counter
📊 Confidence: 87.5%
⚡ Avg Elixir: 2.9
🛡️ Defense: 425
Scenario: You want to find decks similar to "Giant Beatdown with support".
Using Streamlit Dashboard:
- Go to "Semantic Search" tab
- Enter query: "Giant beatdown with musketeer and mega minion"
- Set top-k: 5
- Click "Search Strategies"
- Review similar strategies with similarity scores
Using Python API:
response = requests.post(
"http://localhost:8000/semantic",
json={
"query": "Giant beatdown with musketeer and mega minion",
"top_k": 5
}
)
results = response.json()
print(f"⏱️ Query time: {results['query_time_ms']:.1f}ms\n")
for i, result in enumerate(results['results'], 1):
print(f"{i}. {result['strategy']}")
print(f" Similarity: {result['similarity']:.2%}\n")Expected Output:
⏱️ Query time: 15.3ms
1. Giant + Musketeer + Mega Minion push with Zap and Fireball
Similarity: 94.5%
2. Giant beatdown with wizard and mega minion support
Similarity: 89.2%
3. Heavy Giant push with multiple support troops
Similarity: 85.7%
Scenario: Analyze multiple decks at once for comparison.
import requests
decks = [
{
"name": "Hog Cycle",
"cards": ["Hog Rider", "Valkyrie", "Musketeer", "Fireball",
"Zap", "Cannon", "Ice Spirit", "Skeletons"]
},
{
"name": "Giant Beatdown",
"cards": ["Giant", "Witch", "Musketeer", "Fireball",
"Zap", "Mega Minion", "Tombstone", "Ice Spirit"]
},
{
"name": "X-Bow Siege",
"cards": ["X-Bow", "Tesla", "Ice Golem", "Archers",
"Fireball", "Log", "Ice Spirit", "Skeletons"]
}
]
print("Deck Comparison:")
print("-" * 70)
print(f"{'Deck Name':<20} {'Strategy':<20} {'Confidence':<15}")
print("-" * 70)
for deck in decks:
response = requests.post(
"http://localhost:8000/recommend",
json={"cards": deck["cards"]}
)
result = response.json()
print(f"{deck['name']:<20} {result['strategy']:<20} {result['confidence']:.1%}")Expected Output:
Deck Comparison:
----------------------------------------------------------------------
Deck Name Strategy Confidence
----------------------------------------------------------------------
Hog Cycle Balanced Cycle 82.3%
Giant Beatdown Aggressive Push 89.1%
X-Bow Siege Defensive Counter 91.5%
Scenario: Analyze your deck performance against different opponent archetypes.
my_deck = [
"Miner", "Poison", "Valkyrie", "Musketeer",
"Ice Spirit", "Skeletons", "Cannon", "Log"
]
opponent_types = ["beatdown", "cycle", "control", "bait", "siege"]
print("Match-up Analysis:")
print("-" * 60)
print(f"{'Opponent Type':<15} {'Recommended Strategy':<25} {'Win %'}")
print("-" * 60)
for opp_type in opponent_types:
response = requests.post(
"http://localhost:8000/recommend",
json={"cards": my_deck, "opponent_type": opp_type}
)
result = response.json()
print(f"{opp_type.capitalize():<15} {result['strategy']:<25} {result['confidence']:.0%}")Expected Output:
Match-up Analysis:
------------------------------------------------------------
Opponent Type Recommended Strategy Win %
------------------------------------------------------------
Beatdown Defensive Counter 85%
Cycle Balanced Cycle 78%
Control Aggressive Push 72%
Bait Defensive Counter 81%
Siege Aggressive Push 76%
Complete workflow for optimizing a deck based on recommendations:
import requests
BASE_URL = "http://localhost:8000"
# Step 1: Get initial recommendation
current_deck = [
"Knight", "Archers", "Arrows", "Fireball",
"Giant", "Musketeer", "Minions", "Zap"
]
print("=" * 70)
print("DECK OPTIMIZATION WORKFLOW")
print("=" * 70)
response = requests.post(
f"{BASE_URL}/recommend",
json={"cards": current_deck}
)
initial = response.json()
print("\n📊 Initial Deck Analysis:")
print(f" Strategy: {initial['strategy']}")
print(f" Confidence: {initial['confidence']:.1%}")
print(f" Avg Elixir: {initial['features']['avg_elixir']:.1f}")
print(f" Synergy: {initial['features'].get('synergy_score', 'N/A')}")
# Step 2: Find similar successful strategies
print("\n🔍 Finding Similar High-Performing Strategies...")
response = requests.post(
f"{BASE_URL}/semantic",
json={
"query": f"{initial['strategy']} with high win rate",
"top_k": 3
}
)
similar = response.json()
print("\n Top 3 Similar Strategies:")
for i, result in enumerate(similar['results'], 1):
print(f" {i}. {result['strategy']}")
print(f" Similarity: {result['similarity']:.0%}")
# Step 3: Test optimized deck
print("\n🔧 Testing Optimized Deck...")
optimized_deck = [
"Giant", "Musketeer", "Mega Minion", "Fireball",
"Zap", "Ice Spirit", "Cannon", "Skeletons"
]
response = requests.post(
f"{BASE_URL}/recommend",
json={"cards": optimized_deck}
)
optimized = response.json()
print(f"\n Strategy: {optimized['strategy']}")
print(f" Confidence: {optimized['confidence']:.1%}")
print(f" Avg Elixir: {optimized['features']['avg_elixir']:.1f}")
# Step 4: Compare results
improvement = (optimized['confidence'] - initial['confidence']) * 100
print("\n📈 Optimization Results:")
print(f" Confidence Improvement: {improvement:+.1f}%")
print(f" Elixir Efficiency: {optimized['features']['avg_elixir'] - initial['features']['avg_elixir']:+.1f}")
if improvement > 0:
print(" ✅ Optimization Successful!")
else:
print(" ℹ️ Consider alternative modifications")
print("=" * 70)Scenario: Auto-analyze deck after each match.
class ClashRoyaleAssistant:
def __init__(self, api_url="http://localhost:8000"):
self.api_url = api_url
self.match_history = []
def analyze_deck(self, deck, opponent_type=None):
"""Get strategy recommendation for a deck."""
response = requests.post(
f"{self.api_url}/recommend",
json={"cards": deck, "opponent_type": opponent_type}
)
return response.json()
def find_counters(self, opponent_deck):
"""Find effective counter strategies."""
query = f"counter to {', '.join(opponent_deck[:3])}"
response = requests.post(
f"{self.api_url}/semantic",
json={"query": query, "top_k": 3}
)
return response.json()
def record_match(self, deck, opponent_type, strategy_used, won):
"""Track match performance."""
self.match_history.append({
"deck": deck,
"opponent": opponent_type,
"strategy": strategy_used,
"result": "win" if won else "loss"
})
def get_stats(self):
"""Get performance statistics."""
if not self.match_history:
return "No matches recorded"
total = len(self.match_history)
wins = sum(1 for m in self.match_history if m["result"] == "win")
win_rate = (wins / total) * 100
return f"Matches: {total}, Wins: {wins}, Win Rate: {win_rate:.1f}%"
# Usage
assistant = ClashRoyaleAssistant()
# Before match
my_deck = ["Hog Rider", "Ice Golem", "Musketeer", "Cannon",
"Fireball", "Log", "Ice Spirit", "Skeletons"]
recommendation = assistant.analyze_deck(my_deck, "beatdown")
print(f"Use strategy: {recommendation['strategy']}")
# After match
assistant.record_match(my_deck, "beatdown", recommendation['strategy'], won=True)
print(assistant.get_stats())📜 License This project is licensed under the MIT License. See the LICENSE file for details.
