A comprehensive data science project for analyzing professional League of Legends esports match data from the 2025 season. This project includes data loading, exploratory data analysis (EDA), statistical analysis, and visualizations to uncover insights about player performance, champion meta, team strategies, and competitive trends.
This project provides tools and analyses for understanding professional League of Legends competitive play through data. Using match data from Oracle's Elixir, we examine:
- Player Performance: Individual statistics, KDA ratios, and performance metrics
- Champion Meta: Pick/ban rates, win rates, and meta trends
- Team Analysis: Team performance, win rates, and strategic patterns
- Position Comparison: Role-specific statistics and trends
- Temporal Analysis: Game duration impact and seasonal trends
- Statistical Correlations: Relationships between performance metrics
lol-Esports-Project/
β
βββ config/ # Configuration files
β βββ __init__.py
β βββ config.py # Project settings and constants
β
βββ data/ # Data directory (gitignored)
β βββ raw/ # Raw CSV files from Oracle's Elixir
β βββ processed/ # Cleaned and processed data
β
βββ notebooks/ # Jupyter notebooks
β βββ 01_exploratory_data_analysis.ipynb
β
βββ src/ # Source code
β βββ __init__.py
β βββ data_loader.py # Data downloading and loading utilities
β βββ eda_analysis.py # Exploratory data analysis scripts
β
βββ output/ # Analysis outputs (gitignored)
β βββ figures/ # Visualization outputs
β βββ reports/ # Generated reports
β
βββ .gitignore # Git ignore file
βββ requirements.txt # Python dependencies
βββ README.md # This file
- Python 3.9 or higher
- pip package manager
- (Optional) Jupyter Lab/Notebook for interactive analysis
- Clone the repository
git clone https://github.com/mattspooner1/lol-Esports-Project.git
cd lol-Esports-Project- Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies
pip install -r requirements.txtThe project uses data from Oracle's Elixir, which is hosted on Google Drive.
Option 1: Automatic Download (uses the data_loader script)
from src.data_loader import DataLoader
loader = DataLoader()
df = loader.load_year_data(2025, download_if_missing=True)Option 2: Manual Download
- Visit the Oracle's Elixir Google Drive: https://drive.google.com/drive/folders/1gLSw0RLjBbtaNy0dgnGQDAZOHIgCe-HH
- Download the 2025 CSV file
- Save it to
data/raw/lol_esports_2025.csv
Execute the full exploratory data analysis:
python src/eda_analysis.pyThis will:
- Load and clean the data
- Generate summary statistics
- Create all visualizations
- Save processed data and figures to output directories
For interactive analysis:
jupyter lab
# Navigate to notebooks/01_exploratory_data_analysis.ipynbfrom src.data_loader import DataLoader
from src.eda_analysis import EsportsEDA
# Load data
loader = DataLoader()
df = loader.load_year_data(2025)
# Run EDA
eda = EsportsEDA(df)
eda.run_complete_eda()
# Or run specific analyses
df_clean = eda.clean_data()
player_stats = eda.analyze_player_performance()
champion_stats = eda.analyze_champion_meta()- Individual player statistics (KDA, DPM, CSPM, Vision Score)
- Top player rankings
- Performance trends
- Win rates and game counts
Output: data/processed/player_performance.csv, output/figures/top_players_kda.png
- Pick rates and ban rates
- Champion win rates
- Meta trends and popularity
- Champion-specific statistics
Output: data/processed/champion_meta.csv, output/figures/champion_pickrate.png
- Role-specific statistics (Top, Jungle, Mid, ADC, Support)
- Average metrics by position
- Position comparison visualizations
Output: data/processed/position_metrics.csv, output/figures/position_comparison.png
- Team win rates and rankings
- Aggregate team statistics
- Performance comparisons
Output: data/processed/team_performance.csv
- How game length affects performance
- Duration-based metric analysis
- Early vs. late game patterns
Output: output/figures/game_duration_impact.png
- Correlation matrix of performance metrics
- Relationship analysis between variables
Output: output/figures/correlation_heatmap.png
Edit config/config.py to customize:
- Data source URLs
- Directory paths
- Visualization settings (figure size, DPI, style)
- Analysis parameters (minimum games threshold, etc.)
All code follows a consistent documentation format:
"""
Brief description of what the function/class does.
Returns:
type: Description of what is returned.
"""Each function includes:
- Comprehensive docstring explaining purpose
- Parameter descriptions with types
- Return value documentation
- Inline comments for complex logic
- Automated Data Pipeline: Download and load data with a single function call
- Comprehensive EDA: Pre-built analyses covering all major aspects of competitive play
- High-Quality Visualizations: Publication-ready charts and graphs
- Flexible Architecture: Modular code for easy customization and extension
- Well-Documented: Clear documentation following best practices
- Jupyter Integration: Interactive notebooks for exploratory work
Some questions this project can answer:
- Who are the highest performing players in the 2025 season?
- Which champions dominate the professional meta?
- How do performance metrics vary across different positions?
- What is the correlation between damage dealt and game outcome?
- How does game duration affect player statistics?
- Which teams have the highest win rates?
- pandas: Data manipulation and analysis
- NumPy: Numerical computing
- Matplotlib: Data visualization
- Seaborn: Statistical visualizations
- Jupyter: Interactive analysis environment
- gdown: Google Drive file downloads
Data provided by Oracle's Elixir (https://oracleselixir.com/)
- Professional League of Legends match data
- Updated regularly with new games
- Comprehensive statistics for players, teams, and champions
Contributions are welcome! Areas for enhancement:
- Additional analysis modules (e.g., patch-specific analysis)
- Machine learning models for win prediction
- Advanced visualizations and dashboards
- Data enrichment from Leaguepedia API
- Real-time data pipeline integration
This project is open source and available under the MIT License.
- Oracle's Elixir for providing comprehensive esports data
- Riot Games for League of Legends
- Reference pipeline inspiration from Esports_Data_Pipeline
For questions, suggestions, or collaboration opportunities, please open an issue on GitHub.
Built with β€οΈ for the League of Legends esports community