PixelRec Production Deployment - Index & Getting Started

🚀 Ready to move from notebook to real pipeline!

📋 Reading Order (Recommended)

Start with this file, then follow the links in order:

1️⃣ Quick Overview (5 min)

This file you're reading
Understand what's available

2️⃣ Deployment Plan (10 min)

DEPLOYMENT_PLAN.md
High-level overview of 3 phases
File organization
Differences from notebook

3️⃣ Run Quick Test (10 min)

Execute: python quickstart_deployment.py
This automates Phase 1 setup
Verifies entire pipeline works

4️⃣ Full Production Guide (30 min)

PRODUCTION_DEPLOYMENT_GUIDE.md
Complete step-by-step instructions
Data format specifications
Config parameter reference
Troubleshooting guide

5️⃣ Command Reference (Bookmark)

COMMAND_REFERENCE.md
Copy-paste commands for each phase
Monitoring and analysis tools
Integration options

🎯 What You Can Do Now

Task	Time	File	Command
Test Pipeline (Quickest)	10 min	quickstart_deployment.py	`python quickstart_deployment.py`
Manual Test (Step by step)	15 min	COMMAND_REFERENCE.md	Follow Phase 1 commands
Full Dataset (if available)	2-4 hrs	PRODUCTION_DEPLOYMENT_GUIDE.md	Phase 2 section
Add PixelNet (if images exist)	3-5 hrs	PRODUCTION_DEPLOYMENT_GUIDE.md	Phase 3 section

📁 New Files Created For You

Documentation (Read These)

✓ DEPLOYMENT_PLAN.md (YOU ARE HERE - Overview)
✓ PRODUCTION_DEPLOYMENT_GUIDE.md (MAIN REFERENCE - Detailed guide)
✓ COMMAND_REFERENCE.md (Copy-paste commands)

Code (Run These)

✓ dataset/create_sample.py (Auto-generate test data)
✓ code/IDNet/sample_mini.yaml (Pre-configured baseline)
✓ quickstart_deployment.py (Automated Phase 1)

🚀 Right Now: Start Here (Pick One)

Option A: Automatic (Recommended)

cd D:\Project\PixelRec
python quickstart_deployment.py

Automatically generates data
Verifies config/model/data
Runs training
Sets up checkpoint
Estimated time: 15 minutes

Option B: Manual Steps (Learning)

# 1. Generate data
python dataset/create_sample.py

# 2. Run training
cd code
python ../main.py --device 0 --config_file IDNet/sample_mini.yaml

Follow each step deliberately
Understand what's happening
Estimated time: 20 minutes

Option C: Full Documentation First (Thorough)

Read DEPLOYMENT_PLAN.md (10 min)
Read PRODUCTION_DEPLOYMENT_GUIDE.md (30 min)
Run quickstart_deployment.py (15 min)

Total time: 55 minutes, deep understanding

✅ Expected Outcome After Phase 1

After running the quick test successfully, you will have:

✓ Working repo structure verified
✓ Data pipeline tested (CSV → PyTorch DataLoader)
✓ Model loading confirmed (SASRec instantiated)
✓ Training loop working (loss decreases, metrics improve)
✓ Validation/test evaluation functional
✓ Checkpoint saving confirmed
✓ Log files with training history

Output example:

🎉 DEPLOYMENT SUCCESSFUL!

Best valid result: Recall@10: 0.3456
Test result: Recall@5: 0.2134, Recall@10: 0.3421, NDCG@10: 0.2198

✅ Model checkpoint: log/{timestamp}/best_model.pth
✅ Training logs: log/{timestamp}/INFO.log

📊 Three Phases of Deployment

Phase 1: Test Pipeline (NOW - 15 min)

python quickstart_deployment.py

Uses sample data (10K interactions)
Small model (embedding_size=64, 2 layers)
Quick verification
Success: Logs show metrics improve

Phase 2: Real Dataset (2-4 hours)

python main.py --device 0,1,2,3 --config_file IDNet/pixelrec50k.yaml

Uses full PixelRec50K dataset (~700K interactions)
Production config (embedding_size=256, 4 layers)
Multi-GPU training recommended
Prerequisite: Download dataset from Google Drive

Phase 3: PixelNet with Images (3-5 hours)

python main.py --device 0 --config_file PixelNet/pixelrec50k_pixel.yaml

End-to-end learning with visual encoder
Real item images required
LMDB index must be generated first
Prerequisite: Item cover images in dataset/covers/

🔍 Key Repo Structure At A Glance

Pipeline Architecture:

main.py (launcher)
  ↓
run.py (trainer)
  ├── Config(YAML) → Load configuration
  ├── load_data() → Load CSV & build sequences  
  ├── get_model() → Instantiate model class
  ├── Trainer.fit() → Training loop
  │   ├── train_epoch() → BPR loss
  │   ├── validate() → Eval metrics
  │   └── save_checkpoint() → Best model
  └── evaluate(test_loader)
      └── Return test metrics to log

Data Flow:
  CSV file (item_id, user_id, timestamp)
    ↓
  load_data() - remap IDs, build sequences
    ↓
  build_dataloader() - batch into PyTorch tensors
    ↓
  Model forward() - return logits
    ↓
  BPR loss - optimize embeddings
    ↓
  Evaluator.metrics() - Recall@K, NDCG@K

⚠️ Before You Start: Check Requirements

# Python version
python --version  # Should be >= 3.8

# PyTorch
python -c "import torch; print(torch.__version__)"

# CUDA
python -c "import torch; print(torch.cuda.is_available())"

# Pandas
python -c "import pandas; print(pandas.__version__)"

If any fail, install: pip install -r requirements.txt

🎓 Learning Path

Goal: Understand repo structure while setting it up

Day 1 (30 min):
  - Read DEPLOYMENT_PLAN.md
  - Run quickstart_deployment.py
  - See basic training works

Day 2 (1 hour):
  - Read PRODUCTION_DEPLOYMENT_GUIDE.md
  - Understand config system
  - Know data format/requirements
  - Understand 3 model types

Day 3 (2-4 hours):
  - Run Phase 2 on real data
  - See convergence on PixelRec50K
  - Understand scaling/performance tuning

Day 4+ (ongoing):
  - Phase 3 PixelNet with images
  - Experiment with different models
  - Optimize hyperparameters
  - Compare architectures

💡 Pro Tips

Start with Phase 1 first - Takes 15 minutes and confirms everything works
Check logs frequently - tail -f log/*/INFO.log shows live progress
Save checkpoints - Best models auto-saved, never lost
Use multi-GPU for speed - --device 0,1,2,3 reduces time ~4x
Monitor VRAM - If OOM, reduce batch_size in config
Keep sample config - Use for quick testing, then copy for experiments
Compare models systematically - Train each baseline once before PixelNet

🆘 Getting Help

Common Issues Quick Fixes

# "CSV not found"
→ python dataset/create_sample.py

# "Model not found"  
→ Check file exists: code/REC/model/IDNet/sasrec.py

# "Config error"
→ Validate YAML syntax: code IDNet/sample_mini.yaml

# "CUDA out of memory"
→ Reduce batch_size in config (256 → 64)

# "Training not improving"
→ Normal for sample data - try real dataset in Phase 2

Full Troubleshooting

→ See PRODUCTION_DEPLOYMENT_GUIDE.md section 7

📞 Support Resources

Full Guide: PRODUCTION_DEPLOYMENT_GUIDE.md (comprehensive)
Quick Commands: COMMAND_REFERENCE.md (copy-paste)
Official Paper: https://arxiv.org/pdf/2309.06789.pdf
Original Repo: https://github.com/westlake-repl/PixelRec
Dataset: https://drive.google.com/drive/folders/1vR1lgQUZCy1cuhzPkM2q7AsdYRP43feQ

✨ What's Different From Notebook

Aspect	Notebook	Pipeline
Starting Point	Notebook cells	Command line entry
Configuration	Hardcoded vars	YAML files
Data	Synthetic only	Real CSV support
GPU	Single GPU	Multi-GPU (DDP)
Checkpointing	Manual	Automatic
Logging	Print statements	Structured logs
Reproducibility	Limited	Full
Production Ready	No	Yes ✓

🎯 Your Next Action

Pick ONE:

# FASTEST (15 min, fully automated)
python quickstart_deployment.py

# LEARNING (20 min, manual steps)
python dataset/create_sample.py
cd code && python ../main.py --device 0 --config_file IDNet/sample_mini.yaml

# THOROUGH (55 min, full understanding)
Read DEPLOYMENT_PLAN.md → Read PRODUCTION_DEPLOYMENT_GUIDE.md → Run quickstart

Made for you: April 2026
Status: Ready to deploy
Time to first success: 15 minutes

👉 START NOW: python quickstart_deployment.py

Good luck! 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PixelRec Production Deployment - Index & Getting Started

📋 Reading Order (Recommended)

1️⃣ Quick Overview (5 min)

2️⃣ Deployment Plan (10 min)

3️⃣ Run Quick Test (10 min)

4️⃣ Full Production Guide (30 min)

5️⃣ Command Reference (Bookmark)

🎯 What You Can Do Now

📁 New Files Created For You

Documentation (Read These)

Code (Run These)

🚀 Right Now: Start Here (Pick One)

Option A: Automatic (Recommended)

Option B: Manual Steps (Learning)

Option C: Full Documentation First (Thorough)

✅ Expected Outcome After Phase 1

📊 Three Phases of Deployment

Phase 1: Test Pipeline (NOW - 15 min)

Phase 2: Real Dataset (2-4 hours)

Phase 3: PixelNet with Images (3-5 hours)

🔍 Key Repo Structure At A Glance

⚠️ Before You Start: Check Requirements

🎓 Learning Path

💡 Pro Tips

🆘 Getting Help

Common Issues Quick Fixes

Full Troubleshooting

📞 Support Resources

✨ What's Different From Notebook

🎯 Your Next Action

FilesExpand file tree

README_DEPLOYMENT.md

Latest commit

History

README_DEPLOYMENT.md

File metadata and controls

PixelRec Production Deployment - Index & Getting Started

📋 Reading Order (Recommended)

1️⃣ Quick Overview (5 min)

2️⃣ Deployment Plan (10 min)

3️⃣ Run Quick Test (10 min)

4️⃣ Full Production Guide (30 min)

5️⃣ Command Reference (Bookmark)

🎯 What You Can Do Now

📁 New Files Created For You

Documentation (Read These)

Code (Run These)

🚀 Right Now: Start Here (Pick One)

Option A: Automatic (Recommended)

Option B: Manual Steps (Learning)

Option C: Full Documentation First (Thorough)

✅ Expected Outcome After Phase 1

📊 Three Phases of Deployment

Phase 1: Test Pipeline (NOW - 15 min)

Phase 2: Real Dataset (2-4 hours)

Phase 3: PixelNet with Images (3-5 hours)

🔍 Key Repo Structure At A Glance

⚠️ Before You Start: Check Requirements

🎓 Learning Path

💡 Pro Tips

🆘 Getting Help

Common Issues Quick Fixes

Full Troubleshooting

📞 Support Resources

✨ What's Different From Notebook

🎯 Your Next Action