RBE549: Computer Vision - Worcester Polytechnic Institute, Spring 2025
This project implements advanced computer vision techniques across two main phases:
- A probabilistic boundary detection algorithm that improves upon traditional edge detection methods
- Implementation and optimization of various convolutional neural network architectures
For detailed project specifications, please refer to the course project page.
- Python 3.8+
- PyTorch
- NumPy
- OpenCV
- scikit-learn
- matplotlib
- tqdm
To install all dependencies:
pip install -r requirements.txtThis phase implements a sophisticated boundary detection algorithm using multiple filter banks and gradient analyses. The approach combines texture, brightness, and color information to create a robust probabilistic boundary detection system.
- Multiple filter bank implementations (DoG, Gabor, HD, LM)
- Multi-channel analysis (texture, brightness, color)
- Gradient computation and combination
- Probabilistic boundary detection
- Ensure your images are in the "BSDS500" directory
- Run the wrapper script:
python Wrapper.pyAll outputs will be automatically saved to the "Outputs" directory.
|
|
|
|
| DoG Filters | Gabor Filters | HD Masks | LM Filters |
|
|
|
| Texton Map | Brightness Map | Color Map |
|
|
|
| Canny Baseline | Sobel Baseline | PBLite (Our Method) |
This phase implements and compares various convolutional neural network architectures for image classification on the CIFAR-10 dataset. The implementation includes several modern network architectures and optimization techniques to improve classification performance.
- LeNet: Classic convolutional neural network architecture
- Custom CIFAR10 Model: Tailored architecture with batch normalization
- ResNet: Deep residual network with skip connections
- DenseNet: Dense convolutional network with dense connectivity pattern
- ResNeXt: Advanced architecture with grouped convolutions and cardinality
- Batch normalization for faster training and better convergence
- Skip connections in ResNet and ResNeXt for deep network training
- Dense connectivity in DenseNet for better feature reuse
- Data augmentation including random crops and horizontal flips
- Learning rate optimization with Adam optimizer
- Comprehensive model evaluation and visualization
python Train.py [options]--CheckPointPath Path to save checkpoints (default: ../Checkpoints/)
--NumEpochs Number of training epochs (default: 25)
--DivTrain Train data division factor (default: 1)
--MiniBatchSize Training batch size (default: 128)
--LoadCheckPoint Load from checkpoint? (0/1, default: 0)
--LogsPath Training logs directory (default: LogsRes/)
- 3 convolutional blocks with increasing channels (16→32→64→128→256)
- Batch normalization after each convolution
- MaxPooling layers for spatial dimension reduction
- Fully connected layers (512→128→10)
- Residual blocks with skip connections
- Convolutional blocks with BatchNorm and ReLU
- Adaptive average pooling
- Dropout for regularization
- Dense blocks with growth rate of 12
- Transition layers for dimension reduction
- Global average pooling
- Dense connectivity pattern
- Cardinality of 8 for grouped convolutions
- Three stages with increasing channels (128→256→512)
- Bottleneck blocks for efficient computation
python Test.py [options]--ModelPath Path to trained model checkpoint
--LabelsPath Path to test labels file
--SelectTestSet Test set selection flag
--ModelType Architecture selection
- Training and validation accuracy curves
- Loss progression plots
- Confusion matrix generation
- Performance metrics comparison
The implementation includes:
- Real-time accuracy tracking
- Loss monitoring
- Model parameter counting
- Confusion matrix visualization
- Cross-architecture performance comparison
- Data normalization (mean/std: (0.4914, 0.4822, 0.4465)/(0.2023, 0.1994, 0.2010))
- Data augmentation techniques:
- Random cropping (32x32 with padding=4)
- Random horizontal flips (p=0.5)
- Weight decay (1e-4) for regularization
- Adam optimizer with learning rate 1e-3
- PyTorch for model implementation
- TensorBoard for training visualization
- Matplotlib for result plotting
- tqdm for progress tracking
- scikit-learn for metrics computation
project-root/
├── BSDS500/ # Dataset directory
├── Checkpoints/ # Model checkpoints
├── Images/ # Result visualizations
├── Logs/ # Training/testing logs
├── Outputs/ # Generated outputs
├── Phase_1_media/ # Phase 1 visualizations
├── Phase 2 media_output/ # Phase 2 results
├── TxtFiles/ # Label and configuration files
├── Train.py # Training implementation
├── Test.py # Testing implementation
└── Wrapper.py # Phase 1 implementation
If you find this work useful in your research, please consider citing:
@article{WPI_CV_2025,
title={Probabilistic Boundary Detection and CNN Performance Enhancement},
author={[Your Name]},
journal={RBE549 Course Project},
institution={Worcester Polytechnic Institute},
year={2025}
}This project is licensed under the MIT License - see the LICENSE file for details.










