Reference implementation of the paper Multi-View Stereo Using Graph Cuts-Based Depth Refinement published in IEEE Signal Processing Letters (2022).
The paper presents a depth map-based multi-view stereo (MVS) pipeline designed to improve reconstruction quality in homogeneous / weakly textured regions, where most MVS methods struggle due to insufficient local photometric evidence.
Instead of refining depth by processing only subsets of pixels (patch-based or iterative local refinement), the proposed method performs global refinement of all pixels in an image simultaneously by converting depth refinement into a graph cut optimization problem.
The pipeline consists of:
-
Coarse depth estimation at reduced resolution using a robust photo-consistency objective (ZNCC patch matching + best-κ aggregation), optimized per-pixel using a metaheuristic (e.g., Particle Swarm Optimization).
-
Cross-view consistency filtering and cross-view depth completion to remove inconsistent depths and fill missing values.
-
Graph cuts-based depth refinement, where depth optimization is formulated as an s–t min-cut problem on a 3D grid graph (rows × cols × depth levels).
- The graph uses offset vertices to align depth hypotheses around each pixel’s initial depth, enabling efficient global refinement without increasing graph size.
- The energy includes a data term (photo-consistency) and a smoothness term (depth discontinuity), weighted by an image-gradient-based adaptive regularizer to preserve depth discontinuities near edges while enforcing smoothness in flat regions.
-
Refined depth maps are fused into a dense point cloud and converted into a watertight mesh using Poisson surface reconstruction.
The method is evaluated on Middlebury, EPFL, and DTU multi-view datasets, demonstrating strong completeness and competitive accuracy, with notable improvements in low-texture surfaces, occluded regions, and challenging illumination.
Figure: Reconstruction results on the DTU Skull dataset. (Left) input images from different viewpoints, (middle) refined depth maps obtained after graph cuts optimization, (right) reconstructed untextured 3D model.
git clone https://github.com/nirmalsnair/mvs-graphcuts.git
cd mvs-graphcuts- Install dependencies:
pip install -r requirements.txtDownload a calibrated multi-view dataset such as Middlebury MVS (e.g., DinoRing or TempleRing):
Place the images and camera parameters in the dataset directory expected by the scripts.
Compute initial depth maps using ZNCC-based patch matching:
python dmap_zncc.pypython dmap_refinement_gcuts.pyThe generated scene is compatible with Multi-View Environment (MVE):
https://github.com/simonfuhrmann/mve
Use the MVE tools to convert depth maps into a point cloud and run Poisson Surface Reconstruction to obtain the final mesh.
@article{nair2022multi,
title={Multi-View Stereo Using Graph Cuts-Based Depth Refinement},
author={Nair, Nirmal S and Nair, Madhu S},
journal={IEEE Signal Processing Letters},
volume={29},
pages={1903--1907},
year={2022},
publisher={IEEE}
}