Skip to content

CVEO/vectorizer-pro

Repository files navigation

Vectorizer Pro

Tests PyPI Python License

English | 中文

A Python CLI tool to vectorize raster mask files into polygon shapefiles with topology-preserving simplification.

Features

  • Convert raster masks (int8/16/32 class IDs) to vector polygons
  • Pure Python topology-preserving Visvalingam-Whyatt (TPVW) simplification
  • No GEOS dependency for simplification — self-contained pure Python implementation
  • Support for large images (30000x30000+)
  • 4-connectivity polygonization
  • Output formats: Shapefile (.shp), GeoPackage (.gpkg), GeoJSON (.geojson)
  • Preserve class ID attributes
  • CRS preservation from input raster

Installation

pip install -e .

Or install from source:

git clone https://github.com/CVEO/vectorizer-pro.git
cd vectorizer-pro
pip install -e .

Usage

Basic Usage

vectorizer-pro input.tif output.shp

With Options

# Specify nodata value to exclude
vectorizer-pro input.tif output.shp --nodata 0

# Remove small regions (merge regions smaller than 100 pixels)
vectorizer-pro input.tif output.shp --min-area 100

# Set simplification tolerance
vectorizer-pro input.tif output.shp --tolerance 0.1

# Output as GeoPackage
vectorizer-pro input.tif output.gpkg --format gpkg

# Simplify only internal edges (preserve boundary)
vectorizer-pro input.tif output.shp --no-simplify-boundary

Command Line Options

Option Description
--nodata INT Nodata value to exclude from vectorization
--min-area FLOAT Minimum polygon area threshold. Smaller polygons will be merged into their largest adjacent neighbor
--tolerance FLOAT Simplification tolerance (default: half pixel size)
--format, -f Output format: shp, gpkg, or geojson (default: shp)
--simplify-boundary/--no-simplify-boundary Simplify exterior boundaries (default: yes)
--detect-nodata Print nodata value and exit
--list-classes List unique class IDs and exit

Python Package Usage

from vectorizer_pro import vectorize, VectorizeResult

# Simple usage - writes to file
result = vectorize("input.tif", "output.shp", nodata=0)

# Remove small regions in Python API
result = vectorize("input.tif", "output.shp", nodata=0, min_area=100)

# Get geometries without writing
result = vectorize("input.tif", nodata=0, output_path=None)
polygons = result.polygons
class_ids = result.class_ids
crs = result.crs

Examples

Quick Start

# Check nodata value
vectorizer-pro sample/top_potsdam_2_13.tif --detect-nodata

# List class IDs
vectorizer-pro sample/top_potsdam_2_13.tif --list-classes

# Vectorize excluding class 0
vectorizer-pro sample/top_potsdam_2_13.tif output.shp --nodata 0

Advanced Usage

# High simplification for smoother polygons
vectorizer-pro input.tif output.shp --nodata 0 --tolerance 0.5

# Remove small regions before simplification
vectorizer-pro input.tif output.shp --nodata 0 --min-area 50 --tolerance 0.1

# Preserve exact boundary shape
vectorizer-pro input.tif output.shp --nodata 0 --no-simplify-boundary

# GeoPackage output with custom tolerance
vectorizer-pro input.tif output.gpkg --format gpkg --tolerance 0.05

Requirements

  • Python >= 3.10
  • rasterio
  • shapely >= 2.1
  • click
  • fiona
  • numpy

References

Projects

Algorithms

  • GDAL Polygonize - Two-arm chain edge tracing algorithm for 4-connectivity raster vectorization

  • Visvalingam-Whyatt - Area-based vertex removal simplification that preserves topology in polygonal coverages

  • TPVW (Topology-Preserving Visvalingam-Whyatt) - Extension of VW algorithm that ensures shared edges between adjacent polygons are simplified identically, preventing gaps and overlaps

Sample Data

  • sample/top_potsdam_2_13.tif - Semantic labeling result generated by an AI model on the ISPRS Potsdam 2D Semantic Labeling Contest benchmark dataset. Used as a demonstration of vectorizing large raster masks.

  • sample/small.tif - A smaller sample for quick testing.

The original Potsdam aerial imagery and ground truth are from the ISPRS benchmark: https://www.isprs.org/

Authors

Wuhan University CVEO Team (武汉大学CVEO课题组)

Website: https://www.whu-cveo.com/

License

MIT License - see LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages