Skip to content

Commit 4c9e89b

Browse files
committed
restructure repository and improve data processing pipeline
- Restructured codebase into proper Python package (neon_tree_classification/) - Refactored shapefile processor with robust coordinate validation andcCRS handling - Added comprehensive error handling for infinite/negative coordinates - Implemented file existence checks to prevent redundant HSI conversions - Fixed year filtering issues in crown data processing - Created modular scripts for data download and processing - Added deprecation notices for legacy files and clear migration paths - Improved command-line argument parsing for flexible script usage - Enhanced data integrity with coordinate range validation - Added comprehensive test scripts for full pipeline validation Breaking changes: - Moved core modules from src/ to neon_tree_classification/ package structure - Updated import statements and file paths throughout codebase - Deprecated old script locations (see MIGRATION_GUIDE.py) This restructure provides a more maintainable, scalable foundation for NEON tree crown classification workflows with improved data quality assurance.
1 parent 6e8be2e commit 4c9e89b

27 files changed

Lines changed: 4147 additions & 667 deletions

MIGRATION_GUIDE.py

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Repository Migration Helper
4+
5+
This script helps users navigate the reorganized NEON Tree Classification repository
6+
and provides guidance on which files to use.
7+
"""
8+
9+
import os
10+
import sys
11+
12+
def print_migration_guide():
13+
"""Print guidance for using the reorganized repository."""
14+
15+
print("🔄 NEON TREE CLASSIFICATION REPOSITORY REORGANIZED")
16+
print("="*60)
17+
print()
18+
19+
print("📁 NEW STRUCTURE:")
20+
print(" neon_tree_classification/ # Main Python package")
21+
print(" ├── data/shapefile_processor.py # Shapefile processing")
22+
print(" ├── models/architectures.py # ML models")
23+
print(" └── processing/, utils/ # Supporting modules")
24+
print()
25+
print(" scripts/ # Executable scripts")
26+
print(" ├── download_neon_all_modalities.py")
27+
print(" ├── process_tiles_to_crowns.py")
28+
print(" └── test_shapefile_processor.py")
29+
print()
30+
31+
print("🚨 DEPRECATED FILES (do not use):")
32+
print(" src/curate_shp_files.py → Use: scripts/test_shapefile_processor.py")
33+
print(" src/models.py → Use: neon_tree_classification.models.architectures")
34+
print(" src/download_neon_all_modalities.py → Use: scripts/download_neon_all_modalities.py")
35+
print(" scripts/download_neon_data.py → Use: scripts/download_neon_all_modalities.py")
36+
print(" neon_tree_classification/models/hsi_models.py → Use: architectures.py")
37+
print()
38+
39+
print("🎯 QUICK START:")
40+
print(" # Process shapefiles:")
41+
print(" python scripts/test_shapefile_processor.py")
42+
print()
43+
print(" # Process tiles and crowns:")
44+
print(" python scripts/process_tiles_to_crowns.py --site BART --year 2019")
45+
print()
46+
print(" # Download NEON data:")
47+
print(" python scripts/download_neon_all_modalities.py --mode test")
48+
print()
49+
50+
print("📦 PYTHON PACKAGE USAGE:")
51+
print(" from neon_tree_classification.data.shapefile_processor import ShapefileProcessor")
52+
print(" from neon_tree_classification.models.architectures import HsiPixelClassifier")
53+
print()
54+
55+
print("✅ TESTED AND WORKING:")
56+
print(" • Shapefile processing with coordinate validation")
57+
print(" • HSI tile conversion and cropping")
58+
print(" • Crown-tile intersection processing")
59+
print(" • BART 2019 test dataset processing")
60+
print()
61+
62+
print("📋 NEXT STEPS:")
63+
print(" 1. Test your workflow with the new script locations")
64+
print(" 2. Update any custom scripts to use the new imports")
65+
print(" 3. Remove or ignore the deprecated src/ files")
66+
print(" 4. Use --help flag on scripts to see all options")
67+
68+
if __name__ == "__main__":
69+
print_migration_guide()

README.md

Lines changed: 138 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,145 @@
1-
# NeonTreeClassification
1+
# NEON Tree Classification
22

3-
National Ecological Observatory Network (NEON) offers a variety of data products, including airborne data from different forest sites. Airborne data includes RGB orthophotos, LiDAR (CHM) airborne data, and 426 band hyperspectral data. All products are available on https://data.neonscience.org/data-products/, the following are the airborne data products used in this repository:
3+
A Python package for processing NEON (National Ecological Observatory Network) tree crown annotation data and building machine learning models for tree species classification.
44

5-
rgb_data_product = 'DP3.30010.001'
6-
hsi_withbrdf_2022 = 'DP3.30006.002'
7-
lidar = 'DP3.30015.001' #CHM
5+
National Ecological Observatory Network (NEON) offers a variety of data products, including airborne data from different forest sites. Airborne data includes RGB orthophotos, LiDAR (CHM) airborne data, and 426 band hyperspectral data. All products are available on https://data.neonscience.org/data-products/.
86

9-
# Workflow
10-
## 1. Download NEON data
11-
- Given the Northing, Easting, Year and Site, download the NEON data using the `download_neon_data.py` script. There are functions to download the RGB, HSI, and LiDAR data. The data is downloaded to a specified directory.
12-
### To do:
13-
- Merge this script in neon_utils.py
14-
- Look into using Google Earth Engine
7+
## NEON Data Products Used
158

16-
## 2. Generate crowns using deepforest
17-
- The `deepforest_parallel.py` script uses the deepforest package to generate tree crowns from the RGB data. The script is parallelized on SLURM using Dask. It can run on a given list of RGB tiles and save a pandas dataframe with the tree crowns.
9+
- **RGB Orthophotos**: `DP3.30010.001` - High-resolution orthorectified camera imagery mosaic
10+
- **Hyperspectral Imagery**: `DP3.30006.002` - Spectrometer orthorectified surface bidirectional reflectance - mosaic
11+
- **LiDAR CHM**: `DP3.30015.001` - Ecosystem structure (Canopy Height Model)
12+
13+
## Features
14+
15+
- **Shapefile Processing**: Handle coordinate system transformations and validation for NEON tree crown shapefiles
16+
- **HSI Tile Processing**: Convert and process hyperspectral imagery (HSI) tiles from H5 to GeoTIFF format
17+
- **Crown-Tile Intersection**: Match tree crown annotations with corresponding image tiles
18+
- **Data Pipeline**: End-to-end processing from raw NEON data to training-ready datasets
19+
- **Coordinate Validation**: Robust handling of invalid coordinates and CRS issues
20+
21+
## Installation
22+
23+
```bash
24+
# Clone the repository
25+
git clone https://github.com/Ritesh313/NeonTreeClassification.git
26+
cd NeonTreeClassification
27+
28+
# Install in development mode
29+
pip install -e .
30+
```
31+
32+
## Quick Start
33+
34+
### 1. Process Shapefiles
35+
```bash
36+
python scripts/test_shapefile_processor.py
37+
```
38+
39+
### 2. Process Tiles and Crowns
40+
```bash
41+
python scripts/process_tiles_to_crowns.py
42+
```
43+
44+
## Package Structure
45+
46+
```
47+
neon_tree_classification/
48+
├── data/
49+
│ └── shapefile_processor.py # Shapefile processing and CRS handling
50+
├── models/
51+
│ └── hsi_models.py # PyTorch models for HSI classification
52+
├── processing/
53+
│ └── __init__.py # Processing utilities
54+
└── utils/
55+
└── __init__.py # General utilities
56+
57+
scripts/
58+
├── download_neon_all_modalities.py # Data download scripts
59+
├── process_tiles_to_crowns.py # Main processing pipeline
60+
└── test_shapefile_processor.py # Test shapefile processing
61+
62+
configs/ # Configuration files
63+
notebooks/ # Jupyter notebooks for analysis
64+
SLURM/ # SLURM job scripts
65+
tests/ # Unit tests
66+
```
67+
68+
## Workflow
69+
70+
### 1. Download NEON Data
71+
Given the Northing, Easting, Year and Site, download the NEON data using the download scripts. There are functions to download the RGB, HSI, and LiDAR data.
72+
73+
### 2. Process Shapefiles
74+
Process tree crown annotation shapefiles with coordinate system correction and validation.
75+
76+
### 3. Process Tiles and Crowns
77+
Run the full pipeline to match crown annotations with image tiles and create training datasets.
78+
79+
## Usage Examples
80+
81+
### Processing NEON Shapefiles
82+
83+
```python
84+
from neon_tree_classification.data.shapefile_processor import ShapefileProcessor
85+
86+
processor = ShapefileProcessor()
87+
88+
# Consolidate shapefiles from subdirectories
89+
processor.consolidate_files(parent_dir, destination_dir)
90+
91+
# Process with coordinate validation and CRS correction
92+
sites_df, summary = processor.process_shapefiles(destination_dir)
93+
```
94+
95+
### Running the Tile Processing Pipeline
96+
97+
```python
98+
from scripts.process_tiles_to_crowns import run_full_pipeline
99+
100+
results = run_full_pipeline(
101+
tiles_base_dir="/path/to/neon_tiles",
102+
crown_csv_path="/path/to/clean_coordinates.csv",
103+
site="BART",
104+
year="2019",
105+
output_base_dir="/path/to/output"
106+
)
107+
```
108+
109+
## Key Features
110+
111+
### Coordinate System Handling
112+
- Automatic UTM zone detection by NEON site
113+
- CRS transformation and validation
114+
- Invalid coordinate filtering (infinite values, out-of-range)
115+
116+
### Multi-Modal Processing
117+
- RGB imagery (GeoTIFF)
118+
- Hyperspectral imagery (H5 → GeoTIFF conversion)
119+
- LiDAR CHM data (GeoTIFF)
120+
121+
### Robust Data Validation
122+
- Geometry validation and cleaning
123+
- Coordinate range checking
124+
- File existence verification
125+
- Error handling and reporting
126+
127+
## Contributing
128+
129+
1. Fork the repository
130+
2. Create a feature branch
131+
3. Make your changes
132+
4. Add tests for new functionality
133+
5. Submit a pull request
134+
135+
## Authors
136+
137+
- Ritesh Chowdhry
138+
139+
## Acknowledgments
140+
141+
- National Ecological Observatory Network (NEON) for providing the data
142+
- University of Florida Macrosystems project
18143

19144
# Citations
20145

SLURM/crowns.sh

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
#!/bin/bash
2+
#SBATCH --job-name=crop_crowns
3+
#SBATCH --mail-type=END,FAIL
4+
#SBATCH --mail-user=riteshchowdhry@ufl.edu
5+
#SBATCH --account=azare
6+
#SBATCH --output=/home/riteshchowdhry/logs/macrosystems/crop_crowns_%j.out
7+
#SBATCH --ntasks=1
8+
#SBATCH --cpus-per-task=10
9+
#SBATCH --mem=150G
10+
#SBATCH --time=48:00:00
11+
#SBATCH --partition=gpu
12+
#SBATCH --constraint=ai
13+
#SBATCH --gpus=1
14+
15+
date; hostname
16+
module load conda
17+
conda activate dfor_311
18+
pwd
19+
20+
21+
srun -u python crop_crowns.py
22+
23+
date

SLURM/dask.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
#SBATCH --mail-type=END,FAIL
44
#SBATCH --mail-user=riteshchowdhry@ufl.edu
55
#SBATCH --account=azare
6-
#SBATCH --partition=gpu
76
#SBATCH --output=/home/riteshchowdhry/logs/macrosystems/dask_deepforest/master_%j.out
87
#SBATCH --ntasks=1
98
#SBATCH --cpus-per-task=2
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
"""
2+
NEON Tree Classification Package
3+
4+
A comprehensive package for downloading, processing, and classifying tree species
5+
from NEON airborne data including RGB, hyperspectral imagery, and LiDAR.
6+
"""
7+
8+
__version__ = "0.1.0"
9+
__author__ = "Ritesh Chowdhry"
10+
11+
# Import key classes for easy access
12+
from .data.shapefile_processor import ShapefileProcessor
13+
from .models.architectures import HsiPixelClassifier
14+
15+
__all__ = [
16+
"ShapefileProcessor",
17+
"HsiPixelClassifier",
18+
]
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
"""Data handling and preprocessing utilities."""
2+
3+
from .shapefile_processor import ShapefileProcessor
4+
from .neon_downloader import NEONDownloader
5+
6+
__all__ = ['ShapefileProcessor', 'NEONDownloader']

0 commit comments

Comments
 (0)