Skip to content

Commit 0f3558a

Browse files
committed
feat: Add modular dataloaders and models for RGB, HSI, and LiDAR
- Create separate model architectures for each modality - Add flexible dataloaders supporting any combination of modalities - Implement training functionality for individual or combined modalities - Update packaging with uv setup
1 parent 4c9e89b commit 0f3558a

33 files changed

Lines changed: 4694 additions & 2058 deletions

.gitignore

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,16 @@ lightning_logs/
1212
results_temp_dir/
1313
.comet.config
1414

15+
# Python packaging
16+
*.egg-info/
17+
build/
18+
dist/
19+
*.egg
20+
21+
# uv
22+
.venv/
23+
uv.lock
24+
1525
# Compiled source #
1626
###################
1727
*.com

CLASSIFIER_SETUP.md

Whitespace-only changes.

MIGRATION_GUIDE.py

Lines changed: 0 additions & 69 deletions
This file was deleted.

README.md

Lines changed: 69 additions & 99 deletions
Original file line numberDiff line numberDiff line change
@@ -1,152 +1,122 @@
11
# NEON Tree Classification
22

3-
A Python package for processing NEON (National Ecological Observatory Network) tree crown annotation data and building machine learning models for tree species classification.
4-
5-
National Ecological Observatory Network (NEON) offers a variety of data products, including airborne data from different forest sites. Airborne data includes RGB orthophotos, LiDAR (CHM) airborne data, and 426 band hyperspectral data. All products are available on https://data.neonscience.org/data-products/.
6-
7-
## NEON Data Products Used
8-
9-
- **RGB Orthophotos**: `DP3.30010.001` - High-resolution orthorectified camera imagery mosaic
10-
- **Hyperspectral Imagery**: `DP3.30006.002` - Spectrometer orthorectified surface bidirectional reflectance - mosaic
11-
- **LiDAR CHM**: `DP3.30015.001` - Ecosystem structure (Canopy Height Model)
3+
A modular Python package for processing NEON airborne data and multi-modal tree species classification using RGB, hyperspectral, and LiDAR data.
124

135
## Features
146

15-
- **Shapefile Processing**: Handle coordinate system transformations and validation for NEON tree crown shapefiles
16-
- **HSI Tile Processing**: Convert and process hyperspectral imagery (HSI) tiles from H5 to GeoTIFF format
17-
- **Crown-Tile Intersection**: Match tree crown annotations with corresponding image tiles
18-
- **Data Pipeline**: End-to-end processing from raw NEON data to training-ready datasets
19-
- **Coordinate Validation**: Robust handling of invalid coordinates and CRS issues
7+
### Data Processing
8+
- **NEON data download**: Automated download of RGB, hyperspectral, and LiDAR tiles
9+
- **Shapefile processing**: Coordinate system transformations and validation for tree crowns
10+
- **Multi-modal tile processing**: Convert and process HSI (H5 → GeoTIFF), RGB, and LiDAR data
11+
- **Crown-tile intersection**: Match tree crown annotations with corresponding image tiles
12+
13+
### Machine Learning
14+
- **Multi-modal models**: Separate architectures for RGB, hyperspectral (426 bands), and LiDAR
15+
- **Modular training**: PyTorch Lightning modules with CometML/TensorBoard logging
16+
- **Flexible data pipeline**: Clean tensor-only batches with configurable splits
17+
- **Modern packaging**: Uses `pyproject.toml` and `uv` for dependency management
2018

2119
## Installation
2220

2321
```bash
24-
# Clone the repository
2522
git clone https://github.com/Ritesh313/NeonTreeClassification.git
2623
cd NeonTreeClassification
2724

28-
# Install in development mode
25+
# Install with uv (recommended)
26+
pip install uv
27+
uv sync
28+
29+
# Or with pip
2930
pip install -e .
3031
```
3132

3233
## Quick Start
3334

34-
### 1. Process Shapefiles
35+
### Data Processing
3536
```bash
37+
# Process NEON shapefiles
3638
python scripts/test_shapefile_processor.py
37-
```
3839

39-
### 2. Process Tiles and Crowns
40-
```bash
40+
# Process tiles and match with crowns
4141
python scripts/process_tiles_to_crowns.py
4242
```
4343

44-
## Package Structure
45-
46-
```
47-
neon_tree_classification/
48-
├── data/
49-
│ └── shapefile_processor.py # Shapefile processing and CRS handling
50-
├── models/
51-
│ └── hsi_models.py # PyTorch models for HSI classification
52-
├── processing/
53-
│ └── __init__.py # Processing utilities
54-
└── utils/
55-
└── __init__.py # General utilities
44+
### Model Training
45+
```bash
46+
# Train RGB model
47+
python train.py --modality rgb --csv_path data/crowns.csv --data_dir data/
5648

57-
scripts/
58-
├── download_neon_all_modalities.py # Data download scripts
59-
├── process_tiles_to_crowns.py # Main processing pipeline
60-
└── test_shapefile_processor.py # Test shapefile processing
49+
# Train HSI model with CometML logging
50+
python train.py --modality hsi --logger comet --project_name my-project
6151

62-
configs/ # Configuration files
63-
notebooks/ # Jupyter notebooks for analysis
64-
SLURM/ # SLURM job scripts
65-
tests/ # Unit tests
52+
# Compare all modalities
53+
python compare_modalities.py --csv_path data/crowns.csv --data_dir data/
6654
```
6755

68-
## Workflow
69-
70-
### 1. Download NEON Data
71-
Given the Northing, Easting, Year and Site, download the NEON data using the download scripts. There are functions to download the RGB, HSI, and LiDAR data.
72-
73-
### 2. Process Shapefiles
74-
Process tree crown annotation shapefiles with coordinate system correction and validation.
75-
76-
### 3. Process Tiles and Crowns
77-
Run the full pipeline to match crown annotations with image tiles and create training datasets.
78-
79-
## Usage Examples
80-
81-
### Processing NEON Shapefiles
82-
56+
### Using in code
8357
```python
58+
# Data processing
8459
from neon_tree_classification.data.shapefile_processor import ShapefileProcessor
8560

8661
processor = ShapefileProcessor()
87-
88-
# Consolidate shapefiles from subdirectories
89-
processor.consolidate_files(parent_dir, destination_dir)
90-
91-
# Process with coordinate validation and CRS correction
9262
sites_df, summary = processor.process_shapefiles(destination_dir)
93-
```
9463

95-
### Running the Tile Processing Pipeline
64+
# Model training
65+
from neon_tree_classification import NeonCrownDataModule, RGBClassifier
9666

97-
```python
98-
from scripts.process_tiles_to_crowns import run_full_pipeline
99-
100-
results = run_full_pipeline(
101-
tiles_base_dir="/path/to/neon_tiles",
102-
crown_csv_path="/path/to/clean_coordinates.csv",
103-
site="BART",
104-
year="2019",
105-
output_base_dir="/path/to/output"
67+
datamodule = NeonCrownDataModule(
68+
csv_path="data/crowns.csv",
69+
base_data_dir="data/",
70+
modalities=["rgb"],
71+
batch_size=32
10672
)
73+
74+
classifier = RGBClassifier(model_type="resnet", num_classes=10)
75+
76+
import lightning as L
77+
trainer = L.Trainer(max_epochs=50)
78+
trainer.fit(classifier, datamodule)
10779
```
10880

109-
## Key Features
81+
## Architecture
11082

111-
### Coordinate System Handling
112-
- Automatic UTM zone detection by NEON site
113-
- CRS transformation and validation
114-
- Invalid coordinate filtering (infinite values, out-of-range)
83+
```
84+
neon_tree_classification/
85+
├── data/
86+
│ ├── dataset.py # Multi-modal dataset
87+
│ ├── datamodule.py # Lightning DataModule
88+
│ └── shapefile_processor.py # NEON shapefile processing
89+
├── models/
90+
│ ├── rgb_models.py # RGB architectures
91+
│ ├── hsi_models.py # Hyperspectral architectures
92+
│ ├── lidar_models.py # LiDAR architectures
93+
│ └── lightning_modules.py # Training modules
94+
└── processing/ # NEON data processing utilities
95+
96+
scripts/
97+
├── download_neon_all_modalities.py # Download NEON data
98+
├── process_tiles_to_crowns.py # Tile processing pipeline
99+
└── test_shapefile_processor.py # Test shapefile processing
100+
```
115101

116-
### Multi-Modal Processing
117-
- RGB imagery (GeoTIFF)
118-
- Hyperspectral imagery (H5 → GeoTIFF conversion)
119-
- LiDAR CHM data (GeoTIFF)
102+
## NEON Data Products
120103

121-
### Robust Data Validation
122-
- Geometry validation and cleaning
123-
- Coordinate range checking
124-
- File existence verification
125-
- Error handling and reporting
104+
- **RGB**: `DP3.30010.001` - High-resolution orthorectified imagery
105+
- **Hyperspectral**: `DP3.30006.002` - 426-band spectrometer reflectance
106+
- **LiDAR**: `DP3.30015.001` - Canopy Height Model
126107

127108
## Contributing
128109

129110
1. Fork the repository
130111
2. Create a feature branch
131-
3. Make your changes
132-
4. Add tests for new functionality
133-
5. Submit a pull request
112+
3. Submit a pull request
134113

135114
## Authors
136115

137-
- Ritesh Chowdhry
116+
Ritesh Chowdhry
138117

139118
## Acknowledgments
140119

141-
- National Ecological Observatory Network (NEON) for providing the data
120+
- National Ecological Observatory Network (NEON)
142121
- University of Florida Macrosystems project
143122

144-
# Citations
145-
146-
## NEON Airborne Data Products
147-
NEON (National Ecological Observatory Network). High-resolution orthorectified camera imagery mosaic (DP3.30010.001), RELEASE-2025. https://doi.org/10.48443/gdgn-3r69. Dataset accessed from https://data.neonscience.org/data-products/DP3.30010.001/RELEASE-2025 on April 3, 2025.
148-
149-
NEON (National Ecological Observatory Network). Spectrometer orthorectified surface bidirectional reflectance - mosaic (DP3.30006.002), provisional data. Dataset accessed from https://data.neonscience.org/data-products/DP3.30006.002 on April 3, 2025. Data archived at [your DOI].
150-
151-
NEON (National Ecological Observatory Network). Ecosystem structure (DP3.30015.001), RELEASE-2025. https://doi.org/10.48443/jqqd-1n30. Dataset accessed from https://data.neonscience.org/data-products/DP3.30015.001/RELEASE-2025 on April 3, 2025.
152-

0 commit comments

Comments
 (0)