|
1 | 1 | # NEON Tree Classification |
2 | 2 |
|
3 | | -A Python package for processing NEON (National Ecological Observatory Network) tree crown annotation data and building machine learning models for tree species classification. |
4 | | - |
5 | | -National Ecological Observatory Network (NEON) offers a variety of data products, including airborne data from different forest sites. Airborne data includes RGB orthophotos, LiDAR (CHM) airborne data, and 426 band hyperspectral data. All products are available on https://data.neonscience.org/data-products/. |
6 | | - |
7 | | -## NEON Data Products Used |
8 | | - |
9 | | -- **RGB Orthophotos**: `DP3.30010.001` - High-resolution orthorectified camera imagery mosaic |
10 | | -- **Hyperspectral Imagery**: `DP3.30006.002` - Spectrometer orthorectified surface bidirectional reflectance - mosaic |
11 | | -- **LiDAR CHM**: `DP3.30015.001` - Ecosystem structure (Canopy Height Model) |
| 3 | +A modular Python package for processing NEON airborne data and multi-modal tree species classification using RGB, hyperspectral, and LiDAR data. |
12 | 4 |
|
13 | 5 | ## Features |
14 | 6 |
|
15 | | -- **Shapefile Processing**: Handle coordinate system transformations and validation for NEON tree crown shapefiles |
16 | | -- **HSI Tile Processing**: Convert and process hyperspectral imagery (HSI) tiles from H5 to GeoTIFF format |
17 | | -- **Crown-Tile Intersection**: Match tree crown annotations with corresponding image tiles |
18 | | -- **Data Pipeline**: End-to-end processing from raw NEON data to training-ready datasets |
19 | | -- **Coordinate Validation**: Robust handling of invalid coordinates and CRS issues |
| 7 | +### Data Processing |
| 8 | +- **NEON data download**: Automated download of RGB, hyperspectral, and LiDAR tiles |
| 9 | +- **Shapefile processing**: Coordinate system transformations and validation for tree crowns |
| 10 | +- **Multi-modal tile processing**: Convert and process HSI (H5 → GeoTIFF), RGB, and LiDAR data |
| 11 | +- **Crown-tile intersection**: Match tree crown annotations with corresponding image tiles |
| 12 | + |
| 13 | +### Machine Learning |
| 14 | +- **Multi-modal models**: Separate architectures for RGB, hyperspectral (426 bands), and LiDAR |
| 15 | +- **Modular training**: PyTorch Lightning modules with CometML/TensorBoard logging |
| 16 | +- **Flexible data pipeline**: Clean tensor-only batches with configurable splits |
| 17 | +- **Modern packaging**: Uses `pyproject.toml` and `uv` for dependency management |
20 | 18 |
|
21 | 19 | ## Installation |
22 | 20 |
|
23 | 21 | ```bash |
24 | | -# Clone the repository |
25 | 22 | git clone https://github.com/Ritesh313/NeonTreeClassification.git |
26 | 23 | cd NeonTreeClassification |
27 | 24 |
|
28 | | -# Install in development mode |
| 25 | +# Install with uv (recommended) |
| 26 | +pip install uv |
| 27 | +uv sync |
| 28 | + |
| 29 | +# Or with pip |
29 | 30 | pip install -e . |
30 | 31 | ``` |
31 | 32 |
|
32 | 33 | ## Quick Start |
33 | 34 |
|
34 | | -### 1. Process Shapefiles |
| 35 | +### Data Processing |
35 | 36 | ```bash |
| 37 | +# Process NEON shapefiles |
36 | 38 | python scripts/test_shapefile_processor.py |
37 | | -``` |
38 | 39 |
|
39 | | -### 2. Process Tiles and Crowns |
40 | | -```bash |
| 40 | +# Process tiles and match with crowns |
41 | 41 | python scripts/process_tiles_to_crowns.py |
42 | 42 | ``` |
43 | 43 |
|
44 | | -## Package Structure |
45 | | - |
46 | | -``` |
47 | | -neon_tree_classification/ |
48 | | -├── data/ |
49 | | -│ └── shapefile_processor.py # Shapefile processing and CRS handling |
50 | | -├── models/ |
51 | | -│ └── hsi_models.py # PyTorch models for HSI classification |
52 | | -├── processing/ |
53 | | -│ └── __init__.py # Processing utilities |
54 | | -└── utils/ |
55 | | - └── __init__.py # General utilities |
| 44 | +### Model Training |
| 45 | +```bash |
| 46 | +# Train RGB model |
| 47 | +python train.py --modality rgb --csv_path data/crowns.csv --data_dir data/ |
56 | 48 |
|
57 | | -scripts/ |
58 | | -├── download_neon_all_modalities.py # Data download scripts |
59 | | -├── process_tiles_to_crowns.py # Main processing pipeline |
60 | | -└── test_shapefile_processor.py # Test shapefile processing |
| 49 | +# Train HSI model with CometML logging |
| 50 | +python train.py --modality hsi --logger comet --project_name my-project |
61 | 51 |
|
62 | | -configs/ # Configuration files |
63 | | -notebooks/ # Jupyter notebooks for analysis |
64 | | -SLURM/ # SLURM job scripts |
65 | | -tests/ # Unit tests |
| 52 | +# Compare all modalities |
| 53 | +python compare_modalities.py --csv_path data/crowns.csv --data_dir data/ |
66 | 54 | ``` |
67 | 55 |
|
68 | | -## Workflow |
69 | | - |
70 | | -### 1. Download NEON Data |
71 | | -Given the Northing, Easting, Year and Site, download the NEON data using the download scripts. There are functions to download the RGB, HSI, and LiDAR data. |
72 | | - |
73 | | -### 2. Process Shapefiles |
74 | | -Process tree crown annotation shapefiles with coordinate system correction and validation. |
75 | | - |
76 | | -### 3. Process Tiles and Crowns |
77 | | -Run the full pipeline to match crown annotations with image tiles and create training datasets. |
78 | | - |
79 | | -## Usage Examples |
80 | | - |
81 | | -### Processing NEON Shapefiles |
82 | | - |
| 56 | +### Using in code |
83 | 57 | ```python |
| 58 | +# Data processing |
84 | 59 | from neon_tree_classification.data.shapefile_processor import ShapefileProcessor |
85 | 60 |
|
86 | 61 | processor = ShapefileProcessor() |
87 | | - |
88 | | -# Consolidate shapefiles from subdirectories |
89 | | -processor.consolidate_files(parent_dir, destination_dir) |
90 | | - |
91 | | -# Process with coordinate validation and CRS correction |
92 | 62 | sites_df, summary = processor.process_shapefiles(destination_dir) |
93 | | -``` |
94 | 63 |
|
95 | | -### Running the Tile Processing Pipeline |
| 64 | +# Model training |
| 65 | +from neon_tree_classification import NeonCrownDataModule, RGBClassifier |
96 | 66 |
|
97 | | -```python |
98 | | -from scripts.process_tiles_to_crowns import run_full_pipeline |
99 | | - |
100 | | -results = run_full_pipeline( |
101 | | - tiles_base_dir="/path/to/neon_tiles", |
102 | | - crown_csv_path="/path/to/clean_coordinates.csv", |
103 | | - site="BART", |
104 | | - year="2019", |
105 | | - output_base_dir="/path/to/output" |
| 67 | +datamodule = NeonCrownDataModule( |
| 68 | + csv_path="data/crowns.csv", |
| 69 | + base_data_dir="data/", |
| 70 | + modalities=["rgb"], |
| 71 | + batch_size=32 |
106 | 72 | ) |
| 73 | + |
| 74 | +classifier = RGBClassifier(model_type="resnet", num_classes=10) |
| 75 | + |
| 76 | +import lightning as L |
| 77 | +trainer = L.Trainer(max_epochs=50) |
| 78 | +trainer.fit(classifier, datamodule) |
107 | 79 | ``` |
108 | 80 |
|
109 | | -## Key Features |
| 81 | +## Architecture |
110 | 82 |
|
111 | | -### Coordinate System Handling |
112 | | -- Automatic UTM zone detection by NEON site |
113 | | -- CRS transformation and validation |
114 | | -- Invalid coordinate filtering (infinite values, out-of-range) |
| 83 | +``` |
| 84 | +neon_tree_classification/ |
| 85 | +├── data/ |
| 86 | +│ ├── dataset.py # Multi-modal dataset |
| 87 | +│ ├── datamodule.py # Lightning DataModule |
| 88 | +│ └── shapefile_processor.py # NEON shapefile processing |
| 89 | +├── models/ |
| 90 | +│ ├── rgb_models.py # RGB architectures |
| 91 | +│ ├── hsi_models.py # Hyperspectral architectures |
| 92 | +│ ├── lidar_models.py # LiDAR architectures |
| 93 | +│ └── lightning_modules.py # Training modules |
| 94 | +└── processing/ # NEON data processing utilities |
| 95 | +
|
| 96 | +scripts/ |
| 97 | +├── download_neon_all_modalities.py # Download NEON data |
| 98 | +├── process_tiles_to_crowns.py # Tile processing pipeline |
| 99 | +└── test_shapefile_processor.py # Test shapefile processing |
| 100 | +``` |
115 | 101 |
|
116 | | -### Multi-Modal Processing |
117 | | -- RGB imagery (GeoTIFF) |
118 | | -- Hyperspectral imagery (H5 → GeoTIFF conversion) |
119 | | -- LiDAR CHM data (GeoTIFF) |
| 102 | +## NEON Data Products |
120 | 103 |
|
121 | | -### Robust Data Validation |
122 | | -- Geometry validation and cleaning |
123 | | -- Coordinate range checking |
124 | | -- File existence verification |
125 | | -- Error handling and reporting |
| 104 | +- **RGB**: `DP3.30010.001` - High-resolution orthorectified imagery |
| 105 | +- **Hyperspectral**: `DP3.30006.002` - 426-band spectrometer reflectance |
| 106 | +- **LiDAR**: `DP3.30015.001` - Canopy Height Model |
126 | 107 |
|
127 | 108 | ## Contributing |
128 | 109 |
|
129 | 110 | 1. Fork the repository |
130 | 111 | 2. Create a feature branch |
131 | | -3. Make your changes |
132 | | -4. Add tests for new functionality |
133 | | -5. Submit a pull request |
| 112 | +3. Submit a pull request |
134 | 113 |
|
135 | 114 | ## Authors |
136 | 115 |
|
137 | | -- Ritesh Chowdhry |
| 116 | +Ritesh Chowdhry |
138 | 117 |
|
139 | 118 | ## Acknowledgments |
140 | 119 |
|
141 | | -- National Ecological Observatory Network (NEON) for providing the data |
| 120 | +- National Ecological Observatory Network (NEON) |
142 | 121 | - University of Florida Macrosystems project |
143 | 122 |
|
144 | | -# Citations |
145 | | - |
146 | | -## NEON Airborne Data Products |
147 | | -NEON (National Ecological Observatory Network). High-resolution orthorectified camera imagery mosaic (DP3.30010.001), RELEASE-2025. https://doi.org/10.48443/gdgn-3r69. Dataset accessed from https://data.neonscience.org/data-products/DP3.30010.001/RELEASE-2025 on April 3, 2025. |
148 | | - |
149 | | -NEON (National Ecological Observatory Network). Spectrometer orthorectified surface bidirectional reflectance - mosaic (DP3.30006.002), provisional data. Dataset accessed from https://data.neonscience.org/data-products/DP3.30006.002 on April 3, 2025. Data archived at [your DOI]. |
150 | | - |
151 | | -NEON (National Ecological Observatory Network). Ecosystem structure (DP3.30015.001), RELEASE-2025. https://doi.org/10.48443/jqqd-1n30. Dataset accessed from https://data.neonscience.org/data-products/DP3.30015.001/RELEASE-2025 on April 3, 2025. |
152 | | - |
0 commit comments