Skip to content

Commit 0cba1e4

Browse files
committed
Update dataset to full 40k samples and improve processing scripts
- Update to complete training dataset with 40k samples - Refresh README with current dataset statistics - Enhance processing scripts with argument parsing and better output - Make all scripts configurable via command line arguments
1 parent dd154df commit 0cba1e4

6 files changed

Lines changed: 42593 additions & 113 deletions

File tree

README.md

Lines changed: 29 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
# NEON Multi-Modal Tree Species Dataset
22

3-
Hyperspectral, RGB and LiDAR airborne data for **96 tree species** representing **5,518 individual trees** across **30 NEON sites** in North America.
3+
Hyperspectral, RGB and LiDAR airborne data for **162 tree species** representing **42,453 individual trees** across **27 NEON sites** in North America.
44

55
## Dataset Overview
66

7-
- **5,518** individual tree crowns
8-
- **96** unique species
9-
- **30** NEON sites across North America
10-
- **2018-2020** (3 years of data)
7+
- **42,453** individual tree crowns
8+
- **162** unique species
9+
- **27** NEON sites across North America
10+
- **2014-2023** (10 years of data)
1111
- **3 modalities:** RGB, Hyperspectral (426 bands), LiDAR CHM
1212

1313
## Quick Start
@@ -75,35 +75,35 @@ plot_lidar(sample['lidar_path'])
7575

7676
## Top Species
7777

78-
The dataset includes 96 tree species. Here are the most common:
78+
The dataset includes 162 tree species. Here are the most common:
7979

8080
| Rank | Species | Count | Percentage |
8181
|------|---------|-------|------------|
82-
| 1 | Picea mariana (Mill.) Britton, Sterns & Poggenb. | 678 | 12.3% |
83-
| 2 | Acer rubrum L. | 360 | 6.5% |
84-
| 3 | Pseudotsuga menziesii (Mirb.) Franco var. menziesii | 300 | 5.4% |
85-
| 4 | Populus tremuloides Michx. | 271 | 4.9% |
86-
| 5 | Quercus rubra L. | 243 | 4.4% |
87-
| 6 | Pinus palustris Mill. | 233 | 4.2% |
88-
| 7 | Tsuga canadensis (L.) Carrière | 200 | 3.6% |
89-
| 8 | Pinus contorta Douglas ex Loudon var. latifolia Engelm. ex S. Watson | 189 | 3.4% |
90-
| 9 | Abies lasiocarpa (Hook.) Nutt. var. lasiocarpa | 172 | 3.1% |
91-
| 10 | Betula neoalaskana Sarg. | 162 | 2.9% |
82+
| 1 | Acer rubrum L. | 5,324 | 12.5% |
83+
| 2 | Tsuga canadensis (L.) Carrière | 3,103 | 7.3% |
84+
| 3 | Pseudotsuga menziesii (Mirb.) Franco var. menziesii | 2,678 | 6.3% |
85+
| 4 | Pinus palustris Mill. | 1,974 | 4.6% |
86+
| 5 | Quercus rubra L. | 1,843 | 4.3% |
87+
| 6 | Pinus contorta Douglas ex Loudon var. latifolia Engelm. ex S. Watson | 1,822 | 4.3% |
88+
| 7 | Tsuga heterophylla (Raf.) Sarg. | 1,394 | 3.3% |
89+
| 8 | Populus tremuloides Michx. | 1,091 | 2.6% |
90+
| 9 | Liriodendron tulipifera L. | 1,049 | 2.5% |
91+
| 10 | Quercus alba L. | 1,004 | 2.4% |
9292

9393
## Geographic Distribution
9494

95-
Data collected from **30 NEON sites** across North America:
95+
Data collected from **27 NEON sites** across North America:
9696

97-
**1.** DEJU: 577 samples (10.5%)
98-
**2.** BART: 533 samples (9.7%)
99-
**3.** BONA: 504 samples (9.1%)
100-
**4.** HARV: 490 samples (8.9%)
101-
**5.** MLBS: 368 samples (6.7%)
102-
**6.** RMNP: 329 samples (6.0%)
103-
**7.** DELA: 299 samples (5.4%)
104-
**8.** NIWO: 276 samples (5.0%)
105-
**9.** UNDE: 262 samples (4.7%)
106-
**10.** TALL: 246 samples (4.5%)
97+
**1.** HARV: 6,672 samples (15.7%)
98+
**2.** MLBS: 5,056 samples (11.9%)
99+
**3.** GRSM: 4,774 samples (11.2%)
100+
**4.** DELA: 4,240 samples (10.0%)
101+
**5.** RMNP: 3,602 samples (8.5%)
102+
**6.** WREF: 3,517 samples (8.3%)
103+
**7.** OSBS: 2,101 samples (4.9%)
104+
**8.** BART: 1,827 samples (4.3%)
105+
**9.** UNDE: 1,678 samples (4.0%)
106+
**10.** CLBJ: 1,655 samples (3.9%)
107107

108108
## Installation
109109

@@ -184,7 +184,7 @@ datamodule = NeonCrownDataModule(
184184
)
185185

186186
# Train RGB model
187-
classifier = RGBClassifier(num_classes=96)
187+
classifier = RGBClassifier(num_classes=162)
188188

189189
import lightning as L
190190
trainer = L.Trainer(max_epochs=50)
@@ -237,5 +237,5 @@ Ritesh Chowdhry
237237
## Acknowledgments
238238

239239
- National Ecological Observatory Network (NEON)
240-
- This dataset details were generated on 2025-08-24
240+
- This dataset details were generated on 2025-08-26
241241

0 commit comments

Comments
 (0)