One-Line Description:
High-performance concurrent file processor in Go with live metrics, autoscaling workers, and SHA256 hashing.
Developed by: Syed Shaheer Hussain © 2026
Technologies / Language:
- Go (Golang)
- Standard Library:
context,crypto/sha256,sync,runtime,os,filepath
Tags: #GoLang #Concurrency #SystemsProgramming #WorkerPool #Autoscaler #FileProcessing #SHA256 #Metrics
FileProcessor is a production-ready Go application that scans directories, calculates SHA256 hashes of files, and processes them concurrently using a dynamic worker pool.
It includes:
- Live metrics reporting (processed files, failed files, queue length, goroutines, memory usage)
- Worker autoscaling (adds workers automatically when the queue grows)
- Graceful shutdown via Ctrl+C or system signals
- Error handling and atomic counters for concurrency safety
This project demonstrates real-world systems programming concepts in Go.
- Scans directories recursively for files
- Hashes files using SHA256
- Processes multiple files concurrently with worker goroutines
- Reports live metrics every second
- Dynamically adds workers if backlog grows
- Gracefully shuts down on Ctrl+C or termination signals
Flowchart / Process Flow:
+-----------------+
| Main Program |
+-----------------+
|
v
+--------------------+
| Walk Directory |
| Collect File Paths |
+--------------------+
|
v
+---------------------+
| Jobs Channel (Chan) |
+---------------------+
/ | \
/ | \
v v v
+---------+ +---------+ +---------+
| Worker | | Worker | | Worker |
| Goroutine| | Goroutine| | Goroutine|
+---------+ +---------+ +---------+
\ | /
\ | /
v v v
+-------------------------+
| Process Files (SHA256) |
+-------------------------+
|
v
+----------------+
| Metrics Reporter|
| Memory Usage |
| Queue Length |
| Goroutines |
+----------------+
|
v
+----------------+
| Autoscaler |
| Add/Remove |
| Workers |
+----------------+
FileProcessor/
├── main.go # Main application
├── go.mod # Go modules file
- ✅ Concurrency with worker pools
- ✅ Dynamic autoscaling of workers
- ✅ SHA256 hashing of files
- ✅ Live metrics reporting (processed, failed, queue, goroutines, memory)
- ✅ Graceful shutdown with Ctrl+C
- ✅ Atomic counters for safe concurrent updates
- ✅ Error logging and collection
| Function | Purpose |
|---|---|
main() |
Initializes workers, metrics reporter, autoscaler, walks directories |
worker() |
Processes jobs from the channel, computes SHA256, updates metrics |
processFile() |
Opens file, computes SHA256, simulates processing delay |
metricsReporter() |
Prints live metrics every second (processed, failed, queue, goroutines, memory) |
workerAutoscaler() |
Dynamically adds workers if backlog grows, tracks logical reduction |
Requirements:
- Go >= 1.25
- Windows, Linux, or macOS
Steps:
- Clone repo (or create project folder):
git clone https://github.com/SyedShaheerHussain/Concurrency-FileProcessor-GO-lang
cd FileProcessor
- Initialize Go modules (if not done):
go mod init fileprocessor
- Build or run:
go run main.go -dir=C:\Users\YourUser\Documents -workers=4
Optional build:
go build -o fileprocessor main.go
./fileprocessor -dir=C:\Users\YourUser\Documents -workers=4
- Run with
-dirto specify directory - Run with
-workersto specify initial number of workers - Monitor metrics printed every second
- Press Ctrl+C to gracefully stop
Example:
go run main.go -dir=C:\Windows -workers=4
go run main.go -dir=C:\Windows -workers=6
Caution
- The program reads all files in the directory recursively — do not point it to extremely large directories without enough RAM.
- SHA256 hashing can be CPU-intensive for very large files.
- Autoscaler increases workers dynamically — too many workers can overwhelm CPU.
- Only files are processed, directories are skipped.
- Handles large directories concurrently
- Real-time metrics for observability
- Dynamic adjustment of workers for performance
- Graceful shutdown prevents resource leaks
- Cross-platform (Windows/Linux/macOS)
- High memory usage if queue size is huge
- Autoscaler currently cannot reduce active workers forcibly; idle workers exit naturally
- Metrics printing may slightly slow down very high-throughput processing
- No persistence of processed files metadata yet
- True worker scaling down (idle workers terminate automatically)
- Add throughput stats (files/sec)
- Prometheus metrics endpoint for external monitoring
- Terminal dashboard UI
- Retry mechanism for failed files
- Distributed processing with multiple machines
- Configurable thresholds for autoscaling
- Program starts, parses flags
-dirand-workers - Context and graceful shutdown signal are initialized
- Jobs channel with buffer 100 is created
- Initial workers (
worker()goroutines) start - Worker autoscaler starts monitoring the queue
- Metrics reporter prints live metrics every second
- Directory is walked recursively; files are sent to jobs channel
- Workers read jobs, compute SHA256, update metrics
- Autoscaler adds workers if backlog grows
- Ctrl+C triggers context cancellation
- Workers and metrics reporter exit gracefully
- Final summary is printed
- File indexing and backup systems
- Antivirus or file integrity scanning
- Log aggregation or crawler pipelines
- Educational tool for Go concurrency and systems programming
Syed Shaheer Hussain © 2026
Warning
- Use responsibly; scanning system directories may require admin permissions
- Designed for learning, testing, and real-world file processing scenarios
- Scan directories concurrently
- Compute SHA256 of files
- Autoscale workers
- Live metrics monitoring (including memory usage)
- Handle graceful shutdown
- Process files beyond memory/disk constraints
- Scale workers across multiple machines (currently single-machine)
- Persist results to database (requires extension)
| Feature | Status |
|---|---|
| Concurrency | ✅ |
| Worker Autoscaling | ✅ |
| Live Metrics | ✅ |
| Memory Usage Stats | ✅ |
| SHA256 hashing | ✅ |
| Graceful Shutdown | ✅ |
| Error Logging | ✅ |
- Go (Golang) 1.25+
- Highly concurrent
- Dynamic scaling
- Real-time observability
- Cross-platform
- CPU & memory usage grows with large directories
- Autoscaler downscaling is conceptual only
- Large file directories
- Systems programming practice in Go
- Learning concurrency patterns, worker pools, atomic counters, context handling
Note
- Sleep time in
processFile()simulates CPU-bound work (50ms default) - Metrics are printed every 1 second
- Autoscaler ticks every 2 seconds
- Designed worker pool with channel for job distribution
- Added context cancellation for graceful shutdown
- Used
atomiccounters for processed/failed files - Added metrics reporter for live stats and memory usage
- Added worker autoscaler for dynamic concurrency
- Install Go >= 1.25
- Clone repository
- Open terminal and navigate to project folder
- Run:
go run main.go -dir=<directory> -workers=<number> - Observe metrics and logs in real-time
- Press Ctrl+C to gracefully stop
FileProcessor/
├── main.go
├── go.mod
FileProcessor is a real-time, concurrent, scalable file processor written in Go. It’s suitable for systems programming, educational purposes, and real-world concurrent file processing.
It demonstrates worker pools, context cancellation, atomic counters, live metrics, autoscaling, SHA256 hashing, and graceful shutdown in a single project.
If you find this repository useful or insightful, please consider:
- ⭐ Starring the repository
- 🔁 Sharing it within your network
- 👤 Following my GitHub profile for future projects and updates
Your support helps drive continued innovation and open-source contributions.
— Syed Shaheer Hussain