LG-NPU

Description

lg-npu (Luca Goddijn Neural Processing Unit) is an experimental open hardware accelerator for neural network inference.

The project aims to design a modular neural processing unit that can be implemented on FPGA for development and validation, with a long-term goal of supporting ASIC implementation.

The architecture supports six compute kernels and is designed around a clear separation between:

a software-controlled execution model
modular compute backends and composites
a shared memory and data movement infrastructure

The project is structured to support full hardware/software co-design, including:

SystemVerilog RTL implementation
simulation and verification infrastructure
Python reference models for correctness
a software runtime and driver interface
FPGA and ASIC integration flows

The accelerator targets INT8 inference, supporting convolution, general matrix multiply (GEMM), softmax, element-wise vector operations, layer normalisation, and 2D spatial pooling.

Documentation

The docs/ directory contains architecture descriptions, hardware/software contracts, and development guides for the lg-npu project.

Documentation is divided into three main categories: architecture, specifications, and bring-up guides.

Architecture (`docs/arch/`)

High-level design documents describing how the NPU is structured and how its components interact.

File	Description
`overview.md`	High-level overview of the lg-npu architecture and design goals.
`compute_dataflow.md`	Description of the compute execution model and dataflow through backends and composites.
`memory_hierarchy.md`	Overview of the on-chip memory system, buffers, and data movement strategy.
`programming_model.md`	Conceptual model for how software interacts with the NPU (commands, execution flow, completion).

Hardware / Software Specification (`docs/spec/`)

Precise documentation defining the interface between hardware and software. These documents form the contract that both the RTL and driver/runtime must follow.

File	Description
`register_map.md`	Memory-mapped register layout exposed to the host.
`command_format.md`	Structure and semantics of command descriptors submitted to the NPU.
`tensor_layouts.md`	Supported tensor memory layouts and dimension ordering.
`interrupts.md`	Interrupt and completion signaling behavior.
`reset_boot_flow.md`	Device initialization, reset behavior, and boot sequence.
`perf_counters.md`	Performance monitoring counters and profiling support.

Development & Bring-Up (`docs/bringup/`)

Guides for running simulations, testing the hardware, and bringing the system up on real hardware.

File	Description
`sim_bringup.md`	Instructions for running simulations and verification environments.
`fpga_bringup.md`	Steps required to build and run the design on an FPGA platform.

Relationship to the Source Tree

These documents correspond closely with the main project components:

Directory	Purpose
`rtl/`	Hardware implementation of the NPU.
`tb/`	Verification testbenches and simulation infrastructure.
`model/`	Reference software models used for validation.
`sw/uapi/`	Public C headers (device context, commands, tensors, registers).
`sw/runtime/`	Runtime library: MMIO, IRQ, DMA, command builders, tensor conversion.
`sw/driver/`	Kernel-level driver stubs (future).
`sw/shared/`	Cross-platform annotation macros (`annotations.h`).
`fpga/`	FPGA integration and board support.
`asic/`	ASIC synthesis, layout, and sign-off flow.
`tools/`	Linting, formatting, code generation, and utility scripts.

Quick Start

Prerequisites

Verilator 5.x (verilator --version)
GNU Make
Bash (for simulation scripts)
Python 3.9+ (for generators and reference models)

Build Targets

Run make help to list all targets. The most common ones:

make lint              # Verilator lint (zero-warning gate)
make compile           # Verilate -> C++ model
make compile-full-tests # Build full regression binary
make sim-smoke         # Smoke regression (conv + control + perf)
make sim-full          # Full regression (all test suites)
make sw-check          # Syntax-check runtime C sources (-fsyntax-only)
make sw-build          # Build liblgnpu_rt.a and liblgnpu_rt.so
make vectors           # Generate test vectors from Python models
make format            # Auto-format SV / shell / Python
make gen               # Regenerate SV packages + C headers
make viz               # Generate architecture diagrams
make waves             # Open latest waveform in Surfer viewer
make clean             # Remove build artifacts

Running Lint

make lint

This invokes Verilator with -Wall on all RTL via tools/lint/rtl.f. The design must pass with zero warnings.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github		.github
asic		asic
docs		docs
formal/properties		formal/properties
fpga		fpga
include		include
model		model
rtl		rtl
sim/scripts		sim/scripts
sw		sw
tb		tb
tools		tools
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LG-NPU

Description

Documentation

Architecture (`docs/arch/`)

Hardware / Software Specification (`docs/spec/`)

Development & Bring-Up (`docs/bringup/`)

Relationship to the Source Tree

Quick Start

Prerequisites

Build Targets

Running Lint

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LG-NPU

Description

Documentation

Architecture (docs/arch/)

Hardware / Software Specification (docs/spec/)

Development & Bring-Up (docs/bringup/)

Relationship to the Source Tree

Quick Start

Prerequisites

Build Targets

Running Lint

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Architecture (`docs/arch/`)

Hardware / Software Specification (`docs/spec/`)

Development & Bring-Up (`docs/bringup/`)

Packages