Skip to content

zth1337/LuminaSearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lumina Search ⚡

Lumina Search is an ultra-fast, modern full-text search engine built from scratch using the latest features of Java 22. This project demonstrates how combining SIMD instructions (Vector API), direct memory access (FFM API), and virtual threads (Project Loom) can deliver performance that significantly exceeds traditional counterparts.

🚀 Why is Lumina Faster?

Unlike classic search engines (e.g., Apache Lucene), Lumina minimizes object allocations and relies heavily on hardware acceleration for mathematical computations:

  1. SIMD BM25 Scoring: Relevance scoring is computed in vectorized blocks (256-bit), processing multiple documents in a single CPU instruction rather than sequentially.
  2. Zero-Copy Memory: Dictionary and index files are memory-mapped directly from disk via MemorySegment. This completely eliminates the overhead of copying data into the Java Heap and keeps the Garbage Collector idle.
  3. Hashing Over Strings: The custom analyzer converts text tokens into 64-bit Murmur3 hashes immediately during the parsing stage, reducing the search process to rapid binary operations.

📊 Performance Comparison (Lumina vs Lucene)

Benchmark based on the query "fast math memory" across a dataset of 1,000,000 documents. The table shows average response times.

image

*The Lucene result is an estimation based on a similar configuration using the built-in LuceneBenchmark. Lumina consistently delivers sub-millisecond latency thanks to hardware vectorization.

⚙️ Requirements

  • Java: JDK 22 or newer.
  • Maven: 3.8+.
  • JVM Flags: You must run the application with --enable-preview and --add-modules jdk.incubator.vector to enable the incubator APIs.

🛠️ Build and Run

Compile the project using Maven:

mvn clean package