Skip to content
@neuralmagic

Neural Magic

Neural Magic (Acquired by Red Hat) empowers developers to optimize & deploy LLMs at scale. Our model compression & acceleration enable top performance with vLLM

Pinned Loading

  1. deepsparse deepsparse Public archive

    Sparsity-aware deep learning inference runtime for CPUs

    Python 3.2k 191

Repositories

Showing 10 of 93 repositories
  • vllm Public Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    neuralmagic/vllm’s past year of commit activity
    Python 17 Apache-2.0 16,015 0 37 Updated Apr 20, 2026
  • axolotl Public Forked from axolotl-ai-cloud/axolotl

    Go ahead and axolotl questions

    neuralmagic/axolotl’s past year of commit activity
    Python 0 Apache-2.0 1,334 0 5 Updated Apr 19, 2026
  • every_eval_ever Public Forked from evaleval/every_eval_ever

    Every Eval Ever is a shared schema and crowdsourced eval database. It defines a standardized metadata format for storing AI evaluation results — from leaderboard scrapes and research papers to local evaluation runs — so that results from different frameworks can be compared, reproduced, and reused.

    neuralmagic/every_eval_ever’s past year of commit activity
    Python 0 MIT 29 0 5 Updated Apr 18, 2026
  • GuardBench Public Forked from eldarkurtic/GuardBench

    A Python library for guardrail models evaluation with vLLM support.

    neuralmagic/GuardBench’s past year of commit activity
    Python 0 EUPL-1.2 10 0 13 Updated Apr 18, 2026
  • lighteval Public Forked from huggingface/lighteval

    Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

    neuralmagic/lighteval’s past year of commit activity
    Python 0 MIT 455 0 1 Updated Apr 18, 2026
  • nyann-bench Public
    neuralmagic/nyann-bench’s past year of commit activity
    Go 0 Apache-2.0 0 0 4 Updated Apr 17, 2026
  • research Public

    Repository to enable research flows

    neuralmagic/research’s past year of commit activity
    Python 3 0 0 3 Updated Apr 17, 2026
  • flash-attention Public Forked from vllm-project/flash-attention

    Fast and memory-efficient exact attention

    neuralmagic/flash-attention’s past year of commit activity
    C++ 0 BSD-3-Clause 2,650 0 0 Updated Apr 16, 2026
  • transformers-gdm Public Forked from huggingface/transformers

    🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

    neuralmagic/transformers-gdm’s past year of commit activity
    Python 0 Apache-2.0 33,558 0 0 Updated Apr 15, 2026
  • lm-evaluation-harness Public Forked from EleutherAI/lm-evaluation-harness

    A framework for few-shot evaluation of language models.

    neuralmagic/lm-evaluation-harness’s past year of commit activity
    Python 5 MIT 3,224 0 1 Updated Apr 14, 2026

Top languages

Loading…

Most used topics

Loading…