Add InternVL3-8B-Instruct contrib model by jimburtoft · Pull Request #163 · aws-neuron/neuronx-distributed-inference

jimburtoft · 2026-05-09T05:46:13Z

Summary

Add InternVL3-8B-Instruct VLM contrib (InternViT-300M vision encoder + Qwen2.5-7B text backbone)
Validated on trn2.3xlarge TP=4 with SDK 2.29: 75.1 tok/s decode, 138ms TTFT, cosine 0.9984 vs CPU
1.85x faster output throughput vs NVIDIA L40S GPU
Includes vision encoder compilation, text backbone with vision embedding injection, vLLM integration patches, accuracy tests, and GPU benchmark scripts

Model Details

Component	Details
Vision encoder	InternViT-300M-448px-V2.5 (24 layers, traced via torch_neuronx)
Projector	Pixel shuffle + 2-layer MLP
Text backbone	Qwen2.5-7B (NxDI NeuronBaseForCausalLM)
Framework	NeuronBaseForImageToText

Files

src/ — 3-file VLM implementation (main, text, vision)
test/ — Integration tests
vllm/ — vLLM-neuron patches for serving
Benchmark/validation scripts (accuracy, scaling, NKI, GPU comparison)

Validation

CTE logit comparison vs CPU FP32: cosine=0.9984, top-1 match
TKG text generation: correct output
Multimodal (text + image): end-to-end working
Sequence lengths: 2K-32K validated

VLM (InternViT-300M + Qwen2.5-7B) on trn2.3xlarge TP=4. 75.1 tok/s decode, 138ms TTFT, cosine 0.9984 vs CPU. Includes vision encoder, text backbone, vLLM patches, accuracy tests, and GPU benchmark comparison (1.85x vs L40S).

Add InternVL3-8B-Instruct contrib model

65402d6

VLM (InternViT-300M + Qwen2.5-7B) on trn2.3xlarge TP=4. 75.1 tok/s decode, 138ms TTFT, cosine 0.9984 vs CPU. Includes vision encoder, text backbone, vLLM patches, accuracy tests, and GPU benchmark comparison (1.85x vs L40S).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add InternVL3-8B-Instruct contrib model#163

Add InternVL3-8B-Instruct contrib model#163
jimburtoft wants to merge 1 commit into
aws-neuron:mainfrom
jimburtoft:contrib/internvl3-8b

jimburtoft commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jimburtoft commented May 9, 2026

Summary

Model Details

Files

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant