Skip to content

wolfwdavid/mechspec-qwen

Repository files navigation

MechSpec-Qwen

License: Apache 2.0 Python 3.10+ Model: Qwen2.5-3B

A domain-adapted large language model for mechanical engineering specifications, material property retrieval, engineering calculations, and GD&T interpretation. Built by fine-tuning Qwen2.5-3B-Instruct with LoRA on ~5,000+ expert-authored Q&A pairs covering real-world engineering problems.


Architecture

                    +------------------------------------------+
                    |           MechSpec Pipeline               |
                    +------------------------------------------+
                                     |
              +----------------------+----------------------+
              |                      |                      |
     +--------v--------+   +--------v--------+   +--------v--------+
     |  Data Pipeline   |   | Training (LoRA)  |   | Evaluation      |
     +-----------------+   +-----------------+   +-----------------+
     | generate_qa_pairs|   | Qwen2.5-3B-Inst |   | MechEval Bench  |
     | validate_answers |   | 4-bit NF4 quant |   |  - Mat Retrieval|
     | format_dataset   |   | LoRA r=32       |   |  - Calculations |
     | split            |   | SFTTrainer      |   |  - GD&T Interp  |
     | fetch_mp_api     |   | Cosine schedule |   | Radar chart     |
     +-----------------+   +-----------------+   +-----------------+
              |                      |                      |
              v                      v                      v
     +--------+--------+   +--------+--------+   +--------+--------+
     | 5000+ Q&A Pairs  |   | LoRA Adapters    |   | Comparison      |
     | Alpaca format    |   | Merged model     |   | Report          |
     | 85/10/5 split    |   | safetensors      |   | Per-task scores |
     +-----------------+   +-----------------+   +-----------------+

Training Details

Parameter Value
Base model Qwen/Qwen2.5-3B-Instruct
Fine-tuning method LoRA (PEFT)
LoRA rank 32
LoRA alpha 64
LoRA dropout 0.05
Target modules q, k, v, o, gate, up, down projections
Quantization 4-bit NF4 (bitsandbytes)
Epochs 3
Batch size 2 (x8 gradient accumulation = effective 16)
Learning rate 2e-4 (cosine schedule, 5% warmup)
Precision fp16 (T4 compatible)
Max sequence length 2048 tokens
Training hardware 1x NVIDIA T4 (16 GB VRAM)
Training time ~30 minutes
Trainable parameters ~48M / 3B total (1.6%)

Dataset Statistics

Category Count Examples
Material property retrieval ~2,000 Yield strength, UTS, modulus, density, CTE
Beam deflection ~500 Cantilever, simply supported, distributed load
Stress analysis ~500 Von Mises, Mohr's circle, axial, torsion
GD&T interpretation ~500 Position, flatness, runout, concentricity
Thermal calculations ~400 Expansion, constrained stress, conduction
Pressure vessel design ~300 Cylindrical, spherical, ASME BPVC
Other engineering ~800 Fatigue, springs, bolts, bearings, welds, buckling

MechEval Benchmark Results

Model Material Retrieval Calculations GD&T Aggregate
Qwen2.5-3B (base) XX% XX% XX% XX%
MechSpec-Qwen-3B XX% XX% XX% XX%
Qwen2.5-7B (base) XX% XX% XX% XX%
GPT-4o (reference) XX% XX% XX% XX%

Results pending full evaluation. Run make eval to generate scores.

Quickstart

Installation

pip install -e .

Generate Training Data

mechspec-data --output data/generated/qa_pairs.json --num-pairs 5000

Train

mechspec-train --config configs/training_config.yaml

Inference

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "outputs/merged"  # or HuggingFace Hub ID
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_path)

messages = [
    {"role": "system", "content": "You are a mechanical engineering expert."},
    {"role": "user", "content": "What is the yield strength of Ti-6Al-4V?"},
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.1, do_sample=False)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)

Interactive Mode

mechspec-generate --model-path outputs/merged --interactive

Evaluate

mechspec-eval --config configs/eval_config.yaml --model-path outputs/merged

Project Structure

mechspec-qwen/
+-- configs/                  Training and evaluation YAML configs
+-- src/mechspec/
|   +-- data/                 Data generation, validation, formatting
|   +-- training/             LoRA fine-tuning and model merging
|   +-- eval/                 MechEval benchmark and scoring
|   +-- inference/            Single-query and batch inference
+-- data/eval_benchmark/      Curated evaluation datasets (JSON)
+-- notebooks/                Colab training notebook
+-- tests/                    Pytest test suite
+-- docs/                     Training log and data source documentation

Full Pipeline

Run the entire pipeline from data generation to evaluation:

make pipeline

Or step by step:

make data          # Generate Q&A pairs
make validate      # Validate generated data
make format-data   # Format for training
make split         # Create train/val/test splits
make train         # Fine-tune with LoRA
make merge         # Merge adapters into base model
make eval          # Run MechEval benchmark
make report        # Generate evaluation report

Development

pip install -e ".[dev]"
make lint          # Run ruff linter
make type-check    # Run mypy
make test          # Run pytest
make test-cov      # Run with coverage

Model Card

Intended Use

MechSpec-Qwen is designed for:

  • Retrieving material properties for common engineering alloys
  • Solving introductory-to-intermediate engineering calculation problems
  • Interpreting GD&T (Geometric Dimensioning and Tolerancing) specifications
  • Providing step-by-step engineering analysis with proper units

Limitations

  • Not a replacement for FEA/simulation software. The model performs closed-form analytical calculations, not finite element analysis.
  • Material property values are typical/nominal. Actual values vary by heat treatment, manufacturing process, and material lot. Always verify against certified material test reports (CMTRs) for critical applications.
  • Not validated for safety-critical decisions. Do not use model output as the sole basis for structural design or safety analysis. A licensed professional engineer (PE) must review all calculations used in practice.
  • Limited to training data coverage. The model is trained on ~20 common engineering alloys and standard textbook-level problems. It may hallucinate values for uncommon materials or advanced analysis methods.
  • English only. The model is fine-tuned on English-language engineering content.

Ethical Considerations

  • Engineering calculations require professional verification before use in real-world applications.
  • Incorrect material properties or calculations could lead to structural failures if used without review.
  • This model should be used as a productivity tool, not as a substitute for engineering education or professional practice.

Carbon Footprint

Training was performed on a single NVIDIA T4 GPU for approximately 30 minutes. Estimated energy consumption: ~0.05 kWh. Estimated CO2 emissions: ~0.02 kg CO2eq (assuming US average grid carbon intensity of 0.4 kg CO2/kWh).

License

This project is licensed under the Apache License 2.0. See LICENSE for details.

The base model (Qwen2.5-3B-Instruct) is licensed under the Qwen License.

Releases

No releases published

Packages

 
 
 

Contributors