kappaTune 🐘

A PyTorch-based optimizer wrapper for continual learning via selective fine-tuning, guided by the condition number ($\kappa$) of model tensors. KappaTune identifies and updates only the least anisotropic parameters to preserve pre-trained knowledge and mitigate catastrophic forgetting.

Please cite the following paper if you use this code or ideas derived from it in your publications: https://arxiv.org/html/2506.16289v3

Introduction

kappaTune is designed to address the challenge of catastrophic forgetting in continual learning scenarios. By analyzing the condition numbers of a neural network's weight matrices, it selects a subset of parameters to fine-tune. This approach updates only tensors with the smallest condition numbers due to a synergy of 3 factors:

Numerical Stability: Their inherent stability makes them less susceptible to training noise.
Learning Potential: Their higher differential entropy output provides more capacity to learn new information (acting like a raw marble block ready to be sculpted).
Knowledge Preservation: Their less specialized nature allows for robust adaptation without overwriting the highly specific, anisotropic weights that store foundational pre-training knowledge, as shown in the paper.

Features

Condition Number Guided Selection: Ranks model parameters based on their condition numbers, prioritizing those that are less anisotropic (more "round" in their singular value distribution).
Selective Fine-Tuning: Integrates with any standard PyTorch optimizer, ensuring only the selected parameters are updated.
Efficient Analysis: Caches condition numbers to avoid redundant computations across multiple runs or experiments.
Flexible Filtering: Allows skipping parameters based on number of dimensions, or maximum dimension size, providing fine-grained control over which tensors are considered for analysis.
Catastrophic Forgetting Mitigation: By selectively updating parameters, kappaTune helps preserve pre-trained knowledge, making it suitable for continual learning and domain adaptation tasks.

kappaTune vs. LoRA

While LoRA is highly effective for reducing training costs through parameter-efficient fine-tuning, it doesn’t inherently include a strategy to prevent catastrophic forgetting. In contrast, kappaTune is purpose-built for continual learning; it offers better retention of prior knowledge and also reduces computational effort as a side effect by selectively updating only a small subset of model tensors.

NEW: Hugging Face PEFT Integration

You can now use KappaTune's selection logic directly with the Hugging Face ecosystem. This allows you to apply LoRA adapters only to the proper modules, effectively mitigating catastrophic forgetting with a single line of code.

Usage with LoRA

Instead of manually guessing which layers to target (e.g., q_proj, v_proj), let KappaTune find the best $N$ modules based on their condition number:

from transformers import AutoModelForCausalLM
from kappaTune import get_kappatune_lora_model

# Load your model
model = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")

# Apply KappaTune-LoRA (New automated selection)
model = get_kappatune_lora_model(
    model, 
    num_modules_to_adapt=20, # Targets the 20 most stable layers
    lora_rank=16
)

Installation

Prerequisites

Python 3.8+
pip package manager

Dependencies

You can install the required libraries using pip:

pip install torch transformers datasets numpy

Usage

For KappaTune-LoRA using Hugging Face PEFT, see kappa_lora_tinyllama.py. For the original KappaTune fine tuning in Pytorch (without LoRA), see complete_example_use_selective_fine_tuning.py, which demonstrates how to use kappaTune to fine-tune a TinyLlama-1.1B model on a text classification dataset (ag_news), selectively updating parameters based on their condition numbers. Note that while ag_news is a classification dataset, the example code performs a language modeling (next-token prediction) task only to illustrate the LLM adaptation.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
complete_example_use_selective_fine_tuning.py		complete_example_use_selective_fine_tuning.py
experiments_SA_kappatune.py		experiments_SA_kappatune.py
experiments_sarcasm_kappatune.py		experiments_sarcasm_kappatune.py
hf_integration.py		hf_integration.py
kappa_lora_tinyllama.py		kappa_lora_tinyllama.py
pyproject.toml		pyproject.toml
selective_fine_tuning.py		selective_fine_tuning.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kappaTune 🐘

Table of Contents

Introduction

Features

kappaTune vs. LoRA

NEW: Hugging Face PEFT Integration

Usage with LoRA

Installation

Prerequisites

Dependencies

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

kappaTune 🐘

Table of Contents

Introduction

Features

kappaTune vs. LoRA

NEW: Hugging Face PEFT Integration

Usage with LoRA

Installation

Prerequisites

Dependencies

Usage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages