Skip to content

BBahtiri/Physics-Informed-Deep-Learning-with-Nested-Learning-Hope---Constitutive-Model-

Repository files navigation

Thermodynamically Consistent Deep Learning Model for Nanocomposites using Nested Learning

License: MIT Python 3.8+ TensorFlow 2.x

A Physics-Informed Deep Learning (PIDL) framework for modeling complex material behavior while enforcing thermodynamic consistency. This repository implements a novel approach that combines HOPE (Nested Learning) architecture with Feed-Forward Neural Networks (FFNNs) to predict the mechanical response of nanoparticle-filled epoxy composites under varying ambient conditions.

Model Architecture

🎯 Overview

This framework addresses the challenge of modeling complex material behavior by:

  • Enforcing Physics: Incorporating thermodynamic principles directly into the neural network architecture
  • Capturing History: Using HOPE blocks (Titans Memory + CMS) to model history-dependent material behavior through internal state variables
  • Ensuring Consistency: Guaranteeing thermodynamic consistency through physics-based loss terms
  • Handling Complexity: Managing multi-scale effects from temperature, moisture, and nanoparticle content variations

Key Innovation

The model uniquely combines:

  1. HOPE (Nested Learning) Architecture → Based on the paper "Nested Learning: The Illusion of Deep Learning Architecture" (Behrouz et al.)
  2. TitansL2 Memory Module → Adaptive memory with Delta Rule updates for capturing temporal dependencies
  3. Continuum Memory System (CMS) → Multi-frequency memory consolidation for persistent knowledge storage
  4. FFNN for Free Energy → Approximate the material's thermodynamic state
  5. Automatic Differentiation → Derive stress from free energy (∂Ψ/∂C)
  6. Physics Constraints → Enforce non-negative dissipation and thermodynamic laws

🧠 HOPE Implementation Overview

This section provides a detailed overview of the HOPE (Nested Learning) implementation in src/hope_layer.py, which replaces traditional LSTM layers with a more expressive memory system inspired by neuroscience principles.

Architecture Hierarchy

src/hope_layer.py
├── DynamicDense           # GLU-style gated projections
├── TitansL2               # Base memory with Delta Rule (standard projections)
├── TitansL2Dynamic        # Memory with Dynamic (gated) projections
├── CMSBlock               # Simple MLP for persistent storage
├── CMSLayer               # Chunk-based memory with slower updates
├── HopeBlock              # TitansL2 + CMSBlock
├── HopeBlockDynamic       # TitansL2Dynamic + CMSBlock
└── FullHOPEBlock          # TitansL2Dynamic + CMSLayer (used in model)

1. DynamicDense Layer

GLU-style gated projection that modulates the output based on input content:

y = (x @ W_static) * SiLU(x @ W_gate)
Component Description
W_static Static projection weights
W_gate Gating weights with SiLU activation
Output Element-wise product of projection and gate

Purpose: Provides input-dependent gating for more expressive Q/K/V projections.


2. TitansL2 Memory Module

The core associative memory with Delta Rule updates. Processes sequences step-by-step using tf.scan.

Memory Update Formula

M_new = M_prev - α · forget_term + β · write_term

where:
  forget_term = (M @ k) @ k^T    # Selective forgetting
  write_term  = v @ k^T          # New information

Architecture Details

Parameter Description Value
head_dim Dimension per attention head units // n_head
n_head Number of parallel memory heads 6 (configurable)
α (alpha) Forgetting rate sigmoid(α_raw) × 0.8 ∈ [0, 0.8]
β (beta) Writing rate sigmoid(β_raw) × 0.8 ∈ [0, 0.8]
Memory shape Per-head memory matrix [batch, n_head, head_dim, head_dim]

Processing Flow

1. Project inputs → Q, K, V  (via Dense or DynamicDense)
2. L2-normalize K and Q      (gradient stability)
3. Reshape to multi-head     [B, T, n_head, head_dim]
4. tf.scan over timesteps:
   a. Read:  y = M @ q
   b. Forget: M -= α · (M @ k) @ k^T
   c. Write:  M += β · v @ k^T
5. Project output            (c_proj)

Optional Momentum

When use_momentum=True, updates are smoothed via exponential moving average:

δ = -α · forget_term + β · write_term
momentum_new = β_m · momentum_prev + (1 - β_m) · δ
M_new = M_prev + momentum_new

3. TitansL2Dynamic

Same as TitansL2 but uses DynamicDense for Q/K/V projections instead of standard Dense layers. This provides input-dependent gating for more expressive memory operations.


4. CMSBlock (Simple)

Persistent knowledge storage via a standard MLP:

Sequential([
    Dense(4 × units, activation='gelu'),
    Dense(units),
    Dropout(rate)
])

Purpose: Stores static knowledge learned during training (like original Transformer FFN).


5. CMSLayer (Chunk-based)

Continuum Memory System with multi-frequency updates inspired by brain oscillations.

Key Innovation: Chunk-based Memory Updates

Instead of updating memory at every timestep (like TitansL2), CMSLayer accumulates updates and applies them at chunk boundaries:

# At each timestep:
pending_forget += (M @ k) @ k^T
pending_write  += v @ k^T

# At chunk boundaries (every chunk_size steps):
M += -α · (pending_forget / chunk_size) + β · (pending_write / chunk_size)

Architecture

Input ──┬── MLP Path ───────────────── x_static
        │      c_fc (4×units, gelu)
        │      c_proj (units)
        │
        └── Memory Path ────────────── y_mem
               c_key, c_val projections
               Chunk-based memory updates
               
Output = x_static + y_mem  (combined knowledge)
Parameter Description Default
chunk_size Steps between memory updates 16
α, β Forget/write rates learnable

Purpose: Lower-frequency updates consolidate information over longer timescales.


6. FullHOPEBlock (Used in Model)

Combines high-frequency (TitansL2Dynamic) and low-frequency (CMSLayer) memory systems:

Input
  │
  ├──► LayerNorm ──► TitansL2Dynamic ──┐
  │                                    + (residual)
  ◄────────────────────────────────────┘
  │
  ├──► LayerNorm ──► CMSLayer ─────────┐
  │                                    + (residual)
  ◄────────────────────────────────────┘
  │
Output

Multi-Timescale Design

Component Update Frequency Purpose
TitansL2Dynamic Every timestep Fast adaptation, short-term dependencies
CMSLayer Every 16 timesteps Slow consolidation, long-term patterns

This mirrors brain oscillation theory where different frequencies handle different cognitive functions.


Integration with Physics-Informed Model

In src/model.py, two FullHOPEBlock layers replace traditional LSTM:

# Input projection to match HOPE dimensions
self.input_proj = TimeDistributed(Dense(layer_size))

# Two stacked HOPE blocks
self.hope_block1 = FullHOPEBlock(units=layer_size, n_head=6, chunk_size=16)
self.hope_block2 = FullHOPEBlock(units=layer_size, n_head=6, chunk_size=16)

The HOPE blocks capture history-dependent material behavior, which is then used to predict internal state variables (z_i) for the thermodynamic model.


Why HOPE Instead of LSTM?

Aspect LSTM HOPE
Memory Type Vector state Matrix associative memory
Update Rule Gated cell state Delta Rule (gradient descent on memory)
Capacity Fixed hidden size O(head_dim²) per head
Multi-scale Single timescale High + low frequency systems
Interpretability Opaque gates Key-value associations

🚀 Features

✨ Core Capabilities

  • Thermodynamic Consistency: Automatic enforcement of physical laws through custom loss functions
  • Multi-Scale Modeling: Handles effects from molecular (moisture) to macro (fiber orientation) scales
  • Environmental Sensitivity: Accounts for temperature, moisture content, and nanoparticle volume fraction
  • History Dependence: Captures path-dependent material behavior through internal variables
  • Experimental Data Driven: Trained directly on experimental stress-strain data

🔬 Scientific Features

  • Physics-Informed Architecture: Custom neural network layers that respect continuum mechanics
  • Automatic Stress Derivation: Stress computed as σ = 2∂Ψ/∂C using TensorFlow's automatic differentiation
  • Dissipation Monitoring: Real-time calculation and enforcement of non-negative energy dissipation
  • Free Energy Learning: Neural network approximation of Helmholtz free energy function

🛠️ Technical Features

  • Modular Design: Easily adaptable to different material systems
  • HOPE Integration: Advanced memory architecture for temporal modeling
  • Comprehensive Logging: Detailed training metrics and physics constraint monitoring
  • GPU Acceleration: Optimized for high-performance computing environments

📋 Requirements

System Requirements

  • Python: 3.8 or higher
  • GPU: NVIDIA GPU with CUDA support (recommended)
  • Memory: Minimum 8GB RAM, 16GB+ recommended for large datasets

Dependencies

# Core dependencies
tensorflow >= 2.8.0
numpy >= 1.21.0
scipy >= 1.7.0
matplotlib >= 3.5.0

# Optional but recommended
nvidia-cudnn-cu11  # For GPU acceleration

🔧 Installation

Option 1: Clone and Install

# Clone the repository
git clone https://github.com/BBahtiri/Deep-Learning-Constitutive-Model.git
cd Deep-Learning-Constitutive-Model

# Create virtual environment (recommended)
python -m venv physics_ai_env
source physics_ai_env/bin/activate  # On Windows: physics_ai_env\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Option 2: Direct Installation

pip install tensorflow numpy scipy matplotlib
# Add nvidia-cudnn-cu11 for GPU support

📁 Project Structure

PHYSICS-AI/
├── 📄 Main_ML.py              # Main training and evaluation script
├── 🧠 DL_model.py             # Core neural network with HOPE architecture
├── 🔮 hope_layer.py           # HOPE module implementation (TitansL2, CMS, HopeBlock)
├── 🔧 misc.py                 # Data loading and preprocessing utilities
├── 📊 data_experiments_train/ # Training experimental data (.mat files)
├── 📊 data_experiments_validation/ # Validation experimental data
├── 📁 experiment_outputs_pinn/ # Generated results and outputs
├── 📁 checkpoints/            # Model checkpoints during training
├── 📄 extracted_paper_content.txt # Nested Learning paper reference
├── 📄 README.md              # This file
└── 🖼️ pinn.PNG               # Architecture diagram

🏃‍♂️ Quick Start

1. Prepare Your Data

Organize your experimental data in the following structure:

data_experiments_train/
├── epoxy_1_1_1_001.mat
├── epoxy_1_1_2_001.mat
└── ... (more .mat files)

data_experiments_validation/
├── epoxy_2_1_1_001.mat
└── ... (validation .mat files)

Expected .mat file contents:

  • expStress: Experimental stress data
  • trueStrain: True strain measurements
  • timeVec: Time vector for the experiment

2. Configure Hyperparameters

Edit the hyperparameters in Main_ML.py:

# Network Architecture
layer_size = 24              # HOPE and Dense layer units (must be divisible by n_head)
layer_size_fenergy = 24      # Free energy network units
internal_variables = 6       # Number of internal state variables
n_head = 6                   # Number of attention heads in HOPE blocks

# Training Parameters
learning_rate = 0.001        # Initial learning rate
num_epochs = 2000           # Maximum training epochs
batch_size = 32             # Training batch size
timesteps = 500             # Sequence length for HOPE processing

3. Run Training

python Main_ML.py

4. Monitor Results

Training outputs are saved to structured directories:

  • ./final_predictions/ - Model predictions and internal states
  • ./stress_exact/ - Ground truth stress data
  • ./weights/ - Final trained model weights
  • ./checkpoints/ - Training checkpoints

📊 Data Format

Input Data Requirements

Your .mat files should contain:

Variable Description Shape Units
expStress Experimental stress [n_timesteps, 1] MPa
trueStrain True strain [n_timesteps, 1] dimensionless
timeVec Time vector [n_timesteps, 1] seconds

Filename Convention

The code expects filenames in the format: epoxy_X_Y_Z_*.mat

  • X: Nanoparticle content indicator (1=0%, 2=5%, 3=10%)
  • Y: Moisture condition (1=dry, 2=saturated)
  • Z: Temperature condition (1=-20°C, 2=23°C, 3=50°C, 4=60°C)

🧠 Model Architecture

Overall Framework

graph TD
    A[Input Sequence] --> B[Input Projection]
    B --> C[FullHOPE Block 1]
    C --> D[FullHOPE Block 2]
    D --> E[Dense Layers]
    E --> F[Internal Variables z_i]
    F --> G[Free Energy Network]
    A --> G
    G --> H[Free Energy Ψ]
    H --> I[Automatic Differentiation]
    I --> J[Stress σ = 2∂Ψ/∂C]
    F --> K[Dissipation Calculation]
    J --> L[Physics-Informed Loss]
    K --> L
Loading

Key Components

1. FullHOPE Blocks (Replacing LSTM)

  • TitansL2Dynamic: Multi-head associative memory with dynamic projections
  • CMSLayer: Continuum Memory System with chunk-based updates
  • Architecture: Pre-LayerNorm → TitansL2 → Residual → LayerNorm → CMS → Residual

2. TitansL2 Memory Update

# Delta Rule Memory Update
M_new = M_prev - α * forget_term + β * write_term
# where:
#   forget_term = (M @ k) @ k^T  (selective forgetting)
#   write_term = v @ k^T         (new information)

3. Internal Variable Prediction

  • Input: HOPE hidden states
  • Architecture: Time-distributed dense layers with swish activation
  • Output: Evolution of internal state variables (z_i)

4. Free Energy Network

  • Input: Internal variables + strain measure
  • Architecture: Dense layers with physics constraints
  • Constraints: Non-negative weights, softplus activation
  • Output: Helmholtz free energy (Ψ)

5. Physics Enforcement

  • Stress Derivation: σ = 2∂Ψ/∂C via automatic differentiation
  • Dissipation: D = -∑(τ_i · ż_i) where τ_i = -∂Ψ/∂z_i
  • Constraints: D ≥ 0 (thermodynamic consistency)

⚙️ Advanced Usage

Custom Material Systems

To adapt for different materials:

  1. Modify data loading in misc.py:
def getData_exp(input_mat_file_path, target_sequence_length=1000):
    # Adapt for your data format
    mat_contents = scipy.io.loadmat(input_mat_file_path)
    # Modify key names as needed
    stress_raw = mat_contents['your_stress_key']
    # ... rest of implementation
  1. Adjust HOPE architecture in DL_model.py:
# Modify number of heads (must divide layer_size evenly)
n_head = 4  # or 6, 8, etc.

# Adjust chunk size for CMS
self.hope_block1 = FullHOPEBlock(units=layer_size, n_head=n_head, chunk_size=32)
  1. Update network architecture in DL_model.py:
# Adjust number of internal variables for your physics
internal_variables = 12  # Example: more complex material

Custom Physics Constraints

Add domain-specific physics in DL_model.py:

def call(self, normalized_inputs_seq):
    # ... existing code ...
    
    # Add your custom physics constraint
    custom_physics_penalty = your_physics_function(psi_final_full_sequence)
    self.add_loss(custom_physics_penalty * weight_factor)
    
    return norm_pred_stress_for_loss

Hyperparameter Tuning

Key hyperparameters to optimize:

# Architecture
layer_size = [24, 30, 48]           # Network capacity (must be divisible by n_head)
n_head = [4, 6, 8]                  # Number of attention heads
internal_variables = [6, 8, 12]     # Complexity of internal state
layer_size_fenergy = [20, 30, 50]   # Free energy network size

# Training
learning_rate = [1e-4, 1e-3, 5e-3]  # Learning rate schedule
batch_size = [16, 32, 64]           # Memory vs. gradient quality
timesteps = [250, 500, 1000]        # Sequence length vs. memory

📈 Evaluation and Visualization

Training Metrics

The framework automatically tracks:

  • Primary Loss: Mean Absolute Error on stress prediction
  • Physics Penalties: Dissipation and free energy constraints
  • Validation Performance: Generalization metrics

Output Analysis

Post-training analysis includes:

  • Stress-Strain Curves: Compare predictions vs. experiments
  • Internal Variable Evolution: Track material state changes
  • Free Energy Landscapes: Visualize thermodynamic surfaces
  • Dissipation Monitoring: Verify physics compliance

Visualization Example

import matplotlib.pyplot as plt
import numpy as np

# Load results
stress_pred = np.loadtxt('./final_predictions/stress_pred_unnorm_0.txt')
stress_true = np.loadtxt('./stress_exact/stress_unnorm_0.txt')
strain = np.loadtxt('./strain/strain_unnorm_0.txt')

# Plot stress-strain comparison
plt.figure(figsize=(10, 6))
plt.plot(strain[1:], stress_true, 'b-', label='Experimental', linewidth=2)
plt.plot(strain[1:], stress_pred, 'r--', label='PIDL-HOPE Prediction', linewidth=2)
plt.xlabel('Strain')
plt.ylabel('Stress (MPa)')
plt.legend()
plt.grid(True)
plt.title('PIDL-HOPE Model Performance')
plt.show()

🔬 Scientific Background

Theoretical Foundation

This implementation is based on:

  1. Thermodynamically Consistent Framework:

    • Helmholtz Free Energy: Ψ(C, z_i, θ) defines the material's thermodynamic state
    • Stress Derivation: σ = 2∂Ψ/∂C (from continuum mechanics)
    • Evolution Laws: ż_i governed by thermodynamic forces τ_i = -∂Ψ/∂z_i
    • Dissipation: D = -∑τ_i·ż_i ≥ 0 (second law of thermodynamics)
  2. Nested Learning Theory (Behrouz et al.):

    • Multi-frequency memory updates inspired by brain oscillations
    • Delta Rule associative memory for temporal dependencies
    • Continuum Memory System for knowledge consolidation

Key Advantages

  • Physics Consistency: Guaranteed satisfaction of thermodynamic laws
  • Interpretability: Internal variables have physical meaning
  • Generalization: Physics constraints improve extrapolation
  • Data Efficiency: Physics guidance reduces data requirements
  • Advanced Memory: HOPE architecture captures complex temporal patterns

📚 Citation

If you use this code in your research, please cite:

@article{bahtiri2024thermodynamically,
  title={A thermodynamically consistent physics-informed deep learning material model for short fiber/polymer nanocomposites},
  author={Bahtiri, Betim and Arash, Behrouz and Scheffler, Sven and Jux, Maximilian and Rolfes, Raimund},
  journal={Computer Methods in Applied Mechanics and Engineering},
  volume={427},
  pages={117038},
  year={2024},
  publisher={Elsevier},
  doi={https://doi.org/10.1016/j.cma.2024.117038}
}

@article{behrouz2025nested,
  title={Nested Learning: The Illusion of Deep Learning Architecture},
  author={Behrouz, Ali and Razaviyayn, Meisam and Zhong, Peilin and Mirrokni, Vahab},
  journal={Neural Information Processing Systems (NeurIPS)},
  year={2025},
  doi={https://arxiv.org/abs/2512.24695}
}

Areas for Contribution

  • New Material Systems: Adapt framework for metals, ceramics, biologics
  • Enhanced Physics: Add new thermodynamic constraints
  • HOPE Extensions: Experiment with different memory configurations
  • Optimization: Improve computational efficiency
  • Visualization: Enhanced plotting and analysis tools

Development Setup

# Fork and clone your fork
git clone https://github.com/BBahtiri/Deep-Learning-Constitutive-Model.git
cd Deep-Learning-Constitutive-Model

# Create development environment
python -m venv dev_env
source dev_env/bin/activate

# Install in development mode
pip install -e .
pip install -r requirements-dev.txt  # Include testing dependencies

# Run tests
python -m pytest tests/

🐛 Troubleshooting

Common Issues

GPU Memory Errors

# In Main_ML.py, reduce batch size or sequence length
batch_size = 16  # Reduce from 32
timesteps = 250  # Reduce from 500

Layer Size / Head Compatibility

# layer_size must be divisible by n_head
layer_size = 24  # Works with n_head = 4, 6, 8
n_head = 6       # 24 / 6 = 4 (valid)

Convergence Issues

# Try different learning rates or architectures
learning_rate = 5e-4  # Adjust learning rate
layer_size = 30       # Try different capacity

Data Loading Errors

  • Verify .mat file structure matches expected format
  • Check filename conventions match parsing logic
  • Ensure sufficient data files in train/validation directories

Getting Help


⭐ Star this repository if you find it useful!

About

Physics-Informed Deep Learning with Google's HOPE Architecture for internal variables for Material Modeling

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages