A Physics-Informed Deep Learning (PIDL) framework for modeling complex material behavior while enforcing thermodynamic consistency. This repository implements a novel approach that combines HOPE (Nested Learning) architecture with Feed-Forward Neural Networks (FFNNs) to predict the mechanical response of nanoparticle-filled epoxy composites under varying ambient conditions.
This framework addresses the challenge of modeling complex material behavior by:
- Enforcing Physics: Incorporating thermodynamic principles directly into the neural network architecture
- Capturing History: Using HOPE blocks (Titans Memory + CMS) to model history-dependent material behavior through internal state variables
- Ensuring Consistency: Guaranteeing thermodynamic consistency through physics-based loss terms
- Handling Complexity: Managing multi-scale effects from temperature, moisture, and nanoparticle content variations
The model uniquely combines:
- HOPE (Nested Learning) Architecture → Based on the paper "Nested Learning: The Illusion of Deep Learning Architecture" (Behrouz et al.)
- TitansL2 Memory Module → Adaptive memory with Delta Rule updates for capturing temporal dependencies
- Continuum Memory System (CMS) → Multi-frequency memory consolidation for persistent knowledge storage
- FFNN for Free Energy → Approximate the material's thermodynamic state
- Automatic Differentiation → Derive stress from free energy (∂Ψ/∂C)
- Physics Constraints → Enforce non-negative dissipation and thermodynamic laws
This section provides a detailed overview of the HOPE (Nested Learning) implementation in src/hope_layer.py, which replaces traditional LSTM layers with a more expressive memory system inspired by neuroscience principles.
src/hope_layer.py
├── DynamicDense # GLU-style gated projections
├── TitansL2 # Base memory with Delta Rule (standard projections)
├── TitansL2Dynamic # Memory with Dynamic (gated) projections
├── CMSBlock # Simple MLP for persistent storage
├── CMSLayer # Chunk-based memory with slower updates
├── HopeBlock # TitansL2 + CMSBlock
├── HopeBlockDynamic # TitansL2Dynamic + CMSBlock
└── FullHOPEBlock # TitansL2Dynamic + CMSLayer (used in model)
GLU-style gated projection that modulates the output based on input content:
y = (x @ W_static) * SiLU(x @ W_gate)| Component | Description |
|---|---|
W_static |
Static projection weights |
W_gate |
Gating weights with SiLU activation |
| Output | Element-wise product of projection and gate |
Purpose: Provides input-dependent gating for more expressive Q/K/V projections.
The core associative memory with Delta Rule updates. Processes sequences step-by-step using tf.scan.
M_new = M_prev - α · forget_term + β · write_term
where:
forget_term = (M @ k) @ k^T # Selective forgetting
write_term = v @ k^T # New information
| Parameter | Description | Value |
|---|---|---|
head_dim |
Dimension per attention head | units // n_head |
n_head |
Number of parallel memory heads | 6 (configurable) |
α (alpha) |
Forgetting rate | sigmoid(α_raw) × 0.8 ∈ [0, 0.8] |
β (beta) |
Writing rate | sigmoid(β_raw) × 0.8 ∈ [0, 0.8] |
| Memory shape | Per-head memory matrix | [batch, n_head, head_dim, head_dim] |
1. Project inputs → Q, K, V (via Dense or DynamicDense)
2. L2-normalize K and Q (gradient stability)
3. Reshape to multi-head [B, T, n_head, head_dim]
4. tf.scan over timesteps:
a. Read: y = M @ q
b. Forget: M -= α · (M @ k) @ k^T
c. Write: M += β · v @ k^T
5. Project output (c_proj)
When use_momentum=True, updates are smoothed via exponential moving average:
δ = -α · forget_term + β · write_term
momentum_new = β_m · momentum_prev + (1 - β_m) · δ
M_new = M_prev + momentum_new
Same as TitansL2 but uses DynamicDense for Q/K/V projections instead of standard Dense layers. This provides input-dependent gating for more expressive memory operations.
Persistent knowledge storage via a standard MLP:
Sequential([
Dense(4 × units, activation='gelu'),
Dense(units),
Dropout(rate)
])Purpose: Stores static knowledge learned during training (like original Transformer FFN).
Continuum Memory System with multi-frequency updates inspired by brain oscillations.
Instead of updating memory at every timestep (like TitansL2), CMSLayer accumulates updates and applies them at chunk boundaries:
# At each timestep:
pending_forget += (M @ k) @ k^T
pending_write += v @ k^T
# At chunk boundaries (every chunk_size steps):
M += -α · (pending_forget / chunk_size) + β · (pending_write / chunk_size)Input ──┬── MLP Path ───────────────── x_static
│ c_fc (4×units, gelu)
│ c_proj (units)
│
└── Memory Path ────────────── y_mem
c_key, c_val projections
Chunk-based memory updates
Output = x_static + y_mem (combined knowledge)
| Parameter | Description | Default |
|---|---|---|
chunk_size |
Steps between memory updates | 16 |
α, β |
Forget/write rates | learnable |
Purpose: Lower-frequency updates consolidate information over longer timescales.
Combines high-frequency (TitansL2Dynamic) and low-frequency (CMSLayer) memory systems:
Input
│
├──► LayerNorm ──► TitansL2Dynamic ──┐
│ + (residual)
◄────────────────────────────────────┘
│
├──► LayerNorm ──► CMSLayer ─────────┐
│ + (residual)
◄────────────────────────────────────┘
│
Output
| Component | Update Frequency | Purpose |
|---|---|---|
| TitansL2Dynamic | Every timestep | Fast adaptation, short-term dependencies |
| CMSLayer | Every 16 timesteps | Slow consolidation, long-term patterns |
This mirrors brain oscillation theory where different frequencies handle different cognitive functions.
In src/model.py, two FullHOPEBlock layers replace traditional LSTM:
# Input projection to match HOPE dimensions
self.input_proj = TimeDistributed(Dense(layer_size))
# Two stacked HOPE blocks
self.hope_block1 = FullHOPEBlock(units=layer_size, n_head=6, chunk_size=16)
self.hope_block2 = FullHOPEBlock(units=layer_size, n_head=6, chunk_size=16)The HOPE blocks capture history-dependent material behavior, which is then used to predict internal state variables (z_i) for the thermodynamic model.
| Aspect | LSTM | HOPE |
|---|---|---|
| Memory Type | Vector state | Matrix associative memory |
| Update Rule | Gated cell state | Delta Rule (gradient descent on memory) |
| Capacity | Fixed hidden size | O(head_dim²) per head |
| Multi-scale | Single timescale | High + low frequency systems |
| Interpretability | Opaque gates | Key-value associations |
- Thermodynamic Consistency: Automatic enforcement of physical laws through custom loss functions
- Multi-Scale Modeling: Handles effects from molecular (moisture) to macro (fiber orientation) scales
- Environmental Sensitivity: Accounts for temperature, moisture content, and nanoparticle volume fraction
- History Dependence: Captures path-dependent material behavior through internal variables
- Experimental Data Driven: Trained directly on experimental stress-strain data
- Physics-Informed Architecture: Custom neural network layers that respect continuum mechanics
- Automatic Stress Derivation: Stress computed as σ = 2∂Ψ/∂C using TensorFlow's automatic differentiation
- Dissipation Monitoring: Real-time calculation and enforcement of non-negative energy dissipation
- Free Energy Learning: Neural network approximation of Helmholtz free energy function
- Modular Design: Easily adaptable to different material systems
- HOPE Integration: Advanced memory architecture for temporal modeling
- Comprehensive Logging: Detailed training metrics and physics constraint monitoring
- GPU Acceleration: Optimized for high-performance computing environments
- Python: 3.8 or higher
- GPU: NVIDIA GPU with CUDA support (recommended)
- Memory: Minimum 8GB RAM, 16GB+ recommended for large datasets
# Core dependencies
tensorflow >= 2.8.0
numpy >= 1.21.0
scipy >= 1.7.0
matplotlib >= 3.5.0
# Optional but recommended
nvidia-cudnn-cu11 # For GPU acceleration# Clone the repository
git clone https://github.com/BBahtiri/Deep-Learning-Constitutive-Model.git
cd Deep-Learning-Constitutive-Model
# Create virtual environment (recommended)
python -m venv physics_ai_env
source physics_ai_env/bin/activate # On Windows: physics_ai_env\Scripts\activate
# Install dependencies
pip install -r requirements.txtpip install tensorflow numpy scipy matplotlib
# Add nvidia-cudnn-cu11 for GPU supportPHYSICS-AI/
├── 📄 Main_ML.py # Main training and evaluation script
├── 🧠 DL_model.py # Core neural network with HOPE architecture
├── 🔮 hope_layer.py # HOPE module implementation (TitansL2, CMS, HopeBlock)
├── 🔧 misc.py # Data loading and preprocessing utilities
├── 📊 data_experiments_train/ # Training experimental data (.mat files)
├── 📊 data_experiments_validation/ # Validation experimental data
├── 📁 experiment_outputs_pinn/ # Generated results and outputs
├── 📁 checkpoints/ # Model checkpoints during training
├── 📄 extracted_paper_content.txt # Nested Learning paper reference
├── 📄 README.md # This file
└── 🖼️ pinn.PNG # Architecture diagram
Organize your experimental data in the following structure:
data_experiments_train/
├── epoxy_1_1_1_001.mat
├── epoxy_1_1_2_001.mat
└── ... (more .mat files)
data_experiments_validation/
├── epoxy_2_1_1_001.mat
└── ... (validation .mat files)
Expected .mat file contents:
expStress: Experimental stress datatrueStrain: True strain measurementstimeVec: Time vector for the experiment
Edit the hyperparameters in Main_ML.py:
# Network Architecture
layer_size = 24 # HOPE and Dense layer units (must be divisible by n_head)
layer_size_fenergy = 24 # Free energy network units
internal_variables = 6 # Number of internal state variables
n_head = 6 # Number of attention heads in HOPE blocks
# Training Parameters
learning_rate = 0.001 # Initial learning rate
num_epochs = 2000 # Maximum training epochs
batch_size = 32 # Training batch size
timesteps = 500 # Sequence length for HOPE processingpython Main_ML.pyTraining outputs are saved to structured directories:
./final_predictions/- Model predictions and internal states./stress_exact/- Ground truth stress data./weights/- Final trained model weights./checkpoints/- Training checkpoints
Your .mat files should contain:
| Variable | Description | Shape | Units |
|---|---|---|---|
expStress |
Experimental stress | [n_timesteps, 1] | MPa |
trueStrain |
True strain | [n_timesteps, 1] | dimensionless |
timeVec |
Time vector | [n_timesteps, 1] | seconds |
The code expects filenames in the format: epoxy_X_Y_Z_*.mat
- X: Nanoparticle content indicator (1=0%, 2=5%, 3=10%)
- Y: Moisture condition (1=dry, 2=saturated)
- Z: Temperature condition (1=-20°C, 2=23°C, 3=50°C, 4=60°C)
graph TD
A[Input Sequence] --> B[Input Projection]
B --> C[FullHOPE Block 1]
C --> D[FullHOPE Block 2]
D --> E[Dense Layers]
E --> F[Internal Variables z_i]
F --> G[Free Energy Network]
A --> G
G --> H[Free Energy Ψ]
H --> I[Automatic Differentiation]
I --> J[Stress σ = 2∂Ψ/∂C]
F --> K[Dissipation Calculation]
J --> L[Physics-Informed Loss]
K --> L
- TitansL2Dynamic: Multi-head associative memory with dynamic projections
- CMSLayer: Continuum Memory System with chunk-based updates
- Architecture: Pre-LayerNorm → TitansL2 → Residual → LayerNorm → CMS → Residual
# Delta Rule Memory Update
M_new = M_prev - α * forget_term + β * write_term
# where:
# forget_term = (M @ k) @ k^T (selective forgetting)
# write_term = v @ k^T (new information)- Input: HOPE hidden states
- Architecture: Time-distributed dense layers with swish activation
- Output: Evolution of internal state variables (z_i)
- Input: Internal variables + strain measure
- Architecture: Dense layers with physics constraints
- Constraints: Non-negative weights, softplus activation
- Output: Helmholtz free energy (Ψ)
- Stress Derivation: σ = 2∂Ψ/∂C via automatic differentiation
- Dissipation: D = -∑(τ_i · ż_i) where τ_i = -∂Ψ/∂z_i
- Constraints: D ≥ 0 (thermodynamic consistency)
To adapt for different materials:
- Modify data loading in
misc.py:
def getData_exp(input_mat_file_path, target_sequence_length=1000):
# Adapt for your data format
mat_contents = scipy.io.loadmat(input_mat_file_path)
# Modify key names as needed
stress_raw = mat_contents['your_stress_key']
# ... rest of implementation- Adjust HOPE architecture in
DL_model.py:
# Modify number of heads (must divide layer_size evenly)
n_head = 4 # or 6, 8, etc.
# Adjust chunk size for CMS
self.hope_block1 = FullHOPEBlock(units=layer_size, n_head=n_head, chunk_size=32)- Update network architecture in
DL_model.py:
# Adjust number of internal variables for your physics
internal_variables = 12 # Example: more complex materialAdd domain-specific physics in DL_model.py:
def call(self, normalized_inputs_seq):
# ... existing code ...
# Add your custom physics constraint
custom_physics_penalty = your_physics_function(psi_final_full_sequence)
self.add_loss(custom_physics_penalty * weight_factor)
return norm_pred_stress_for_lossKey hyperparameters to optimize:
# Architecture
layer_size = [24, 30, 48] # Network capacity (must be divisible by n_head)
n_head = [4, 6, 8] # Number of attention heads
internal_variables = [6, 8, 12] # Complexity of internal state
layer_size_fenergy = [20, 30, 50] # Free energy network size
# Training
learning_rate = [1e-4, 1e-3, 5e-3] # Learning rate schedule
batch_size = [16, 32, 64] # Memory vs. gradient quality
timesteps = [250, 500, 1000] # Sequence length vs. memoryThe framework automatically tracks:
- Primary Loss: Mean Absolute Error on stress prediction
- Physics Penalties: Dissipation and free energy constraints
- Validation Performance: Generalization metrics
Post-training analysis includes:
- Stress-Strain Curves: Compare predictions vs. experiments
- Internal Variable Evolution: Track material state changes
- Free Energy Landscapes: Visualize thermodynamic surfaces
- Dissipation Monitoring: Verify physics compliance
import matplotlib.pyplot as plt
import numpy as np
# Load results
stress_pred = np.loadtxt('./final_predictions/stress_pred_unnorm_0.txt')
stress_true = np.loadtxt('./stress_exact/stress_unnorm_0.txt')
strain = np.loadtxt('./strain/strain_unnorm_0.txt')
# Plot stress-strain comparison
plt.figure(figsize=(10, 6))
plt.plot(strain[1:], stress_true, 'b-', label='Experimental', linewidth=2)
plt.plot(strain[1:], stress_pred, 'r--', label='PIDL-HOPE Prediction', linewidth=2)
plt.xlabel('Strain')
plt.ylabel('Stress (MPa)')
plt.legend()
plt.grid(True)
plt.title('PIDL-HOPE Model Performance')
plt.show()This implementation is based on:
-
Thermodynamically Consistent Framework:
- Helmholtz Free Energy: Ψ(C, z_i, θ) defines the material's thermodynamic state
- Stress Derivation: σ = 2∂Ψ/∂C (from continuum mechanics)
- Evolution Laws: ż_i governed by thermodynamic forces τ_i = -∂Ψ/∂z_i
- Dissipation: D = -∑τ_i·ż_i ≥ 0 (second law of thermodynamics)
-
Nested Learning Theory (Behrouz et al.):
- Multi-frequency memory updates inspired by brain oscillations
- Delta Rule associative memory for temporal dependencies
- Continuum Memory System for knowledge consolidation
- Physics Consistency: Guaranteed satisfaction of thermodynamic laws
- Interpretability: Internal variables have physical meaning
- Generalization: Physics constraints improve extrapolation
- Data Efficiency: Physics guidance reduces data requirements
- Advanced Memory: HOPE architecture captures complex temporal patterns
If you use this code in your research, please cite:
@article{bahtiri2024thermodynamically,
title={A thermodynamically consistent physics-informed deep learning material model for short fiber/polymer nanocomposites},
author={Bahtiri, Betim and Arash, Behrouz and Scheffler, Sven and Jux, Maximilian and Rolfes, Raimund},
journal={Computer Methods in Applied Mechanics and Engineering},
volume={427},
pages={117038},
year={2024},
publisher={Elsevier},
doi={https://doi.org/10.1016/j.cma.2024.117038}
}
@article{behrouz2025nested,
title={Nested Learning: The Illusion of Deep Learning Architecture},
author={Behrouz, Ali and Razaviyayn, Meisam and Zhong, Peilin and Mirrokni, Vahab},
journal={Neural Information Processing Systems (NeurIPS)},
year={2025},
doi={https://arxiv.org/abs/2512.24695}
}- New Material Systems: Adapt framework for metals, ceramics, biologics
- Enhanced Physics: Add new thermodynamic constraints
- HOPE Extensions: Experiment with different memory configurations
- Optimization: Improve computational efficiency
- Visualization: Enhanced plotting and analysis tools
# Fork and clone your fork
git clone https://github.com/BBahtiri/Deep-Learning-Constitutive-Model.git
cd Deep-Learning-Constitutive-Model
# Create development environment
python -m venv dev_env
source dev_env/bin/activate
# Install in development mode
pip install -e .
pip install -r requirements-dev.txt # Include testing dependencies
# Run tests
python -m pytest tests/GPU Memory Errors
# In Main_ML.py, reduce batch size or sequence length
batch_size = 16 # Reduce from 32
timesteps = 250 # Reduce from 500Layer Size / Head Compatibility
# layer_size must be divisible by n_head
layer_size = 24 # Works with n_head = 4, 6, 8
n_head = 6 # 24 / 6 = 4 (valid)Convergence Issues
# Try different learning rates or architectures
learning_rate = 5e-4 # Adjust learning rate
layer_size = 30 # Try different capacityData Loading Errors
- Verify
.matfile structure matches expected format - Check filename conventions match parsing logic
- Ensure sufficient data files in train/validation directories
- 📧 Email: [betimbahtiri@outlook.de]
- 🐛 Issues: GitHub Issues
- 💬 Discussions: GitHub Discussions
⭐ Star this repository if you find it useful!
