Skip to content

SungminLee511/VortexGPU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VortexGPU

A GPU-Accelerated Unsteady Vortex Method Solver for Multi-Body Aerospace Problems


What This Is

VortexGPU is a mesh-free, Lagrangian vortex method solver that simulates unsteady incompressible aerodynamic flows around single and multiple interacting bodies. It is built entirely from scratch — no external CFD libraries — and is GPU-native, leveraging fast N-body summation algorithms (treecode and fast multipole method) to achieve O(N log N) or O(N) computational scaling.

The solver tracks discrete vortex particles (vortex blobs) as they are shed from body surfaces and advected through the flow field. At every timestep, the velocity of each particle is computed from the collective influence of all other particles via the Biot-Savart law — an N-body problem that maps naturally onto GPU hardware.

Primary Applications

  • Unsteady airfoil aerodynamics: Impulsive starts, pitching/plunging motions, dynamic stall onset
  • Wake interaction: Multiple bodies flying through each other's wakes (tandem, formation, biplane)
  • Vortex-dominated flows: Problems where wake structure is the primary physical feature — flow visualization, vortex rollup, Kelvin-Helmholtz instability

Why Vortex Methods Instead of Conventional CFD

Traditional grid-based CFD (finite volume, finite element, finite difference) discretizes the entire fluid domain. Vortex methods take a fundamentally different approach:

  • No mesh in the wake: Vorticity-carrying particles are tracked in a Lagrangian frame. The wake evolves naturally without requiring mesh refinement to resolve it.
  • Automatic far-field satisfaction: The Biot-Savart law inherently gives velocity that decays to zero at infinity — no artificial boundary conditions needed.
  • Natural wake capture: Wake rollup, convection, merging, and interaction emerge from the physics without special treatment.
  • Embarrassing parallelism: The core N-body velocity evaluation is perfectly suited for GPU computation.
  • Physical intuition: The simulation output is literally the vortical structures — you see the physics directly.

The fundamental trade-off: vortex methods excel at external, vortex-dominated, unsteady flows. They are not well-suited for wall-bounded turbulence, internal flows, or problems where pressure fields are the primary output.


How It Works

The Core Physics

The solver is grounded in the vorticity-velocity formulation of the incompressible Navier-Stokes equations. In this formulation, pressure is eliminated entirely, and the key insight emerges: at high Reynolds numbers, vorticity is approximately materially conserved — it moves with the fluid. So the computational task reduces to:

  1. Track where vorticity is (particle positions)
  2. Compute the velocity field from the vorticity (Biot-Savart summation)
  3. Advect particles using the computed velocity
  4. Shed new vorticity from body trailing edges (Kutta condition)

The Body: Panel Method

Airfoil surfaces are represented as collections of flat panels. Each panel carries a bound vortex whose strength is determined by enforcing a no-penetration boundary condition (flow cannot pass through the body). This produces a linear system that is solved at every timestep. For rigid bodies, the system matrix is constant and can be LU-factored once and reused — only the right-hand side (which depends on the evolving wake) changes.

The panel method uses the Pistolesi rule: vortices placed at the 1/4-chord point of each panel, collocation (boundary condition enforcement) at the 3/4-chord point. This ensures correct thin-airfoil theory behavior in the limit.

The Wake: Vortex Shedding and Advection

At every timestep, a new vortex particle is shed from the trailing edge. Its strength is determined by Kelvin's circulation theorem: the total circulation in the flow (bound + wake) must remain constant. If the flow starts from rest, total circulation stays zero for all time.

The wake vortex particles are then advected by the total velocity field — freestream plus the induced velocity from all other particles (both wake and bound). This advection step is the computational bottleneck and is where GPU acceleration is critical.

Regularization

Point vortices have singular velocity fields (infinite velocity at the vortex center). This is regularized by replacing point vortices with vortex blobs — vorticity distributed over a small core of radius σ. The smoothing parameter σ is typically set to 1.5× the initial inter-particle spacing.

Force Computation

Aerodynamic forces are computed via the impulse method: the force is proportional to the rate of change of the first moment of vorticity. This avoids the difficulty of computing pressure fields (which don't exist explicitly in the vortex method formulation) and is generally more robust than pressure integration in this context.


Project Phases

The solver is developed incrementally across four phases of increasing complexity:

Phase 1: 2D Single-Body Solver (O(N²))

A complete working unsteady vortex method solver for a single 2D airfoil. Includes NACA 4-digit airfoil geometry generation with cosine panel spacing, the full panel method linear system, wake shedding via Kelvin's theorem, and RK2 time integration. The naive O(N²) velocity evaluation runs on GPU using a tiled shared-memory kernel. Validated against the Wagner function (impulsive start), steady thin-airfoil theory, and the Theodorsen function (unsteady pitching).

Phase 2: GPU Fast Summation (Treecode / FMM)

Replaces the O(N²) velocity evaluation with an O(N log N) Barnes-Hut treecode. Particles are sorted by Morton codes (Z-order curve) for spatial locality, a quadtree is built implicitly from the sorted order, and tree traversal uses a multipole acceptance criterion to decide when distant particle groups can be approximated as single sources. The treecode is validated against direct summation for accuracy, and scaling is demonstrated from N = 10³ to N = 10⁶. An optional FMM (O(N)) implementation adds local expansions and a systematic downward pass for further speedup.

Phase 3: Multi-Body Interaction

Extends the solver to handle two or more bodies simultaneously. Each body has its own panel set and sheds its own wake, but all bodies share a single velocity field. The panel method linear system is coupled across all bodies. Applications include tandem airfoil wake interaction, formation flight, and gust encounter problems. Bodies can translate and rotate independently.

Phase 4: 3D Vortex Filaments

Extends from 2D point vortices to 3D vortex filament segments. The 3D Biot-Savart law is integrated analytically for straight filament segments. A finite wing is modeled with a lifting line (bound filament along the quarter-chord) and trailing vortex filaments shed at each spanwise station. The key physical phenomenon is tip vortex rollup — the trailing vortex sheet rolls up into concentrated tip vortex cores, driven by Kelvin-Helmholtz instability. Validated against the elliptic wing analytical induced drag result.


Technology Stack

Component Choice Rationale
GPU Kernels CUDA C++ Direct control over shared memory tiling, warp-level operations for treecode traversal
Host Logic C++ Performance-critical panel solver, tree construction
Prototyping Python + CuPy/Numba Acceptable for Phase 1-2 development; CUDA C++ preferred for Phase 3-4
Visualization VTK (.vtp) + ParaView 3D wake visualization, especially critical for Phase 4
2D Animation matplotlib Quick 2D wake animations, force history plots
Build System CMake Standard for CUDA C++ projects

No external CFD libraries are used. Every component — geometry generation, panel method, Biot-Savart kernels, treecode, force computation, I/O — is implemented from scratch.


Documentation

All detailed mathematical derivations, implementation specifications, and validation procedures are in the docs/ directory:

Document What's Inside
docs/theory.md Governing equations (vorticity transport), Biot-Savart law (2D component form), regularization kernels (Gaussian, algebraic), viscous diffusion (core spreading, PSE), Kutta condition physics, Kelvin's circulation theorem, non-dimensionalization table, compressible flow extension
docs/phase1_2d_single_body.md Panel representation (lumped vortex vs constant-strength), influence coefficient formulas, no-penetration boundary condition assembly, Kutta condition implementation (wake-shedding approach), complete timestepping algorithm, RK2 integration, force computation (impulse + unsteady Bernoulli), NACA 4-digit geometry generation with cosine spacing
docs/phase2_gpu_fast_summation.md Barnes-Hut treecode (quadtree construction, multipole acceptance criterion, higher-order expansions), FMM (P2M/M2M/M2L/L2L/L2P/P2P operators, translation formulas), GPU implementation (Morton codes, implicit tree, warp-level traversal, shared-memory P2P), accuracy control methodology
docs/phase3_multi_body.md Coupled multi-body panel system formulation, cross-body influence coefficients, moving/rotating body boundary conditions, Body class + shared WakeField architecture
docs/phase4_3d_extension.md 3D Biot-Savart law for filament segments (integrated closed-form + regularized version), lifting line + free wake vortex lattice method, tip vortex rollup physics
docs/validation.md Wagner function test (impulsive flat plate), NACA 0012 steady lift, Theodorsen function test (pitching airfoil), treecode accuracy vs direct sum, tandem airfoil qualitative test, elliptic wing induced drag, performance scaling targets
docs/references.md Essential reading: Katz & Plotkin, Cottet & Koumoutsakos, Barba GPU vortex codes, Greengard & Rokhlin FMM, Barnes & Hut treecode, Yokota & Barba CUDA implementation guide; validation data sources

Planned Code Architecture

VortexGPU/
├── CMakeLists.txt
├── src/
│   ├── core/               # Particle/Panel/Body structs, Biot-Savart kernels (CPU + GPU)
│   ├── geometry/            # NACA airfoil generation, panel discretization
│   ├── solver/              # Panel linear system, wake management, time integration, forces
│   ├── treecode/            # Morton codes, quadtree construction, treecode traversal kernel
│   ├── multibody/           # Coupled multi-body solver, Body class with reference frame
│   ├── io/                  # VTK and CSV output writers
│   └── main.cpp             # Entry point and config parsing
├── tests/                   # Unit and validation tests
├── configs/                 # JSON simulation configs (single airfoil, tandem, etc.)
├── scripts/                 # Python post-processing (force plots, wake animation)
├── benchmarks/              # Performance measurement scripts
└── docs/                    # Full mathematical and implementation documentation

Performance Expectations

Configuration N Particles Time per Step Hardware
Phase 1 (O(N²) direct, GPU) 10,000 ~1 ms Modern GPU
Phase 1 (O(N²) direct, GPU) 100,000 ~100 ms Modern GPU
Phase 2 (Treecode, GPU) 100,000 ~5-10 ms Modern GPU
Phase 2 (Treecode, GPU) 1,000,000 ~50-100 ms Modern GPU

The treecode provides 10-20× speedup over direct summation at N = 100,000, enabling long-time simulations with dense wakes that would be impractical with O(N²) scaling.


Key Design Decisions

  • Double precision (fp64) for positions and circulations: Single precision causes visible drift in vortex trajectories over long simulations. Mixed precision (fp64 positions, fp32 far-field kernel) is acceptable since treecode approximation error exceeds fp32 rounding error.
  • RK2 minimum for time integration: Forward Euler produces unacceptable trajectory drift. RK2 (midpoint method) doubles the cost per step but is essential for accuracy.
  • Cosine panel spacing: Clustering panels near the leading and trailing edges is critical for accurate pressure distributions. Uniform spacing gives poor results.
  • Treecode before FMM: The Barnes-Hut treecode is simpler to implement on GPU (regular traversal pattern) and provides O(N log N) scaling. FMM (O(N)) is only worth the additional complexity for very large N.
  • Inviscid first, viscosity later: Start with the inviscid approximation (vorticity conservation). Core spreading for viscous diffusion can be added as an optional feature.

License

See LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages