Embodied AI Visuals

Interactive animations explaining core concepts in robotics and embodied intelligence.

Live site: arpitg1304.github.io/embodied-ai-visuals

Animations

Animation	Category	Description
VLA Model Explainer	Perception	Step-by-step walkthrough of Vision-Language-Action models — from camera input to robot action output
Sim-to-Real Gap Explainer	Learning	Why sim-trained policies fail in the real world and how domain randomization bridges the gap
Reward Shaping — Sparse vs Dense	Learning	How reward function design shapes learning — sparse rewards, dense gradients, and potential-based shaping
Video Action Models & Latent Space	Learning	How video-conditioned policies use temporal context and latent space predictions to generate robot actions
Diffusion Policy	Learning	How denoising diffusion refines random noise into smooth action trajectories, handling multimodal demonstrations
Flow Matching	Learning	How flow matching learns straight-line velocity fields to transport noise into action distributions — a faster, simpler alternative to diffusion
World Models — Predict Before You Act	Learning	How robots imagine multiple futures in latent space, score each outcome, and pick the best action before moving
Learning Physics from Video	Learning	How watching billions of internet videos teaches robots gravity, collisions, and object permanence — no physics engine required
Action Chunking — Predict Trajectories, Not Steps	Learning	Why modern robot policies predict K actions at once — the secret behind smooth motion in ACT, Diffusion Policy, and π0
SLAM — Mapping the Unknown	Perception	How robots simultaneously build a map and figure out where they are — the chicken-and-egg problem at the heart of navigation

Features

No dependencies — pure HTML/CSS/JS, zero build step
Dark themed — easy on the eyes
Embeddable — copy iframe embed code for any animation to use in your blog or slides
Mobile friendly — responsive layout that works on any device
Auto-deploy — push to main and GitHub Actions deploys to Pages

Roadmap

Planned animations, roughly ordered by pedagogical flow. Contributions welcome!

Perception & Representation

Visual Encoders Compared — CNN vs ViT vs DINOv2: how each architecture turns pixels into features, and why foundation vision models changed robotics
Point Cloud Processing — From raw depth sensor → voxel grid → PointNet features. Show how 3D understanding feeds into grasp planning
Spatial Action Maps — How pixel-space affordance maps let robots decide where to act directly from images

World Models

World Models — Predict Before You Act — The core idea: robot observes state, imagines multiple futures in latent space, scores each outcome, and picks the best action
Learning Physics from Video — How watching internet-scale video teaches robots gravity, collisions, and object permanence — no physics engine needed
Closed-Loop World Model Control — The real-time observe → predict → act → re-observe cycle — how continuous re-planning handles the unexpected

Planning & Control

MPC vs Learned Policies — Model Predictive Control re-plans every step; a learned policy runs open-loop. Animated side-by-side on the same task
Inverse Kinematics Explained — Given a target end-effector pose, how the robot solves for joint angles — Jacobian, gradient descent, singularities
Task and Motion Planning (TAMP) — High-level symbolic plan ("pick → place → stack") grounded into continuous motion trajectories
Behavior Trees vs FSMs — Two paradigms for structuring robot decision-making, animated with a pick-and-place example

Learning & Adaptation

Imitation Learning Pipeline — Human demo → trajectory encoding → policy distillation. Show how a few demonstrations become a generalizable skill
Diffusion Policy — Denoising process that iteratively refines random noise into a smooth action trajectory — the key insight behind diffusion-based robot control
Hindsight Experience Replay — Failed trajectories relabeled with achieved goals — turning failures into training signal
Curriculum Learning for Manipulation — Progressively harder tasks: reach → touch → grasp → lift → stack

Multi-Agent & Communication

Multi-Robot Task Allocation — How a team of robots divides tasks using auction-based or graph-based coordination
Human-Robot Handoff — Timing, grip force negotiation, and intent prediction during object handovers

Safety & Deployment

Safe RL with Constraints — Reward maximization plus constraint satisfaction — how robots learn to be both capable and safe
Failure Detection & Recovery — How robots monitor execution, detect anomalies, and trigger recovery behaviors in real-time

Adding a new animation

Copy the template: cp animations/_template.html animations/your_name.html
Build your animation using the CSS variable contract for consistent theming
Register it in the ANIMATIONS array in index.html
Push — the site updates automatically

See CONTRIBUTING.md for the full guide.

Local development

python3 -m http.server 8000

Then open http://localhost:8000.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github		.github
animations		animations
docs		docs
.gitignore		.gitignore
.nojekyll		.nojekyll
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Embodied AI Visuals

Animations

Features

Roadmap

Perception & Representation

World Models

Planning & Control

Learning & Adaptation

Multi-Agent & Communication

Safety & Deployment

Adding a new animation

Local development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Embodied AI Visuals

Animations

Features

Roadmap

Perception & Representation

World Models

Planning & Control

Learning & Adaptation

Multi-Agent & Communication

Safety & Deployment

Adding a new animation

Local development

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages