Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
562 commits
Select commit Hold shift + click to select a range
c52dee7
Merge branch '4.0' into jonah_train_select_copy
jsuarez5341 Feb 13, 2026
fef2254
Merge pull request #485 from jonahsamost/jonah_train_select_copy
jsuarez5341 Feb 13, 2026
9796f3d
Add new kerns
jsuarez5341 Feb 13, 2026
5e7d15f
fixed train select
jonahsamost Feb 13, 2026
6ccd6e6
Merge pull request #486 from jonahsamost/jonah_bug_2_13
jsuarez5341 Feb 13, 2026
ea3ec7d
refactor
jsuarez5341 Feb 13, 2026
17c4586
Merge jonah's kernel fix
jsuarez5341 Feb 13, 2026
7736b6d
Clean up advantage and muon
jsuarez5341 Feb 13, 2026
82b0cf1
Massive bugfix in newvalue + ratio copy and new_value usage in clip. …
jsuarez5341 Feb 14, 2026
fb646d8
merge
jsuarez5341 Feb 14, 2026
4202c66
Refactor
jsuarez5341 Feb 14, 2026
81f75db
Initial nograd
jsuarez5341 Feb 14, 2026
38ac67e
Allocators
jsuarez5341 Feb 14, 2026
16851fd
Initial static native
jsuarez5341 Feb 14, 2026
4ce0bfa
pass static activ buffers
jsuarez5341 Feb 16, 2026
51e8b8a
A bunch more crappy porting
jsuarez5341 Feb 17, 2026
d7b09b0
Initial no-torch training (total mess)
jsuarez5341 Feb 18, 2026
0a278af
full deterministic training, major cleanups
jsuarez5341 Feb 18, 2026
26f2f5a
refactor + save/load
jsuarez5341 Feb 19, 2026
ddb5797
legacy
jsuarez5341 Feb 19, 2026
c9ae58b
refactor
jsuarez5341 Feb 19, 2026
28ab172
More refactor
jsuarez5341 Feb 19, 2026
2ffd40e
more refactor
jsuarez5341 Feb 19, 2026
202ff14
Initial model refactor
jsuarez5341 Feb 21, 2026
9511c04
refactor
jsuarez5341 Feb 24, 2026
b874f6c
g2048 bind
jsuarez5341 Feb 24, 2026
fcab65b
genial oath config
jsuarez5341 Feb 24, 2026
05330ee
fp32
jsuarez5341 Feb 24, 2026
096c2b9
fix single-buffer training. float32 still nondeterm for >1 buffer
jsuarez5341 Feb 24, 2026
1edc6f6
extend shmem
jsuarez5341 Feb 24, 2026
3f2749b
nmmo3, grid runnable with rng
jsuarez5341 Feb 24, 2026
bb4ac0e
Fix 2048 spawn count and add 16384 stat
kywch Feb 25, 2026
7d63e72
Initial nmmo net - crappy port, unmaintainable garbage
jsuarez5341 Feb 26, 2026
78409f8
Major bug fix on val update
jsuarez5341 Feb 26, 2026
9f5de82
constellation fixes
jsuarez5341 Feb 26, 2026
eebcfbf
Major fix on zeroing mingru state. 82% on grid with 3 layers
jsuarez5341 Feb 26, 2026
d411b3a
Grid close
jsuarez5341 Feb 27, 2026
516c46c
zero rollout buffer
jsuarez5341 Feb 27, 2026
dde5b0c
initial sweep refactor (bad)
jsuarez5341 Feb 27, 2026
a623c70
spawn subproc in sweep
jsuarez5341 Feb 27, 2026
059b784
disable verbose in sweep, divide num threads
jsuarez5341 Feb 27, 2026
c74d51f
grid stable
jsuarez5341 Feb 28, 2026
2a15af0
highway connect
jsuarez5341 Feb 28, 2026
0abbf07
minor sweep fixes
jsuarez5341 Feb 28, 2026
d435305
merge fix
jsuarez5341 Feb 28, 2026
6a4646f
grid config
jsuarez5341 Feb 28, 2026
2034c81
Merge pull request #493 from kywch/fix-2048-bug
jsuarez5341 Feb 28, 2026
0a479a5
Semi-breaking major refactor to clean up pufferl
jsuarez5341 Mar 1, 2026
8229f92
temp
jsuarez5341 Mar 2, 2026
9bbc96a
Backport major g2048 game bug fix
jsuarez5341 Mar 3, 2026
ff2e86a
2048 swept config
jsuarez5341 Mar 3, 2026
1ca4faa
g2048 config'
jsuarez5341 Mar 3, 2026
482887d
sweep/train fixes
jsuarez5341 Mar 3, 2026
e2ab3d1
dash
jsuarez5341 Mar 3, 2026
e7ae8f9
sweep fixes:
jsuarez5341 Mar 3, 2026
95e19b0
sweep fixes
jsuarez5341 Mar 3, 2026
06bfa3a
default config sweep
jsuarez5341 Mar 3, 2026
aa8bc76
fixes
jsuarez5341 Mar 3, 2026
42811e6
more bug
jsuarez5341 Mar 3, 2026
7920c78
kaiming init
jsuarez5341 Mar 3, 2026
dd8f639
delete ortho init. Speculative
jsuarez5341 Mar 3, 2026
b27de98
cleanup muon
jsuarez5341 Mar 4, 2026
fc9f98c
Initial pufferl refactor training
jsuarez5341 Mar 5, 2026
649826a
update log format
jsuarez5341 Mar 6, 2026
2fb2fb7
sweep keys
jsuarez5341 Mar 6, 2026
760c6e4
dtype fix
jsuarez5341 Mar 6, 2026
426d6e8
tsnee
jsuarez5341 Mar 6, 2026
c6e3fc6
Fix rare norm bug
jsuarez5341 Mar 6, 2026
81bdfc6
constellation fixes
jsuarez5341 Mar 7, 2026
fbacc09
prevent tooltip drawing offscreen
jsuarez5341 Mar 7, 2026
0d82a03
pong
jsuarez5341 Mar 7, 2026
30a5a21
Begin refactor constellation
jsuarez5341 Mar 7, 2026
cd0fa4c
clean ui
jsuarez5341 Mar 7, 2026
a3c3edf
UI cleanup
jsuarez5341 Mar 7, 2026
9e6e431
temp fix color scale
jsuarez5341 Mar 8, 2026
fc09814
merge precision_t kernels and prune dead code
jsuarez5341 Mar 8, 2026
069f3df
purge check macros
jsuarez5341 Mar 10, 2026
fd28ef8
delete transpose indirection
jsuarez5341 Mar 10, 2026
4fbbc36
Initial cudnn conv + nmmo encoder
jsuarez5341 Mar 10, 2026
d2bbec6
refactor kernels
jsuarez5341 Mar 12, 2026
2966410
temp determ fix
jsuarez5341 Mar 12, 2026
dd4c184
refactor muon -> simplify, keep more ops in precision_t. Changes nume…
jsuarez5341 Mar 12, 2026
8c4117e
merge grad clip into muon
jsuarez5341 Mar 12, 2026
3063ff1
refactor
jsuarez5341 Mar 13, 2026
403cec0
more refactor
jsuarez5341 Mar 13, 2026
dff4b0e
more refactors
jsuarez5341 Mar 13, 2026
5f7784d
nccl bind
jsuarez5341 Mar 13, 2026
e5a0139
Stable multigpu
jsuarez5341 Mar 13, 2026
770c270
minor refactor
jsuarez5341 Mar 13, 2026
c8be914
:qMerge branch 'static-native' of https://github.com/pufferai/pufferl…
jsuarez5341 Mar 13, 2026
432e4f1
more refactor
jsuarez5341 Mar 14, 2026
90512a8
minor
jsuarez5341 Mar 14, 2026
29d746a
Merge branch '4.0' of https://github.com/pufferai/pufferlib into 4.0
jsuarez5341 Mar 14, 2026
f0f4df6
cursed merge fix
jsuarez5341 Mar 14, 2026
6be557c
Merge pull request #498 from PufferAI/static-native
jsuarez5341 Mar 14, 2026
695b116
remove structs file
jsuarez5341 Mar 14, 2026
28b0ad1
zero the damn rewards and terms for you
jsuarez5341 Mar 14, 2026
7901b82
Per-type tensors = minus a ton of casts. Had to get obs_dtype from sy…
jsuarez5341 Mar 14, 2026
27b45db
Small refactors
jsuarez5341 Mar 15, 2026
b940322
nmmo fixes - maybe will train?
jsuarez5341 Mar 15, 2026
38b6821
tensor
jsuarez5341 Mar 15, 2026
dd38699
initial port of python backend to match latest cuda. breakout trains.…
jsuarez5341 Mar 16, 2026
752b95e
binding files
jsuarez5341 Mar 17, 2026
7483468
Merge branch '4.0' of https://github.com/pufferai/pufferlib into 4.0
jsuarez5341 Mar 17, 2026
e59b6a7
blah
l1onh3art88 Mar 17, 2026
2d9a7c7
nmmo3 sota (pretty sure) python only
jsuarez5341 Mar 17, 2026
29a93be
Merge branch '4.0' of https://github.com/pufferai/pufferlib into 4.0
jsuarez5341 Mar 17, 2026
4dbddd4
init scale changes
jsuarez5341 Mar 17, 2026
f9ac3af
cuda cache
jsuarez5341 Mar 17, 2026
fe83cfa
fix threadlocal
jsuarez5341 Mar 17, 2026
a50d9ff
cursed conv
jsuarez5341 Mar 17, 2026
be6a29c
cursed cudnn
jsuarez5341 Mar 17, 2026
15667a9
stupid idiot fallback
jsuarez5341 Mar 18, 2026
3aed630
Working nmmo3! +muon scale changes, but main diff was just stupid im2…
jsuarez5341 Mar 18, 2026
d99dfc8
better sweep caching
jsuarez5341 Mar 19, 2026
550ff94
Merge branch '4.0' of https://github.com/pufferai/pufferlib into 4.0
jsuarez5341 Mar 19, 2026
fccba07
bfloat atns
jsuarez5341 Mar 19, 2026
8adcc9e
fix determinism
jsuarez5341 Mar 19, 2026
b936162
minor cleanup
jsuarez5341 Mar 20, 2026
0336926
Merge pull request #501 from PufferAI/bfloatatns
jsuarez5341 Mar 20, 2026
91e4ce9
minor refactor
jsuarez5341 Mar 20, 2026
d9241e6
Merge branch 'PufferAI:4.0' into 4.0
l1onh3art88 Mar 20, 2026
8d2752e
refactors
jsuarez5341 Mar 20, 2026
1674bea
forgot binds
jsuarez5341 Mar 20, 2026
eeddceb
Temp commit. Fixed determinism w/ rng but bloated
jsuarez5341 Mar 20, 2026
6bd83a9
Remove per-thread cublas handles. It is bloat and deterministic witho…
jsuarez5341 Mar 20, 2026
43d603f
Refactor - remove decoder bias
jsuarez5341 Mar 20, 2026
f26d98e
small refactors
jsuarez5341 Mar 21, 2026
2ff0f8d
nmmo3 float
jsuarez5341 Mar 21, 2026
275f492
quick nmmo fix
jsuarez5341 Mar 21, 2026
df4fe4c
sweep config
jsuarez5341 Mar 21, 2026
54822f0
fix axis cutoff
jsuarez5341 Mar 21, 2026
b45526f
sweep settings
jsuarez5341 Mar 21, 2026
94f9256
Fix python version (needed float32 atns)
jsuarez5341 Mar 23, 2026
bdf8ad0
Initial python refactor
jsuarez5341 Mar 23, 2026
289f1e3
Drop vecenv, emulation, gymansium, pz
jsuarez5341 Mar 23, 2026
f5dfce7
Move everything
jsuarez5341 Mar 23, 2026
0979617
Move configs
jsuarez5341 Mar 23, 2026
2d70d9b
Remove trash
jsuarez5341 Mar 23, 2026
2ce8c42
remove definitely dead tests
jsuarez5341 Mar 23, 2026
11586a2
Delete torch ext crap
jsuarez5341 Mar 23, 2026
d2471de
dead scripts
jsuarez5341 Mar 23, 2026
5200492
setup cleanup
jsuarez5341 Mar 23, 2026
4ba3ba4
cleanup torch models
jsuarez5341 Mar 23, 2026
5c811c6
small fixes
jsuarez5341 Mar 23, 2026
e72370b
cleanups
jsuarez5341 Mar 23, 2026
a3c1a90
delete more
jsuarez5341 Mar 23, 2026
a030af8
minor
jsuarez5341 Mar 23, 2026
fbd52ce
drop no build isolation
jsuarez5341 Mar 23, 2026
58b3fc9
uh forgot src
jsuarez5341 Mar 23, 2026
1292b81
toml license
jsuarez5341 Mar 23, 2026
032a61a
fix ocean
jsuarez5341 Mar 24, 2026
41469f3
pybind11?
jsuarez5341 Mar 24, 2026
0aa5bdf
khr compile fix
jsuarez5341 Mar 24, 2026
f51cb87
build fixes for ocean
jsuarez5341 Mar 24, 2026
e137d95
Update manifest
jsuarez5341 Mar 24, 2026
a320b24
fuck you setup.py!
jsuarez5341 Mar 24, 2026
88d0e20
Nice simple build script!
jsuarez5341 Mar 24, 2026
f931f3f
single build script
jsuarez5341 Mar 24, 2026
5b5c217
Some refactors, needs more work
jsuarez5341 Mar 24, 2026
5345067
Old extensions
jsuarez5341 Mar 24, 2026
eb4c17c
Jonah's safe_logit
jsuarez5341 Mar 24, 2026
d7b33a4
Initial profile update
jsuarez5341 Mar 24, 2026
713c659
profile updates
jsuarez5341 Mar 24, 2026
b4badb3
Update profiling
jsuarez5341 Mar 24, 2026
c79ed99
delete old profile
jsuarez5341 Mar 24, 2026
8ecc862
refactor
jsuarez5341 Mar 24, 2026
9c7bd6c
Move more stuff around
jsuarez5341 Mar 24, 2026
94511ba
fix eval
jsuarez5341 Mar 24, 2026
c1c31a2
refactor errors
jsuarez5341 Mar 24, 2026
f2008d2
Log frequency
jsuarez5341 Mar 24, 2026
a120550
latest
l1onh3art88 Mar 25, 2026
4d2787f
100x data load speed, severl fixes
jsuarez5341 Mar 25, 2026
8c17929
filter fig4
jsuarez5341 Mar 25, 2026
512fd3a
prune old code
jsuarez5341 Mar 25, 2026
cbc13d6
constellation build
jsuarez5341 Mar 25, 2026
eb93927
prune trash
jsuarez5341 Mar 26, 2026
bb15a59
move stuff around a bit
jsuarez5341 Mar 26, 2026
4e0c951
CPU fallback for mac scrubs
jsuarez5341 Mar 26, 2026
6bbcdf0
Fix hardcoded CUDA path and stuff that installs cudnn dependency.
daphne-cornelisse Mar 28, 2026
9e23fa3
Fix OBS_TENSOR build error.
daphne-cornelisse Mar 28, 2026
c0f378c
Required 4.0 env changes for drive.
daphne-cornelisse Mar 28, 2026
86c8faf
Safeguard to prevent segfault if binaries are not stored at the right…
daphne-cornelisse Mar 28, 2026
dd7b2cb
don't try to pickle backend
jsuarez5341 Mar 29, 2026
90d08e0
delete old nv flag
jsuarez5341 Mar 30, 2026
2b665d3
Merge branch 'PufferAI:4.0' into 4.0
l1onh3art88 Mar 30, 2026
fc6800f
Trailer progress
jsuarez5341 Mar 30, 2026
cd4f19a
Decent progress
jsuarez5341 Mar 31, 2026
3d75952
Small timing fixes:
jsuarez5341 Mar 31, 2026
e74e33b
Merge branch 'PufferAI:4.0' into 4.0
l1onh3art88 Mar 31, 2026
ac977eb
Data processing script with instructions.
daphne-cornelisse Mar 31, 2026
f3647f7
Trailer + constellation shader updates
jsuarez5341 Mar 31, 2026
599594b
Delete legacy bindings.h
daphne-cornelisse Mar 31, 2026
13ccdfd
Delete hardcoded logic for 8 agents. Can train at 1.9M SPS.
daphne-cornelisse Mar 31, 2026
b52dfb7
Add datapaths in .gitignore.
daphne-cornelisse Mar 31, 2026
7431f0f
Fix: Ensure save_map_binary() only has matching attributes.
daphne-cornelisse Mar 31, 2026
55f59c9
Provide map_dir.
daphne-cornelisse Mar 31, 2026
b9f3443
Typo fix.
daphne-cornelisse Mar 31, 2026
24ccfdd
move configs
jsuarez5341 Mar 31, 2026
da731f1
Merge pull request #508 from daphne-cornelisse/4.0
jsuarez5341 Mar 31, 2026
91d5128
Major bug fix on rendering; integrate initial drive
jsuarez5341 Mar 31, 2026
eb321c3
Iterate through multiple maps.
daphne-cornelisse Mar 31, 2026
2eed0f2
Use 1k maps dataset for benchmarking
daphne-cornelisse Apr 1, 2026
10e1105
moba port
jsuarez5341 Apr 1, 2026
9dd827e
Merge remote-tracking branch 'upstream/4.0' into 4.0
daphne-cornelisse Apr 1, 2026
65fced5
patch
jsuarez5341 Apr 1, 2026
ad78fc6
old drone
FinlaySanders Apr 1, 2026
5220174
new drone
FinlaySanders Apr 1, 2026
528c45c
fix: continuous action logstd indexing in ppo kernel
FinlaySanders Apr 1, 2026
c0364b1
rename drone
FinlaySanders Apr 1, 2026
dfbc4a1
Merge remote-tracking branch 'upstream/4.0' into 4.0
daphne-cornelisse Apr 1, 2026
dac58ec
Binding fix: use max_agents instead of num_agents.
daphne-cornelisse Apr 1, 2026
a824475
Drive sweep configs.
daphne-cornelisse Apr 1, 2026
8e605c5
Small changes I had to make to run sweeps.
daphne-cornelisse Apr 1, 2026
b53fe5a
Clean up drive env: Remove magic values and legacy code.
daphne-cornelisse Apr 1, 2026
254dd4b
moba race fix
jsuarez5341 Apr 1, 2026
11ff42a
faster rng
FinlaySanders Apr 1, 2026
b1966b0
terraform ported and ready to sweep
jsuarez5341 Apr 1, 2026
5d80f05
ported tower climb
jsuarez5341 Apr 1, 2026
f6f64d3
better score metric
FinlaySanders Apr 1, 2026
57f6a12
squared continuous test env
jsuarez5341 Apr 1, 2026
33fb81e
Merge pull request #511 from FinlaySanders/4.0
jsuarez5341 Apr 1, 2026
168b2a9
Merge pull request #510 from daphne-cornelisse/4.0
jsuarez5341 Apr 1, 2026
d09e1ca
Merge pull request #512 from daphne-cornelisse/4.0
jsuarez5341 Apr 1, 2026
e15e424
drive tweaks
jsuarez5341 Apr 1, 2026
d4f4ff2
full solve
FinlaySanders Apr 2, 2026
378cff7
fix nonetype
jsuarez5341 Apr 2, 2026
46c9d20
latest
l1onh3art88 Apr 2, 2026
a859c9a
Merge pull request #513 from FinlaySanders/4.0
jsuarez5341 Apr 2, 2026
ade25d2
Many small fixes
jsuarez5341 Apr 2, 2026
ab75881
refactor build
jsuarez5341 Apr 2, 2026
cdf68b0
Minor fixes
jsuarez5341 Apr 2, 2026
bbbf27f
vendor minshell
jsuarez5341 Apr 2, 2026
9f19d5e
test env fixes
jsuarez5341 Apr 2, 2026
b554919
trailer
jsuarez5341 Apr 3, 2026
4886ac8
g2048
jsuarez5341 Apr 3, 2026
5ef7fdb
default profile breakout
jsuarez5341 Apr 3, 2026
613b19d
vendo ini files
jsuarez5341 Apr 3, 2026
9b5d6d1
fix vendor
jsuarez5341 Apr 3, 2026
4a2f231
fixes
jsuarez5341 Apr 3, 2026
42b3509
logs ignore
jsuarez5341 Apr 3, 2026
5ced979
robust arch
jsuarez5341 Apr 3, 2026
0f17fa6
don't fail build
jsuarez5341 Apr 3, 2026
0c20d2e
build fix
jsuarez5341 Apr 3, 2026
d6c7525
cache
jsuarez5341 Apr 3, 2026
8e51417
cache
jsuarez5341 Apr 3, 2026
a919ca7
PufferNet fixes. pong, breakout, moba local pols. Pong fixes
jsuarez5341 Apr 4, 2026
a059fa6
env updates for 4.0
l1onh3art88 Apr 4, 2026
579f978
models
jsuarez5341 Apr 4, 2026
7039a9e
conflicts
jsuarez5341 Apr 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
c_*.c
pufferlib/extensions.c
pufferlib/puffernet.c
logs/

# Build dir
build/

# hipified cuda extensions dir [HIP/ROCM]
pufferlib/extensions/hip/
Expand All @@ -18,6 +22,7 @@ cy_*.c

# C extensions
*.so
*.o

# Distribution / packaging
.Python
Expand Down Expand Up @@ -162,3 +167,8 @@ pufferlib/ocean/impulse_wars/*-release/
pufferlib/ocean/impulse_wars/debug-*/
pufferlib/ocean/impulse_wars/release-*/
pufferlib/ocean/impulse_wars/benchmark/

# Data
resources/drive/data/*
resources/drive/binaries/*

19 changes: 0 additions & 19 deletions MANIFEST.in

This file was deleted.

8 changes: 3 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
![figure](https://pufferai.github.io/source/resource/header.png)

[![PyPI version](https://badge.fury.io/py/pufferlib.svg)](https://badge.fury.io/py/pufferlib)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/pufferlib)
![Github Actions](https://github.com/PufferAI/PufferLib/actions/workflows/install.yml/badge.svg)
[![](https://dcbadge.vercel.app/api/server/spT4huaGYV?style=plastic)](https://discord.gg/spT4huaGYV)
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/cloudposse.svg?style=social&label=Follow%20%40jsuarez5341)](https://twitter.com/jsuarez5341)
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/cloudposse.svg?style=social&label=Follow%20%40jsuarez)](https://twitter.com/jsuarez)

PufferLib is the reinforcement learning library I wish existed during my PhD. It started as a compatibility layer to make working with complex environments a breeze. Now, it's a high-performance toolkit for research and industry with optimized parallel simulation, environments that run and train at 1M+ steps/second, and tons of quality of life improvements for practitioners. All our tools are free and open source. We also offer priority service for companies, startups, and labs!
PufferLib is a fast and sane reinforcement learning library that can train tiny, super-human models in seconds. The included learning algorithm, hyperparameter tuning, and simulation methods are the product of our own research. All our tools are free and open source. Need a high performance environment for your application? We build them professionally and offer training + extended support. Contact jsuarez🐡puffer🐡ai.

![Trailer](https://github.com/PufferAI/puffer.ai/blob/main/docs/assets/puffer_2.gif?raw=true)

All of our documentation is hosted at [puffer.ai](https://puffer.ai "PufferLib Documentation"). @jsuarez5341 on [Discord](https://discord.gg/puffer) for support -- post here before opening issues. We're always looking for new contributors, too!
All of our documentation is hosted at [puffer.ai](https://puffer.ai "PufferLib Documentation"). @jsuarez5341 on [Discord](https://discord.gg/puffer) for support. Post there before opening issues. We're always looking for new contributors!

## Star to puff up the project!

Expand Down
300 changes: 300 additions & 0 deletions build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,300 @@
#!/bin/bash
set -e

# Usage:
# ./build.sh breakout # Build _C.so with breakout statically linked
# ./build.sh breakout --float # float32 precision (required for --slowly)
# ./build.sh breakout --cpu # CPU fallback, torch only
# ./build.sh breakout --debug # Debug build
# ./build.sh breakout --local # Standalone executable (debug, sanitizers)
# ./build.sh breakout --fast # Standalone executable (optimized)
# ./build.sh breakout --web # Emscripten web build
# ./build.sh breakout --profile # Kernel profiling binary
# ./build.sh all # Build all envs with default and --float

if [ -z "$1" ]; then
echo "Usage: ./build.sh ENV_NAME [--float] [--debug] [--local|--fast|--web|--profile|--cpu|--all]"
exit 1
fi
ENV=$1
shift

for arg in "$@"; do
case $arg in
--float) PRECISION="-DPRECISION_FLOAT" ;;
--debug) DEBUG=1 ;;
--local) MODE=local ;;
--fast) MODE=fast ;;
--web) MODE=web ;;
--profile) MODE=profile ;;
--cpu) MODE=cpu; PRECISION="-DPRECISION_FLOAT" ;;
*) echo "Error: unknown argument '$arg'" && exit 1 ;;
esac
done

if [ "$ENV" = "all" ]; then
FAILED=""
for env_dir in ocean/*/; do
env=$(basename "$env_dir")
if bash "$0" "$env" && bash "$0" "$env" --float; then
echo "OK: $env"
else
echo "FAIL: $env"
FAILED="$FAILED\n $env"
fi
done

if [ -n "$FAILED" ]; then
echo -e "\nFailed builds:$FAILED"
fi
exit 0
fi

# Linux/mac
PLATFORM="$(uname -s)"
if [ "$PLATFORM" = "Linux" ]; then
RAYLIB_NAME='raylib-5.5_linux_amd64'
OMP_LIB=-lomp5
SANITIZE_FLAGS=(-fsanitize=address,undefined,bounds,pointer-overflow,leak -fno-omit-frame-pointer)
STANDALONE_LDFLAGS=(-lGL)
SHARED_LDFLAGS=(-Bsymbolic-functions)
else
RAYLIB_NAME='raylib-5.5_macos'
OMP_LIB=-lomp
SANITIZE_FLAGS=()
STANDALONE_LDFLAGS=(-framework Cocoa -framework IOKit -framework CoreVideo -framework OpenGL)
SHARED_LDFLAGS=(-framework Cocoa -framework OpenGL -framework IOKit -undefined dynamic_lookup)
fi

CLANG_WARN=(
-Wall
-ferror-limit=3
-Werror=incompatible-pointer-types
-Werror=return-type
-Wno-error=incompatible-pointer-types-discards-qualifiers
-Wno-incompatible-pointer-types-discards-qualifiers
-Wno-error=array-parameter
)

download() {
local name=$1 url=$2
[ -d "$name" ] && return
echo "Downloading $name..."
case "$url" in
*.zip) curl -sL "$url" -o "$name.zip" && unzip -q "$name.zip" && rm "$name.zip" ;;
*) curl -sL "$url" -o "$name.tar.gz" && tar xf "$name.tar.gz" && rm "$name.tar.gz" ;;
esac
}

RAYLIB_URL="https://github.com/raysan5/raylib/releases/download/5.5"
if [ "$MODE" = "web" ]; then
RAYLIB_NAME='raylib-5.5_webassembly'
download "$RAYLIB_NAME" "$RAYLIB_URL/$RAYLIB_NAME.zip"
else
download "$RAYLIB_NAME" "$RAYLIB_URL/$RAYLIB_NAME.tar.gz"
fi

RAYLIB_A="$RAYLIB_NAME/lib/libraylib.a"
INCLUDES=(-I./$RAYLIB_NAME/include -I./src -I./vendor)
LINK_ARCHIVES=("$RAYLIB_A")
EXTRA_SRC=""

if [ "$ENV" = "constellation" ]; then
SRC_DIR="constellation"
EXTRA_SRC="vendor/cJSON.c"
OUTPUT_NAME="seethestars"
elif [ "$ENV" = "trailer" ]; then
SRC_DIR="trailer"
OUTPUT_NAME="trailer/trailer"
elif [ "$ENV" = "impulse_wars" ]; then
SRC_DIR="ocean/$ENV"
if [ "$MODE" = "web" ]; then BOX2D_NAME='box2d-web'
elif [ "$PLATFORM" = "Linux" ]; then BOX2D_NAME='box2d-linux-amd64'
else BOX2D_NAME='box2d-macos-arm64'
fi
BOX2D_URL="https://github.com/capnspacehook/box2d/releases/latest/download"
download "$BOX2D_NAME" "$BOX2D_URL/$BOX2D_NAME.tar.gz"
INCLUDES+=(-I./$BOX2D_NAME/include -I./$BOX2D_NAME/src)
LINK_ARCHIVES+=("./$BOX2D_NAME/libbox2d.a")
elif [ -d "ocean/$ENV" ]; then
SRC_DIR="ocean/$ENV"
else
echo "Error: environment '$ENV' not found" && exit 1
fi

OUTPUT_NAME=${OUTPUT_NAME:-$ENV}

# Standalone environment build
if [ -n "$DEBUG" ] || [ "$MODE" = "local" ]; then
CLANG_OPT=(-g -O0 "${CLANG_WARN[@]}" "${SANITIZE_FLAGS[@]}")
NVCC_OPT="-O0 -g"
LINK_OPT="-g"
else
CLANG_OPT=(-O2 -DNDEBUG "${CLANG_WARN[@]}")
NVCC_OPT="-O2 --threads 0"
LINK_OPT="-O2"
fi
if [ "$MODE" = "local" ] || [ "$MODE" = "fast" ]; then
FLAGS=(
"${INCLUDES[@]}"
"$SRC_DIR/$ENV.c" $EXTRA_SRC -o "$OUTPUT_NAME"
"${LINK_ARCHIVES[@]}"
"${STANDALONE_LDFLAGS[@]}"
-lm -lpthread -fopenmp
-DPLATFORM_DESKTOP
)
echo "Compiling $ENV..."
${CC:-clang} "${CLANG_OPT[@]}" "${FLAGS[@]}"
echo "Built: ./$OUTPUT_NAME"
exit 0
elif [ "$MODE" = "web" ]; then
mkdir -p "build/web/$ENV"
echo "Compiling $ENV for web..."
emcc \
-o "build/web/$ENV/game.html" \
"$SRC_DIR/$ENV.c" $EXTRA_SRC \
-O3 -Wall \
"${LINK_ARCHIVES[@]}" \
"${INCLUDES[@]}" \
-L. -L./$RAYLIB_NAME/lib \
-sASSERTIONS=2 -gsource-map \
-sUSE_GLFW=3 -sUSE_WEBGL2=1 -sASYNCIFY -sFILESYSTEM -sFORCE_FILESYSTEM=1 \
--shell-file vendor/minshell.html \
-sINITIAL_MEMORY=512MB -sALLOW_MEMORY_GROWTH -sSTACK_SIZE=512KB \
-DNDEBUG -DPLATFORM_WEB -DGRAPHICS_API_OPENGL_ES3 \
--preload-file resources/$ENV@resources/$ENV \
--preload-file resources/shared@resources/shared
echo "Built: build/web/$ENV/game.html"
exit 0
fi

# Find cuDNN path
CUDA_HOME=${CUDA_HOME:-${CUDA_PATH:-$(dirname "$(dirname "$(which nvcc)")")}}
CUDNN_IFLAG=""
CUDNN_LFLAG=""
for dir in /usr/local/cuda/include /usr/include; do
if [ -f "$dir/cudnn.h" ]; then
CUDNN_IFLAG="-I$dir"
break
fi
done
for dir in /usr/local/cuda/lib64 /usr/lib/x86_64-linux-gnu; do
if [ -f "$dir/libcudnn.so" ]; then
CUDNN_LFLAG="-L$dir"
break
fi
done
if [ -z "$CUDNN_IFLAG" ]; then
CUDNN_IFLAG=$(python -c "import nvidia.cudnn, os; print('-I' + os.path.join(nvidia.cudnn.__path__[0], 'include'))" 2>/dev/null || echo "")
fi
if [ -z "$CUDNN_LFLAG" ]; then
CUDNN_LFLAG=$(python -c "import nvidia.cudnn, os; print('-L' + os.path.join(nvidia.cudnn.__path__[0], 'lib'))" 2>/dev/null || echo "")
fi

export CCACHE_DIR="${CCACHE_DIR:-$HOME/.ccache}"
export CCACHE_BASEDIR="$(pwd)"
export CCACHE_COMPILERCHECK=content
NVCC="ccache $CUDA_HOME/bin/nvcc"
CC="${CC:-$(command -v ccache >/dev/null && echo 'ccache clang' || echo 'clang')}"
ARCH=${NVCC_ARCH:-native}

PYTHON_INCLUDE=$(python -c "import sysconfig; print(sysconfig.get_path('include'))")
PYBIND_INCLUDE=$(python -c "import pybind11; print(pybind11.get_include())")
NUMPY_INCLUDE=$(python -c "import numpy; print(numpy.get_include())")
EXT_SUFFIX=$(python -c "import sysconfig; print(sysconfig.get_config_var('EXT_SUFFIX'))")
OUTPUT="pufferlib/_C${EXT_SUFFIX}"

BINDING_SRC="$SRC_DIR/binding.c"
mkdir -p build
STATIC_OBJ="build/libstatic_${ENV}.o"
STATIC_LIB="build/libstatic_${ENV}.a"

if [ ! -f "$BINDING_SRC" ]; then
echo "Error: $BINDING_SRC not found"
exit 1
fi

echo "Compiling static library for $ENV..."
${CC:-clang} -c "${CLANG_OPT[@]}" \
-I. -Isrc -I$SRC_DIR \
-I./$RAYLIB_NAME/include -I$CUDA_HOME/include \
-DPLATFORM_DESKTOP \
-fno-semantic-interposition -fvisibility=hidden \
-fPIC -fopenmp \
"$BINDING_SRC" -o "$STATIC_OBJ"
ar rcs "$STATIC_LIB" "$STATIC_OBJ"

# Brittle hack: have to extract the tensor type from the static lib to build trainer
OBS_TENSOR_T=$(awk '/^#define OBS_TENSOR_T/{print $3}' "$BINDING_SRC")
if [ -z "$OBS_TENSOR_T" ]; then
echo "Error: Could not find OBS_TENSOR_T in $BINDING_SRC"
exit 1
fi

if [ -z "$MODE" ]; then
echo "Compiling CUDA ($ARCH) training backend..."
$NVCC -c -arch=$ARCH -Xcompiler -fPIC \
-Xcompiler=-D_GLIBCXX_USE_CXX11_ABI=1 \
-Xcompiler=-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION \
-Xcompiler=-DPLATFORM_DESKTOP \
-std=c++17 \
-I. -Isrc \
-I$PYTHON_INCLUDE -I$PYBIND_INCLUDE -I$NUMPY_INCLUDE \
-I$CUDA_HOME/include $CUDNN_IFLAG -I$RAYLIB_NAME/include \
-Xcompiler=-fopenmp \
-DOBS_TENSOR_T=$OBS_TENSOR_T \
-DENV_NAME=$ENV \
$PRECISION $NVCC_OPT \
src/bindings.cu -o build/bindings.o

LINK_CMD=(
${CXX:-g++} -shared -fPIC -fopenmp
build/bindings.o "$STATIC_LIB" "$RAYLIB_A"
-L$CUDA_HOME/lib64 $CUDNN_LFLAG
-lcudart -lnccl -lnvidia-ml -lcublas -lcusolver -lcurand -lcudnn
$OMP_LIB $LINK_OPT
"${SHARED_LDFLAGS[@]}"
-o "$OUTPUT"
)
"${LINK_CMD[@]}"
echo "Built: $OUTPUT"

elif [ "$MODE" = "cpu" ]; then
echo "Compiling CPU training backend..."
${CXX:-g++} -c -fPIC -fopenmp \
-D_GLIBCXX_USE_CXX11_ABI=1 \
-DPLATFORM_DESKTOP \
-std=c++17 \
-I. -Isrc \
-I$PYTHON_INCLUDE -I$PYBIND_INCLUDE \
-DOBS_TENSOR_T=$OBS_TENSOR_T \
-DENV_NAME=$ENV \
$PRECISION $LINK_OPT \
src/bindings_cpu.cpp -o build/bindings_cpu.o
LINK_CMD=(
${CXX:-g++} -shared -fPIC -fopenmp
build/bindings_cpu.o "$STATIC_LIB" "$RAYLIB_A"
-lm -lpthread $OMP_LIB $LINK_OPT
"${SHARED_LDFLAGS[@]}"
-o "$OUTPUT"
)
"${LINK_CMD[@]}"
echo "Built: $OUTPUT"

elif [ "$MODE" = "profile" ]; then
echo "Compiling profile binary ($ARCH)..."
$NVCC $NVCC_OPT -arch=$ARCH -std=c++17 \
-I. -Isrc -I$SRC_DIR -Ivendor \
-I$CUDA_HOME/include $CUDNN_IFLAG -I$RAYLIB_NAME/include \
-DOBS_TENSOR_T=$OBS_TENSOR_T \
-DENV_NAME=$ENV \
-Xcompiler=-DPLATFORM_DESKTOP \
$PRECISION \
-Xcompiler=-fopenmp \
tests/profile_kernels.cu vendor/ini.c \
"$STATIC_LIB" "$RAYLIB_A" \
-lnccl -lnvidia-ml -lcublas -lcurand -lcudnn \
-lGL -lm -lpthread $OMP_LIB \
-o profile
echo "Built: ./profile"
fi
1 change: 0 additions & 1 deletion config

This file was deleted.

7 changes: 2 additions & 5 deletions pufferlib/config/ocean/asteroids.ini → config/asteroids.ini
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
[base]
package = ocean
env_name = puffer_asteroids
policy_name = Policy
rnn_name = Recurrent
env_name = asteroids

[vec]
num_envs = 8
Expand All @@ -17,7 +14,7 @@ adam_beta2 = 0.9999436458974764
adam_eps = 6.915036275112011e-08
anneal_lr = true
batch_size = auto
bptt_horizon = 64
horizon = 64
checkpoint_interval = 200
clip_coef = 0.18588778503512546
ent_coef = 0.0016620361911332262
Expand Down
Loading
Loading