Skip to content

session: disable AMReX arena preallocation to fit T4 with EB#296

Merged
jameslehoux merged 1 commit into
masterfrom
claude/issue-289-mlmg-eb-migration
May 28, 2026
Merged

session: disable AMReX arena preallocation to fit T4 with EB#296
jameslehoux merged 1 commit into
masterfrom
claude/issue-289-mlmg-eb-migration

Conversation

@jameslehoux
Copy link
Copy Markdown

AMReX's default GPU arena preallocates a large chunk of VRAM upfront (~all available memory minus headroom). With EB enabled, the metadata allocations on top of this push the T4 (15 GB) over the limit, OOM-ing at AMReX init. Pass amrex.the_arena_init_size=0 to use lazy allocation instead — each kernel allocates what it needs, when it needs it. Small runtime overhead but unblocks Colab T4 users.

AMReX's default GPU arena preallocates a large chunk of VRAM upfront
(~all available memory minus headroom). With EB enabled, the metadata
allocations on top of this push the T4 (15 GB) over the limit, OOM-ing
at AMReX init. Pass amrex.the_arena_init_size=0 to use lazy allocation
instead — each kernel allocates what it needs, when it needs it. Small
runtime overhead but unblocks Colab T4 users.
@jameslehoux jameslehoux merged commit 54b1db7 into master May 28, 2026
@github-actions
Copy link
Copy Markdown

Performance Benchmark Results

Size Solver Wall Time (s) Tortuosity Expected Rel. Error Iters Status
64³ pcg 0.6590 0.984375 0.984375 0.00e+00 1 PASS
64³ flexgmres 0.4084 0.984375 0.984375 0.00e+00 N/A PASS
64³ bicgstab 0.3916 0.984375 0.984375 0.00e+00 N/A PASS
64³ gmres 0.3892 0.984375 0.984375 0.00e+00 N/A PASS
128³ pcg 7.6073 0.992188 0.992188 0.00e+00 1 PASS
128³ flexgmres 5.3374 0.992188 0.992188 0.00e+00 N/A PASS
128³ bicgstab 5.2480 0.992188 0.992188 0.00e+00 N/A PASS
128³ gmres 5.2299 0.992188 0.992188 0.00e+00 N/A PASS

Fastest solver: gmres at 64³ (0.3892s)

Benchmark: uniform block (analytical τ = (N-1)/N)

@github-actions
Copy link
Copy Markdown

Code Coverage Report

------------------------------------------------------------------------------
                           GCC Code Coverage Report
Directory: .
------------------------------------------------------------------------------
File                                       Lines     Exec  Cover   Missing
------------------------------------------------------------------------------
src/io/CathodeWrite.cpp                       95       83    87%   40-41,97-100,115-116,182-185
src/io/CathodeWrite.H                          1        1   100%
src/io/DatReader.cpp                         136      106    77%   28-29,32,37,94-95,101-102,109-111,137-139,143,146-150,154-157,164,166,210-211,256,259
src/io/DatReader.H                             1        1   100%
src/io/HDF5Reader.cpp                        344       84    24%   40-41,43-44,46-49,52,54-56,58-59,62,64-66,68-74,92-93,126-128,144-145,154-157,174-180,182-187,204,213-215,217,219-228,230-233,236-238,240-251,253-258,266,266,266,266,266,266,266,270,270,270,270,270,270,270,274,276,278,280,282,288,290,297,297,297,297,297,297,297,301,301,301,301,301,301,301,305,305,305,305,305,305,305-306,306,306,306,306,306,306,309,309,309,309,309,309,309-310,310,310,310,310,310,310-311,311,311,311,311,311,311,313,313,313,313,313,313,313-314,314,314,314,314,314,314-315,315,315,315,315,315,315,319,319,319,319,319,319,319,324,324,324,324,324,324,324-325,325,325,325,325,325,325-326,326,326,326,326,326,326-327,327,327,327,327,327,327,332,332,332,332,332,332,332,337,337,337,337,337,337,337-338,338,338,338,338,338,338,343,343,343,343,343,343,343,350,350,350,350,350,350,350,357-358,432-435,437-440
src/io/HDF5Reader.H                            3        3   100%
src/io/ImageLoader.cpp                        61       42    68%   25,38,48,60-62,64-70,72,77,89-90,92,94
src/io/RawReader.cpp                         267      136    50%   51-52,91-92,113-114,117-119,122-123,142-144,157-159,168-170,176-179,187-188,194-198,202-206,211-214,221-226,233-239,273,275-276,278,285-286,303,314,316,320,327,329,333-336,340,348-349,355-357,363-365,367-368,371,374,376,379-382,384-386,388,390-391,393,395-396,398,400-401,403,405-406,408,412-413,415,419-420,422,427,467,473-474,535-538,552,554-556,558,560-562,572,576-578,580,602
src/io/RawReader.H                             1        1   100%
src/io/TiffReader.cpp                        385      131    34%   60-66,68-70,72-74,76-78,80-81,83-85,87-89,91-93,95-97,99-100,102-104,107-109,112-113,115-118,120,123,125-128,144-145,149-151,153-159,161,187,211,218,227,229-232,241,243-246,249,256,289-294,307,310-318,320-321,324-328,332-336,339-343,345-349,352-358,360-364,368,370,376-378,380-394,397,399-403,405-410,414-419,421-426,429-430,433-435,569-589,591-592,595-602,604,607-623,626-628,684,687-688,691-697,699,703-714,716-717
src/io/TiffReader.H                            5        5   100%
src/props/BoundaryCondition.H                131       74    56%   63,68,70,216,224-229,233-236,238-244,247-249,252-253,255,258-261,264-265,271-272,274-279,285-287,290-296,299,303,365-366,371,373
src/props/ConnectedComponents.cpp             71       69    97%   115-116
src/props/ConnectedComponents.H                4        4   100%
src/props/DeffTensor.cpp                      62       59    95%   122,128-129
src/props/Diffusion.cpp                      510      378    74%   93-94,97-98,103-104,106-116,118,123-132,134-141,144-150,153-157,159-163,165,168-173,175-177,179,182-184,186-187,190-191,193,195-198,200,202-203,288-289,297-298,300,349,359-360,368-371,373-375,404-413,415,453,461,465-467,526-527,533,535,539,547,581,610,638,646,735-736,739-740,757-760,771-772,774,824
src/props/EffDiffFillMtx.H                   120      106    88%   58,216-217,221-225,229,231-235
src/props/EffectiveDiffusivityHypre.cpp      413      372    90%   189-191,193-197,352-355,458,610-613,615-617,619-622,631-634,641,670,682-685,687-689,691,706,724,726
src/props/EffectiveDiffusivityHypre.H          7        7   100%
src/props/FloodFill.cpp                       90       87    96%   109-110,250
src/props/HypreStructSolver.cpp              343      210    61%   87-88,121,133-134,145,303,313,315,318,350,360,362,365,371-374,376-380,382-383,385-389,392-393,395-396,398,401-402,405-406,408-411,413-417,419-420,422-426,429-430,432-433,435,438-439,442-443,445-447,449-455,457-461,464-465,467-468,470,473-474,477,479-481,483-489,491-495,498-499,501-502,504,507-508,511,513-515,517-520,522-526,529-530,532-533,535,538-539,542,545-546,559
src/props/HypreStructSolver.H                  6        6   100%
src/props/MacroGeometry.H                     17       17   100%
src/props/ParticleSizeDistribution.cpp        11       11   100%
src/props/ParticleSizeDistribution.H           6        6   100%
src/props/PercolationCheck.cpp                53       46    86%   32-33,49-51,68,73
src/props/PercolationCheck.H                   4        4   100%
src/props/PhysicsConfig.H                     90       89    98%   150
src/props/ResultsJSON.H                      225      222    98%   242,395,416
src/props/REVStudy.cpp                       151      128    84%   72,83-91,159,170-173,175,183-186,188-190
src/props/SolverConfig.H                      32       20    62%   30,32,37-44,75-76
src/props/SpecificSurfaceArea.cpp             56       55    98%   59
src/props/SpecificSurfaceArea.H                6        6   100%
src/props/ThroughThicknessProfile.cpp         38       38   100%
src/props/ThroughThicknessProfile.H            5        5   100%
src/props/Tortuosity.H                         2        2   100%
src/props/TortuosityDirect.cpp               219      191    87%   81-83,86,100-106,113-114,125,134,140,202-209,226,394,424,433
src/props/TortuosityDirect.H                   5        5   100%
src/props/TortuosityHypre.cpp                793      567    71%   149-150,155-156,240-243,246-248,311,335-337,340-341,343,371-373,376-378,408-411,620,644,648,669,686-687,689-691,694-701,708-709,711,713,716-726,730-736,738-742,746-748,750-752,755-762,769-770,772,774-784,788-796,798-801,803,813,819-822,824-826,835-838,840-842,878,881-882,902-904,907,918-921,923,960,965-968,971-973,977-980,982,984-987,989,994-996,998,1047,1056,1061,1064-1069,1085-1088,1102-1106,1111-1116,1126-1130,1135-1140,1145-1149,1152-1155,1162-1165,1176,1185,1187,1191,1193,1218,1259-1260,1346-1348,1474-1477
src/props/TortuosityHypre.H                   15       15   100%
src/props/TortuosityHypreFill.H              127       98    77%   85,203,205-212,237-239,241-245,247-248,250,252,255-256,258-262
src/props/TortuosityKernels.H                 97       53    54%   52,56-60,62-65,69-74,76-80,84-85,90,129,143,157,243,245-248,250-253,257-260,262-265
src/props/TortuosityMLMG.cpp                 149      142    95%   283-285,287-288,293,314
src/props/TortuosityMLMG.H                     1        1   100%
src/props/TortuositySolverBase.cpp           311      247    79%   70-72,74-75,94-100,118,122,124,160-163,218,221,223,409,412-414,416,424-427,429-435,440,445-447,453-454,456-458,494,498-500,503,508-511,513,544,548-550,552,554,558
src/props/TortuositySolverBase.H              13       13   100%
src/props/VolumeFraction.cpp                  25       25   100%
src/props/VolumeFraction.H                     4        4   100%
------------------------------------------------------------------------------
TOTAL                                       5511     3975    72%
------------------------------------------------------------------------------


Generated by CI — coverage data from gcovr

@codecov
Copy link
Copy Markdown

codecov Bot commented May 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant