Conversation
kshyatt
left a comment
There was a problem hiding this comment.
Apart from minor comment LGTM, this will be very helpful for the Enzyme PR
|
Your PR no longer requires formatting changes. Thank you for your contribution! |
|
Ope, GPU failures look related... |
|
Needs #390 first, but should then be ready. |
Codecov Report✅ All modified and coverable lines are covered by tests.
... and 2 files with indirect coverage changes 🚀 New features to boost your workflow:
|
ab125b5 to
e3ce090
Compare
|
This generally looks good; I left a few small comments and questions. But clearly, this is too much change for a detailed review. Is there a convenient way to review such code reorganization, i.e. to separate between what has just moved to other files and what are actual changes. I could probably ask some agent, but I don't feel like doing that. |
|
I don't think there is, but I did try and not actually alter anything except for the organization of the tests.
In principle there is no reason to review the actual contents of the test files, since these are unchanged, which also explains some of the things you commented on. |
|
Buildkite now succeeding!!! |
| @tensor t′[1 2 3; 4 5] := t1[1; 4] * t2[2 3; 5] | ||
| CUDA.@allowscalar begin | ||
| @tensor t′[1 2 3; 4 5] := t1[1; 4] * t2[2 3; 5] | ||
| end |
There was a problem hiding this comment.
Why does this need an allowscalar? (just to understand what is still missing).
There was a problem hiding this comment.
It's missing the changes in the BraidingTensor PR -- a BraidingTensor is arising in this contraction for the Irrep[CU₁] case. Once we get the BraidingTensor changes in this can be removed
That is great news. The chainrules test still take quite long and time out on Windows. The tensor contraction AD test is repeated 5 times over with random contraction patterns, so that cost can easily be cut down by lowering the number of repetitions. Also thanks to @borisdevos ; your commits contain useful improvements. |
|
This is the output of a local run of chainrules/tensoroperations.jl : ---------------------------------------
Auto-diff with symmetry: Trivial
---------------------------------------
Test Summary: | Pass Total Time
ChainRules for tensor operations with symmetry Trivial | 1434 1434 1m30.8s
scalartype Float64 | 717 717 45.9s
scalartype ComplexF64 | 717 717 44.9s
---------------------------------------
Auto-diff with symmetry: Irrep[ℤ₂]
---------------------------------------
Test Summary: | Pass Total Time
ChainRules for tensor operations with symmetry Irrep[ℤ₂] | 1582 1582 1m26.8s
scalartype Float64 | 811 811 44.9s
scalartype ComplexF64 | 771 771 41.9s
---------------------------------------
Auto-diff with symmetry: Irrep[CU₁]
---------------------------------------
Test Summary: | Pass Total Time
ChainRules for tensor operations with symmetry Irrep[CU₁] | 1746 1746 31m46.5s
scalartype Float64 | 885 885 11m54.9s
scalartype ComplexF64 | 861 861 19m51.5s
---------------------------------------
Auto-diff with symmetry: (FermionParity ⊠ Irrep[SU₂] ⊠ Irrep[U₁])
---------------------------------------
Test Summary: | Pass Total Time
ChainRules for tensor operations with symmetry (FermionParity ⊠ Irrep[SU₂] ⊠ Irrep[U₁]) | 1806 1806 202m42.4s
scalartype Float64 | 905 905 2m55.2s
scalartype ComplexF64 | 901 901 199m47.2sClearly the last two symmetry types are the culprit, but I would need to add more detailed timing statements to know the true origin. I checked that tensors with spaces which are related to What is very strange is that for |
|
It's been very hard to get the tests to pass again, since I started making small changes in the spaces etc. There was also some issue with random numbers that accidentally caused a very small singular value in Float32 precision etc. I think everything is now working, except that one windows test run failed due to some (unrelated?) error in compiling the CUDA package, and that one of the actual buildkite cuda runs also failed. I don't know if this failure is deterministic and related to some actual change here; I fail to see how that could be the case as I didn't change anything in the cuda tests in the last few commits. If we can rule out that the failure is unrelated, then I would be ok with having this merged, even though I will try to further improve the tests in further PRs. |
|
CUDA failure seems to have resolved itself btw |
|
Still not sure what is going on with the CUDA tests, but will merge for now and deal with fallout later. Thanks everyone for the help on this one! |
Summary
test/Project.toml, separating them from the main package manifest--fastmode that skips AD test groups and reduces sector/scalar-type coverage for quick iterationtest/README.mddocumenting how to run tests, available groups, fast mode, and how to add new test files