Problem
Many users seem to want to run stretched-nmf with thousands of files, but currently this can take hours. While the computations are genuinely heavy, this could likely be improved by a lot.
Proposed solution
Investigate whether the underlying math model is compatible with multi-threading, although my instinct is it should be, since large computations steps (like update_stretch) do not seem to be sequentially dependent on prior calculations. If it turned out not to be compatible, multi-threading would be limited to grid search functionality.