In the Tucker-1 model, some first-order slices of the input Y may be redundant if they are similar to each other. We could remove them before factorization so we have a smaller input to work with, and then add them back at the end, copying information from the similar slices.
Possible complications
- What metric should we use for similarity? Frobenius inner product? Will it have to depend on the loss function? What threshold do we set for similarity?
- Is this unique to Tucker-1 factorization? If not, we need a database of how factors interact across all types of possible AbstractDecompositions.
- Substituting factors at the end may not yield results close enough. More likely, this will just be a warm start for the full factorization. Will need to benchmark to see how many slices need to be pruned for this technique to be worth it.
- This involves copying the input data which could be large, even if we are pruning many slices.
- If the input Y is stored in an information efficient way (e.g. sparse tensor), we may also not want to copy over data to a dense representation. So we need to be able to prune the factors by initializing an array of the same type, requiring access to constructors the user has defined/imported, unless we commit to making dense copies.
Because of this last complication, we may only want to implement a checker function that recommends if the user should perform any pruning of their own.
In the Tucker-1 model, some first-order slices of the input Y may be redundant if they are similar to each other. We could remove them before factorization so we have a smaller input to work with, and then add them back at the end, copying information from the similar slices.
Possible complications
Because of this last complication, we may only want to implement a checker function that recommends if the user should perform any pruning of their own.