Skip to content

Commit e5b9950

Browse files
authored
[SLP] Avoid adding duplicate VFs into vectorizeStores()::CandidateVFs (llvm#179296)
Small compile time improvement: ``` stage1-O3: (-0.01%) stage1-ReleaseThinLTO (-0.00%) stage1-ReleaseLTO-g (-0.01%) stage1-O0-g (-0.00%) stage1-aarch64-O3 (+0.01%) stage1-aarch64-O0-g (-0.02%) stage2-O3 (-0.00%) stage2-O0-g (-0.03%) stage2-clang (+0.00%) ``` Also changes/removes a few comments for clarity.
1 parent 73417aa commit e5b9950

1 file changed

Lines changed: 6 additions & 8 deletions

File tree

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -24480,8 +24480,6 @@ bool SLPVectorizerPass::vectorizeStores(
2448024480
// Mark the vectorized stores so that we don't vectorize them
2448124481
// again.
2448224482
VectorizedStores.insert_range(Slice);
24483-
// Mark the vectorized stores so that we don't vectorize them
24484-
// again.
2448524483
AnyProfitableGraph = RepeatChanged = Changed = true;
2448624484
// If we vectorized initial block, no need to try to vectorize
2448724485
// it again.
@@ -24563,19 +24561,19 @@ bool SLPVectorizerPass::vectorizeStores(
2456324561
find_if(RangeSizes, IsNotVectorized)) +
2456424562
1));
2456524563
unsigned VF = bit_ceil(CandidateVFs.front()) * 2;
24566-
unsigned Limit =
24567-
getFloorFullVectorNumberOfElements(*TTI, StoreTy, MaxTotalNum);
24568-
CandidateVFs.clear();
24569-
if (bit_floor(Limit) == VF)
24570-
CandidateVFs.push_back(Limit);
2457124564
if (VF > MaxTotalNum || VF >= StoresLimit)
2457224565
break;
2457324566
for (std::pair<unsigned, unsigned> &P : RangeSizes) {
2457424567
if (P.first != 0)
2457524568
P.first = std::max(P.second, P.first);
2457624569
}
24577-
// Last attempt to vectorize max number of elements, if all previous
24570+
// Attempt again to vectorize even larger chains if all previous
2457824571
// attempts were unsuccessful because of the cost issues.
24572+
CandidateVFs.clear();
24573+
unsigned Limit =
24574+
getFloorFullVectorNumberOfElements(*TTI, StoreTy, MaxTotalNum);
24575+
if (bit_floor(Limit) == VF && Limit != VF)
24576+
CandidateVFs.push_back(Limit);
2457924577
CandidateVFs.push_back(VF);
2458024578
}
2458124579
}

0 commit comments

Comments
 (0)