FP comparison instructions (vmfge, vmfeq, vmflt, etc.) produce results that should be routed to the mask unit. Under pipeline pressure from filler instructions, the routing flag reads the wrong instruction and the comparison result is written to the VRF instead of the mask unit.
Reproducer
vsetivli zero, 4, e32, m1, ta, mu
vle32.v v4, (a1)
vle32.v v6, (a2)
vfadd.vv v24, v24, v26
vmfge.vf v0, v6, fa5
vfadd.vv v4, v6, v6, v0.t
vse32.v v4, (a3)
Steps to Reproduce
git clone https://github.com/ianfield/ara-test.git
cd ara-test/
export LLVM_RISCV=/opt/homebrew/Cellar/llvm/21.1.7/bin/
./sh/checkout.sh
./sh/build.sh
./sh/run.sh --sim spike -k test_vfadd_hazard
./sh/run.sh --sim fork -k test_vfadd_hazard
./sh/run.sh --sim base -k test_vfadd_hazard
Proposed change
vmfpu.sv:211
case (op) inside
VFDIV, VFRDIV, VFSQRT: fpu_latency = LatFDivSqrt;
[VFREDMIN:VFREDMAX]: fpu_latency = LatFNonComp;
[VFCVTXUF:VFCVTFF]: fpu_latency = LatFConv;
[VFMIN:VFSGNJX]: fpu_latency = LatFNonComp;
[VMFEQ:VMFGE]: fpu_latency = LatFNonComp;
default: begin
FP comparison instructions (
vmfge,vmfeq,vmflt, etc.) produce results that should be routed to the mask unit. Under pipeline pressure from filler instructions, the routing flag reads the wrong instruction and the comparison result is written to the VRF instead of the mask unit.Reproducer
Steps to Reproduce
Proposed change
vmfpu.sv:211