Update platforms.mk for U50 FPGA#330
Open
rahul7rajdn wants to merge 1 commit intovortexgpgpu:masterfrom
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
rn84@flubber1:~/USERSCRATCH/latest_vortex/build/hw/syn/xilinx/xrt(debug) $ FPGA_BIN_DIR=/net/netscratch/rn84/latest_vortex/build/hw/syn/xilinx/xrt/test1_xilinx_u50_gen3x16_xdma_5_202210_1_hw/bin TARGET=hw PLATFORM=xilinx_u280_gen3x16_xdma_1_202211_1 ../../../../ci/blackbox.sh --driver=xrt --app=demo Running: make -C ../../../../ci/../runtime/xrt > /dev/null Running: make -C ../../../../ci/../tests/regression/demo run-xrt make: Entering directory '/net/netscratch/rn84/latest_vortex/build/tests/regression/demo' SCOPE_JSON_PATH=/net/netscratch/rn84/latest_vortex/build/hw/syn/xilinx/xrt/test1_xilinx_u50_gen3x16_xdma_5_202210_1_hw/bin/scope.json XRT_INI_PATH=/net/netscratch/rn84/latest_vortex/build/runtime/xrt/xrt.ini EMCONFIG_PATH=/net/netscratch/rn84/latest_vortex/build/hw/syn/xilinx/xrt/test1_xilinx_u50_gen3x16_xdma_5_202210_1_hw/bin XRT_DEVICE_INDEX=0 XRT_XCLBIN_PATH=/net/netscratch/rn84/latest_vortex/build/hw/syn/xilinx/xrt/test1_xilinx_u50_gen3x16_xdma_5_202210_1_hw/bin/vortex_afu.xclbin LD_LIBRARY_PATH=/opt/xilinx/xrt/lib:/net/netscratch/rn84/latest_vortex/build/runtime:/opt/xilinx/xrt/lib:/opt/slurm/current/lib:/opt/slurm/current/lib: VORTEX_DRIVER=xrt ./demo -n64 open device connection info: device name=xilinx_u50_gen3x16_xdma_base_5, memory_capacity=0x200000000 bytes, memory_banks=32. data type: integer number of points: 2048 buffer size: 8192 bytes allocate device memory allocating bank0... reusing bank0... reusing bank0... dev_src0=0x10000 dev_src1=0x12000 dev_dst=0x14000 allocate host buffers upload source buffer0 upload source buffer1 upload program allocating bank8... upload kernel argument reusing bank0... start device wait for completion download destination buffer verify result cleanup freeing bank8... freeing bank0... allocating bank0... PERF: core0: instrs=27999, cycles=150089, IPC=0.186549 PERF: core1: instrs=27999, cycles=150616, IPC=0.185897 PERF: instrs=55998, cycles=150616, IPC=0.371793 PASSED! make: Leaving directory '/net/netscratch/rn84/latest_vortex/build/tests/regression/demo'More info related to debugging from Saurabh:
I debugged further, and seems like the AXI resp error we were seeing was due to the HBM AXI rresp signal being 0x3 (DECERR), which means the memory mapping is probably not correct and the interconnect is not able to find memory loacated at the address being sent from AFU.
So I tried using CONFIGS += -DPLATFORM_MERGED_MEMORY_INTERFACE option in platform.mk for U50 board, as it was used in the case of U55 board (since both boards are similar) and it seems to pass the hw_emu, and I dont see any Axi errors in the simulate.log.
I was able to observe instructions in the memory using my debugger, on the non-functional bitstream.
Not exactly sure how that is working .
Vortex_afu has 2 AXI interfaces at the top level. Master interface (multi channel) that is used to access HBM, and slave interface that host uses to communicate with AFU. The AXI errors that we observed were generated on the HBM interface, suggesting that the banks are not properly configured for the paltform. Therefore, the addresses sent out by vortex during program execution don't land in any leagal memory range.
I think when the debugger reads memory (by injecting ld instructions in vortex pipeline), it might be using only a few AXI channels, but during normal program execution, perhaps it utilizes more channels, and the ones that aren't configured properly result in AXI errors. Other thing to note is that when using the debugger, vortex is mostly stalled and instruction/data accesses are infrequent. not sure if that might change anything.
Another observation: I tried stepping through the program using debugger, and found that the warps are getting stuck (looping infinitely) in the memcpy call in _init_tls function. I suspect that since the HBM Axi interface returns nonzero response code, vortex might be discarding/reading incorrect value resulting in an infinite loop always waiting for a valid response.
I tested the bitstream with CONFIGS += -DPLATFORM_MERGED_MEMORY_INTERFACE interface on u50 and its passing (incl demo, sgemm, veadd).