Skip to content

[Feature] Implement DMA support#293

Open
BenkangPeng wants to merge 13 commits into
tancheng:masterfrom
BenkangPeng:dma-cgra
Open

[Feature] Implement DMA support#293
BenkangPeng wants to merge 13 commits into
tancheng:masterfrom
BenkangPeng:dma-cgra

Conversation

@BenkangPeng
Copy link
Copy Markdown
Collaborator

Related issue: coredac/CGRA-SoC#2

This PR introduces CgraDmaRTL which integrates the CGRA with a DMA engine, enabling direct memory transfers between external DRAM(don't implement now) and the CGRA's dataSPM.

@BenkangPeng BenkangPeng requested review from HobbitQia and tancheng June 2, 2026 13:55
Comment thread mem/dma/DmaEngineRTL.py Outdated
Comment thread mem/dma/DmaEngineRTL.py
Comment thread mem/dma/DmaEngineRTL.py
Comment thread mem/dma/DmaEngineRTL.py Outdated
Comment thread mem/dma/DmaEngineRTL.py Outdated
Comment thread mem/dma/DmaEngineRTL.py Outdated
Comment thread cgra/CgraDmaRTL.py Outdated
Comment on lines +116 to +118
s.mem_rd_req_val = OutPort() # dma_read_request_valid
s.mem_rd_req_rdy = InPort() # dma_read_request_ready
s.mem_rd_req_addr = OutPort(DmaDramAddrType)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we use the RecvIfcRTL and SendIfcRTL interfaces to connect the DmaRTL?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment thread mem/data/DataMemControllerRTL.py Outdated
Comment thread cgra/CgraTemplateRTL.py Outdated
Comment on lines +225 to +230
s.data_mem.spm_dma_rval //= s.spm_dma_rval
s.data_mem.spm_dma_rrdy //= s.spm_dma_rrdy
s.data_mem.spm_dma_raddr //= s.spm_dma_raddr
s.data_mem.spm_dma_rresp_val //= s.spm_dma_rresp_val
s.data_mem.spm_dma_rresp_rdy //= s.spm_dma_rresp_rdy
s.data_mem.spm_dma_rresp_data //= s.spm_dma_rresp_data
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed before, should we connect the dma to the controller (as intermediate interface/transition)? instead of directly connecting to data spm?

So then we can leverage

s.recv_from_cpu_pkt //= s.recv_from_cpu_pkt_queue.recv
s.send_to_cpu_pkt //= s.send_to_cpu_pkt_queue.send
to decide the next location (e.g., local spam bank, remote spm)?

@HobbitQia
Copy link
Copy Markdown
Collaborator

Hi @tancheng @BenkangPeng , I summarized two direction of DMA design as below:

  • Rely on data controller

    • DMA is added as a new client of the DataMemControllerRTL, where data in the DMA engine communicates directly with the DataMemControllerRTL, and the logic for multiplexing SPM ports is also implemented in that module.

      To initiate DMA, the CPU can send dma_mvin or dma_mvout to the CGRA, after which the controller activates the DMA engine by sending start signals.

    • Pros: Keeps the controller clean; provides a faster path because data does not go through the controller.

    • Cons: Additional logic is required to feed DMA results into the control memory.

    image
  • All in controller

    • All decoding logic is handled in the controller. The logic for handling the logic port should still reside in the DataMemControllerRTL, since the data memory should have its own port multiplexing logic.

      The logic of packeting should also be implemented in the controller module.

    • Pros: Unifies control and data memory within the controller (the controller is already connected to both control and data memory).

    • Cons: Introduces complex control logic in the controller; results in a slower path.

    image

I prefer the second method but I think there are still some logic should be written in DataMemControllerRTL. WDTY?

@tancheng
Copy link
Copy Markdown
Owner

tancheng commented Jun 5, 2026

Hi @HobbitQia, option 2 looks good to me. Though I am not sure what logic should be additionally in DataMemController?

Comment thread cgra/CgraDmaRTL.py
Comment on lines +177 to +214
s.dma_cmd_val //= s.dma.dma_cmd_val
s.dma_cmd_rdy //= s.dma.dma_cmd_rdy
s.dma_cmd_opcode //= s.dma.dma_cmd_opcode
s.dma_cmd_dram_addr //= s.dma.dma_cmd_dram_addr
s.dma_cmd_spm_addr //= s.dma.dma_cmd_spm_addr
s.dma_cmd_bytes //= s.dma.dma_cmd_bytes
s.dma_cmd_tag //= s.dma.dma_cmd_tag

s.dma_done_val //= s.dma.dma_done_val
s.dma_done_rdy //= s.dma.dma_done_rdy
s.dma_done_tag //= s.dma.dma_done_tag

s.dram_rd_req //= s.dma.dram_rd_req
s.dram_rd_resp //= s.dma.dram_rd_resp

s.dram_wr_req_val //= s.dma.dram_wr_req_val
s.dram_wr_req_rdy //= s.dma.dram_wr_req_rdy
s.dram_wr_req_addr //= s.dma.dram_wr_req_addr
s.dram_wr_req_data //= s.dma.dram_wr_req_data
s.dram_wr_req_mask //= s.dma.dram_wr_req_mask

s.dram_wr_resp_val //= s.dma.dram_wr_resp_val
s.dram_wr_resp_rdy //= s.dma.dram_wr_resp_rdy

# DMA to SPM connections.

s.dma.spm_dma_wval //= s.cgra.spm_dma_wval
s.dma.spm_dma_wrdy //= s.cgra.spm_dma_wrdy
s.dma.spm_dma_waddr //= s.cgra.spm_dma_waddr
s.dma.spm_dma_wdata //= s.cgra.spm_dma_wdata
s.dma.spm_dma_wmask //= s.cgra.spm_dma_wmask

s.dma.spm_dma_rval //= s.cgra.spm_dma_rval
s.dma.spm_dma_rrdy //= s.cgra.spm_dma_rrdy
s.dma.spm_dma_raddr //= s.cgra.spm_dma_raddr
s.dma.spm_dma_rresp_val //= s.cgra.spm_dma_rresp_val
s.dma.spm_dma_rresp_rdy //= s.cgra.spm_dma_rresp_rdy
s.dma.spm_dma_rresp_data //= s.cgra.spm_dma_rresp_data
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these will change if we go with @HobbitQia's option 2, right?

Moreover, can we use send/recv interfaces, and define msg (https://github.com/tancheng/VectorCGRA/blob/master/lib/messages.py) to encapsulate data, addr, or whatever needed as struct? So we don't need to declare so many ports, and explicitly connect each of them. This CGRA RTL shouldn't see these details, the data struct can be decomposed inside the submodule.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants