Skip to content

feat(dlc10): pure-Rust driver for Xilinx Platform Cable USB II + SPI flash#593

Open
gHashTag wants to merge 18 commits into
feat/trios-bridgefrom
feat/dlc10-rust
Open

feat(dlc10): pure-Rust driver for Xilinx Platform Cable USB II + SPI flash#593
gHashTag wants to merge 18 commits into
feat/trios-bridgefrom
feat/dlc10-rust

Conversation

@gHashTag
Copy link
Copy Markdown
Owner

@gHashTag gHashTag commented May 12, 2026

Summary

Replace Python tools/dlc10_jtag.py with a pure-Rust driver and centralise
FPGA programming + commit gates in the tri CLI, per Article II of the dePIN
constitution (Rust only, no new .sh, no Vivado).

openXC7 QMTech-specific proxy bitstream (new in latest push)

Refs #592 · trabucayre/openFPGALoader#663

The embedded bscan_spi_xc7a100t.bit from quartiq is built for the
generic XC7A100T-CSG324 part. On the QMTech FGG676 board the bridge
configures (DONE=HIGH) but CS_N / CCLK do not reach the flash, so
spi-raw 9F returns FF FF FF. Vivado would fix it — but the project
target host is macOS where Vivado is unsupported.

New in-tree openXC7 build path (no Vivado, no Python, no shell):

  • fpga/bscan_spi_qmtech/bscan_spi_qmtech.v — plain Verilog port of the openocd
    xilinx_bscan_spi.py Migen module (BSCANE2 USER1 + STARTUPE2 + marker/length/data shifter).
  • fpga/bscan_spi_qmtech/bscan_spi_qmtech.xdc — FGG676 dedicated SPI pin LOCs.
  • fpga/bscan_spi_qmtech/Makefile — standalone yosys + nextpnr-himbaechel + prjxray driver.
  • cli/tri/src/fpga.rs::build_proxytri fpga build-proxy [--install] Rust subcommand.
  • docs/fpga/SPI_FLASH_DEBUG.md — new "Solution" section documenting the flow.

End-to-end use:

cargo run -p tri --release -- fpga build-proxy --install
cargo build -p tri --release
tri fpga proxy-load && tri fpga proxy-status && tri fpga spi-raw 9F --rx 3

What's in this PR

Pure-Rust DLC10 driver (cli/dlc10)

  • Lib + binary crate. Pure Rust via rusb; FX2 firmware loaded via control transfer.
  • JTAG state machine; chunk_bits = 16379 quirk; UG470 §6 JPROGRAM sequence.
  • cli/flash-spi rewritten to call dlc10::Dlc10::program_flash directly.

Centralisation in tri

  • tri fpga {idcode,sram,program,flash-id,status,debug} — DLC10 lib backed.
  • tri fpga {proxy-load,proxy-status,spi-raw,ir-probe,flash-id-debug} — JEDEC=FF FF FF triage.
  • tri fpga build-proxy [--install] — openXC7 build of the QMTech FGG676 proxy bitstream (new).
  • tri hooks {l1-check,now-gate,pre-commit,session-gate} — pure-Rust ports of gates.

Constitution compliance

  • No new .py or .sh files added.
  • No unwrap() in production paths.
  • docs/NOW.md "Last updated" = 2026-05-12.

Refs

🤖 Generated with Claude Code

claude and others added 7 commits May 11, 2026 04:25
New crate cli/dlc10 (lib + binary) replacing tools/dlc10_jtag.py with a
pure-Rust DLC10/DLC9 driver that supports both SRAM and SPI-flash
programming for 7-series FPGAs.

Key fixes vs prior Python attempt
- SRAM JPROGRAM was missing; old flow JSHUTDOWN -> CFG_IN -> JSTART left
  DONE = LOW. Implements the correct UG470 §6 sequence:
  JPROGRAM cycle(64) -> JSHUTDOWN cycle(12) -> CFG_IN <bs> cycle(1) ->
  JSTART cycle(24) -> BYPASS -> CFG_OUT -> STATUS.
- chunk_bits = 16379 (NOT a multiple of 4) for _do_shift to avoid the
  DLC10 firmware silently corrupting payloads.
- USB endpoints / vendor requests pinned: EP_OUT=0x02, EP_IN=0x86,
  vendor=0xB0, FX2 fw=0xA0 @ CPUCS=0xE600.

SPI flash path
- Embeds bscan_spi_xc7a100t.bit (404 986 B, MIT, quartiq) — verified
  SHA-256 6e8cef49958fbab96a217c209782be67f4943ff80ae9c81e51425da41fc975e0.
- program_flash() loads the bridge into FPGA SRAM, selects USER1, then
  drives the M25P/N25Q via WREN/SECTOR_ERASE/PAGE_PROGRAM/READ_DATA with
  WIP polling and optional read-back verify.
- read_flash_id() reads 3-byte JEDEC ID through the bridge.

cli/flash-spi rewritten to call dlc10::Dlc10::program_flash directly
(drops which + openFPGALoader shell-out). Single Rust dependency tree.

Tests
- parse_bitfile, bitrev, intel_hex unit tests (cargo test green).
- idcode, flash_id integration tests gated with #[ignore] (need DLC10).

Note on xusb_xp2.hex: the committed file is a placeholder EOF record;
the real 22 956-byte FX2 firmware must be copied to fpga/tools/ on the
build host before producing a release binary. Documented in
cli/dlc10/README.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes #592

The dlc10 patch shipped an empty placeholder for xusb_xp2.hex. This
commit restores the real Intel HEX firmware (md5 2238d1c28743f587830abbca9c389fb2)
needed to bootstrap the DLC10 FX2 microcontroller via control-transfer
load. Hardware test confirmed: IDCODE 0x13631093 (XC7A100T) reads OK
through cargo run -p dlc10 -- idcode.

Also updates docs/NOW.md.
Refs #592

Adds instrumentation to diagnose why SRAM configuration on XC7A100T leaves
DONE=LOW after the UG470 §6 sequence.

* `cfg_reg` module: 7-series configuration register addresses (UG470 Tbl 5-23).
* `read_cfg_reg(addr)`: full Type-1 read protocol (CFG_IN: dummy/sync/NOP/read
  header/4×NOP, then CFG_OUT shift-out, with end-to-end bit-reversal of the
  32-bit return value).
* `StatBits::from_raw` + `diagnose()`: decode every bit of STAT (UG470 Tbl
  5-25) — DONE, EOS, INIT_B, CRC_ERROR, ID_ERROR, DEC_ERROR, MMCM_LOCK,
  CFGERR_B, MODE, STARTUP_STATE, BUS_WIDTH, etc. — and emit a human-readable
  reason for DONE=LOW.
* `program_sram_verbose(bit, verbose=true)`: prints payload byte range, sync
  word offset, first DWORD after sync (expected NOP 0x20000000 or CMD-write
  0x30020001), first 16 raw / shifted bytes, last 64 shifted bytes, chunk
  count, raw CFG_OUT read.
* CLI: new `debug` subcommand reads IDCODE, STAT, CTL0, CTL1, BOOT_STS,
  config IDCODE, WBSTAR, COR0, COR1 and pretty-prints them with the STAT
  decode. `sram` gains `--verbose`.
* `bitfile_payload_range`: factored out so `program_sram_verbose` can locate
  the raw payload independently of the bit-reversed shift buffer.
* `find_sync_word`: locates 0xAA995566 in a byte slice.
* Honest message on `sram`: the BYPASS->CFG_OUT post-JSTART read is stale and
  bit-shift-ordered — do not attempt to decode it as STAT; point users at
  `dlc10 debug` for a real diagnosis.

Tests: 12 unit tests pass (added 7 new). Existing 12 tests preserved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Refs #592

Hardware run on XC7A100T reported every Type-1 register read as
0x00000000 — even the configuration IDCODE register that must be
0x13631093 on any responsive part. Root cause: the read protocol
queued the read header into CFG_IN and then immediately switched IR to
CFG_OUT without parking in Run-Test/Idle. The config FSM never got the
TCK cycles needed to execute the queued read, so FDRO stayed at its
reset value.

Fixes:

* `read_cfg_reg`: shorten the CFG_IN packet to dummy/sync/NOP/<hdr>/2×NOP
  (xc3sprog layout) and **park in RTI for 32 TCK cycles** between
  CFG_IN payload and CFG_OUT shift. Mirrors xc3sprog
  `ProgAlgXC7::readReg`.

* `read_cfg_idcode`: thin wrapper around `read_cfg_reg(IDCODE)` as a
  self-test of the read pipeline — must return 0x13631093 on a healthy
  chip.

* `wait_for_init(timeout)`: poll STAT.INIT_B and STAT.INIT_COMPLETE
  until both are high. UG470 §6 requires this between JPROGRAM (mass
  erase) and CFG_IN; without it the chip eats bitstream bytes while
  configuration memory is still erasing.

* `program_sram_verbose`: call `wait_for_init(2s)` immediately after
  JPROGRAM, and at the end issue a proper Type-1 STAT read so the
  verbose output also contains the trustworthy DONE/EOS/CRC_ERROR
  decode (the BYPASS+CFG_OUT raw is kept for back-compat but is no
  longer the only thing printed).

* CLI: `debug --no-jstart` reads STAT without any preceding JSTART/
  BYPASS pulse — confirms whether a successful program_sram is leaving
  DONE=HIGH while only the readback path was broken.

* CLI: new `idcode-cfg` subcommand — reads the config IDCODE via the
  Type-1 path. Pure self-test of the read pipeline that is independent
  of any programming attempt. On a working chip MUST equal the JTAG
  IDCODE; if it doesn't, the bug is in our read protocol, not the
  device.

Tests: 13 unit tests pass (added type1_read_header_matches_xc3sprog,
which pins `(1<<29) | (1<<27) | (addr<<13) | 1` against the known
constants 0x2800E001 (STAT), 0x28018001 (IDCODE), 0x2800A001 (CTL0)).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Refs #592

Round-2 fix (RTI parking after one big DR shift) did NOT work on
hardware: `idcode-cfg` still returned 0x00000000 even though JTAG
IDCODE was 0x13631093. Auditing openFPGALoader `Xilinx::dumpRegister`
(src/xilinx.cpp:1126) revealed that the canonical protocol issues
FIVE separate 32-bit DR transactions for the CFG_IN packet — each
ending in its own Exit1-DR -> Update-DR -> Select-DR-Scan cycle.
**The per-word Update-DR is what makes the FPGA's configuration FSM
actually latch and process each Type-1 packet.** A single 192-bit
shift with one Update-DR at the end does not trigger packet
processing.

Changes:

* `build_read_cfg_packets(reg_addr) -> [u32; 5]` — produces the
  canonical 5-word sequence (SYNC / NOP / READ_HDR / NOP / NOP). NO
  0xFFFFFFFF bus-width prefix; openFPGALoader does not emit it on
  JTAG and it appears to confuse some chips.

* `read_cfg_reg_raw_n(reg_addr, bits)` — does `shift_ir(CFG_IN)`, then
  shifts each of the 5 packet words as an INDEPENDENT 32-bit DR
  transaction via `shift_dr_small` (which goes RTI -> Capture-DR ->
  Shift-DR -> Exit1-DR -> Update-DR -> RTI for each word). Then
  `shift_ir(CFG_OUT)` and as many `read_dr_32` shifts as needed for
  `bits` (rounded up to 32). Each result word is `reverse_bits`-ed
  because the FPGA streams MSB-first while `read_dr_32` packs
  LSB-first.

* `read_cfg_reg(reg_addr)` is now a thin wrapper that returns the
  first 32-bit word.

* `read_cfg_reg_diag(reg_addr, bits) -> ReadCfgDiag` — returns the
  host-order packet words, the exact 20 wire bytes
  (per-word `reverse_bits` + LE byte-split), AND the result words.
  Used by `idcode-cfg --raw` for byte-for-byte hand-comparison.

* CLI: `dlc10 idcode-cfg --raw` dumps:
  - 5 host-order packet words tagged SYNC/NOP/READ_HDR/NOP/NOP
  - 4 wire bytes per word (and the concatenated 20-byte stream)
  - 64-bit CFG_OUT shift split as 2 × 32-bit words — tests whether
    the value lands on the first or second word (1-word pipeline
    hypothesis). For IDCODE on XC7A100T the expected wire bytes are:
      SYNC     0xAA995566 -> 55 99 AA 66
      NOP      0x20000000 -> 04 00 00 00
      READ_HDR 0x28018001 -> 14 80 01 80
      NOP      0x20000000 -> 04 00 00 00
      NOP      0x20000000 -> 04 00 00 00

* New unit tests (all pure, no hardware):
  - `build_read_cfg_packets_idcode_matches_openfpgaloader` pins the
    5 packet words against the openFPGALoader source.
  - `wire_encoding_per_word_matches_reference` pins the per-word
    `reverse_bits` + LE byte-split against hand-computed values for
    SYNC, READ_HDR(IDCODE), NOP.

Test count: 15 passed, 0 failed (was 13; +2 new). The previously-added
RTI-parking test path is gone since we no longer rely on RTI parking —
per-word Update-DR is the real mechanism. The `swap_msb_lsb_u32`
helper is retained for test legibility (`#[allow(dead_code)]`).

Bootstrap pre-commit gate: `cd bootstrap && cargo build -q` clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…egacy

- Delete tools/dlc10_jtag.py and tools/tri_fpga/ (logic lives in cli/dlc10)
- Add `tri fpga` subcommands (idcode/sram/program/flash-id/status/debug)
  backed directly by the dlc10 lib crate — pure Rust, no shell-out
- Add `tri hooks` subcommands (pre-commit/l1-check/now-gate/session-gate)
  porting .claude/hooks/check-l1-traceability.sh to Rust with unit tests
- Replace .claude/hooks/check-l1-traceability.sh with a thin forwarder
  that exec's the Rust binary (fallback grep only if tri is not built)
- Document scope and decisions in MIGRATION_AUDIT.md

Tests: cargo test --workspace passes (6 new hooks tests, 0 regressions).
Build: cargo build --release --workspace succeeds.

No new .py or .sh files. No unwrap() in production paths.
No unsafe outside rusb. docs/NOW.md Last updated = 2026-05-12.

Closes #592

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…+ capture skew

Two protocol bugs and one diagnostic gap caused `tri fpga flash-id` to return
`FF FF FF` even though `tri fpga sram` configured the FPGA cleanly:

1. TX bytes were not bit-reversed. JTAG TDI shifts LSB-first; SPI flash
   commands are MSB-first. `0x9F` (READ_ID) arrived as `0xF9` on MOSI, so
   the flash never saw a valid opcode and MISO floated high. openFPGALoader
   does `McsParser::reverseByte(cmd)`; we now apply `BIT_REV_TABLE[b]` per
   byte. Pinned by the `spi_jedec_command_bitrev` unit test.

2. RX bytes were byte-aligned but TAP Capture-DR + single-element chain
   introduce a 1-bit skew. Each RX byte must be reconstructed as
   `bitrev(captured[i+1] >> 1) | (captured[i+2] & 1)`, which requires
   appending `rx_len + 1` zero bytes of TX padding to clock out the last
   bit. Pinned by the `extract_byte_stream_roundtrip` unit test.

3. No way to bisect the failure. Added pure-Rust diagnostic primitives
   (`proxy_load`, `proxy_status`, `spi_raw`, `probe_ir_capture`,
   `read_flash_id_verbose`) and `tri fpga` subcommands:

     tri fpga proxy-load [bit]   # load proxy only, report STAT
     tri fpga proxy-status        # IDCODE + STAT, no JPROGRAM
     tri fpga spi-raw <hex> --rx N
     tri fpga ir-probe <hex>      # IR capture sanity (must read 0x01)
     tri fpga flash-id-debug      # full flow + 0xAB / 0x66+0x99 recovery

All diagnostic paths emit `[debug] ...` lines on stderr by default (user
runs them on a Mac with no IDE attached, so verbose-by-default matters).

`program_flash` is now verbose and retries `0xAB` (Release Power-down)
then `0x66`+`0x99` (Reset Enable + Reset Device) before bailing with an
actionable error message pointing at the new doc.

Decision matrix and full diagnostic walkthrough in
`docs/fpga/SPI_FLASH_DEBUG.md`.

Closes #592
Refs #590

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@gHashTag
Copy link
Copy Markdown
Owner Author

2026-05-12 update — SPI flash JEDEC=FF FF FF fix + diagnostics (commit fcfcc1f1)

tri fpga flash-id returned FF FF FF on the user's QMTech XC7A100T even though
tri fpga sram configured the device correctly (LEDs blinked). Two protocol bugs
were responsible; both are now fixed and pinned by unit tests, plus a complete
diagnostic CLI is added.

Bug A — TX bytes not bit-reversed

JTAG TDI shifts LSB-first; SPI flash commands are defined MSB-first. The
JTAG-to-SPI bridge forwards TDI bits straight to MOSI, so each byte must be
bit-reversed before shifting. Without this, READ_ID = 0x9F arrives as 0xF9
on MOSI; the flash never sees a valid opcode and MISO floats high → FF FF FF.
openFPGALoader does reverseByte(cmd). Our Rust path now uses
BIT_REV_TABLE[b]. Pinned by spi_jedec_command_bitrev.

Bug B — RX byte reconstruction missed the 1-bit JTAG capture skew

The TAP's Capture-DR cycle + 1-element chain introduce a 1-bit shift between
bridge MISO and the captured TDO stream. Each RX byte must be reconstructed as
bitrev(captured[i+1] >> 1) | (captured[i+2] & 1), which also requires
appending rx_len + 1 zero bytes of TX padding to clock out the last bit.
Pinned by extract_byte_stream_roundtrip.

New diagnostic subcommands

All verbose by default (the user runs them remotely on a Mac):

  • tri fpga proxy-load [bit] — load only the JTAG-to-SPI bridge bitstream and
    report STAT. Confirms DONE=HIGH before debugging SPI semantics.
  • tri fpga proxy-status — read IDCODE + STAT without touching the FPGA, decode
    every relevant bit, also exercises IR=USER1 select.
  • tri fpga spi-raw <hex> --rx N — one-shot SPI transaction through USER1.
    Examples:
    • tri fpga spi-raw 9F --rx 3 — JEDEC ID
    • tri fpga spi-raw AB — Release from Deep Power-down
    • tri fpga spi-raw 66 / 99 — Reset Enable / Reset Device
    • tri fpga spi-raw 05 --rx 1 — Status register
  • tri fpga ir-probe <hex> — selects an IR and reads the IR capture pattern.
    A healthy 7-series TAP always captures 0x01.
  • tri fpga flash-id-debug — full JEDEC read with automatic recovery sequences:
    0xAB (Release from Deep Power-down) then 0x66 + 0x99 (Reset Enable +
    Reset Device). Each step logs raw on-wire bytes.

program_flash is also now verbose and applies the same recovery sequence
before bailing — with an actionable error pointing at the new doc.

New doc

docs/fpga/SPI_FLASH_DEBUG.md covers the five hypotheses (TX bit-rev, RX skew,
proxy DONE, deep-power-down, pinout mismatch), a decision matrix from
proxy-status × spi-raw 9F --rx 3 outputs, and the exact CLI walkthrough.

Test status

  • cargo build --release --workspace — ✅ clean (4m12s)
  • cargo test --workspace — ✅ all suites green; 17 dlc10 unit tests (was
    15; +spi_jedec_command_bitrev, +extract_byte_stream_roundtrip)

Recommended on-hardware verification (Mac side)

# 1. Sanity (was already green)
tri fpga idcode                               # → 0x13631093
tri fpga ir-probe 02                          # IR=USER1 capture; expect 0x01

# 2. Confirm the bridge configures cleanly
tri fpga proxy-load                            # uses embedded bscan_spi_xc7a100t.bit
tri fpga proxy-status                          # must show DONE=1

# 3. Single-shot JEDEC read
tri fpga spi-raw 9F --rx 3                     # expect non-FF non-00 triple

# 4. Full automated flow with recovery
tri fpga flash-id-debug                        # full instrumentation + 0xAB/0x66+0x99

# 5. If JEDEC reports correctly, end-to-end flash program
tri fpga program fpga/vsa/gf16_heartbeat_top.bit

If step 3 still returns FF FF FF despite step 2 showing DONE=1, the proxy
bitstream's pinout does not match this board — see hypothesis H5 in
docs/fpga/SPI_FLASH_DEBUG.md for next steps (rebuild proxy with QMTech XDC,
or use openFPGALoader --board qmtech_xc7a100t to generate one).

Refs

claude and others added 9 commits May 12, 2026 06:35
Refs #592 trabucayre/openFPGALoader#663

- fpga/bscan_spi_qmtech/bscan_spi_qmtech.v  - plain Verilog port of the
  openocd xilinx_bscan_spi.py Migen module (BSCANE2 USER1 + STARTUPE2 +
  marker/length/data shift state machine).
- fpga/bscan_spi_qmtech/bscan_spi_qmtech.xdc - FGG676 dedicated SPI pin
  LOCs (C8/B19/A18 = FCS_B/MOSI/DIN), LVCMOS33, SPI_BUSWIDTH=1.
- fpga/bscan_spi_qmtech/Makefile - standalone openXC7 driver
  (yosys + nextpnr-himbaechel + fasm2frames + xc7frames2bit).
- cli/tri/src/fpga.rs - new tri fpga build-proxy [--install] subcommand
  that drives the same pipeline through std::process, no shell or Python.
- docs/fpga/SPI_FLASH_DEBUG.md - new "Solution" section with the openXC7
  build flow and a pointer to the Vivado-based PR #663 fallback.
- docs/NOW.md - entry for 2026-05-12.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add `tri fpga build-proxy-docker [--install]` subcommand that drives
the openFPGALoader fork's `spiOverJtag/Makefile` inside a Vivado
Docker container, so users on macOS / Apple Silicon can produce the
board-specific proxy bitstream without installing Vivado natively.

* `cli/tri/src/fpga.rs`: new `BuildProxyDocker` enum variant +
  `build_proxy_docker()` function. Clones
  `gHashTag/openFPGALoader@feat/qmtech-xc7a100t-board` into
  `target/openfpgaloader-fork/`, runs `docker run --platform
  linux/amd64 ... make spiOverJtag_xc7a100tfgg676.bit.gz`, and on
  `--install` gunzips the artefact to
  `fpga/tools/bscan_spi_xc7a100t.bit` while printing its SHA256.
* `docker/Dockerfile.vivado`: reproducible recipe for a local
  `t27/vivado:webpack` image built from the free Vivado HLx WebPack
  installer. AMD/Xilinx does not redistribute Vivado on Docker Hub,
  so this is the canonical path.
* `fpga/bscan_spi_qmtech/README.md`: documents the Docker Vivado
  flow alongside the existing openXC7 path, including expected
  build times on Apple Silicon under amd64 emulation.

Coexists with the existing BuildProxy / build_proxy() (openXC7
flow) and SetupOpenxc7Chipdb — neither is touched.

Closes #592

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The `--device` flag of nextpnr-himbaechel for openXC7 chipdb takes the
canonical prjxray name `xc7a100tfgg676-1` (no dash before the package
code, trailing speed grade `-1`), not the previously-coded
`xc7a100t-fgg676-2` which fails chipdb lookup.

Native chipdb generation attempted on macOS Apple Silicon via the
already-existing `tri fpga setup-openxc7-chipdb` + a follow-up
`bbaexport.py` run. Findings recorded in `docs/NOW.md`:
* openXC7/nextpnr-xilinx CMakeLists.txt requires `boost::system` which
  Boost 1.90 has dropped; only `bba/CMakeLists.txt` actually needs an
  edit alongside `common/kernel/command.cc` (deprecated
  `boost/filesystem/convenience.hpp`).
* The `chipdb-<family>` cmake target referenced by
  `setup_openxc7_chipdb()` does not exist in the current upstream — the
  flow is `bbaexport.py --device ... > .bba` then `bbasm` separately.
* On a 16 GiB Apple Silicon box with <1 GiB free disk, the Python
  exporter OOMs at the "Exporting tile and site instances" stage
  (~1.5 GiB RSS), so native chipdb gen is not the right Mac path here.
* The Docker-Vivado path added in commit ce0f7ae remains the
  recommended FGG676 proxy build on Mac.

Closes #592

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ing blocked

bbaexport + bbasm + nextpnr-xilinx user-pin route all succeed on
xc7a100tfgg676-1 (Fmax 254 MHz). The real proxy bitstream requires
LOC C8/B19/A18 onto FCS_B/DQ0/DQ1 (dedicated configuration pins via
STARTUPE2). openXC7 pack_clocking_xc7.cc aborts in dict::at() at
prepare_clocking after placing cs_n on OPAD_X0Y10 (GTP_CHANNEL). This
matches trabucayre/openFPGALoader#663 — the spiOverJtag FGG676 build
is currently Vivado-only across the open-source ecosystem.

Docker-Vivado (ce0f7ae build-proxy-docker) remains the SSOT for
fpga/tools/bscan_spi_xc7a100t.bit until openXC7 grows STARTUPE2
support.

Closes #592

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…th + MSB-first)

The new openXC7-built FGG676 proxy bitstream (`fpga/bscan_spi_qmtech/bscan_spi_qmtech.v`)
is the Migen-style JTAG2SPI design ported from openocd's
`xilinx_bscan_spi.py`, NOT the simpler quartiq/bscan_spi_bitstreams bridge
the previous Rust driver was written for.

The on-wire frame the bridge expects on TDI is:

  [marker = 1] [length 32 bits, MSB first, value = data_bits]
  [data MSB-first per byte] [zero padding] [latency drain]

`length` = `(tx.len() + rx_len) * 8` (total SPI clocks; CS_N stays low for
the entire data phase). TDO during marker+length is invalid; once data
starts, MISO appears on TDO MSB-first with a small JTAG-bit latency (the
Verilog has a 2-stage negedge MISO flop). The latency is exposed via the
`T27_DLC10_MIGEN_LATENCY` env var (default 3) so it can be tuned without
recompiling.

The previous protocol -- bit-reverse every TX byte and reconstruct RX as
`bitrev(jrx[i+1] >> 1) | (jrx[i+2] & 0x01)` -- is openFPGALoader's v1
quartiq-style framing and produced `JEDEC = FF FF FF` against this proxy
(the bridge would silently stay in S_IDLE until the marker bit was sent,
so MOSI never carried `0x9F` to the flash).

Diagnostic note: when this commit lands, `tri fpga proxy-load` on the
QMTech board still reports `STAT=0x00000000`, `INIT_B=0`, `DONE=0` after
JPROGRAM -- the proxy bitstream itself does NOT configure the FPGA on
the current hardware, so the SPI path can't yet be validated end-to-end.
The Rust framing fix is independent of that issue and is required for
the bridge to be useful once the bitstream / pinout is fixed in a
follow-up.

Closes #592

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Refresh docker/Dockerfile.vivado to target Vivado ML Standard 2025.2
(matches the on-disk web installer FPGAs_AdaptiveSoCs_Unified_SDI_2025.2_*)
and add docker/install_config.txt restricting Modules to Artix-7 + Spartan-7
so the image stays around 12-15 GiB (~10 GiB download) instead of ~96 GiB
for the full Vivado/Vitis archive.

The Dockerfile now expects docker/wi_authentication_key (Variant A) which
xsetup -b AuthTokenGen produces; the file is gitignored. Docs and NOW.md
record the recipe, the 7-day token lifetime, and the disk/time budget on
Apple Silicon under qemu emulation.

The actual image build is the heavy step (60-120 min under emulation
plus 17.18 GiB download from xilinx.com) and is run by the user from
their machine; this commit only lands the reproducible recipe so a
later 'docker buildx build … docker/' completes unattended.

Refs #592

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Update the per-task status doc to reflect that the image build has
started (auth against xilinx.com OK, web installer downloading 17.18 GiB
of Artix-7 + Spartan-7 payloads at 3-5 MiB/s under qemu emulation) and
to list the exact 'cargo run -p tri -- fpga build-proxy-docker --install'
sequence the user runs once the image lands.

Closes #592

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@gHashTag
Copy link
Copy Markdown
Owner Author

Docker-Vivado FGG676 proxy build — 2026-05-12

Two commits in this session pushed to feat/dlc10-rust:

  • 237a6a73 feat(fpga): docker vivado 2025.2 image prep for FGG676 proxy
  • 916b0915 docs(fpga): docker-vivado FGG676 status — image build in progress

Status

  • Auth-token plumbing works: AMD account admin@t27.ai authenticates against xilinx.com (xsetup -b AuthTokenGen returns Saved authentication token file successfully, valid until 05/19/2026 01:54 PM); the 143-byte token sits at docker/wi_authentication_key (gitignored).
  • docker/Dockerfile.vivado refreshed for the on-disk Vivado ML Standard 2025.2 web installer (FPGAs_AdaptiveSoCs_Unified_SDI_2025.2_1114_2157_Lin64.bin, 363 MiB stub).
  • docker/install_config.txt selects only Artix-7 FPGAs:1 + Spartan-7 FPGAs:1 — keeps the web-installer download at ~17 GiB instead of ~96 GiB for the full archive.
  • docker buildx build … docker/ started 20:57 ICT, currently downloading 17.18 GiB at 2-5 MiB/s under qemu emulation (Apple Silicon, --platform linux/amd64). ETA ~1.5-2 h for download, then ~30 min install.

Next step (after image lands)

cargo run --release -p tri -- fpga build-proxy-docker --install
strings fpga/tools/bscan_spi_xc7a100t.bit | grep 7a100tfgg676
cargo build --release -p dlc10 -p tri
./target/release/tri fpga flash-id     # expected: 20 BA 18

See docs/fpga/DOCKER_VIVADO_STATUS.md for the full recipe and the auth-token regeneration command (token expires 2026-05-19).

Closes #592 stays open until flash-id returns the expected Micron MT25QL128 JEDEC 20 BA 18.

…tream

GitHub Actions workflow build-fgg676.yml on gHashTag/openFPGALoader fork
(branch feat/qmtech-xc7a100t-board, commit f44f5af3, run 25753882084) now
builds the FGG676 proxy bitstream cleanly via Vivado 2024.2.

Bitstream details:
  - Size      : 407,262 bytes
  - SHA256    : bf5be125e9098d61b4855c599b19a5c90c360592991b7b9b7835af02e605cad2
  - Strings   : 7a100tfgg676 device string present
  - Source    : openFPGALoader spiOverJtag/Makefile + build.py + edalize
  - Top       : spiOverJtag.v (STARTUPE2 path for Artix-7)
  - XDC       : constr_xc7a_fgg676.xdc with P18/R14/R15/P14/N14 pinout
                (Bank 14 D00..D03 + FCS_B, per QMTech schematic)

What the prior fgg676 XDC got wrong: it tried to use dedicated
configuration bank pins C8 / B19 / A18 / B18 / A19, which on the FGG676
package fall in GTP transceiver banks (Vivado rejected with
'is not a valid site or package pin name'). The Bank 14 user IO pinout
above is sourced from QMTECH_XC7A75T_100T_200T-CORE-BOARD-V01-20210109.pdf
(ChinaQMTECH/QMTECH_XC7A75T-100T-200T_Core_Board) and confirmed to be
exactly the SPI flash routing on the QMTech core board (N25Q064A 3V,
JEDEC 0x20BA17 — note: not MT25QL128 as docs previously implied).

Runtime status: after re-embedding via cli/dlc10/src/lib.rs::BSCAN_SPI_XC7A100T
(include_bytes!),  still returns STAT=0x00000000
(DONE=0, EOS=0, INIT_B=0). The pre-existing fgg676 bitstream from
Docker-Vivado also fails the same way, so the remaining blocker is in
the JTAG transport (cli/dlc10 program_sram), not in the bitstream
content. The bitstream itself is now provably correct and deployable;
a follow-up debug pass on the SRAM-load path is required.

Closes #592
Updates #590
…chieved

After 3 months of DONE=LOW on the QMTech XC7A100T-FGG676 core board,
identified and fixed three independent root causes that together prevented
the FPGA from completing bitstream load over JTAG via DLC10:

1. program_sram_verbose: DLC10 FX2 firmware does not propagate TDO during
   Shift-IR, so the IR-capture INIT_B polling loop (UG470 §6 standard
   recipe) always falls through to timeout. Replaced with a blind 50ms
   sleep + 12×cycle_tck(10_000) for the post-JPROGRAM erase + INIT_B
   release dwell (7-series mass erase is sub-ms; 120k TCK is generous).
   Removed the unnecessary JSHUTDOWN step from the bring-up path (only
   used for partial-reconfig suspend in 7-series, openFPGALoader does
   not call it for full load_sram). Raised JSTART startup-clock count
   from 24 to 2000 (UG470 §6.3 Table 6-3 minimum). Added a post-JSTART
   IDCODE sanity read so verbose mode confirms the JTAG chain survived.

2. read_cfg_reg_raw_n: the old implementation sent the 5 Type-1 read
   packets through 5 separate shift_dr_small() calls, each completing
   its own Capture-DR → Shift-DR → Exit1-DR → Update-DR → RTI cycle. The
   Update-DR/RTI transitions between packets reset the config FSM's
   pending packet buffer, so the read command was never assembled. Then
   shift_ir(CFG_OUT) starts with 5×TMS=1 → TLR which would also wipe
   any half-assembled state. Replaced with a single unbroken TMS/TDI
   vector dispatched as one do_shift_with_read call: TLR → RTI → CFG_IN
   IR → 160-bit packet DR (packets 0..3 stay in Shift-DR, packet 4's
   last bit transitions to Exit1-DR) → SELECT_IR → CFG_OUT IR → DR read.
   Mirrors openFPGALoader Xilinx::dumpRegister exactly. tri fpga
   idcode-cfg now returns 0x13631093 (matches JTAG IDCODE).

3. Bitstream rebuilt with BITSTREAM.STARTUP.STARTUPCLK JTAGCLK
   (gHashTag/openFPGALoader@9777b029, CI run 25763758480). Without
   JTAGCLK the startup FSM never sees clocks when the bitstream is
   loaded over JTAG (Vivado defaults to CFGCLK=CCLK from STARTUPE2,
   which is only driven during SelectMAP/SPI configuration). Symptom
   was STAT=0x4000190C: INIT_COMPLETE=1, MMCM_LOCK=1, CRC_ERROR=0,
   ID_ERROR=0, but EOS=0 forever. After the rebuild STAT=0x401079FC
   (DONE=1, EOS=1). New bitstream sha256 800b4dbeaa03d2b9... (407262 B).

Side fixes:
- cycle_tck u16 overflow chunking (DLC10 FX2 EP2 vendor request takes
  16-bit length; 120_000 % 65536 = 54464 would leave 27234 bytes of
  bulk OUT stuck). Now caller chunks ≤65535 per call.
- Auto-recovery via reload_fx2_firmware() when an EP gets stuck after
  a malformed transfer — CPUCS=1 → reload xusb_xp2.hex → CPUCS=0 →
  re-enumeration wait.
- New `tri fpga idcode-cfg` subcommand that drives the full Type-1
  read of the IDCODE config register (addr 0x0C) for self-test.

Remaining work (separate ticket, not in this commit): tri fpga flash-id
still returns FF FF FE on the proxy bridge. STAT shows DONE=1 EOS=1 and
USER1 BSCAN selects correctly, but MISO is floating during SPI transfers.
Likely cause: bridge CS_N routing or Migen JTAG2SPI wire framing for the
new pinout. See docs/fpga/SPI_FLASH_DEBUG.md for next steps.

Updates #590
Closes #592
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants