This describes the current intended behaviour of the devicecode fabric link, as implemented on the Lua side and intended for the matching TinyGo side.
This is still a first pass. The aim is a clean and reliable CM5 ↔ MCU control-plane link over UART, with room to evolve later.
The main design choices remain:
- keep raw UART bytes out of the local in-process bus;
- keep OS and UART ownership on the Lua side inside HAL;
- make
fabric the only service that knows about remote peers;
- carry a small explicit protocol over a byte stream;
- preserve useful bus semantics such as publish, retained state, unretain, and directed call/reply;
- avoid pretending the remote side is just another in-process bus connection.
1. Big picture
On the Lua side there is one fabric service. It owns one session per configured link. For the MCU link, that session currently uses a UART transport.
HAL on the Lua side opens the UART and returns a Stream capability to fabric. fabric then reads and writes protocol messages on that stream.
On the TinyGo side, the matching component should be the peer session layer over the UART. It sits above a raw byte stream and below the MCU’s internal service/runtime environment.
So the UART link is:
- Lua HAL owns UART fd / driver
- Lua fabric speaks fabric protocol over the stream
- TinyGo fabric speaks the same protocol over the raw UART
- TinyGo fabric imports/exports messages to the MCU’s internal service world
The protocol meaning should remain transport-neutral even though, for now, it is carried over UART.
2. First-pass scope
Version 1 supports:
- link bring-up and peer handshake
- heartbeat
- ordinary publish
- retained publish
- unretain
- directed call / reply, equivalent to lane-B-style RPC proxying
Version 1 does not yet support:
- distributed subscriptions
- route advertisements
- multi-hop mesh forwarding
- firmware transfer
- binary framing
- on-wire authentication
3. Wire format
For the first implementation, the wire format is deliberately simple:
- one compact JSON object per line
- UTF-8 text
- newline (
\n) terminates one message
Conceptually:
{"t":"hello",...}\n
{"t":"pub",...}\n
{"t":"call",...}\n
Framing rule
Treat the UART as a byte stream. Accumulate bytes until newline, then decode that full line as one JSON message.
Encoding rule
Do not emit pretty-printed JSON. One compact JSON object per line only.
Practical note
JSON strings may contain escaped \n, but not literal frame-breaking newline bytes. A normal JSON encoder does the right thing.
4. Protocol version
The current protocol version is:
This is carried in handshake messages.
A peer should reject or ignore incompatible protocol versions rather than trying to continue with mismatched assumptions.
5. Message types
All messages are JSON objects with a required string field t.
5.1 hello
Example:
{"t":"hello","node":"cm5-local","peer":"mcu-1","sid":"9e3b...","proto":1,"caps":{"pub":true,"call":true}}
Fields:
t: "hello"
node: sender node id
peer: intended remote peer id
sid: sender session id
proto: protocol version
caps: high-level capability flags
Semantics
The sender is saying:
- this is who I am (
node)
- this is who I think you are (
peer)
- this is my current session id (
sid)
- this is the protocol version I am speaking (
proto)
- these are the capability families I support (
caps)
Receiver behaviour
On receiving hello:
- validate the shape;
- verify
peer is acceptable for this device;
- verify
proto is supported;
- record remote
node;
- record remote
sid;
- treat a changed remote
sid as a fresh peer session;
- send back
hello_ack.
A fresh peer session means any pending call/reply correlation state tied to the previous peer session should be discarded.
5.2 hello_ack
Example:
{"t":"hello_ack","node":"mcu-1","sid":"a12f...","proto":1,"ok":true}
Fields:
t: "hello_ack"
node: sender node id
sid: sender session id
proto: protocol version
ok: boolean, currently expected to be true
Semantics
Acknowledges handshake and provides the sender’s own current session identity.
On receiving hello_ack, the peer should:
- verify
proto;
- record remote
node;
- record remote
sid;
- treat a changed remote
sid as a fresh peer session.
5.3 ping
Example:
{"t":"ping","ts":1712345678,"sid":"9e3b..."}
Fields:
t: "ping"
ts: sender timestamp, opaque in v1
sid: sender session id
Behaviour
Reply with pong.
5.4 pong
Example:
{"t":"pong","ts":1712345678,"sid":"a12f..."}
Fields:
t: "pong"
ts: opaque timestamp
sid: sender session id
Semantics
Heartbeat only. No strict clock semantics in v1.
5.5 pub
Example:
{"t":"pub","topic":["state","mcu","health"],"payload":{"ok":true},"retain":false}
Fields:
t: "pub"
topic: array of non-empty strings
payload: arbitrary JSON value
retain: boolean
Semantics
Publish one message into the peer’s import rules.
If retain is true, the receiver should treat this as retained state for the mapped local topic.
If retain is false, treat it as a transient publish.
Constraint
For v1, topic tokens on the wire are strings only.
5.6 unretain
Example:
{"t":"unretain","topic":["state","mcu","health"]}
Fields:
t: "unretain"
topic: array of non-empty strings
Semantics
Clear retained state for the mapped local topic.
5.7 call
Example:
{"t":"call","id":"f6a2...","topic":["rpc","hal","read_state"],"payload":{"ns":"config","key":"services"},"timeout_ms":5000}
Fields:
t: "call"
id: correlation id generated by caller
topic: concrete topic array for the remote directed call target
payload: arbitrary JSON value
timeout_ms: advisory timeout in milliseconds
Semantics
This is a directed request to the remote peer. The receiver should map topic through import-call rules, invoke the corresponding local handler, and send exactly one reply.
Important rule
call.topic must be concrete in v1. No wildcards.
5.8 reply
Success example:
{"t":"reply","corr":"f6a2...","ok":true,"payload":{"found":true,"data":"..."}}
Failure example:
{"t":"reply","corr":"f6a2...","ok":false,"err":"timeout"}
Fields:
t: "reply"
corr: correlation id matching a previous call.id
ok: boolean
payload: present when ok=true
err: string when ok=false
Semantics
Completes one pending call.
Exactly one reply should be emitted per accepted call.
If the receiver cannot route or execute the call, it should still reply with ok=false.
6. Topic model
Topics on the wire are JSON arrays of strings.
Examples:
["state","mcu","health"]
["rpc","hal","dump"]
["config","device"]
Do not encode topics as slash-separated strings on the wire.
7. Topic remapping
Each side uses static configured remapping rules.
A rule is conceptually:
- local pattern ↔ remote pattern
with wildcard support:
+ means one token
# means the remaining tail
Example
Remote:
maps to local:
{ "peer", "mcu-1", "state", "#" }
So a remote publish:
{"t":"pub","topic":["state","net","link","wan0"], ...}
becomes locally:
{"peer","mcu-1","state","net","link","wan0"}
Recommendation for TinyGo
Mirror the same mechanism:
- export rules for what the MCU may send out
- import rules for what the MCU accepts from the CM5
- proxy-call rules for directed RPC
Keep these static in v1.
8. Directed call mapping
There are two directions.
8.1 Lua local → TinyGo remote
Lua fabric binds local proxy endpoints. When called locally, Lua fabric sends:
{"t":"call","id":"...","topic":["rpc","mcu","reboot_to_bootloader"],"payload":{"reason":"update"},"timeout_ms":5000}
TinyGo fabric should:
- map that topic to a local MCU service handler;
- invoke it;
- send back
reply.
8.2 TinyGo local → Lua remote
TinyGo fabric may send a call to a configured remote target, for example:
{"t":"call","id":"...","topic":["rpc","hal","read_state"],"payload":{"ns":"config","key":"services"},"timeout_ms":5000}
Lua fabric maps that to a local call target and returns a reply.
Rule for both sides
If no route matches, send:
{"t":"reply","corr":"...","ok":false,"err":"no_route"}
Do not silently drop a call.
9. Retained state semantics
Retained state is simple in v1.
If retain=true on a pub, the receiver should treat that as the current retained value for the mapped topic.
If an unretain arrives, the receiver should clear the retained value for the mapped topic.
Reconnect behaviour
On link-up, the exporter should replay the current retained exported state.
On the Lua side this is implemented by watching retained lifecycle with replay enabled. The TinyGo side should follow the same model conceptually:
- on session establishment, emit current retained exported state again;
- later retained updates become
pub(..., retain=true);
- later retained removals become
unretain.
10. Session state
The Lua side currently uses these session states:
opening
session_up
ready
down
Meaning:
opening: transport is up, handshake/local setup incomplete
session_up: peer session established, local forwarding surfaces not yet all installed
ready: peer session established and local forwarding surfaces installed
down: terminal failure for this link session
The TinyGo side does not need to mirror the exact names, but should have equivalent internal distinctions.
Suggested minimal state
At minimum:
- link status
- remote node id
- local session id
- remote session id
- last hello seen
- last heartbeat seen
- pending outgoing calls by correlation id
- import/export/proxy rule tables
11. Session replacement
A change in peer session id is significant.
If a peer sends hello or hello_ack with a different sid from the current recorded peer session, treat that as:
- a fresh peer session;
- reset pending call correlation state;
- keep the transport up, but replace peer-session state.
Late replies from the old peer session should be dropped.
12. Error handling rules
These should be followed on both sides.
Invalid JSON line
- log it;
- discard it;
- do not immediately bring the whole session down.
Oversize line
- log it;
- discard it;
- count it as a bad frame.
Unknown t
Malformed message of known type
- log it;
- ignore it;
- if it is recognisably a
call and has a usable id, best effort reply with ok=false.
Call with no route
- reply with
ok=false, err="no_route".
Local handler failure
- reply with
ok=false, err="<reason>".
Timeout waiting for reply
- local caller times out;
- clear pending entry;
- treat late reply as unknown and drop it.
Repeated bad frames
A few bad frames should not kill the session immediately. Repeated bad frames within a time window should.
Current Lua defaults are:
- bad frame limit:
5
- bad frame window:
30s
These are local policy defaults, not wire-level requirements.
13. Timeouts and liveness
Current Lua defaults are:
- hello retry:
10s
- idle ping interval:
15s
- stale link timeout:
45s
Directed call timeout:
- use
timeout_ms if present and sensible;
- otherwise use a local default, typically around
5s.
These are local policy defaults, not hard protocol guarantees.
14. Frame size limits
For v1, both sides should use a bounded maximum line size.
Current Lua default:
The TinyGo side should also enforce a fixed maximum line length and reject oversize input safely.
Do not allow unbounded line buffering.
15. Payload shape
Payloads are JSON values. In practice, use JSON objects for application-facing messages.
Do not put binary blobs directly into this v1 control-plane protocol. Firmware transfer is a later subprotocol.
16. Transport support
Current implemented transport on the Lua side is:
Other transport kinds may appear later, but should not be assumed for current interop work.
17. TinyGo implementation structure
A sensible structure is:
UART transport
Responsible for:
- reading bytes until newline
- enforcing max line size
- writing one JSON line plus newline
Session layer
Responsible for:
hello / hello_ack
ping / pong
- peer session id tracking
- pending call map
- bad-frame budget
- dispatch by
t
Router
Responsible for:
- applying import/export rules
- mapping incoming
pub to local topics
- mapping incoming
call to local handlers
- forwarding local exported publishes to the wire
- replaying retained exported state on session establishment
Local integration
Responsible for:
- local publish
- local retained update / clear
- local directed call handling
- local retained replay source
18. Deliberate non-feature in v1
Firmware transfer is not part of this first control protocol.
Once the control path is solid, a separate bulk-transfer subprotocol can be added with messages such as:
begin
ready
need
chunk
done
abort
That should not just be “a very large JSON message”.
19. Practical examples
19.1 CM5 announces retained config to MCU
{"t":"pub","topic":["config","device"],"payload":{"schema":"devicecode.mcu/1","rev":3,"data":{"mode":"normal"}},"retain":true}
19.2 MCU publishes retained health to CM5
{"t":"pub","topic":["state","mcu","health"],"payload":{"ok":true,"temp_c":41.2},"retain":true}
19.3 MCU clears retained health
{"t":"unretain","topic":["state","mcu","health"]}
19.4 CM5 calls remote MCU method
{"t":"call","id":"1234","topic":["rpc","mcu","reboot_to_bootloader"],"payload":{"reason":"update"},"timeout_ms":5000}
Reply:
{"t":"reply","corr":"1234","ok":true,"payload":{"accepted":true}}
20. Minimal implementation checklist
For the first working milestone, the TinyGo side should do all of this:
This describes the current intended behaviour of the
devicecodefabric link, as implemented on the Lua side and intended for the matching TinyGo side.This is still a first pass. The aim is a clean and reliable CM5 ↔ MCU control-plane link over UART, with room to evolve later.
The main design choices remain:
fabricthe only service that knows about remote peers;1. Big picture
On the Lua side there is one
fabricservice. It owns one session per configured link. For the MCU link, that session currently uses a UART transport.HAL on the Lua side opens the UART and returns a
Streamcapability tofabric.fabricthen reads and writes protocol messages on that stream.On the TinyGo side, the matching component should be the peer session layer over the UART. It sits above a raw byte stream and below the MCU’s internal service/runtime environment.
So the UART link is:
The protocol meaning should remain transport-neutral even though, for now, it is carried over UART.
2. First-pass scope
Version 1 supports:
Version 1 does not yet support:
3. Wire format
For the first implementation, the wire format is deliberately simple:
\n) terminates one messageConceptually:
Framing rule
Treat the UART as a byte stream. Accumulate bytes until newline, then decode that full line as one JSON message.
Encoding rule
Do not emit pretty-printed JSON. One compact JSON object per line only.
Practical note
JSON strings may contain escaped
\n, but not literal frame-breaking newline bytes. A normal JSON encoder does the right thing.4. Protocol version
The current protocol version is:
This is carried in handshake messages.
A peer should reject or ignore incompatible protocol versions rather than trying to continue with mismatched assumptions.
5. Message types
All messages are JSON objects with a required string field
t.5.1
helloExample:
{"t":"hello","node":"cm5-local","peer":"mcu-1","sid":"9e3b...","proto":1,"caps":{"pub":true,"call":true}}Fields:
t:"hello"node: sender node idpeer: intended remote peer idsid: sender session idproto: protocol versioncaps: high-level capability flagsSemantics
The sender is saying:
node)peer)sid)proto)caps)Receiver behaviour
On receiving
hello:peeris acceptable for this device;protois supported;node;sid;sidas a fresh peer session;hello_ack.A fresh peer session means any pending call/reply correlation state tied to the previous peer session should be discarded.
5.2
hello_ackExample:
{"t":"hello_ack","node":"mcu-1","sid":"a12f...","proto":1,"ok":true}Fields:
t:"hello_ack"node: sender node idsid: sender session idproto: protocol versionok: boolean, currently expected to betrueSemantics
Acknowledges handshake and provides the sender’s own current session identity.
On receiving
hello_ack, the peer should:proto;node;sid;sidas a fresh peer session.5.3
pingExample:
{"t":"ping","ts":1712345678,"sid":"9e3b..."}Fields:
t:"ping"ts: sender timestamp, opaque in v1sid: sender session idBehaviour
Reply with
pong.5.4
pongExample:
{"t":"pong","ts":1712345678,"sid":"a12f..."}Fields:
t:"pong"ts: opaque timestampsid: sender session idSemantics
Heartbeat only. No strict clock semantics in v1.
5.5
pubExample:
{"t":"pub","topic":["state","mcu","health"],"payload":{"ok":true},"retain":false}Fields:
t:"pub"topic: array of non-empty stringspayload: arbitrary JSON valueretain: booleanSemantics
Publish one message into the peer’s import rules.
If
retainistrue, the receiver should treat this as retained state for the mapped local topic.If
retainisfalse, treat it as a transient publish.Constraint
For v1, topic tokens on the wire are strings only.
5.6
unretainExample:
{"t":"unretain","topic":["state","mcu","health"]}Fields:
t:"unretain"topic: array of non-empty stringsSemantics
Clear retained state for the mapped local topic.
5.7
callExample:
{"t":"call","id":"f6a2...","topic":["rpc","hal","read_state"],"payload":{"ns":"config","key":"services"},"timeout_ms":5000}Fields:
t:"call"id: correlation id generated by callertopic: concrete topic array for the remote directed call targetpayload: arbitrary JSON valuetimeout_ms: advisory timeout in millisecondsSemantics
This is a directed request to the remote peer. The receiver should map
topicthrough import-call rules, invoke the corresponding local handler, and send exactly onereply.Important rule
call.topicmust be concrete in v1. No wildcards.5.8
replySuccess example:
{"t":"reply","corr":"f6a2...","ok":true,"payload":{"found":true,"data":"..."}}Failure example:
{"t":"reply","corr":"f6a2...","ok":false,"err":"timeout"}Fields:
t:"reply"corr: correlation id matching a previouscall.idok: booleanpayload: present whenok=trueerr: string whenok=falseSemantics
Completes one pending call.
Exactly one reply should be emitted per accepted call.
If the receiver cannot route or execute the call, it should still reply with
ok=false.6. Topic model
Topics on the wire are JSON arrays of strings.
Examples:
["state","mcu","health"]["rpc","hal","dump"]["config","device"]Do not encode topics as slash-separated strings on the wire.
7. Topic remapping
Each side uses static configured remapping rules.
A rule is conceptually:
with wildcard support:
+means one token#means the remaining tailExample
Remote:
{ "state", "#" }maps to local:
{ "peer", "mcu-1", "state", "#" }So a remote publish:
{"t":"pub","topic":["state","net","link","wan0"], ...}becomes locally:
{"peer","mcu-1","state","net","link","wan0"}Recommendation for TinyGo
Mirror the same mechanism:
Keep these static in v1.
8. Directed call mapping
There are two directions.
8.1 Lua local → TinyGo remote
Lua fabric binds local proxy endpoints. When called locally, Lua fabric sends:
{"t":"call","id":"...","topic":["rpc","mcu","reboot_to_bootloader"],"payload":{"reason":"update"},"timeout_ms":5000}TinyGo fabric should:
reply.8.2 TinyGo local → Lua remote
TinyGo fabric may send a
callto a configured remote target, for example:{"t":"call","id":"...","topic":["rpc","hal","read_state"],"payload":{"ns":"config","key":"services"},"timeout_ms":5000}Lua fabric maps that to a local call target and returns a
reply.Rule for both sides
If no route matches, send:
{"t":"reply","corr":"...","ok":false,"err":"no_route"}Do not silently drop a call.
9. Retained state semantics
Retained state is simple in v1.
If
retain=trueon apub, the receiver should treat that as the current retained value for the mapped topic.If an
unretainarrives, the receiver should clear the retained value for the mapped topic.Reconnect behaviour
On link-up, the exporter should replay the current retained exported state.
On the Lua side this is implemented by watching retained lifecycle with replay enabled. The TinyGo side should follow the same model conceptually:
pub(..., retain=true);unretain.10. Session state
The Lua side currently uses these session states:
openingsession_upreadydownMeaning:
opening: transport is up, handshake/local setup incompletesession_up: peer session established, local forwarding surfaces not yet all installedready: peer session established and local forwarding surfaces installeddown: terminal failure for this link sessionThe TinyGo side does not need to mirror the exact names, but should have equivalent internal distinctions.
Suggested minimal state
At minimum:
11. Session replacement
A change in peer session id is significant.
If a peer sends
helloorhello_ackwith a differentsidfrom the current recorded peer session, treat that as:Late replies from the old peer session should be dropped.
12. Error handling rules
These should be followed on both sides.
Invalid JSON line
Oversize line
Unknown
tMalformed message of known type
calland has a usableid, best effort reply withok=false.Call with no route
ok=false, err="no_route".Local handler failure
ok=false, err="<reason>".Timeout waiting for reply
Repeated bad frames
A few bad frames should not kill the session immediately. Repeated bad frames within a time window should.
Current Lua defaults are:
530sThese are local policy defaults, not wire-level requirements.
13. Timeouts and liveness
Current Lua defaults are:
10s15s45sDirected call timeout:
timeout_msif present and sensible;5s.These are local policy defaults, not hard protocol guarantees.
14. Frame size limits
For v1, both sides should use a bounded maximum line size.
Current Lua default:
max_line_bytes = 4096The TinyGo side should also enforce a fixed maximum line length and reject oversize input safely.
Do not allow unbounded line buffering.
15. Payload shape
Payloads are JSON values. In practice, use JSON objects for application-facing messages.
Do not put binary blobs directly into this v1 control-plane protocol. Firmware transfer is a later subprotocol.
16. Transport support
Current implemented transport on the Lua side is:
Other transport kinds may appear later, but should not be assumed for current interop work.
17. TinyGo implementation structure
A sensible structure is:
UART transport
Responsible for:
Session layer
Responsible for:
hello/hello_ackping/pongtRouter
Responsible for:
pubto local topicscallto local handlersLocal integration
Responsible for:
18. Deliberate non-feature in v1
Firmware transfer is not part of this first control protocol.
Once the control path is solid, a separate bulk-transfer subprotocol can be added with messages such as:
beginreadyneedchunkdoneabortThat should not just be “a very large JSON message”.
19. Practical examples
19.1 CM5 announces retained config to MCU
{"t":"pub","topic":["config","device"],"payload":{"schema":"devicecode.mcu/1","rev":3,"data":{"mode":"normal"}},"retain":true}19.2 MCU publishes retained health to CM5
{"t":"pub","topic":["state","mcu","health"],"payload":{"ok":true,"temp_c":41.2},"retain":true}19.3 MCU clears retained health
{"t":"unretain","topic":["state","mcu","health"]}19.4 CM5 calls remote MCU method
{"t":"call","id":"1234","topic":["rpc","mcu","reboot_to_bootloader"],"payload":{"reason":"update"},"timeout_ms":5000}Reply:
{"t":"reply","corr":"1234","ok":true,"payload":{"accepted":true}}20. Minimal implementation checklist
For the first working milestone, the TinyGo side should do all of this:
hellohellowithhello_acksidsidchangepingwithpongpubunretainpubunretaincallreplycallreplyto pending map