Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion rfcs/Proquint.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,18 @@ For example, the proquint encoding of the bytestring `[127, 0, 0, 1]` (the data

```
pro-lusab-babad
```
```

## Odd-byte inputs

The original proquint specification operates on 16-bit chunks and is silent on inputs whose length is not a multiple of two bytes. Multibase requires arbitrary byte sequences to be encodable, so this document specifies the following extension:

When the input has an odd number of bytes, every pair of bytes is encoded as a 5-character `CVCVC` block as usual, and the final byte is encoded as a 3-character `CVC` block representing the high 8 bits of a 16-bit value whose low 8 bits are zero. The trailing `CVC` block is joined to the preceding blocks with `-` in the usual way.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) hm.. the 16-bit framing is hard to follow (and slightly imprecise, iiuc a CVC block carries 10 bits, not 8?). A bullet list spelling out where each pair of bits goes is clearer and avoids the implicit 16-bit construction:

Suggested change
When the input has an odd number of bytes, every pair of bytes is encoded as a 5-character `CVCVC` block as usual, and the final byte is encoded as a 3-character `CVC` block representing the high 8 bits of a 16-bit value whose low 8 bits are zero. The trailing `CVC` block is joined to the preceding blocks with `-` in the usual way.
When the input has an odd number of bytes, every pair of bytes is encoded as a 5-character `CVCVC` block as usual, and the final byte is encoded as a 3-character `CVC` block in which:
- the first consonant carries bits 7-4 of the byte;
- the vowel carries bits 3-2 of the byte;
- the second consonant carries bits 1-0 of the byte in its high 2 bits, with its low 2 bits zero.
The trailing `CVC` block is joined to the preceding blocks with `-` in the usual way.


For example, the single byte `[0x21]` (`!`) is encoded as:

```
pro-fah
```

The second consonant in the trailing `CVC` block carries the two least-significant bits of the input byte in its two most-significant bits; its two least-significant bits MUST be zero. Decoders MUST reject any trailing `CVC` block where this is not the case, so that the encoding is canonical and bijective. The four valid trailing consonants are `b`, `h`, `m`, and `s`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) the example shows pro-fah for a 1-byte input, but the prose only says blocks join with - to preceding blocks. adding a sentence that covers the single-block case removes the ambiguity:

Suggested change
The second consonant in the trailing `CVC` block carries the two least-significant bits of the input byte in its two most-significant bits; its two least-significant bits MUST be zero. Decoders MUST reject any trailing `CVC` block where this is not the case, so that the encoding is canonical and bijective. The four valid trailing consonants are `b`, `h`, `m`, and `s`.
The second consonant in the trailing `CVC` block carries the two least-significant bits of the input byte in its two most-significant bits; its two least-significant bits MUST be zero. Decoders MUST reject any trailing `CVC` block where this is not the case, so that the encoding is canonical and bijective. The four valid trailing consonants are `b`, `h`, `m`, and `s`. When the input is a single byte the trailing `CVC` block is the only block, so the output is `pro-` followed directly by the three characters, as in the example above.

2 changes: 2 additions & 0 deletions tests/basic.csv
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,12 @@ base32hexpadupper, "TF5IN683DC5N6I811"
base32z, "hxf1zgedpcfzg1ebb"
base36, "k2lcpzo5yikidynfl"
base36upper, "K2LCPZO5YIKIDYNFL"
base45, "RRFF.OEB$D5/DZ24"
base58flickr, "Z7Pznk19XTTzBtx"
base58btc, "z7paNL19xttacUY"
base64, "meWVzIG1hbmkgIQ"
base64pad, "MeWVzIG1hbmkgIQ=="
base64url, "ueWVzIG1hbmkgIQ"
base64urlpad, "UeWVzIG1hbmkgIQ=="
proquint, "pro-lojoj-lasob-kujod-kunon-fabod"
base256emoji, "🚀🏃✋🌈😅🌷🤤😻🌟😅👏"
2 changes: 2 additions & 0 deletions tests/leading_zero.csv
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,12 @@ base32hexpadupper, "T01SMASP0DLGMSQ9044======"
base32z, "hybhskh3ypiosh4jyrr"
base36, "k02lcpzo5yikidynfl"
base36upper, "K02LCPZO5YIKIDYNFL"
base45, "RV206$CL44CEC2DDX0"
base58flickr, "Z17Pznk19XTTzBtx"
base58btc, "z17paNL19xttacUY"
base64, "mAHllcyBtYW5pICE"
base64pad, "MAHllcyBtYW5pICE="
base64url, "uAHllcyBtYW5pICE"
base64urlpad, "UAHllcyBtYW5pICE="
proquint, "pro-badun-kijug-fadot-kajov-kohob-fah"
base256emoji, "🚀🚀🏃✋🌈😅🌷🤤😻🌟😅👏"
2 changes: 2 additions & 0 deletions tests/two_leading_zeros.csv
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,12 @@ base32hexpadupper, "T0007IPBJ41MM2RJ940GG===="
base32z, "hyyy813murbssn5ujryoo"
base36, "k002lcpzo5yikidynfl"
base36upper, "K002LCPZO5YIKIDYNFL"
base45, "R000RFF.OEB$D5/DZ24"
base58flickr, "Z117Pznk19XTTzBtx"
base58btc, "z117paNL19xttacUY"
base64, "mAAB5ZXMgbWFuaSAh"
base64pad, "MAAB5ZXMgbWFuaSAh"
base64url, "uAAB5ZXMgbWFuaSAh"
base64urlpad, "UAAB5ZXMgbWFuaSAh"
proquint, "pro-babab-lojoj-lasob-kujod-kunon-fabod"
base256emoji, "🚀🚀🚀🏃✋🌈😅🌷🤤😻🌟😅👏"