|
| 1 | +# Pagination |
| 2 | + |
| 3 | +APIs often need to provide collections of data, most commonly in the |
| 4 | +[List][aip-132] standard method. However, collections can often be arbitrarily |
| 5 | +sized, and also often grow over time, increasing lookup time as well as the |
| 6 | +size of the responses being sent over the wire. Therefore, it is important that |
| 7 | +collections be paginated. |
| 8 | + |
| 9 | +## Guidance |
| 10 | + |
| 11 | +Operations returning collections of data **must** provide pagination _at the |
| 12 | +outset_, as it is a [backwards-incompatible change](#backwards-compatibility) |
| 13 | +to add pagination to an existing method. |
| 14 | + |
| 15 | +```typescript |
| 16 | +// The request structure for listing books. |
| 17 | +interface ListBooksRequest { |
| 18 | + // The parent, which owns this collection of books. |
| 19 | + // Format: publishers/{publisher} |
| 20 | + parent: string; |
| 21 | + |
| 22 | + // The maximum number of books to return. The service may return fewer than |
| 23 | + // this value. |
| 24 | + // If unspecified, at most 50 books will be returned. |
| 25 | + // The maximum value is 1000; values above 1000 will be coerced to 1000. |
| 26 | + maxPageSize: bigint; |
| 27 | + |
| 28 | + // A page token, received from a previous `ListBooks` call. |
| 29 | + // Provide this to retrieve the subsequent page. |
| 30 | + // |
| 31 | + // When paginating, all other parameters provided to `ListBooks` must match |
| 32 | + // the call that provided the page token. |
| 33 | + pageToken: string; |
| 34 | +} |
| 35 | + |
| 36 | +// The response structure from listing books. |
| 37 | +interface ListBooksResponse { |
| 38 | + // The books from the specified publisher. |
| 39 | + books: Book[]; |
| 40 | + |
| 41 | + // A token that can be sent as `page_token` to retrieve the next page. |
| 42 | + // If this field is omitted, there are no subsequent pages. |
| 43 | + nextPageToken: string; |
| 44 | +} |
| 45 | +``` |
| 46 | + |
| 47 | +- Request definitions for collections **should** define an |
| 48 | + `int32 max_page_size` field, allowing users to specify the maximum number of |
| 49 | + results to return. |
| 50 | + - If the user does not specify `max_page_size` (or specifies `0`), the API |
| 51 | + chooses an appropriate default, which the API **should** document. The API |
| 52 | + **must not** return an error. |
| 53 | + - If the user specifies `max_page_size` greater than the maximum permitted by |
| 54 | + the service, the service **should** coerce down to the maximum permitted |
| 55 | + page size. |
| 56 | + - If the user specifies a negative value for `max_page_size`, the API |
| 57 | + **must** return a `400 Bad Request` error. |
| 58 | + - The service **should** the number of results requested, unless the end of |
| 59 | + the collection is reached. |
| 60 | + - However, occasionally this is infeasible, especially within expected time |
| 61 | + limits. In these cases, the service **may** return fewer results than the |
| 62 | + number requested (including zero results), even if not at the end of the |
| 63 | + collection. |
| 64 | +- Request definitions for collections **should** define a `string page_token` |
| 65 | + field, allowing users to advance to the next page in the collection. |
| 66 | + - If the user changes the `max_page_size` in a request for subsequent pages, |
| 67 | + the service **must** honor the new page size. |
| 68 | + - The user is expected to keep all other arguments to the operation request |
| 69 | + the same; if any arguments are different, the API **should** send a |
| 70 | + `400 Bad Request` error. |
| 71 | +- The response **must not** be a streaming response. |
| 72 | +- Services **may** support using page tokens across versions of a service, but |
| 73 | + are not required to do so. |
| 74 | +- Response definitions for collections **must** define a |
| 75 | + `string next_page_token` field, providing the user with a page token that may |
| 76 | + be used to retrieve the next page. |
| 77 | + - The field containing pagination results **should** be the first field |
| 78 | + specified. It **should** be a repeated field containing a list of resources |
| 79 | + constituting a single page of results. |
| 80 | + - If the end of the collection has been reached, the `next_page_token` field |
| 81 | + **must** be empty. This is the _only_ way to communicate |
| 82 | + "end-of-collection" to users. |
| 83 | + - If the end of the collection has not been reached (or if the API can not |
| 84 | + determine in time), the service **must** provide a `next_page_token`. |
| 85 | +- Response definitions **may** include a `string next_page_url` field |
| 86 | + containing the full URL for the next page. |
| 87 | +- Response definitions for collections **may** provide an `int32 total_size` |
| 88 | + field, providing the user with the total number of items in the list. |
| 89 | + - This total **may** be an estimate (but the API **should** explicitly |
| 90 | + document that). |
| 91 | + |
| 92 | +[rfc-8288]: https://tools.ietf.org/html/rfc8288 |
| 93 | + |
| 94 | +### Skipping results |
| 95 | + |
| 96 | +The request definition for a paginatied operation **may** define an |
| 97 | +`int32 skip` field to allow the user to skip results. |
| 98 | + |
| 99 | +The `skip` value **must** refer to the number of individual resources to skip, |
| 100 | +not the number of pages. |
| 101 | + |
| 102 | +For example: |
| 103 | + |
| 104 | +- A request with no page token and a `skip` value of `30` returns a single page |
| 105 | + of results starting with the 31st result. |
| 106 | +- A request with a page token corresponding to the 51st result (because the |
| 107 | + first 50 results were returned on the first page) and a `skip` value of `30` |
| 108 | + returns a single page of results starting with the 81st result. |
| 109 | + |
| 110 | +If a `skip` value is provided that causes the cursor to move past the end of |
| 111 | +the collection of results, the response **must** be `200 OK` with an empty |
| 112 | +result set, and not provide a `next_page_token`. |
| 113 | + |
| 114 | +### Opacity |
| 115 | + |
| 116 | +Page tokens provided by services **must** be opaque (but URL-safe) strings, and |
| 117 | +**must not** be user-parseable. This is because if users are able to |
| 118 | +deconstruct these, _they will do so_. This effectively makes the implementation |
| 119 | +details of your API's pagination become part of the API surface, and it becomes |
| 120 | +impossible to update those details without breaking users. |
| 121 | + |
| 122 | +**Warning:** Base-64 encoding an otherwise-transparent page token is **not** a |
| 123 | +sufficient obfuscation mechanism. |
| 124 | + |
| 125 | +For page tokens which do not need to be stored in a database, and which do not |
| 126 | +contain sensitive data, an API **may** obfuscate the page token by defining an |
| 127 | +internal protocol buffer message with any data needed, and send the serialized |
| 128 | +proto, base-64 encoded. |
| 129 | + |
| 130 | +Page tokens **must** be limited to providing an indication of where to continue |
| 131 | +the pagination process only. They **must not** provide any form of |
| 132 | +authorization to the underlying resources, and authorization **must** be |
| 133 | +performed on the request as with any other regardless of the presence of a page |
| 134 | +token. |
| 135 | + |
| 136 | +### Expiring page tokens |
| 137 | + |
| 138 | +Many services store page tokens in a database internally. In this situation, |
| 139 | +the service **may** expire page tokens a reasonable time after they have been |
| 140 | +sent, in order not to needlessly store large amounts of data that is unlikely |
| 141 | +to be used. It is not necessary to document this behavior. |
| 142 | + |
| 143 | +**Note:** While a reasonable time may vary between services, a good rule of |
| 144 | +thumb is three days. |
| 145 | + |
| 146 | +### Consistency |
| 147 | + |
| 148 | +When discussing pagination, consistency refers to the question of what to do if |
| 149 | +the underlying collection is modified while pagination is in progress. The most |
| 150 | +common way that this occurs is for a resource to be added or deleted in a place |
| 151 | +that the pagination cursor has already passed. |
| 152 | + |
| 153 | +Services **may** choose to be strongly consistent by approximating the |
| 154 | +"repeatable read" behavior in databases, and returning exactly the records that |
| 155 | +exist at the time that pagination begins. |
| 156 | + |
| 157 | +### Backwards compatibility |
| 158 | + |
| 159 | +Adding pagination to an existing operation is a backwards-incompatible change. |
| 160 | +This may seem strange; adding fields to interface definitions is generally |
| 161 | +backwards compatible. However, this change is _behaviorally_ incompatible. |
| 162 | + |
| 163 | +Consider a user whose collection has 75 resources, and who has already written |
| 164 | +and deployed code. If the API later adds pagination fields, and sets the |
| 165 | +default to 50, then that user's code breaks; it was getting all resources, and |
| 166 | +now is only getting the first 50 (and does not know to advance pagination). |
| 167 | +Even if the API set a higher default limit, such as 100, the user's collection |
| 168 | +could grow, and _then_ the code would break. |
| 169 | + |
| 170 | +For this reason, it is important to always add pagination to operations |
| 171 | +returning collections _up front_; they are consistently important, and they can |
| 172 | +not be added later without causing problems for existing users. |
| 173 | + |
| 174 | +**Warning:** This also entails that, in addition to presenting the pagination |
| 175 | +fields, they **must** be _actually implemented_ with a non-infinite default |
| 176 | +value. Implementing an in-memory version (which might fetch everything then |
| 177 | +paginate) is reasonable for initially-small collections. |
| 178 | + |
| 179 | +## Implementation |
| 180 | + |
| 181 | +Page tokens **should** be versioned independently of the public API, so that |
| 182 | +page tokens can be used with any version of the service. |
| 183 | + |
| 184 | +The simplest form of a page token only requires an offset. However, offsets |
| 185 | +pose challenges when a distributed database is introduced, so a more robust |
| 186 | +page token needs to store the information needed to find a "logical" position |
| 187 | +in the database. The simplest way to do this is to include relevant data from |
| 188 | +the last result returned. Primarily, this means the resource ID, but also |
| 189 | +includes any other fields from the resource used to sort the results (for the |
| 190 | +event where the resource is changed or deleted). |
| 191 | + |
| 192 | +This information is from the resource itself, and therefore might be sensitive. |
| 193 | +Sensitive data **must** be encrypted before being used in a page token. |
| 194 | +Therefore, the token also includes the date it was created, to allow for the |
| 195 | +potential need to rotate the encryption key. |
| 196 | + |
| 197 | +This yields the following interface, which **may** be base64 encoded and used |
| 198 | +as a page token: |
| 199 | + |
| 200 | +```typescript |
| 201 | +interface PageTokenSecrets { |
| 202 | + // The ID of the most recent resource returned. |
| 203 | + lastId: string; |
| 204 | + |
| 205 | + // Any index data needed, generally 1:1 with the fields used for ordering. |
| 206 | + indexData: Buffer[]; |
| 207 | + |
| 208 | + // When this token was minted. |
| 209 | + createTime: Date; |
| 210 | +} |
| 211 | +``` |
| 212 | + |
| 213 | +**Note:** This section does not preclude alternative page token implementations |
| 214 | +provided they conform to the guidelines discussed in this document. |
0 commit comments