You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: reference.md
+6-18Lines changed: 6 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1615,7 +1615,7 @@ client.rerank(
1615
1615
],
1616
1616
query="What is the capital of the United States?",
1617
1617
top_n=3,
1618
-
model="rerank-v3.5",
1618
+
model="rerank-v4.0-pro",
1619
1619
)
1620
1620
1621
1621
```
@@ -2492,10 +2492,7 @@ If tool_choice isn't specified, then the model is free to choose whether to use
2492
2492
<dl>
2493
2493
<dd>
2494
2494
2495
-
**priority:**`typing.Optional[int]`
2496
-
2497
-
The priority of the request (lower means earlier handling; default 0 highest priority).
2498
-
Higher priority requests are handled first, and dropped last when the system is under load.
2495
+
**priority:**`typing.Optional[int]` — Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
2499
2496
2500
2497
</dd>
2501
2498
</dl>
@@ -2793,10 +2790,7 @@ If tool_choice isn't specified, then the model is free to choose whether to use
2793
2790
<dl>
2794
2791
<dd>
2795
2792
2796
-
**priority:**`typing.Optional[int]`
2797
-
2798
-
The priority of the request (lower means earlier handling; default 0 highest priority).
2799
-
Higher priority requests are handled first, and dropped last when the system is under load.
2793
+
**priority:**`typing.Optional[int]` — Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
2800
2794
2801
2795
</dd>
2802
2796
</dl>
@@ -2972,10 +2966,7 @@ If `NONE` is selected, when the input exceeds the maximum input token length an
2972
2966
<dl>
2973
2967
<dd>
2974
2968
2975
-
**priority:**`typing.Optional[int]`
2976
-
2977
-
The priority of the request (lower means earlier handling; default 0 highest priority).
2978
-
Higher priority requests are handled first, and dropped last when the system is under load.
2969
+
**priority:**`typing.Optional[int]` — Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
2979
2970
2980
2971
</dd>
2981
2972
</dl>
@@ -3038,7 +3029,7 @@ client.v2.rerank(
3038
3029
],
3039
3030
query="What is the capital of the United States?",
3040
3031
top_n=3,
3041
-
model="rerank-v3.5",
3032
+
model="rerank-v4.0-pro",
3042
3033
)
3043
3034
3044
3035
```
@@ -3102,10 +3093,7 @@ For optimal performance we recommend against sending more than 1,000 documents i
3102
3093
<dl>
3103
3094
<dd>
3104
3095
3105
-
**priority:**`typing.Optional[int]`
3106
-
3107
-
The priority of the request (lower means earlier handling; default 0 highest priority).
3108
-
Higher priority requests are handled first, and dropped last when the system is under load.
3096
+
**priority:**`typing.Optional[int]` — Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
Copy file name to clipboardExpand all lines: src/cohere/v2/client.py
+10-18Lines changed: 10 additions & 18 deletions
Original file line number
Diff line number
Diff line change
@@ -160,8 +160,7 @@ def chat_stream(
160
160
thinking : typing.Optional[Thinking]
161
161
162
162
priority : typing.Optional[int]
163
-
The priority of the request (lower means earlier handling; default 0 highest priority).
164
-
Higher priority requests are handled first, and dropped last when the system is under load.
163
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
165
164
166
165
request_options : typing.Optional[RequestOptions]
167
166
Request-specific configuration.
@@ -331,8 +330,7 @@ def chat(
331
330
thinking : typing.Optional[Thinking]
332
331
333
332
priority : typing.Optional[int]
334
-
The priority of the request (lower means earlier handling; default 0 highest priority).
335
-
Higher priority requests are handled first, and dropped last when the system is under load.
333
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
336
334
337
335
request_options : typing.Optional[RequestOptions]
338
336
Request-specific configuration.
@@ -451,8 +449,7 @@ def embed(
451
449
If `NONE` is selected, when the input exceeds the maximum input token length an error will be returned.
452
450
453
451
priority : typing.Optional[int]
454
-
The priority of the request (lower means earlier handling; default 0 highest priority).
455
-
Higher priority requests are handled first, and dropped last when the system is under load.
452
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
456
453
457
454
request_options : typing.Optional[RequestOptions]
458
455
Request-specific configuration.
@@ -650,8 +647,7 @@ def rerank(
650
647
Defaults to `4096`. Long documents will be automatically truncated to the specified number of tokens.
651
648
652
649
priority : typing.Optional[int]
653
-
The priority of the request (lower means earlier handling; default 0 highest priority).
654
-
Higher priority requests are handled first, and dropped last when the system is under load.
650
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
655
651
656
652
request_options : typing.Optional[RequestOptions]
657
653
Request-specific configuration.
@@ -679,7 +675,7 @@ def rerank(
679
675
],
680
676
query="What is the capital of the United States?",
681
677
top_n=3,
682
-
model="rerank-v3.5",
678
+
model="rerank-v4.0-pro",
683
679
)
684
680
"""
685
681
_response=self._raw_client.rerank(
@@ -825,8 +821,7 @@ async def chat_stream(
825
821
thinking : typing.Optional[Thinking]
826
822
827
823
priority : typing.Optional[int]
828
-
The priority of the request (lower means earlier handling; default 0 highest priority).
829
-
Higher priority requests are handled first, and dropped last when the system is under load.
824
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
830
825
831
826
request_options : typing.Optional[RequestOptions]
832
827
Request-specific configuration.
@@ -1005,8 +1000,7 @@ async def chat(
1005
1000
thinking : typing.Optional[Thinking]
1006
1001
1007
1002
priority : typing.Optional[int]
1008
-
The priority of the request (lower means earlier handling; default 0 highest priority).
1009
-
Higher priority requests are handled first, and dropped last when the system is under load.
1003
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
1010
1004
1011
1005
request_options : typing.Optional[RequestOptions]
1012
1006
Request-specific configuration.
@@ -1133,8 +1127,7 @@ async def embed(
1133
1127
If `NONE` is selected, when the input exceeds the maximum input token length an error will be returned.
1134
1128
1135
1129
priority : typing.Optional[int]
1136
-
The priority of the request (lower means earlier handling; default 0 highest priority).
1137
-
Higher priority requests are handled first, and dropped last when the system is under load.
1130
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
1138
1131
1139
1132
request_options : typing.Optional[RequestOptions]
1140
1133
Request-specific configuration.
@@ -1219,8 +1212,7 @@ async def rerank(
1219
1212
Defaults to `4096`. Long documents will be automatically truncated to the specified number of tokens.
1220
1213
1221
1214
priority : typing.Optional[int]
1222
-
The priority of the request (lower means earlier handling; default 0 highest priority).
1223
-
Higher priority requests are handled first, and dropped last when the system is under load.
1215
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
1224
1216
1225
1217
request_options : typing.Optional[RequestOptions]
1226
1218
Request-specific configuration.
@@ -1253,7 +1245,7 @@ async def main() -> None:
1253
1245
],
1254
1246
query="What is the capital of the United States?",
Copy file name to clipboardExpand all lines: src/cohere/v2/raw_client.py
+8-16Lines changed: 8 additions & 16 deletions
Original file line number
Diff line number
Diff line change
@@ -169,8 +169,7 @@ def chat_stream(
169
169
thinking : typing.Optional[Thinking]
170
170
171
171
priority : typing.Optional[int]
172
-
The priority of the request (lower means earlier handling; default 0 highest priority).
173
-
Higher priority requests are handled first, and dropped last when the system is under load.
172
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
174
173
175
174
request_options : typing.Optional[RequestOptions]
176
175
Request-specific configuration.
@@ -513,8 +512,7 @@ def chat(
513
512
thinking : typing.Optional[Thinking]
514
513
515
514
priority : typing.Optional[int]
516
-
The priority of the request (lower means earlier handling; default 0 highest priority).
517
-
Higher priority requests are handled first, and dropped last when the system is under load.
515
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
518
516
519
517
request_options : typing.Optional[RequestOptions]
520
518
Request-specific configuration.
@@ -782,8 +780,7 @@ def embed(
782
780
If `NONE` is selected, when the input exceeds the maximum input token length an error will be returned.
783
781
784
782
priority : typing.Optional[int]
785
-
The priority of the request (lower means earlier handling; default 0 highest priority).
786
-
Higher priority requests are handled first, and dropped last when the system is under load.
783
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
787
784
788
785
request_options : typing.Optional[RequestOptions]
789
786
Request-specific configuration.
@@ -1000,8 +997,7 @@ def rerank(
1000
997
Defaults to `4096`. Long documents will be automatically truncated to the specified number of tokens.
1001
998
1002
999
priority : typing.Optional[int]
1003
-
The priority of the request (lower means earlier handling; default 0 highest priority).
1004
-
Higher priority requests are handled first, and dropped last when the system is under load.
1000
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
1005
1001
1006
1002
request_options : typing.Optional[RequestOptions]
1007
1003
Request-specific configuration.
@@ -1297,8 +1293,7 @@ async def chat_stream(
1297
1293
thinking : typing.Optional[Thinking]
1298
1294
1299
1295
priority : typing.Optional[int]
1300
-
The priority of the request (lower means earlier handling; default 0 highest priority).
1301
-
Higher priority requests are handled first, and dropped last when the system is under load.
1296
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
1302
1297
1303
1298
request_options : typing.Optional[RequestOptions]
1304
1299
Request-specific configuration.
@@ -1641,8 +1636,7 @@ async def chat(
1641
1636
thinking : typing.Optional[Thinking]
1642
1637
1643
1638
priority : typing.Optional[int]
1644
-
The priority of the request (lower means earlier handling; default 0 highest priority).
1645
-
Higher priority requests are handled first, and dropped last when the system is under load.
1639
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
1646
1640
1647
1641
request_options : typing.Optional[RequestOptions]
1648
1642
Request-specific configuration.
@@ -1910,8 +1904,7 @@ async def embed(
1910
1904
If `NONE` is selected, when the input exceeds the maximum input token length an error will be returned.
1911
1905
1912
1906
priority : typing.Optional[int]
1913
-
The priority of the request (lower means earlier handling; default 0 highest priority).
1914
-
Higher priority requests are handled first, and dropped last when the system is under load.
1907
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
1915
1908
1916
1909
request_options : typing.Optional[RequestOptions]
1917
1910
Request-specific configuration.
@@ -2128,8 +2121,7 @@ async def rerank(
2128
2121
Defaults to `4096`. Long documents will be automatically truncated to the specified number of tokens.
2129
2122
2130
2123
priority : typing.Optional[int]
2131
-
The priority of the request (lower means earlier handling; default 0 highest priority).
2132
-
Higher priority requests are handled first, and dropped last when the system is under load.
2124
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
0 commit comments