Skip to content

Commit 0565dc5

Browse files
Merge pull request #7 from benjaminjackson/fix/webset-import-search-scope
fix(websets): validate import/scope conflicts and improve CLI help
2 parents 2e3d4e7 + 0caef96 commit 0565dc5

4 files changed

Lines changed: 150 additions & 11 deletions

File tree

exe/exa-ai-webset-create

Lines changed: 25 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -80,32 +80,51 @@ def parse_args(argv)
8080
Create a new webset from search criteria or an import
8181
8282
Required (choose one):
83-
--search JSON Search configuration (supports @file.json)
84-
--import ID Import or webset ID to create webset from
85-
(accepts import_* or webset_* IDs)
83+
--search JSON Search configuration as JSON (supports @file.json)
84+
Format: {"query":"...","count":10,"scope":[...]}
85+
The 'scope' field limits search to specific sources
86+
--import ID Import/webset ID to attach data to this webset
87+
(loads data but does NOT filter searches)
88+
Format: import_abc123 or webset_xyz789
8689
8790
Options:
8891
--enrichments JSON Array of enrichment configs (supports @file.json)
89-
--exclude JSON Array of exclude configs (supports @file.json)
92+
Format: [{"description":"...","format":"text"}]
93+
--exclude JSON Sources to exclude from searches (supports @file.json)
94+
Format: [{"source":"import|webset","id":"..."}]
9095
--external-id ID External identifier for the webset
9196
--metadata JSON Custom metadata (supports @file.json)
97+
Format: {"key":"value"}
9298
--wait Wait for webset to reach idle status
9399
--api-key KEY Exa API key (or set EXA_API_KEY env var)
94100
--output-format FMT Output format: json, pretty, or text (default: json)
95101
--help, -h Show this help message
96102
103+
JSON Format Details:
104+
search.scope Array of source references to limit search
105+
Format: [{"source":"import|webset","id":"..."}]
106+
With relationship (hop search):
107+
[{"source":"webset","id":"ws_123",
108+
"relationship":{"definition":"investors of","limit":3}}]
109+
110+
IMPORTANT: Cannot use the same import ID in both --import and search.scope
111+
(this will return a 400 error from the API)
112+
97113
Examples:
98114
# Create webset from search
99115
exa-ai webset-create --search '{"query":"AI startups","count":10}'
100116
exa-ai webset-create --search @search.json --enrichments @enrichments.json
101117
exa-ai webset-create --search @search.json --wait
102118
119+
# Create webset with scoped search (filter to specific import)
120+
exa-ai webset-create --search '{"query":"CEOs","count":10,"scope":[{"source":"import","id":"import_abc"}]}'
121+
103122
# Create webset from import
104123
exa-ai webset-create --import import_abc123
105124
exa-ai webset-create --import import_def456 --enrichments @enrichments.json
106125
107-
# Create webset from existing webset
108-
exa-ai webset-create --import webset_xyz789
126+
# Load import AND run search (search not scoped to import)
127+
exa-ai webset-create --import import_abc123 --search '{"query":"investors","count":20}'
109128
HELP
110129
exit 0
111130
else

exe/exa-ai-webset-search-create

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -104,11 +104,20 @@ def parse_args(argv)
104104
--entity TYPE Entity type: person, company, article, research_paper, custom
105105
--entity-description TXT Description for custom entity type (required with --entity custom)
106106
--criteria JSON Search criteria array (supports @file.json)
107+
Format: [{"description":"criterion 1"},{"description":"criterion 2"}]
107108
--exclude JSON Items to exclude from results (supports @file.json)
109+
Format: [{"source":"import|webset","id":"..."}]
108110
--scope JSON Limit search to specific sources (supports @file.json)
111+
Format: [{"source":"import|webset","id":"..."}]
112+
Filters this search to only items from these sources
113+
With relationship (hop search):
114+
[{"source":"webset","id":"ws_123",
115+
"relationship":{"definition":"investors of","limit":3}}]
109116
--recall Estimate total available results
110-
--behavior TYPE "override" or "append" (default: override)
117+
--behavior TYPE "override" (replace items) or "append" (add items)
118+
Default: override when scope is present, append otherwise
111119
--metadata JSON Custom metadata (supports @file.json)
120+
Format: {"key":"value"}
112121
--api-key KEY Exa API key (or set EXA_API_KEY env var)
113122
--output-format FMT Output format: json, pretty, or text (default: json)
114123
--help, -h Show this help message
@@ -121,11 +130,15 @@ def parse_args(argv)
121130
exa-ai webset-search-create ws_123 --query "tech CEOs" --entity person
122131
exa-ai webset-search-create ws_123 --query "Silicon Valley firms" --entity company
123132
124-
# Search with custom entity type
125-
exa-ai webset-search-create ws_123 --query "Ford Mustang" \\
126-
--entity custom --entity-description "vintage cars"
133+
# Scoped search (filter to specific import)
134+
exa-ai webset-search-create ws_123 --query "CTOs" \\
135+
--scope '[{"source":"import","id":"import_abc"}]'
127136
128-
# Other options
137+
# Hop search (find investors of companies in webset)
138+
exa-ai webset-search-create ws_123 --query "investors" \\
139+
--scope '[{"source":"webset","id":"ws_companies","relationship":{"definition":"investors of","limit":5}}]'
140+
141+
# Search with criteria and behavior
129142
exa-ai webset-search-create ws_123 --query "machine learning" --count 50
130143
exa-ai webset-search-create ws_123 --query "research" --behavior append --recall
131144
HELP

lib/exa/services/websets/create_validator.rb

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ def validate!(params)
2020
validate_exclude!(params[:exclude]) if params[:exclude]
2121
validate_external_id!(params[:externalId]) if params[:externalId]
2222
validate_metadata!(params[:metadata]) if params[:metadata]
23+
validate_no_duplicate_ids_in_import_and_scope!(params)
2324
end
2425

2526
private
@@ -184,6 +185,20 @@ def validate_string_length!(value, name, min: nil, max: nil)
184185
raise ArgumentError, "#{name} must be at least #{min} characters" if min && value.length < min
185186
raise ArgumentError, "#{name} cannot exceed #{max} characters" if max && value.length > max
186187
end
188+
189+
def validate_no_duplicate_ids_in_import_and_scope!(params)
190+
return unless params[:import] && params[:search] && params[:search][:scope]
191+
192+
import_ids = params[:import].map { |item| item[:id] }
193+
scope_ids = params[:search][:scope].map { |item| item[:id] }
194+
195+
duplicates = import_ids & scope_ids
196+
197+
return if duplicates.empty?
198+
199+
raise ArgumentError,
200+
"Cannot use the same import/webset ID in both :import and search[:scope]: #{duplicates.join(', ')}"
201+
end
187202
end
188203
end
189204
end

test/services/websets/create_validator_test.rb

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -465,4 +465,96 @@ def test_raises_when_metadata_value_exceeds_max_length
465465

466466
assert_match(/1000 characters/i, error.message)
467467
end
468+
469+
def test_raises_when_same_import_id_in_both_import_and_search_scope
470+
params = {
471+
import: [
472+
{ source: "import", id: "import_abc123" }
473+
],
474+
search: {
475+
query: "test",
476+
count: 10,
477+
scope: [
478+
{ source: "import", id: "import_abc123" }
479+
]
480+
}
481+
}
482+
483+
error = assert_raises(ArgumentError) do
484+
Exa::Services::Websets::CreateValidator.validate!(params)
485+
end
486+
487+
assert_match(/cannot use the same.*import.*in both/i, error.message)
488+
assert_match(/import_abc123/i, error.message)
489+
end
490+
491+
def test_raises_when_same_webset_id_in_both_import_and_search_scope
492+
params = {
493+
import: [
494+
{ source: "webset", id: "ws_xyz789" }
495+
],
496+
search: {
497+
query: "test",
498+
count: 10,
499+
scope: [
500+
{ source: "webset", id: "ws_xyz789" }
501+
]
502+
}
503+
}
504+
505+
error = assert_raises(ArgumentError) do
506+
Exa::Services::Websets::CreateValidator.validate!(params)
507+
end
508+
509+
assert_match(/cannot use the same.*in both/i, error.message)
510+
assert_match(/ws_xyz789/i, error.message)
511+
end
512+
513+
def test_allows_different_ids_in_import_and_search_scope
514+
params = {
515+
import: [
516+
{ source: "import", id: "import_abc123" }
517+
],
518+
search: {
519+
query: "test",
520+
count: 10,
521+
scope: [
522+
{ source: "import", id: "import_xyz789" }
523+
]
524+
}
525+
}
526+
527+
# Should not raise
528+
Exa::Services::Websets::CreateValidator.validate!(params)
529+
end
530+
531+
def test_allows_import_without_search_scope
532+
params = {
533+
import: [
534+
{ source: "import", id: "import_abc123" }
535+
],
536+
search: {
537+
query: "test",
538+
count: 10
539+
}
540+
}
541+
542+
# Should not raise
543+
Exa::Services::Websets::CreateValidator.validate!(params)
544+
end
545+
546+
def test_allows_search_scope_without_import
547+
params = {
548+
search: {
549+
query: "test",
550+
count: 10,
551+
scope: [
552+
{ source: "import", id: "import_abc123" }
553+
]
554+
}
555+
}
556+
557+
# Should not raise
558+
Exa::Services::Websets::CreateValidator.validate!(params)
559+
end
468560
end

0 commit comments

Comments
 (0)