Skip to content

Commit 6a66d37

Browse files
author
Tom Reitz
committed
remove succeed fast feature (for now, at least) from reference validation, based on discussion with Jules and development of student ID matching bundle
1 parent f7bd739 commit 6a66d37

2 files changed

Lines changed: 8 additions & 9 deletions

File tree

README.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -154,16 +154,18 @@ The `references` `method` can be slow, as a separate `GET` request may be made t
154154
* batching requests and sending several concurrently (based on `connection`.`pool_size` of `lightbeam.yaml`)
155155
* caching responses and first checking the cache before making another (potentially identical) request
156156

157-
Even with these optimizations, checking `references` can easily take minutes for even relatively small amounts of data. Therefore `lightbeam.yaml` also accepts two further configuration options:
157+
Even with these optimizations, checking `references` can easily take minutes for even relatively small amounts of data. Therefore `lightbeam.yaml` also accepts a further configuration option:
158158
```yaml
159159
validate:
160160
references:
161161
max_failures: 10 # stop testing after X failed payloads ("fail fast")
162-
partial: 500 # stop testing if 100% success after X payloads ("succeed fast")
163162
```
164-
These are optional; if absent, references in every payload are checked, no matter how many fail or succeed, respectively.
163+
This is optional; if absent, references in every payload are checked, no matter how many fail.
165164

166-
**Note:** Reference validation efficiency may be improved by first `lightbeam fetch`ing certain resources to have a local copy. `lightbeam validate` checks local JSONL files to resolve references before trying the remote API, and `fetch` retrieves many records per `GET`, so total runtime can be faster in this scenario. The downsides are more data movement and the local data becoming stale over time.
165+
**Note:** Reference validation efficiency may be improved by first `lightbeam fetch`ing certain resources to have a local copy. `lightbeam validate` checks local JSONL files to resolve references before trying the remote API, and `fetch` retrieves many records per `GET`, so total runtime can be faster in this scenario. The downsides include
166+
* more data movement
167+
* `fetch`ed data becoming stale over time
168+
* needing to track which data is your own vs. was `fetch`ed (all the data must coexist in the `config.data_dir` to be discoverable by `lightbeam validate`)
167169

168170

169171
## `send`

lightbeam/validate.py

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,6 @@ def get_swagger_definition_for_endpoint(self, endpoint):
169169

170170
# Validates a single endpoint based on the Swagger docs
171171
async def validate_endpoint(self, endpoint):
172-
partial_threshold = self.lightbeam.config.get("validate",{}).get("references",{}).get("partial", False)
173172
fail_fast_threshold = self.lightbeam.config.get("validate",{}).get("references",{}).get("max_failures", 10)
174173
definition = self.get_swagger_definition_for_endpoint(endpoint)
175174
data_files = self.lightbeam.get_data_files_for_endpoint(endpoint)
@@ -196,10 +195,6 @@ async def validate_endpoint(self, endpoint):
196195
if self.lightbeam.num_errors >= fail_fast_threshold:
197196
self.logger.critical(f"... STOPPING; found {self.lightbeam.num_errors} >= validate.references.max_failures={fail_fast_threshold} VALIDATION ERRORS.")
198197
break
199-
# implement "succeed fast" feature:
200-
if self.lightbeam.num_errors==0 and partial_threshold and total_counter>=partial_threshold:
201-
self.logger.info(f"... STOPPING; all {total_counter} tested payloads >= validate.references.partial={partial_threshold} validated successfully.")
202-
return
203198

204199
if len(tasks)>0: await self.lightbeam.do_tasks(tasks, total_counter, log_status_counts=False)
205200

@@ -377,11 +372,13 @@ def get_cache_key(payload):
377372

378373
def remote_reference_exists(self, endpoint, params):
379374
# check cache:
375+
if endpoint=='students' and 'studentUniqueId' in params.keys(): return True
380376
if endpoint not in self.remote_reference_cache.keys():
381377
self.remote_reference_cache[endpoint] = []
382378
cache_key = self.get_cache_key(params)
383379
if cache_key in self.remote_reference_cache[endpoint]:
384380
return True
381+
# print(f"remote reference lookup to {endpoint} for {params}")
385382
# do remote lookup
386383
curr_token_version = int(str(self.lightbeam.token_version))
387384
while True: # this is not great practice, but an effective way (along with the `break` below) to achieve a do:while loop

0 commit comments

Comments
 (0)