Skip to content

Adjusting data downloader and subscriber to use earthaccess login methods#191

Open
jpl-btlunsfo wants to merge 2 commits into
developfrom
issue/175
Open

Adjusting data downloader and subscriber to use earthaccess login methods#191
jpl-btlunsfo wants to merge 2 commits into
developfrom
issue/175

Conversation

@jpl-btlunsfo

Copy link
Copy Markdown
  • Updated pyproject.toml to include new earthaccess package, and adjusted python minimum requirements
  • Modified podaac_data_{subscriber,downloader}.py to use the earthaccess module's login methods and token fetching.
  • Removed podaac_access module's old login and token methods, and the relevant pytest scripts.

Should close #175, unless we want to make use of more of earthaccess's search/download methods.

@jpl-btlunsfo jpl-btlunsfo requested a review from viviant100 May 14, 2026 17:19
@jpl-btlunsfo jpl-btlunsfo self-assigned this May 14, 2026
@jasonmlkang jasonmlkang self-requested a review May 14, 2026 17:58
jasonmlkang
jasonmlkang previously approved these changes May 14, 2026
@jasonmlkang jasonmlkang dismissed their stale review May 14, 2026 18:41

Will do more local testing

@jasonmlkang jasonmlkang self-requested a review May 14, 2026 18:41
# params[i] = ("token", token)
# #params['token'] = token
token = earthaccess.get_edl_token()["access_token"]
results = pa.get_search_results(params, args.verbose)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 401 retry no longer refreshes anything — params still carries the stale token (Line 193 -237). Details below.

The commented-out loop was the key piece here. Since params is a list[tuple], the value CMR ultimately sees is the ("token", <token>) entry inside that list. Rebinding the local token variable alone does not update params, so the retry path still sends the same expired token and will likely hit another 401. The same pattern exists in podaac_data_downloader.py around lines 261–274 as well.

In addition, earthaccess.get_edl_token() appears to return an existing cached token in many cases, rather than guaranteeing a freshly minted one. The previous refresh_token() logic (delete_token + create_token) forced a refresh, and we'll still want that behavior to support automatic 401 retries.

To keep the retry flow working reliably, we could:

  • re-authenticate (earthaccess.login(...))
  • retrieve the refreshed token
  • explicitly update the params list before retrying

for example:

except HTTPError as e: 

    if e.code == 401: 
        earthaccess.login(strategy="netrc") 
        token = earthaccess.get_edl_token()["access_token"] 
        params = [("token", token) if k == "token" else (k, v) for k, v in params] 
        results = pa.get_search_results(params, args.verbose) 
   else: 
        raise

Also worth noting: downstream calls such as get_cmr_collection_id() and create_citation_file() would still see the stale token after a downloader retry, since cmr_downloader currently only rebinds the local token variable. If we keep the refresh flow, it may help to propagate the updated token more consistently through the call path.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the pretty common flow for login->get_edl_token (and ->update token in params), I moved this all into a different method in podaac_access rather than duplicate the code everywhere.

downstream calls such as get_cmr_collection_id() and create_citation_file()

Both of these call get_cmr_collections, so I worked the refresh_token flow into that method instead. I'm wondering if it would be optimal to move that same 401-check wrapping pa.get_search_results (when called in both _downloader and _subscriber) down into the search method itself?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viviant100 poking for another pass over the changes, re-requested review.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpl-btlunsfo Yes, push it down. You've already done it for get_cmr_collections, so doing the same for get_search_results keeps the two CMR-query functions consistent and means callers never have to know about token-refresh semantics.

Comment thread pyproject.toml
Comment thread tests/test_token_regression.py Outdated
- moved token fetch back into podaac_access (still calling earthaccess), with a wrapper for updating param iterables
- modified get_cmr_collections to allow refreshing token on 401
- added earthaccess login tests
@jpl-btlunsfo jpl-btlunsfo requested a review from viviant100 May 19, 2026 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants