[refact](udf) remove the udf cache expiration_time property#63897
Open
zhangstar333 wants to merge 4 commits into
Open
[refact](udf) remove the udf cache expiration_time property#63897zhangstar333 wants to merge 4 commits into
zhangstar333 wants to merge 4 commits into
Conversation
Issue Number: close #xxx <!--Describe your changes.-->
## Proposed changes Issue Number: close #xxx <!--Describe your changes.-->
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
/review |
2250be3 to
a1f5113
Compare
Contributor
There was a problem hiding this comment.
Summary: I found two correctness issues in the new static UDF classloader cache behavior. The main blocker is a first-use race where one executor can close another executor's live URLClassLoader, preserving the NoClassDefFoundError class of failure this PR is trying to eliminate. There is also a regression for static-load UDFs loaded through the system classloader, where a null classLoader is valid but now treated as a cache miss.
Critical checkpoints:
- Goal/test: The PR aims to stop time-based UDF classloader eviction from breaking static-load Java UDFs. The goal is only partially met; concurrent first use can still close a live loader, and system-classloader static UDFs are not cached effectively. I did not find tests covering these concurrency/system-loader paths.
- Scope: The change is focused, but the cache lifecycle semantics changed from synchronized ExpiringMap operations to ConcurrentHashMap replacement without atomic construction.
- Concurrency: The modified static cache is shared by concurrent Java UDF executors and BE clean-cache tasks. The cache miss/build/put path is not atomic, which creates the live-loader close race noted inline.
- Lifecycle: UdfClassCache.classLoader may intentionally be null for system-classloader UDFs; the new validity check does not preserve that lifecycle invariant.
- Configuration/compatibility: expiration_time remains accepted and serialized but is now ignored; this is a user-visible semantic change and should be documented or removed in a coordinated way.
- Parallel paths: DROP FUNCTION cleanup still exists through FE clean-cache tasks and BE JNI cleanup; static-load lookup is the affected path.
- Testing: No new tests were included for concurrent static-load first use, DROP/reload lifecycle, or empty jarPath/system-classloader static UDFs.
- Observability/performance: No additional observability is required for the core issue, but repeated rebuilding for system-classloader static UDFs is avoidable overhead.
- Data/transaction/persistence: Not applicable to data visibility or transaction persistence.
User focus: No additional user-provided review focus was specified.
Contributor
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 31603 ms |
Contributor
TPC-DS: Total hot run time: 172206 ms |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Problem Summary:
doc apache/doris-website#3845
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)