fix(executor): switch process working dir via native chdir#475
Open
my-vegetable-has-exploded wants to merge 2 commits into
Open
fix(executor): switch process working dir via native chdir#475my-vegetable-has-exploded wants to merge 2 commits into
my-vegetable-has-exploded wants to merge 2 commits into
Conversation
added 2 commits
May 17, 2026 03:27
Add JNA dependency to invoke native chdir syscall for switching the executor process working directory. Align SparkEnv.driverTmpDir with workingDir to ensure distributed files and archives are extracted to the correct root directory. Signed-off-by: wangyi <epsilonwang@didiglobal.com>
There was a problem hiding this comment.
Pull request overview
This PR updates RayDP executor startup so the executor process attempts to switch its native working directory to the executor workingDir, aligning Spark distributed file/archive placement with executor-local expectations.
Changes:
- Adds JNA and a native libc
chdircall during executor startup. - Adds diagnostic logging around process CWD switching.
- Changes
SparkEnv.driverTmpDirto point atworkingDirinstead ofworkingDir/_tmp.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
core/raydp-main/src/main/scala/org/apache/spark/executor/RayDPExecutor.scala |
Adds best-effort native CWD switching and aligns Spark distributed file root with executor working directory. |
core/raydp-main/pom.xml |
Adds the JNA dependency required for native libc access. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+201
to
+202
| logWarning(s"Failed to switch executor process cwd from ${beforeCwd} to ${targetDir}, " + | ||
| s"chdir returned rc=${rc}, errno=${Native.getLastError}") |
pang-wu
reviewed
May 23, 2026
| assert(workerTmpDir.exists() && workerTmpDir.isDirectory) | ||
| SparkEnv.get.driverTmpDir = Some(workerTmpDir.getAbsolutePath) | ||
| // Keep Spark's distributed file/archive root aligned with executor workingDir. | ||
| SparkEnv.get.driverTmpDir = Some(workingDir.getAbsolutePath) |
Collaborator
There was a problem hiding this comment.
will only do this solve the problem?
pang-wu
reviewed
May 23, 2026
|
|
||
| private def getProcessWorkingDir: String = { | ||
| try { | ||
| val procCwd = Paths.get("/proc/self/cwd") |
Collaborator
|
Can we add some tests to verify the fix work? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
The RayDP executor process CWD remains the container default directory (e.g., / or /opt) rather than the executor's workingDir. This causes Spark distributed files (--files) and archives (--archives) to be extracted to the wrong location, making it impossible for executor code to find them at the expected paths.
Approach
chdircall: Add JNA dependency topom.xml, define aLibCinterface to load thechdir()syscall from libc. AftersetUserDir()(which only changes the user.dir system property), addswitchProcessWorkingDirBestEffort()to actually switch the process-level CWDSparkEnv.driverTmpDir: ChangedriverTmpDirfromworkingDir/_tmptoworkingDiritself, so the root directory for Spark distributed files matches the executor'sworkingDir, and files/archives are extracted to the correct pathchdirfailure only logs a warning and does not interrupt executor startup; readscwdbefore and after the switch for diagnostic logging