Add function to checkout the jobs repository#75
Conversation
587d8dd to
ff98e93
Compare
082f212 to
7428588
Compare
dbcc18f to
c0f9c71
Compare
| } | ||
|
|
||
| // TODO: Should clone_git_repository be async | ||
| // gitoxide doesn't currently have async implemented for http backend |
There was a problem hiding this comment.
I don't think this is viable without either async or cloning in a separate thread (which would make cancellation really tricky, though you seem to have some interrupt-related stuff wired up already); blocking an entire I/O thread for what might be a very long clone process is not workable. Should this maybe just be using libgit2 or something else instead?
There was a problem hiding this comment.
fwiw i'd prefer to stay with gitoxide to avoid more C dependences in the core of gitlab-runner. But yes it should be done in a dedicated thread if there is no async api
There was a problem hiding this comment.
I've updated it to use tokio::task::spawn_blocking which runs it on a different thread.
I used a CancellationToken and tokio::select to trigger gitoxide to interrupt, CancellationToken is already used in the gitlab-runner-rs API so hopefully is okay.
gitoxide re-exports Progress from the prodash crate to allow the caller to monitor progress. Is that something we should do?
| * * Clones to BuildDir | ||
| * * Checks out git_info.sha and fetches all in git_info.refspecs | ||
| * * Hardcoded username "gitlab-ci-token" | ||
| * * credHelperCommand? |
There was a problem hiding this comment.
this is a bit tricky to understand out-of-context, might want to add a few more details for some of these points
| } | ||
|
|
||
| let checkout_progress = progress.add_child("checkout".to_string()); | ||
| let (checkout, _outcome) = fetch.fetch_then_checkout(checkout_progress, &should_interrupt)?; |
There was a problem hiding this comment.
thought: if the goal is just to be able to read files from the repo, do we need to do an entire checkout to begin with? couldn't it clone a bare repo and then load files as-needed from there? though the apis might be harder to make for that...but it would likely also be a lot faster.
There was a problem hiding this comment.
Fwiw i wouldn't mind this being a potential future update ; I'm not sure how much faster it would be ("likely be a lot of faster" is always a dangerous statement in the land of optimisation). I guess it primarily depend on what work would be saved, which i guess is primarily unpacking the pack file.
Regardless; the time would be comparable to a normal checking in a normal gitlab runner (or faster as iirc gix is faster at this point then git).
6b04326 to
6b020a2
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR implements Git repository cloning functionality for the GitLab runner by adding two main functions: clone_git_repository for general Git repository cloning and Job::clone_git_repository for job-specific cloning. The implementation leverages the gitoxide library and includes comprehensive error handling through a new GitCheckoutError enum.
Key changes:
- Adds comprehensive Git cloning functionality with support for refspecs, depth control, and SHA-specific checkout
- Implements proper error handling for Git operations with detailed error categorization
- Provides both async and sync interfaces for Git operations with cancellation support
Reviewed Changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| gitlab-runner/src/run.rs | Fixes error message for build directory creation |
| gitlab-runner/src/lib.rs | Adds core Git cloning functions and comprehensive error handling |
| gitlab-runner/src/job.rs | Implements job-specific Git repository cloning method |
| gitlab-runner/src/client.rs | Defines Git information structures and error types |
| gitlab-runner/examples/demo-runner.rs | Adds example commands for testing Git functionality |
| gitlab-runner/Cargo.toml | Updates dependencies and adds gitoxide library |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| if let Some(depth) = depth.and_then(NonZeroU32::new) { | ||
| fetch = fetch.with_shallow(remote::fetch::Shallow::DepthAtRemote(depth)); |
There was a problem hiding this comment.
Using NonZeroU32::new will return None for depth value 0, but depth 0 should be valid for unlimited depth. Consider using a separate check or handling depth 0 explicitly.
| if let Some(depth) = depth.and_then(NonZeroU32::new) { | |
| fetch = fetch.with_shallow(remote::fetch::Shallow::DepthAtRemote(depth)); | |
| match depth { | |
| Some(0) => { | |
| // Depth 0 means unlimited depth; do not set shallow clone. | |
| } | |
| Some(d) => { | |
| if let Some(nonzero_depth) = NonZeroU32::new(d) { | |
| fetch = fetch.with_shallow(remote::fetch::Shallow::DepthAtRemote(nonzero_depth)); | |
| } | |
| } | |
| None => { | |
| // No depth specified; do not set shallow clone. | |
| } |
|
|
||
| index.write(Default::default())?; | ||
|
|
||
| Ok(repo_dir.keep()) |
There was a problem hiding this comment.
[nitpick] The function returns the temporary directory path, but the calling code is expected to move its contents manually. Consider returning the final destination path or handling the move operation within this function to improve the API design.
| clone_git_repository( | ||
| self.build_dir(), | ||
| &self.response.git_info.repo_url, | ||
| Some(&self.response.git_info.sha), |
There was a problem hiding this comment.
[nitpick] The SHA is always passed as Some(), but the generic clone_git_repository function expects Option<&str>. Consider whether this should handle cases where SHA might be empty or invalid.
| Some(&self.response.git_info.sha), | |
| if self.response.git_info.sha.is_empty() { | |
| None | |
| } else { | |
| Some(&self.response.git_info.sha) | |
| }, |
| std::fs::rename(repo_path, path) | ||
| .map_err(|e| outputln!("Failed to move repo path: {}", e.to_string()))?; |
There was a problem hiding this comment.
Using std::fs::rename to move a directory may fail across filesystem boundaries. Consider using a recursive copy operation or std::fs::create_dir_all followed by moving individual files.
* `clone_git_repository` to clone arbitrary repo * `Job::clone_git_repository` to clone the jobs repo
Also throw an error if the provided HEAD ref cannot be parsed.
Otherwise, clippy gets quite upset because the error type is so large (>200 bytes).
This is set if the runner isn't allowed to reuse an existing repository by re-running a fetch there, which is unrelated to the ability to clone itself.
It's a bit confusing that this automatically considers the corresponding CI variable, but some other settings (such as the git strategy allowing repo reuse) do not.
No real reason for them to be public, and it frees us from handling a lot of edge cases and allows for some significant (future) code cleanup.
If this is private, and only used by Job, it makes sense for it to be in the same place.
Given that the example didn't even use the cancellation token properly, it's better for cancellation to be handled internally.
The `sha` field is never going to contain a non-SHA, among other things, so this code can be wittled down quite a bit now.
Much of this can be replaced with helpers from the crate itself.
In these cases, we just return None to signal back to the caller that the clone couldn't be performed for a non-error reason.
|
Rebased this and added a bunch of extra fix-up commits on top. The new changes are split up the changes quite aggressively to try to make them easier to review (we can squash as needed in the end anyway). minimal-direct-dependencies is passing locally, I'll figure out why it's failing later on... |
This branch adds
crate::clone_git_repositoryto clone git repos generally andJob::clone_git_repositoryto clone specifically defined by the gitlab job.clone_git_repositoryis heavily based onPrepareFetch::fetch_onlySince the gitoxide
PrepareFetchandPrepareCheckoutstructs delete the target path when they are dropped I opted to checkout the repository to a temporary directory, then move it's contents to the path provided to the target directory.This adds a rather unwieldy
GitCheckoutErrorenum. gitoxidesErrortypes triggers theresult_large_errclippy warning.This requires rust version
1.73.0or later.Fixes #21