You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+5-2Lines changed: 5 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,8 +2,7 @@
2
2
3
3
This repository holds a docker image which reproduces [Microsofts CodeBERT Code-To-Text Experiment](https://github.com/microsoft/CodeXGLUE/tree/main/Code-Text/code-to-text).
4
4
5
-
The subparts have been minimally changed (see [changes](./changes.md)), but mostly it is just wrapping the experiment in a cpu-based docker image.
6
-
There is currently no GPU-Image.
5
+
The subparts have been minimally changed (see [changes](./changes.md)), but mostly it is just wrapping the experiment in a docker-image.
7
6
8
7
The initial readme can be [found here](./initial_readme.md).
9
8
@@ -14,6 +13,9 @@ The shell file runs the instructions from the initial readme and adds some more
14
13
It worked flawlessly for me on a mac, so I did not want to make extra docker image for data-preprocessing.
15
14
Depending on your distribution, you might need to install things like wget.
16
15
16
+
**Note:** The step before is necessary! the `dataset.zip` only contains references to the dataset and is *unfolded* in `prepare.sh`.
17
+
18
+
17
19
After that, change the docker-compose to point to your files (including filenames) and set environment variables as fit.
18
20
19
21
You can build the docker file beforehand using
@@ -92,4 +94,5 @@ CodeBert_CodeToText_Experiment_0_1 | ./entrypoint.sh: line 14: $'\r': command n
92
94
CodeBert_CodeToText_Experiment_0_1 | ./entrypoint.sh: line 200: syntax error: unexpected end of file
93
95
```
94
96
This is due to windows changing the line-breaks / file encodings. Thanks windows.
97
+
**Easy Solution**: run `dos2unix entrypoint.sh` and rebuild the container.
95
98
Its might easier/faster to pull the image from this repository, or you have to [edit the entrypoint to be compatible with windows](https://askubuntu.com/questions/966488/how-do-i-fix-r-command-not-found-errors-running-bash-scripts-in-wsl).
Copy file name to clipboardExpand all lines: entrypoint.sh
+17-19Lines changed: 17 additions & 19 deletions
Original file line number
Diff line number
Diff line change
@@ -4,10 +4,6 @@
4
4
5
5
# This file invokes the original python code of the codebert text with the environment variables set in the docker container.
6
6
# Additionally, it does a switch-case which flags for training, validation and testing have been set
7
-
# And it uses an anaconda environment to provide the dependencies.
8
-
9
-
# Without Anacondas --no-capture-output flag the system prints from the run.py would be hidden until the anaconda process exits. This flag is optional but highly helpful.
10
-
# Anacondas "-n" parameter specifies which conda-env is used to run the script. It must match the name provided in 'environment.yml'.
11
7
12
8
# The use of exit without a number returns the exit code of the fore-going statement - that is in this case the anaconda command.
13
9
# The Exit codes are necessary, as otherwise all cases are run (atleast, all cases with flags set).
@@ -21,12 +17,12 @@
21
17
if [ "$load_existing_model"=true ];then
22
18
echo"Found flag to load a model under $load_model_path"
23
19
24
-
if [ "$do_train"=true-a"$do_test"=true-a"$do_val"=true ];then
20
+
if [ "$DO_TRAIN"=true-a"$DO_TEST"=true-a"$DO_VALID"=true ];then
25
21
echo"performing full run with training, validation and test"
26
22
python ./run.py \
27
23
--do_train --do_test --do_eval \
28
24
--model_type roberta --model_name_or_path $pretrained_model \
0 commit comments