Skip to content

Commit 57c21c6

Browse files
authored
Revert "Pathwaysjob CRD migration (#1099)" (#1107)
This reverts commit 989ba24.
1 parent 867becd commit 57c21c6

25 files changed

Lines changed: 271 additions & 646 deletions

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,7 @@ XPK also supports the following [Google Cloud Storage solutions](./docs/usage/st
8686
| [JobSet](https://github.com/kubernetes-sigs/jobset) | Workload creation |
8787
| [Docker](https://docs.docker.com/engine/install/) | Building workload container |
8888
| [CoreDNS](https://github.com/coredns/deployment/tree/master/kubernetes) | Cluster set up |
89+
| [PathwaysJob](https://github.com/google/pathways-job) | Running Pathways workloads |
8990

9091
# Privacy notice
9192

recipes/Basic_cluster_create.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,11 +45,11 @@ kubectl wait deployment/coredns --for=condition=Available=true --namespace=kube-
4545
[XPK] Task: `Determine current gke master version` is implemented by the following command not running since it is a dry run.
4646
gcloud beta container clusters describe golden-cluster --location us-central1 --project golden-project --format="value(currentMasterVersion)"
4747
[XPK] Creating 1 node pool or pools of tpu7x-8
48-
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=2, pathways_tpu_version='tpu7x')
48+
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=2)
4949
[XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run.
5050
gcloud beta container node-pools list --cluster golden-cluster --project=golden-project --location=us-central1 --format="csv[no-heading](name)"
5151
[XPK] Creating 1 node pool or pools of tpu7x-8
52-
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=2, pathways_tpu_version='tpu7x')
52+
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=2)
5353
[XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run.
5454
gcloud beta container node-pools describe 0 --cluster golden-cluster --project=golden-project --location=us-central1 --format="value(locations)"
5555
[XPK] Task: `GKE Cluster Get ConfigMap` is implemented by the following command not running since it is a dry run.
@@ -170,6 +170,9 @@ spec:
170170
[XPK] Try 1: Updating jobset Controller Manager resources
171171
[XPK] Task: `Updating jobset Controller Manager resources` is implemented by the following command not running since it is a dry run.
172172
kubectl apply -f 1b31e624e490f9c8c4ef4e369f08d3fa467990af5a261e4405bd045265d70e95
173+
[XPK] Try 1: Install PathwaysJob on golden-cluster
174+
[XPK] Task: `Install PathwaysJob on golden-cluster` is implemented by the following command not running since it is a dry run.
175+
kubectl apply --server-side -f https://github.com/google/pathways-job/releases/download/v0.1.4/install.yaml
173176
[XPK] Enabling Kueue on the cluster
174177
[XPK] Task: `Get kueue version on server` is implemented by the following command not running since it is a dry run.
175178
kubectl get deployment kueue-controller-manager -n kueue-system -o jsonpath='{.spec.template.spec.containers[0].image}'

recipes/Cluster_create_RayCluster.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,11 +47,11 @@ kubectl wait deployment/coredns --for=condition=Available=true --namespace=kube-
4747
[XPK] Task: `Determine current gke master version` is implemented by the following command not running since it is a dry run.
4848
gcloud beta container clusters describe golden-cluster --location us-central1 --project golden-project --format="value(currentMasterVersion)"
4949
[XPK] Creating 1 node pool or pools of tpu7x-8
50-
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=2, pathways_tpu_version='tpu7x')
50+
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=2)
5151
[XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run.
5252
gcloud beta container node-pools list --cluster golden-cluster --project=golden-project --location=us-central1 --format="csv[no-heading](name)"
5353
[XPK] Creating 1 node pool or pools of tpu7x-8
54-
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=2, pathways_tpu_version='tpu7x')
54+
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=2)
5555
[XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run.
5656
gcloud beta container node-pools describe 0 --cluster golden-cluster --project=golden-project --location=us-central1 --format="value(locations)"
5757
[XPK] Task: `GKE Cluster Get ConfigMap` is implemented by the following command not running since it is a dry run.
@@ -173,6 +173,9 @@ spec:
173173
[XPK] Try 1: Updating jobset Controller Manager resources
174174
[XPK] Task: `Updating jobset Controller Manager resources` is implemented by the following command not running since it is a dry run.
175175
kubectl apply -f 1b31e624e490f9c8c4ef4e369f08d3fa467990af5a261e4405bd045265d70e95
176+
[XPK] Try 1: Install PathwaysJob on golden-cluster
177+
[XPK] Task: `Install PathwaysJob on golden-cluster` is implemented by the following command not running since it is a dry run.
178+
kubectl apply --server-side -f https://github.com/google/pathways-job/releases/download/v0.1.4/install.yaml
176179
[XPK] Enabling Kueue on the cluster
177180
[XPK] Task: `Get kueue version on server` is implemented by the following command not running since it is a dry run.
178181
kubectl get deployment kueue-controller-manager -n kueue-system -o jsonpath='{.spec.template.spec.containers[0].image}'

recipes/Cluster_create_for_multi-host_nodepool.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,11 +45,11 @@ kubectl wait deployment/coredns --for=condition=Available=true --namespace=kube-
4545
[XPK] Task: `Determine current gke master version` is implemented by the following command not running since it is a dry run.
4646
gcloud beta container clusters describe golden-cluster --location us-central1 --project golden-project --format="value(currentMasterVersion)"
4747
[XPK] Creating 1 node pool or pools of tpu7x-16
48-
We assume that the underlying system is: SystemCharacteristics(topology='2x2x2', vms_per_slice=2, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-16', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=True, gpu_config=None, parallel_containers=2, pathways_tpu_version='tpu7x')
48+
We assume that the underlying system is: SystemCharacteristics(topology='2x2x2', vms_per_slice=2, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-16', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=True, gpu_config=None, parallel_containers=2)
4949
[XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run.
5050
gcloud beta container node-pools list --cluster golden-cluster --project=golden-project --location=us-central1 --format="csv[no-heading](name)"
5151
[XPK] Creating 1 node pool or pools of tpu7x-16
52-
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x2', vms_per_slice=2, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-16', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=True, gpu_config=None, parallel_containers=2, pathways_tpu_version='tpu7x')
52+
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x2', vms_per_slice=2, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-16', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=True, gpu_config=None, parallel_containers=2)
5353
[XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run.
5454
gcloud beta container node-pools describe 0 --cluster golden-cluster --project=golden-project --location=us-central1 --format="value(locations)"
5555
[XPK] Task: `GKE Cluster Get ConfigMap` is implemented by the following command not running since it is a dry run.
@@ -172,6 +172,9 @@ spec:
172172
[XPK] Try 1: Updating jobset Controller Manager resources
173173
[XPK] Task: `Updating jobset Controller Manager resources` is implemented by the following command not running since it is a dry run.
174174
kubectl apply -f 1b31e624e490f9c8c4ef4e369f08d3fa467990af5a261e4405bd045265d70e95
175+
[XPK] Try 1: Install PathwaysJob on golden-cluster
176+
[XPK] Task: `Install PathwaysJob on golden-cluster` is implemented by the following command not running since it is a dry run.
177+
kubectl apply --server-side -f https://github.com/google/pathways-job/releases/download/v0.1.4/install.yaml
175178
[XPK] Enabling Kueue on the cluster
176179
[XPK] Task: `Get kueue version on server` is implemented by the following command not running since it is a dry run.
177180
kubectl get deployment kueue-controller-manager -n kueue-system -o jsonpath='{.spec.template.spec.containers[0].image}'

recipes/Cluster_create_for_single-host_nodepool.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,11 +45,11 @@ kubectl wait deployment/coredns --for=condition=Available=true --namespace=kube-
4545
[XPK] Task: `Determine current gke master version` is implemented by the following command not running since it is a dry run.
4646
gcloud beta container clusters describe golden-cluster --location us-central1 --project golden-project --format="value(currentMasterVersion)"
4747
[XPK] Creating 1 node pool or pools of v4-8
48-
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v4-podslice', gce_machine_type='ct4p-hightpu-4t', chips_per_vm=4, accelerator_type=TPU, device_type='v4-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=1, pathways_tpu_version='tpuv4')
48+
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v4-podslice', gce_machine_type='ct4p-hightpu-4t', chips_per_vm=4, accelerator_type=TPU, device_type='v4-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=1)
4949
[XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run.
5050
gcloud beta container node-pools list --cluster golden-cluster --project=golden-project --location=us-central1 --format="csv[no-heading](name)"
5151
[XPK] Creating 1 node pool or pools of v4-8
52-
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v4-podslice', gce_machine_type='ct4p-hightpu-4t', chips_per_vm=4, accelerator_type=TPU, device_type='v4-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=1, pathways_tpu_version='tpuv4')
52+
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v4-podslice', gce_machine_type='ct4p-hightpu-4t', chips_per_vm=4, accelerator_type=TPU, device_type='v4-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=1)
5353
[XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run.
5454
gcloud beta container node-pools describe 0 --cluster golden-cluster --project=golden-project --location=us-central1 --format="value(locations)"
5555
[XPK] Task: `GKE Cluster Get ConfigMap` is implemented by the following command not running since it is a dry run.
@@ -170,6 +170,9 @@ spec:
170170
[XPK] Try 1: Updating jobset Controller Manager resources
171171
[XPK] Task: `Updating jobset Controller Manager resources` is implemented by the following command not running since it is a dry run.
172172
kubectl apply -f 1b31e624e490f9c8c4ef4e369f08d3fa467990af5a261e4405bd045265d70e95
173+
[XPK] Try 1: Install PathwaysJob on golden-cluster
174+
[XPK] Task: `Install PathwaysJob on golden-cluster` is implemented by the following command not running since it is a dry run.
175+
kubectl apply --server-side -f https://github.com/google/pathways-job/releases/download/v0.1.4/install.yaml
173176
[XPK] Enabling Kueue on the cluster
174177
[XPK] Task: `Get kueue version on server` is implemented by the following command not running since it is a dry run.
175178
kubectl get deployment kueue-controller-manager -n kueue-system -o jsonpath='{.spec.template.spec.containers[0].image}'

recipes/Cluster_create_private.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,11 +51,11 @@ kubectl wait deployment/coredns --for=condition=Available=true --namespace=kube-
5151
[XPK] Task: `Determine current gke master version` is implemented by the following command not running since it is a dry run.
5252
gcloud beta container clusters describe golden-cluster-private --location us-central1 --project golden-project --format="value(currentMasterVersion)"
5353
[XPK] Creating 1 node pool or pools of v5p-8
54-
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v5p-slice', gce_machine_type='ct5p-hightpu-4t', chips_per_vm=4, accelerator_type=TPU, device_type='v5p-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=1, pathways_tpu_version='tpuv5')
54+
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v5p-slice', gce_machine_type='ct5p-hightpu-4t', chips_per_vm=4, accelerator_type=TPU, device_type='v5p-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=1)
5555
[XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run.
5656
gcloud beta container node-pools list --cluster golden-cluster-private --project=golden-project --location=us-central1 --format="csv[no-heading](name)"
5757
[XPK] Creating 1 node pool or pools of v5p-8
58-
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v5p-slice', gce_machine_type='ct5p-hightpu-4t', chips_per_vm=4, accelerator_type=TPU, device_type='v5p-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=1, pathways_tpu_version='tpuv5')
58+
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v5p-slice', gce_machine_type='ct5p-hightpu-4t', chips_per_vm=4, accelerator_type=TPU, device_type='v5p-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=<DockerPlatform.AMD: 'linux/amd64'>, requires_workload_policy=False, gpu_config=None, parallel_containers=1)
5959
[XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run.
6060
gcloud beta container node-pools describe 0 --cluster golden-cluster-private --project=golden-project --location=us-central1 --format="value(locations)"
6161
[XPK] Task: `GKE Cluster Get ConfigMap` is implemented by the following command not running since it is a dry run.
@@ -178,6 +178,9 @@ spec:
178178
[XPK] Try 1: Updating jobset Controller Manager resources
179179
[XPK] Task: `Updating jobset Controller Manager resources` is implemented by the following command not running since it is a dry run.
180180
kubectl apply -f 1b31e624e490f9c8c4ef4e369f08d3fa467990af5a261e4405bd045265d70e95
181+
[XPK] Try 1: Install PathwaysJob on golden-cluster-private
182+
[XPK] Task: `Install PathwaysJob on golden-cluster-private` is implemented by the following command not running since it is a dry run.
183+
kubectl apply --server-side -f https://github.com/google/pathways-job/releases/download/v0.1.4/install.yaml
181184
[XPK] Enabling Kueue on the cluster
182185
[XPK] Task: `Get kueue version on server` is implemented by the following command not running since it is a dry run.
183186
kubectl get deployment kueue-controller-manager -n kueue-system -o jsonpath='{.spec.template.spec.containers[0].image}'

0 commit comments

Comments
 (0)