Skip to content

OSDOCS#17761: node replacement procedure updates#106129

Open
skopacz1 wants to merge 2 commits into
openshift:mainfrom
skopacz1:OSDOCS-17761
Open

OSDOCS#17761: node replacement procedure updates#106129
skopacz1 wants to merge 2 commits into
openshift:mainfrom
skopacz1:OSDOCS-17761

Conversation

@skopacz1

@skopacz1 skopacz1 commented Feb 6, 2026

Copy link
Copy Markdown
Contributor

OSDOCS-17761

Version(s): 4.19+

This PR updates the existing procedure for replacing unhealthy control plane nodes

QE review:

  • QE has approved this change.

Preview: Replacing a failed bare-metal control plane node without BMC credentials

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 6, 2026
@openshift-ci-robot

openshift-ci-robot commented Feb 6, 2026

Copy link
Copy Markdown

@skopacz1: This pull request references OSDOCS-17761 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

OSDOCS-17761

Version(s): 4.19+

This PR updates the existing procedure for replacing unhealthy control plane nodes

QE review:

  • QE has approved this change.

Preview:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@skopacz1 skopacz1 changed the title OSDOCS-17761: node replacement procedure updates OSDOCS#17761: node replacement procedure updates Feb 6, 2026
@openshift-ci-robot openshift-ci-robot removed the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 6, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@skopacz1: No Jira issue is referenced in the title of this pull request.
To reference a jira issue, add 'XYZ-NNN:' to the title of this pull request and request another refresh with /jira refresh.

Details

In response to this:

OSDOCS-17761

Version(s): 4.19+

This PR updates the existing procedure for replacing unhealthy control plane nodes

QE review:

  • QE has approved this change.

Preview:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@skopacz1 skopacz1 added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. branch/enterprise-4.19 branch/enterprise-4.20 branch/enterprise-4.21 labels Feb 6, 2026
@skopacz1 skopacz1 added this to the Continuous Release milestone Feb 6, 2026
@openshift-ci openshift-ci Bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 6, 2026
@ocpdocs-previewbot

ocpdocs-previewbot commented Feb 6, 2026

Copy link
Copy Markdown

🤖 Fri Apr 03 17:12:27 - Prow CI generated the docs preview:

https://106129--ocpdocs-pr.netlify.app/openshift-enterprise/latest/nodes/nodes/nodes-nodes-replace-control-plane.html

@skopacz1

skopacz1 commented Feb 9, 2026

Copy link
Copy Markdown
Contributor Author

/retest

+
[NOTE]
====
The name of the new node might be different than the name of the node you are replacing.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] RedHat.TermsErrors: Use 'different from' rather than 'different than'. For more information, see RedHat.TermsErrors.

+
[source,terminal]
----
$ coreos-installer iso customize rhcos-live.86_64.iso \

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to let the user know where to download rhcos-live-iso.x86_64.iso file, https://console.redhat.com/openshift/install/metal/user-provisioned
The file name should be rhcos-live-iso.x86_64.iso
The link to download arm iso is in a different link: https://console.redhat.com/openshift/install/arm/user-provisioned

. Create a `Machine` object for the new control plane node by creating a yaml file similar to the following:
. Create a `Machine` object for the new control plane node:

.. Create a YAML file similar to the following:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about use a specified name like new-machine.yaml

Suggested change
.. Create a YAML file similar to the following:
.. Create a YAML file named new-machine.yaml similar to the following:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

+
[source,terminal]
----
$ oc apply -f <machine_object_yaml_file>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$ oc apply -f <machine_object_yaml_file>
$ oc apply -f new-machine.yaml

Comment on lines 102 to +120
.. Define the `NEW_NODE_NAME` variable by running the following command:
+
[source,terminal]
----
$ NEW_NODE_NAME=<new_node_name>
----
+
Replace `<new_node_name>` with the name of the new control plane node.
+
[NOTE]
====
The name of the new node might be different from the name of the node you are replacing.
You can check the name of the new node by running the following command:

[source,terminal]
----
$ oc get nodes
----
====

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 names is not the same one, how about we create 2 parameters to handle this?

Suggested change
====
Define the `NEW_NODE_NAME` variable by the following steps:
Get the new node name by:
$ oc get nodes
$ NEW_NODE_NAME=<new_node_name>
----
.. Define the `NEW_BAREMETLHOST_NAME` variable by the following steps:
Get the new BareMetalHost name by:
$ oc get -n openshift-machine-api bmh
$ NEW_BAREMETLHOST_NAME=<new_baremetalhost_name>

+
[source,terminal]
----
$ BMH_UID=$(oc get -n openshift-machine-api bmh $NEW_NODE_NAME -ojson | jq -r .metadata.uid)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$ BMH_UID=$(oc get -n openshift-machine-api bmh $NEW_BAREMETLHOST_NAME -ojson | jq -r .metadata.uid)

+
[source,terminal]
----
$ oc patch -n openshift-machine-api bmh $NEW_NODE_NAME --type merge --patch '{"spec":{"consumerRef":{"apiVersion":"machine.openshift.io/v1beta1","kind":"Machine","name":"'$NEW_MACHINE_NAME'","namespace":"openshift-machine-api"}}}'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$ oc patch -n openshift-machine-api bmh $NEW_BAREMETLHOST_NAME --type merge --patch '{"spec":{"consumerRef":{"apiVersion":"machine.openshift.io/v1beta1","kind":"Machine","name":"'$NEW_MACHINE_NAME'","namespace":"openshift-machine-api"}}}'

+
[source,terminal]
----
$ oc patch node $NEW_NODE_NAME --type merge --patch '{"spec":{"providerID":"baremetalhost:///openshift-machine-api/'$NEW_NODE_NAME'/'$BMH_UID'"}}'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$ oc patch node $NEW_NODE_NAME --type merge --patch '{"spec":{"providerID":"baremetalhost:///openshift-machine-api/'$NEW_BAREMETLHOST_NAME'/'$BMH_UID'"}}'

examplecluster-control-plane-1 Running 3h11m openshift-control-plane-1 baremetalhost:///openshift-machine-api/openshift-control-plane-1/d9f9acbc-329c-475e-8d81-03b20280a3e1 externally provisioned
examplecluster-control-plane-2 Running 3h11m openshift-control-plane-2 baremetalhost:///openshift-machine-api/openshift-control-plane-2/3354bdac-61d8-410f-be5b-6a395b056135 externally provisioned
examplecluster-control-plane-2 Failed 3h11m openshift-control-plane-2 baremetalhost:///openshift-machine-api/openshift-control-plane-2/3354bdac-61d8-410f-be5b-6a395b056135 externally provisioned
examplecluster-compute-0 Running 165m openshift-compute-0 baremetalhost:///openshift-machine-api/openshift-compute-0/3d685b81-7410-4bb3-80ec-13a31858241f provisioned

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about removing these 2 workers node, in order to make the consistent in the whole doc? The node name should be consistent too in every place output.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I removed references to the worker nodes from this doc.

Also, I checked to make sure the other node names were consistent throughout the doc, but I am not as familiar with how node names, machine names, etcd names, and so forth map to each other. Please let me know if I need to change any other names.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep it simple, make the node, bmh name to be: master-00/01/02 or control-plane-0/1/2

Machine name to be: examplecluster-master-00/01/02 or examplecluster-control-plane-0/1/2

Although the node name and bmh name can be the same here, but they are different, after replaced master-02 node added back again, the node name can be changed, it is not what we can control, so we need to run oc get nodes to check what the node name is after replacement steps to define $NEW_NODE_NAME to link BareMetalHost object

For another object BareMetalHost, we can both use $NEW_BAREMETALHOST_NAME or $NEW_BMH_NAME to define.

@openshift-ci

openshift-ci Bot commented Apr 3, 2026

Copy link
Copy Markdown

@skopacz1: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jeana-redhat

Copy link
Copy Markdown
Contributor

The branch/enterprise-5.0 label has been added to this PR.

This is because your PR targets the main branch and is labeled for branch/enterprise-4.22. And any PR going into main must also target the latest version branch (branch/enterprise-5.0).

If the update in your PR does NOT apply to version 5.0 onward, please re-target this PR to go directly into the appropriate enterprise- version branch or branches instead of main.

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 8, 2026
@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

branch/enterprise-4.19 branch/enterprise-4.20 branch/enterprise-4.21 branch/enterprise-4.22 branch/enterprise-5.0 jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants