Merge pull request #3778 from lhuard1A/rh_subscription_resilient - csa/devops/ansible-patches/openshift.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>	2017-09-19 21:01:59 -0700
committer	GitHub <noreply@github.com>	2017-09-19 21:01:59 -0700
commit	993436532384cb76ae4039dca224ad132fe43d45 (patch)
tree	3358bb4528f484579a7105912317a9651b6cbb2d /roles/openshift_repos/tasks
parent	3eb3a30712edb771e65f37739d5ae6c2471295cc (diff)
parent	91a6e0a86a783ce25c090a4946a6ada20363ba6c (diff)

Merge pull request #3778 from lhuard1A/rh_subscription_resilient

Automatic merge from submit-queue Make RH subscription more resilient to temporary failures subscription-manager can sometimes fail because of server side errors. Manually replaying the command usually works. So, let’s make openshift-ansible more resilient to temporary failures of subscription-manager by retrying the failed commands with a maximum of 3 retries. Here is an example of such sporadic errors: ``` TASK [rhel_subscribe : Retrieve the OpenShift Pool ID] ************************* ok: [lenaic-node-compute-c96e7] ok: [lenaic-master-bbe09] ok: [lenaic-node-compute-2976a] fatal: [lenaic-node-infra-47ba5]: FAILED! => {"changed": false, "cmd": ["subscription-manager", "list", "--available", "--matches=Red Hat OpenShift Container Platform, Premium*", "--pool-only"], "delta": "0:00:07.152650", "end": "2017-04-04 11:24:59.729405", "failed": true, "rc": 70, "start": "2017-04-04 11:24:52.576755", "stderr": "Unable to verify server's identity: (104, 'Connection reset by peer')", "stdout": "", "stdout_lines": [], "warnings": []} TASK [rhel_subscribe : Determine if OpenShift Pool Already Attached] *********** skipping: [lenaic-master-bbe09] skipping: [lenaic-node-compute-2976a] skipping: [lenaic-node-compute-c96e7] TASK [rhel_subscribe : fail] *************************************************** skipping: [lenaic-node-compute-2976a] skipping: [lenaic-master-bbe09] skipping: [lenaic-node-compute-c96e7] TASK [rhel_subscribe : Attach to OpenShift Pool] ******************************* fatal: [lenaic-node-compute-c96e7]: FAILED! => {"changed": true, "cmd": ["subscription-manager", "subscribe", "--pool", "8a85f9814ff0134a014ff43b44095513"], "delta": "0:00:21.421300", "end": "2017-04-04 11:25:20.655873", "failed": true, "rc": 70, "start": "2017-04-04 11:24:59.234573", "stderr": "Unable to verify server's identity: (104, 'Connection reset by peer')", "stdout": "Successfully attached a subscription for: Red Hat OpenShift Container Platform, Premium (1-2 Sockets)", "stdout_lines": ["Successfully attached a subscription for: Red Hat OpenShift Container Platform, Premium (1-2 Sockets)"], "warnings": []} changed: [lenaic-master-bbe09] changed: [lenaic-node-compute-2976a] ``` In this example, subscription-manager was failing on some nodes, but not all. Retrying on the failed nodes would have avoided to abandon those nodes.

Diffstat (limited to 'roles/openshift_repos/tasks')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: