summaryrefslogtreecommitdiffstats
path: root/roles/openshift_health_checker/openshift_checks
Commit message (Collapse)AuthorAgeFilesLines
* openshift_checks: enable providing file outputsLuke Meyer2017-09-186-24/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Some refactoring of checks and the action plugin to enable writing files locally about the check operation and results, if the user wants them. This is aimed at enabling persistent and machine-readable results from recurring runs of health checks. Now, rather than trying to build a result hash to return from running each check, checks can just register what they need to as they're going along, and the action plugin processes state when the check is done. Checks can register failures, notes about what they saw, and arbitrary files to be saved into a directory structure where the user specifies. If no directory is specified, no files are written. At this time checks can still return a result hash, but that will likely be refactored away in the next iteration. Multiple failures can be registered without halting check execution. Throwing an exception or returning a hash with "failed" is registered as a failure. execute_module now does a little more with the results. Results are automatically included in notes and written individually as files. "changed" results are propagated. Some json results are decoded. A few of the checks were enhanced to use these features; all get some of the features for free.
* docker_image_availability: fix local image searchLuke Meyer2017-09-121-5/+9
| | | | | | An image in the docker index may be tagged by name or by registry plus name. In order to find the image correctly locally and prevent looking for it externally, make sure all possible variations are searched.
* docker_image_availability: probe registry connectivityLuke Meyer2017-09-121-23/+50
| | | | | | | | | | | | Probe whether the host has connectivity to the registry before trying to inspect it for images, and remember the result. Also if later inspection fails due to timeout, mark registry as unreachable. Note in failure output if any registries were unreachable. Registry order should match what is configured into docker now as well. Fixes bug 1480195 https://bugzilla.redhat.com/show_bug.cgi?id=1480195
* openshift_checks: add retries in pythonLuke Meyer2017-09-126-8/+23
|
* Merge pull request #5296 from nak3/skeopeo-command-outputOpenShift Bot2017-09-061-4/+6
|\ | | | | Merged by openshift-bot
| * output skopeo image check commandKenjiro Nakayama2017-09-051-4/+6
| |
* | openshift_checks aos_version: also check installed under yumLuke Meyer2017-09-061-0/+1
|/ | | | | | | | | Tweaks to the logic around using yum vs dnf; now uses ansible_pkg_mgr to determine which is in effect for a host. Also, extended the yum logic to check installed packages in addition to available packages in the aos_version module so that disconnected installs and others with weird repo configs need not disable the package_version check.
* Merge pull request #5035 from ↵Rodolfo Carvalho2017-08-311-1/+1
|\ | | | | | | | | Miciah/openshift_checks-ignore-hidden-files-in-checks-directory openshift_checks: ignore hidden files in checks dir
| * openshift_checks: ignore hidden files in checks dirMiciah Masters2017-08-081-1/+1
| | | | | | | | load_checks: Ignore hidden files when scanning the directory for checks.
* | Merge pull request #5271 from sosiouxme/20170830-disk-avail-bugRodolfo Carvalho2017-08-311-4/+1
|\ \ | | | | | | disk_availability: fix bug where msg is overwritten
| * | disk_availability: fix bug where msg is overwrittenLuke Meyer2017-08-301-4/+1
| | |
* | | docker_image_availability: timeout skopeo inspectLuke Meyer2017-08-281-1/+4
|/ / | | | | | | | | Set a 10 second timeout when using skopeo to inspect remote registries, so that it does not wait for a tcp timeout to fail if they are unreachable.
* | etc_traffic check: factor away short_versionLuke Meyer2017-08-151-2/+2
| |
* | Merge pull request #5036 from ↵Scott Dodson2017-08-151-2/+2
|\ \ | | | | | | | | | | | | Miciah/openshift_checks-support-ovs-2.7-on-ocp-3.5-and-3.6 openshift_checks: allow OVS 2.7 on OCP 3.5 and 3.6
| * | openshift_checks: allow OVS 2.7 on OCP 3.5 and 3.6Miciah Masters2017-08-111-2/+2
| |/ | | | | | | | | | | | | | | | | rpm_version: Allow package_list items to specify a list value for version. If a list value is provided for a package, pass the check if any version in that list is found. ovs_version: Specify both 2.6 and 2.7 as allowed versions of OVS for OpenShift versions 3.5 and 3.6.
* | Merge pull request #4944 from sosiouxme/20170728-refactor-ansible-mountsScott Dodson2017-08-115-73/+44
|\ \ | | | | | | openshift_checks: refactor find_ansible_mount
| * | openshift_checks: refactor find_ansible_mountLuke Meyer2017-08-085-73/+44
| |/ | | | | | | Reuse the code for finding the ansible_mounts mount for a path.
* | Merge pull request #4922 from sosiouxme/20170728-improve-get-varsScott Dodson2017-08-092-7/+51
|\ \ | |/ |/| openshift_checks: enable variable conversion
| * openshift_checks: enable variable conversionLuke Meyer2017-08-012-7/+51
| |
* | Merge pull request #4913 from sosiouxme/20170720-refactor-check-resultsRodolfo Carvalho2017-08-0813-389/+417
|\ \ | | | | | | openshift_checks: refactor check results
| * | openshift_checks: refactor logging checksLuke Meyer2017-08-028-371/+401
| | | | | | | | | | | | | | | | | | | | | Turn failure messages into exceptions that tests can look for without depending on text meant for humans. Turn logging_namespace property into a method. Get rid of _exec_oc and just use logging.exec_oc.
| * | openshift_checks: add property to track 'changed'Luke Meyer2017-08-0210-31/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduced the 'changed' property for checks that can make changes to track whether they did or not. Rather than the check's own logic having to track this and include it in the result hash, just set the property and have the action plugin insert it in the result hash after running (even if there is an exception). Cleared out a lot of crufty "changed: false" hash entries.
* | | Merge pull request #4960 from ↵OpenShift Bot2017-08-071-0/+34
|\ \ \ | | | | | | | | | | | | | | | | juanvallejo/jvallejo/verify-disk-memory-before-upgrade-no-flake Merged by openshift-bot
| * | | add pre-flight checks to ugrade pathjuanvallejo2017-08-021-0/+34
| | |/ | |/|
* | | Merge pull request #4969 from sosiouxme/20170801-tolerate-ovs-beyond-36OpenShift Bot2017-08-021-54/+54
|\ \ \ | |_|/ |/| | Merged by openshift-bot
| * | package_version check: tolerate release version 3.7Luke Meyer2017-08-021-54/+54
| |/ | | | | | | | | | | | | | | | | | | | | Addresses issue https://github.com/openshift/openshift-ansible/issues/4967 For now, any version >= 3.6 is handled as if it were 3.6. We may want to keep that or fine-tune it later. Also, the ovs_version check is not updated. This is a post-install health check (does not block install/upgrade) with an update already in progress so will be addressed there.
* / add fluentd logging driver config checkjuanvallejo2017-08-017-72/+187
|/
* Make LoggingCheck.run return the correct typeRodolfo Carvalho2017-07-271-1/+4
| | | | | | The run method is expected to return a dict. Even though we should not run LoggingCheck by itself, it is still possible to do it and without this change we get an unhandled exception.
* openshift_checks: refactor to internalize task_varsLuke Meyer2017-07-2519-293/+282
| | | | | | | | | Move task_vars into instance variable so we don't have to pass it around everywhere. Also store tmp. Make sure both are filled in on execute_module. In the process, is_active became an instance method, and task_vars is basically never used directly outside of test code.
* openshift_checks: get rid of deprecated module_executorLuke Meyer2017-07-253-9/+8
|
* openshift_checks: improve comments/namesLuke Meyer2017-07-259-30/+32
|
* Merge pull request #4682 from juanvallejo/jvallejo/verify-logging-index-timeRodolfo Carvalho2017-07-242-5/+138
|\ | | | | verify sane log times in logging stack
| * verify sane log times in logging stackjuanvallejo2017-07-202-5/+138
| | | | | | | | | | This patch verifies that logs sent from logging pods can be queried on the Elasticsearch pod within a reasonable amount of time.
* | Merge pull request #4316 from ↵Rodolfo Carvalho2017-07-201-0/+47
|\ \ | | | | | | | | | | | | juanvallejo/jvallejo/add-increased-etcd-traffic-check add check to detect increased etcd traffic
| * | add etcd increased-traffic checkjuanvallejo2017-07-191-0/+47
| | |
* | | openshift_checks/docker_storage: overlay/2 supportLuke Meyer2017-07-191-32/+145
| | | | | | | | | | | | | | | | | | | | | | | | fix bug 1469197 https://bugzilla.redhat.com/show_bug.cgi?id=1469197 When Docker is configured with the overlay or overlay2 storage driver, check that it is supported and usage is below threshold.
* | | Allow OVS 2.7 in latest OpenShift releasesRodolfo Carvalho2017-07-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the package_version check to tolerate either Open vSwitch 2.6 or 2.7. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1465882 This commit removes a unit test that adds no coverage and tests data instead of logic. This coupling makes every change to supported versions require the same changes to the tests.
* | | add scheduled pods checkjuanvallejo2017-07-111-2/+2
| | |
* | | Add overlay to supported Docker storage driversRodolfo Carvalho2017-07-111-1/+1
| | | | | | | | | | | | | | | | | | | | | Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1467809 As a next step, we can refine under which conditions the overlay driver is supported.
* | | openshift_checks: fix execute_module paramsLuke Meyer2017-07-112-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Fix where execute_module was being passed task_vars in place of tmp param. Most modules don't seem to use either and so this doesn't fail; but under some conditions (perhaps different per version of ansible?) it tried to treat the dict as a string and came back with a python stack trace.
* | | Merge pull request #4655 from sosiouxme/20170630-atomic-etcd-bz1466622OpenShift Bot2017-06-301-1/+2
|\ \ \ | | | | | | | | Merged by openshift-bot
| * | | docker_image_availability: fix containerized etcdLuke Meyer2017-06-301-1/+2
| | |/ | |/| | | | | | | fixes bug 1466622 - docker_image_availability check on etcd host failed for 'openshift_image_tag' is undefined
* | | Merge pull request #4607 from sosiouxme/20170627-docker-storage-vgs-unitsOpenShift Bot2017-06-301-1/+1
|\ \ \ | | | | | | | | Merged by openshift-bot
| * | | docker_storage check: make vgs return sane outputLuke Meyer2017-06-271-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | fix bug 1464974 https://bugzilla.redhat.com/show_bug.cgi?id=1464974 Specify --units on vgs call. In my testing with lvm 2.0.2.171(2) on RHEL Atomic Host 7.4, this turned a response of "<4.07g" into "4.07g" which should resolve the issue. I haven't found what the "<" is for in the first place but I'm thinking this should at least be a safe change.
* | | Enable disk check on containerized installsRodolfo Carvalho2017-06-221-2/+1
| | | | | | | | | | | | | | | | | | | | | According to the docs the disk requirements should be similar to non-containerized installs. https://docs.openshift.org/latest/install_config/install/rpm_vs_containerized.html#containerized-storage-requirements
* | | Add module docstringRodolfo Carvalho2017-06-221-1/+2
| | |
* | | Add suggestion to check disk space in any pathRodolfo Carvalho2017-06-221-1/+5
| | |
* | | Require at least 1GB in /usr/bin/local and tempdirRodolfo Carvalho2017-06-221-0/+14
| | | | | | | | | | | | During install, those paths are used and require some free space.
* | | Refactor DiskAvailability for arbitrary pathsRodolfo Carvalho2017-06-221-33/+63
|/ / | | | | | | Prepare the check to support verifying multiple paths, not only /var.
* | Disable TLS verification in skopeo inspectRodolfo Carvalho2017-06-191-1/+1
| | | | | | | | | | | | | | Some registries are not configured with valid certificates and thus the check fails with 'http: server gave HTTP response to HTTPS client'. Since this is not fetching images, but only checking for existence, trade security for convenience.