1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
|
Unused resources
================
! Cleaning of images is necessary if amount of resident images grow above 1000. Everything else has not caused problems yet and could
be ignored unless blocking other actions (e.g. clean-up of old images)
- Deployments. As is this hasn't caused problems yet, but old versions of 'rc' may block removal of the old images and this may
have negative impact on performance.
oc adm prune deployments --orphans --keep-complete=3 --keep-failed=1 --keep-younger-than=60m --confirm
oc adm prune builds --orphans --keep-complete=3 --keep-failed=1 --keep-younger-than=60m --confirm
* This is, however, does not clean old 'rc' controllers which are allowed by 'revisionHistoryLimit' (and may be something else as
well). There is a script included to clean such controllers 'prunerc.sh'
- OpenShift sometimes fails to clean stopped containers. This containers again may block removal of images (and likely on itself also
can use Docker performance penalties if accumulated).
* The lost containers can be identified by looking into the /var/log/messages.
PodSandbox "aa28e9c7605cae088838bb4c9b92172083680880cd4c085d93cbc33b5b9e8910" from runtime service failed: ...
* We can find and remove the corresponding container (the short id is just first letters of the long id)
docker ps -a | grep aa28e9c76
docker rm <id>
* But in general any not-running container which is for a long time remains in stopped state could be considered lost. We can remove
all of them or just ones related to the specific image (if we are cleaning images and something blocks deletion of an old version)
docker rm $(docker ps -a | grep Exited | grep adei | awk '{ print $1 }')
- If cleaning containers manually or/and forcing termination of pods, some remnants could be left in '/var/lib/origin/openshift.local.volumes/pods'
* Probably, it is also could happen in other cases. This can be detected by looking in /var/log/messages for something like
Orphaned pod "212074ca-1d15-11e8-9de3-525400225b53" found, but volume paths are still present on disk.
* If unknown, the location for the pod in question could be found with 'find . -name heketi*' or something like (the containers names will be listed
under this subdirectory, so they can be used in search)...
* There could be problematic mounts which can be freed with lazy umount
* The folders for removed pods may (and should) be removed.
- Prunning unused images (this is required as if large amount is accumulated, the additional latencies in communication with docker
daemon will be inrtoduced and result in severe penalties to scheduling performance). Official way to clean unused images is
oc adm prune images --keep-tag-revisions=3 --keep-younger-than=60m --confirm
* This is, however, will keep all images referenced by exisitng bc, dc, rc, and pods (see above). So, it could be worth cleaning OpenShift resources
before before proceeding with images. If images doesn't go, it worth also tryig to clean orphaned containers.
* Some images could be also orphanned by OpenShift infrastructure. OpenShift supports 'hard' prunning to handle such images.
https://docs.openshift.com/container-platform/3.7/admin_guide/pruning_resources.html
First check if something needs to be done:
oc -n default exec -i -t "$(oc -n default get pods -l deploymentconfig=docker-registry -o jsonpath=$'{.items[0].metadata.name}\n')" -- /usr/bin/dockerregistry -prune=check
If there is many orphans, the hard pruning can be executed. This requires additional permissions
for service account running docker-registry
service_account=$(oc get -n default -o jsonpath=$'system:serviceaccount:{.metadata.namespace}:{.spec.template.spec.serviceAccountName}\n' dc/docker-registry)
oc adm policy add-cluster-role-to-user system:image-pruner ${service_account}
and should be done with docker registry in read-only mode (requires restart of default/docker-registry containers)
oc env -n default dc/docker-registry 'REGISTRY_STORAGE_MAINTENANCE_READONLY={"enabled":true}' # wait until new pods rolled out
oc -n default exec -i -t "$(oc -n default get pods -l deploymentconfig=docker-registry -o jsonpath=$'{.items[0].metadata.name}\n')" -- /usr/bin/dockerregistry -prune=delete
oc env -n default dc/docker-registry REGISTRY_STORAGE_MAINTENANCE_READONLY-
- Cleaning old images which doesn't want to go.
* Investigating image streams and manually deleting the old versions of the images
oc get is adei -o yaml
oc delete image sha256:04afd4d4a0481e1510f12d6d071f1dceddef27416eb922cf524a61281257c66e
* Cleaning old dangling images using docker (on all nodes). Tried and as it seems caused no issues to the operation of the cluster.
docker rmi $(docker images --filter "dangling=true" -q --no-trunc)
|