new article: Use Ansible to manage the QoS of your OpenShift workload

7 years ago · 570840c206
3 changed files with 271 additions and 0 deletions
--- a/config.toml
+++ b/config.toml
@ -4,6 +4,11 @@ title                  = "Nicolas Massé"
 author                 = "Nicolas Massé"
 theme                  = "cocoa"
 [taxonomies]
 opensource             = "opensource"
 topic                  = "expertise"
 tags                   = "tags"
 [params]
 include_rss            = true
 latestpostscount       = 10
@ -38,6 +43,12 @@ weight                 = 30
 identifier             = "blog"
 url                    = "/blog/"
 [[menu.main]]
 name                   = "Open Source"
 weight                 = 35
 identifier             = "opensource"
 url                    = "/opensource/"
 [[menu.main]]
 name                   = "Speaking"
 weight                 = 40
--- a/content/blog/use-ansible-to-manage-the-qos-of-your-openshift-workload.md
+++ b/content/blog/use-ansible-to-manage-the-qos-of-your-openshift-workload.md
@ -0,0 +1,193 @@
 ---
 title: "Use Ansible to manage the QoS of your OpenShift workload"
 date: 2019-02-06T00:00:00+02:00
 opensource: 
 - OpenShift
 - Ansible
 ---
 As I was administering my OpenShift cluster, I found out that I had a too
 much memory requests. To preserve a good quality of service on my cluster,
 I had to tacle this issue.
 Resource requests and limits in OpenShift (and Kubernetes in general) are
 the concepts that helps define the quality of service of every running Pod.
 Resource requests can target memory, CPU or both. When a Pod has a
 resource request (memory, CPU or both), it is guaranted to receives those
 resources and when it has a resource limit, it is cannot overconsume those
 resources.
 Based on the requests and limits, OpenShift divides the workload into three
 classes of Quality of Service: Guaranteed, Burstable and Best Effort.
 When the requests are equal to the limits, the Pod has a "Guaranteed" QoS.
 When the requests are less than the limits, the Pod has a "Burstable" QoS.
 And when no requests and no limits are set, the Pod has a "Best Effort" QoS.
 All of this is true when there are enough resources for every running Pods.
 But as soon as a resource shortage happens, OpenShift will start to throttle
 CPU or kill Pods if there is no more memory.
 It does so by first killing the Pods that have the "Best Effort" QoS, if the
 situation does not improve, it continues with Pods that have the "Burstable"
 QoS. Since the Kubernetes Scheduler used the requests and limits to schedule
 Pods, you should not run into a situation where "Guaranteed" Pods needs to be
 killed (hopefully).
 **So, you definitely don't want to have all your eggs (Pods) in the same basket
 (class of QoS)!**
 Back to the original issue, I needed to find out which Pod were part of the
 Burstable or Guaranteed QoS class and lower the less critical ones to the Best
 Effort class. I settled for an Ansible playbook to help me fix this.
 The first step was discovering which Pods were part of the Burstable or
 Guaranteed QoS class. And since most Pods are created from a `Deployment`,
 `DeploymentConfig` or `StatefulSet`, I had to find out which of those objects
 had a `requests` or `limits` field in it.
 This first task has been accomplished very easily with a first playbook:
 ```yaml
 - name: List all DeploymentConfig having a request or limit set
  hosts: localhost
  gather_facts: no
  tasks:
  - name: Get a list of all DeploymentConfig on our OpenShift cluster
    command: oc get dc -o json --all-namespaces
    register: oc_get_dc
    changed_when: false
  - block:
    - debug:
        var: to_update
    vars:
      all_objects: '{{ (oc_get_dc.stdout|from_json)[''items''] }}'
      to_update: '{{ all_objects|json_query(json_query) }}'
      json_query: >
        [? spec.template.spec.containers[].resources.requests
            || spec.template.spec.containers[].resources.limits ].{
            name: metadata.name,
            namespace: metadata.namespace,
            kind: kind
        }
 ```
 If you run it with `ansible-playbook /path/to/playbook.yaml` you will get a list
 of all `DeploymentConfig` having requests or limits set:
 ```raw
 PLAY [List all DeploymentConfig having a request or limit set] ***********************************************
 TASK [Get a list of all DeploymentConfig on our OpenShift cluster] *******************************************
 ok: [localhost]
 TASK [debug] *************************************************************************************************
 ok: [localhost] => {
    "to_update": [
        {
            "kind": "DeploymentConfig",
            "name": "router",
            "namespace": "default"
        },
        {
            "kind": "DeploymentConfig",
            "name": "docker-registry",
            "namespace": "default"
        },
        ...
 PLAY RECAP ***************************************************************************************************
 localhost                  : ok=2    changed=0    unreachable=0    failed=0
 ```
 I completed the playbook to also find out the `Deployment` and `StatefulSet`
 objects having requests or limits set.
 ```raw
  tasks:
  [...]
  - name: Get a list of all Deployment on our OpenShift cluster
    command: oc get deploy -o json --all-namespaces
    register: oc_get_deploy
    changed_when: false
  - name: Get a list of all StatefulSet on our OpenShift cluster
    command: oc get sts -o json --all-namespaces
    register: oc_get_sts
    changed_when: false
  - block:
    [...]
    vars:
      all_objects: >
        {{ (oc_get_dc.stdout|from_json)['items'] }}
        + {{ (oc_get_deploy.stdout|from_json)['items'] }}
        + {{ (oc_get_sts.stdout|from_json)['items'] }}
 ```
 And last but not least, I added a call to the `oc set resources` command
 in order to bring back those objects to the Best Effort QoS class.
 ```raw
  [...]
  - block:
    [...]
    - debug:
        msg: 'Will update {{ to_update|length }} objects'
    - pause:
        prompt: 'Proceed ?'
    - name: Change the QoS class to "Best Effort"
      command: >
        oc set resources {{ obj.kind }} {{ obj.name }} -n {{ obj.namespace }}
        --requests=cpu=0,memory=0 --limits=cpu=0,memory=0
      loop: '{{ to_update }}'
      loop_control:
        loop_var: obj
 ```
 Since I do not want all Pods to have the Best Effort QoS class, I added a
 blacklist of critical namespaces that should not be touched.
 ```raw
 - name: Change the QoS class of commodity projects
  hosts: localhost
  gather_facts: no
  vars:
    namespace_blacklist:
    - default
    - openshift-sdn
    - openshift-monitoring
    - openshift-console
    - openshift-web-console
  tasks:
    [...]
    - name: Change the QoS class to "Best Effort"
      command: >
        oc set resources {{ obj.kind }} {{ obj.name }} -n {{ obj.namespace }}
        --requests=cpu=0,memory=0 --limits=cpu=0,memory=0
      loop: '{{ to_update }}'
      loop_control:
        loop_var: obj
      when: obj.namespace not in namespace_blacklist
 ```
 You can find the complete playbook [here](change-qos.yaml). Of course, it is
 very rough and would need to more work to be used on a daily basis but for a
 single use this is sufficient.
--- a/static/blog/use-ansible-to-manage-the-qos-of-your-openshift-workload/change-qos.yaml
+++ b/static/blog/use-ansible-to-manage-the-qos-of-your-openshift-workload/change-qos.yaml
@ -0,0 +1,67 @@
 ---
 - name: Change the QoS class of commodity projects
  hosts: localhost
  gather_facts: no
  vars:
    namespace_blacklist:
    - default
    - openshift-sdn
    - openshift-monitoring
    - openshift-console
    - openshift-web-console
  tasks:
  - name: Make sure we are logged in on the CLI
    command: oc whoami
    changed_when: false
  - name: Get a list of all DeploymentConfig on our OpenShift cluster
    command: oc get dc -o json --all-namespaces
    register: oc_get_dc
    changed_when: false
  - name: Get a list of all Deployment on our OpenShift cluster
    command: oc get deploy -o json --all-namespaces
    register: oc_get_deploy
    changed_when: false
  - name: Get a list of all StatefulSet on our OpenShift cluster
    command: oc get sts -o json --all-namespaces
    register: oc_get_sts
    changed_when: false
  - block:
    - debug: 
        var: to_update
        verbosity: 1
    - debug:
        msg: 'Will update {{ to_update|length }} objects'
    - pause:
        prompt: 'Proceed ?'
    - name: Change the QoS class to "Best Effort"
      command: >
        oc set resources {{ obj.kind }} {{ obj.name }} -n {{ obj.namespace }}
        --requests=cpu=0,memory=0 --limits=cpu=0,memory=0 
      loop: '{{ to_update }}'
      loop_control:
        loop_var: obj
      when: obj.namespace not in namespace_blacklist
    vars:
      all_objects: >
        {{ (oc_get_dc.stdout|from_json)['items'] }}
        + {{ (oc_get_deploy.stdout|from_json)['items'] }}
        + {{ (oc_get_sts.stdout|from_json)['items'] }}
      to_update: '{{ all_objects|json_query(json_query) }}'
      json_query: >
        [? spec.template.spec.containers[].resources.requests
            || spec.template.spec.containers[].resources.limits ].{
            name: metadata.name,
            namespace: metadata.namespace,
            kind: kind
        }