How we plan to manage autoscaling using the new notification alarming service of Ceilometer

By Patrick on 25/02/2014 14:07:00

In this post, I'd like to describe how we plan to use the new alarming capabilities offered in Heat and Ceilometer to be notified of stack state changes resulting from an autoscaling operation. Indeed, with Icehouse, it will be possible to specify an new type of alarm whereby you can associate a user-land defined webhook with an autoscaling notification.

There are three different types of autoscaling notifications you will able to subscribe to.

  • orchestration.autoscaling.start
  • orchestration.autoscaling.error
  • orchestration.autoscaling.end

The first two notifications are self explanatory. The third one orchestration.autoscaling.end is sent by Heat when an auto-scaling-group resize has completed successfully. Which more specifically means, when the state of the (hidden) stack associated with an autoscaling group has effectively transitioned from UPDATE_IN_PROGRESS state to UPDATE_COMPLETE state.

The Ceilometer blueprint which introduces the feature in Icehouse is here.

We tested it, and it seems to work fine as shown in the screen scraping below.

The CLI looks like this:

ceilometer --debug alarm-notification-create  --name foo --enabled True --alarm-action "http://localhost:9998?action=UP" --notification-type  "orchestration.autoscaling.end" -q "capacity>0"

Then the curl equivalent:

curl -i -X POST -H 'X-Auth-Token: a-very-logn-string' -H 'Content-Type: application/json' -H 'Accept: application/json' -H 'User-Agent: python-ceilometerclient' -d '{"alarm_actions": ["http://localhost:9998?action=UP"], "name": "foo", "notification_rule": {"query": [{"field": "capacity", "type": "", "value": "0", "op": "gt"}], "period": 0, "notification_type": "orchestration.autoscaling.end*"}, "enabled": true, "repeat_actions": false, "type": "notification"}

And the callback handling:

nc -l 9998

POST /?action=UP HTTP/1.1
Host: localhost:9998
Content-Length: 1650
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/2.2.1 CPython/2.7.3 Linux/3.2.0-48-virtual

{"current": "alarm", "alarm_id": "e7dafd2d-18a3-4c9d-a4af-efe927007ae6", "reason": "Transition to alarm from insufficient data due to notification matching the defined condition for alarm  foo.end with type orchestration.autoscaling.start and period 0", "reason_data": {"_context_request_id": "req-480768ed-c5a2-46f6-b720-8ac2542e3eb8", "event_type": "orchestration.autoscaling.start", "_context_auth_token": null, "_context_user_id": null, "payload": {"state_reason": "Stack create completed successfully", "adjustment": 1, "user_id": "admin", "stack_identity": "arn:openstack:heat::6db81240677b4326b94a595c0159baa5:stacks/AS4/5eef9488-5274-4305-bedb-91f5ed45cdd6", "stack_name": "AS4", "tenant_id": "6db81240677b4326b94a595c0159baa5", "adjustment_type": "ChangeInCapacity", "create_at": "2014-02-19T10:35:59Z", "groupname": "AS4-ASGroup-nyywzf4x5hif", "state": "CREATE_COMPLETE", "capacity": 1, "message": "Start resizing the group AS4-ASGroup-nyywzf4x5hif", "project_id": null}, "_context_username": "admin", "_context_show_deleted": false, "_context_trust_id": null, "priority": "INFO", "_context_is_admin": false, "_context_user": "admin", "publisher_id": "orchestration.ds-swann-precise-node-s3fbwjntypxv", "message_id": "738be905-1ec3-47e3-811a-ab7975426567", "_context_roles": [], "_context_auth_url": "", "timestamp": "2014-02-19 10:42:23.960329", "_unique_id": "a32ff8a1a8144532b6312ef36790acec", "_context_tenant_id": "6db81240677b4326b94a595c0159baa5", "_context_password": "password", "_context_trustor_user_id": null, "_context_aws_creds": null, "_context_tenant": "demo"}, "previous": "insufficient data"}

At first glance, it may seem as a minor feature but it's not. Hence this post. For us, it is a significant stride toward closing the implementation gap we used to have with the integrated lifecycle management operations we want to support for the clusters we deploy in our platform. To help with the explanation, I sketched a diagram that shows how we handle the deployment orchestration and configuration management automation workflow (that I will call contextualization for short) which is taking place when an autoscaling condition occurs. The use case of choice is the remote rendering cluster that we already used in some cool cloud gaming demos.

__Figure 1: Remote Rendering Cluster contextualization workflow upon autoscaling

RRVC Auto-Scaling Workflow

The XLcloud Management Service (XMS) sits on top of OpenStack. It is responsible for supporting the seamless integration between resource deployment orchestration and configuration management automation. Autoscaling is just an example of a state change affecting condition that may occur in the platform. There are other state change affecting conditions such as deploying a new application onto the cluster or upgrading the software that we handle using the same contextualization mechanism. Note that a cluster, as we call it, is nothing more than a relatively complex multi-tiered Heat stack which lifecycle management operations are handled by XMS throughout its lifespan.

In (1) the deployment of the cluster is initiated by XMS which in turn delegates to Heat for the deployment orchestration. The cluster is created by submitting a master template which itself references embedded templates we call layers. Also, not shown here, a layer can benefit from interesting capabilities such as being attached to a specific subnet. Layers can be chosen from a catalog. They are used as blueprints of purpose-built instances to compose a given stack. The remote rendering cluster is therefore a stack composed of layers including in particular an auto-scaling-group layer composed of GPU-accelerated rendering node instances. They are all created and configured using the same parameters and set of Chef recipes. There are two types of alarm resources we specify in the rendering nodes layer template.

  • The OS::Ceilometer::Alarm resource type introduced in Havana which allows to associate an alarm with an auto-scaling-group policy
  • The OS::Ceilometer::Notification resource type which will allow to associate an alarm with a notification.

Note that the OS::Ceilometer::Notification is a new resource type proposal. It doesn't exist yet. It is intended to declaratively represent an alarm that is triggered by Ceilometer when a notification matching certain criteria is met. In our particular use case, when Ceilometer receives an orchestration.autoscaling.end notification that is sent by Heat when an auto-scaling-group resize has completed successfully. The alarm specification allows to distinguish between scale-up (capacity > 0) and scale-down (capacity <0).

Here is an example of how it would be used:

     Type: OS::Ceilometer::Notification
       description: Send an alarm when Ceilometer receives a scale up notification
       notification_type: orchestration.autoscaling.end
       capacity: '0'
       comparison_operator: gt
       - {  a user-land webhook URL... }
       matching_metadata: {'metadata.user_metadata.groupName': {'Ref':'compute-nodes-layer'}
     Type: OS::Ceilometer::Notification
       description: Send an alarm when Ceilometer receives a scale down notification
       notification_type: orchestration.autoscaling.end
       capacity: '0'
       comparison_operator: lt
       - {  a user-land webhook URL... }
       matching_metadata: {'metadata.user_metadata.groupName': {'Ref':'compute-nodes-layer'}@@

In (2) Heat creates these two alarms through the Ceilometer Alarming Service API

In (3) all the instances of the cluster execute their initial setup recipes. The role of the initial setup is to bring the cluster in a state that can be remotely managed by XMS. That is, download all the cookbooks from their respective repositories and along that line resolve their dependencies, install the MCollective Agent, Chef Solo and the rendering engine middleware. During the setup phase, the metadata associated with the stack are exposed as ohai facts through the MCollective Agent. Certain ohai facts, such as the stack id, will be used as MColletive filters to selectively reach a particular instance, a layer or the entire cluster.

In (4) a workload is generated against the cluster by gamers who want to play. A cloud gaming sessions load balancer running in the Virtual Cluster Agent takes those requests to further dispatch them across the auto-scaling-group of the cluster. Gaming sessions are dispatched according to their processing requirements, which may vary quite a lot, depending on the game being played and the viewing resolution.

In (5) a gmond which runs on every GPU-accelereated instance uses a specific Ganglia GPU module to monitor the GPU(s) that are attached to the rendering instances via PCI-passthrough. In another layer of the cluster, a Ganglia Collector, which runs gmetad, collects the GPU usage metrics to pass them through a Ganglia Pollster we developed for that cluster, which in turn, pushes them (after some local processing) as Ceilometer samples. You can observe that we have chosen not to use the cfn-push-stats helper within the monitored instances to rely instead on the Ganglia monitoring framework and the Ceilometer API. A direct benefit of this is that we get a nice Ganglia monitoring dashboard.

In (6) the Alarm Service of Ceilometer detects that a resource usage alarm condition caused by the current workload is met. We found, for example, that an increase of the GPU temperature is very representative of the ongoing GPU load. As a result, Ceilometer calls the webhook in the Auto Scaling Service of Heat, that was defined in the OS::Ceilometer::Alarm resource, which in turn will initiate a scale-up operation.

In (7) Heat spawns one or several new instances in the auto-scaling-group of the cluster

In (8) the new instance(s) execute the initial setup as above. Once the setup is complete, the auto-scaling-group enters in the UPDATE_COMPLETE state which makes the Auto Scaling Service of Heat trigger an orchestration.autoscaling.end notification.

In (9) the Alarm Service of Ceilometer detects that an autoscaling alarm condition is met. Ceilometer calls the webhook in XMS that was defined in the ''autoscaling-alarm-up" resource' of the template.

in (10) XMS makes an MCollective RPC call that will direct the instances of the cluster (expected those that are not concerned by the contextualization) that they must execute the recipes associated with an autoscaling event which we refer in the template as the 'configure' recipes.

In (11) the load-balancer can now dispatch the new incoming gaming session to the newly provisioned instance(s)

Note that the same workflow would roughly apply for a scale-down notification.

Do not hesitate to leave a note if you have a comment or suggestion to make.

This wiki is licensed under a Creative Commons 2.0 license
XWiki Enterprise 5.4.6 - Documentation - Legal Notice

Site maintained by