Provisioning on existing cloud platforms

Forward: I participated in a book chapter writing in last month. The book chapter is about cloud provisioning. And the following are some paragraphs prepared for that chapter.

Prominent cloud platforms, including Amazon Elastic Compute Cloud (EC2), open source platform Eucalyptus, Microsoft Windows Azure, Google App Engine and GoGrid Cloud Hosting offer a variety of services to their users for monitoring, managing and provisioning resources. Amazon EC2 offers three services namely Elastic Load Balancer, Auto Scaling and CloudWatch. Eucalyptus uses a hierarchical controller structure. Windows Azure implements a service called Azure Fabric Controller. GoGrid Cloud Hosting provides a service named F5 Load Balancer. Google App Engine supports scalable technology that all Google applications are based on.

All five cloud platforms are able to do load balancing, provisioning and auto scaling to some extent. But the way of achieving the provisioning goals varies. Some platforms can be managed automatically; some are just on a half way to the goals; some are totally accomplished behind the scenes; while some have to do all staffs manually.

Amazon Elastic Compute Cloud

Elastic Load Balancer, Auto Scaling and CloudWatch are three web services used in Amazon EC2 to achieve the goal of resource provisioning. Elastic Load Balancer is in charge of delivering incoming connections across multiple Amazon EC2 instances automatically. It pays close attention on health conditions of instances, rerouting traffic from faulty instances to faultless instances within a single availability zone or across multiple zones. Auto Scaling is used to scale the number of Amazon EC2 instances up or down when demand spikes or subsides respectively, according to pre-defined conditions. The last service among three, CloudWatch, is responsible for monitoring cloud resources, for instance Amazon EC2 and Elastic Load Balancer, in real-time. It provides developers with detailed metrics related to instances’ resource utilization, operational performance, network traffic and etc.

Furthermore, Amazon CloudWatch metrics is also a fundament of enabling Elastic Load Balancer and Auto Scaling services, because decisions of provisioning by Elastic Load Balancer and scaling by Auto Scaling are made based on the collected metrics data. Although three services are published separately, all functions of three services in Amazon EC2 are strictly supported via APIs of WSDLs which enable simplicity and possibility of SOA-based integrations of services.

Eucalyptus

Eucalyptus is an open source cloud platform which is compatible in frontend with Amazon EC2 API, but implements a totally different design structure in backend. The system is composed of three controllers to control the virtualization environment in a manner of centralized and hierarchical structure. The three controllers from bottom to up are Node Controller, Cluster Controller and Cloud Controller, respectively being in charge of managing physical resources for virtual machines, coordinating Node Controllers in the same availability zone, processing connections from external clients and administrators.

Among three controllers, the Cluster Controller is a key component of provisioning and load balancing. Each Cluster Controller sits in head node of a cluster to form an availability zone, linking outer public networks and inner private networks together. By monitoring state information of instances in the pool of Node Controllers, it determines the flow of incoming connection to the first Node Controller which has enough free resources to process. But so far Eucalyptus still lacks of some functionality, such as auto scaling, live migration and build-in scheduler.

Microsoft Windows Azure

Azure Fabric Controller is a highly redundancy tolerance service designed for monitoring, maintaining and provisioning machines to host the applications that developers created and stored in Windows Azure, sitting inside Windows Azure Fabric. Fundamentally, Windows Azure Fabric has a weave-like structure that is composed of nodes (servers and load balancers) as well as edges (power, Ethernet and serial communications).

The Fabric Controller supervises nodes in Windows Azure Fabric by different approaches depending on node types. If it is a node of a hardware load balancer, despite potential brands and devices it could be, the balancer is driven by the Fabric Controller through a custom driver which is implemented from an Azure supported driver model for compatibility purpose. If a node is marked as a server, a build-in service, named Azure Fabric Controller Agent, runs in the background, tracking the current state and the goal state of the server, communicating with Azure Fabric Controller. With the reported states from the server, the Fabric Controller can promote it from the current state to the goal. If a fault state is reported by the Agent, the Fabric Controller can manage a reboot of the server or a reprovisioning of running applications from the current server to other servers.

Besides managing servers and load balancers on nodes, the Fabric Controller is also in charge of resource provisioning by supporting a declarative service model. Declarative service specifications is appointed in every application, while the Fabric Controller look through Azure Fabric to match resources that meet required demands of CPU, bandwidth, operating system, redundancy tolerance and etc. However, most of these promised configurations are not available right now. Only some fixed templates are allowed in current CTP version.

GoGrid Cloud Hosting

GoGrid Cloud Hosting offers developers up to three F5 Load Balancers for each account to distribute Internet traffic across servers, as long as IPs and specific ports of these servers are attached into the load balancers. The load balancer follows two algorithms to route connections. Round Robin algorithm serves incoming traffic to servers one after another by taking turns in an equally distributed fashion. And Least Connect algorithm keeps routing incoming messages to the server that maintains least connection sessions. If the load balancer detects a server crashing happened, all upcoming connection traffic will bypass the crashed server, and redirect to other remaining servers.

So far, GoGrid Cloud Hosting only gives developers a programmatic way to facilitate auto scaling through GoGrid API, but there is no common solution provided by GoGrid, like Amazon Auto Scale does. Developers have to write a piece of codes to collect usage data, and run/close servers or migrate up to a new larger server based on collected data.

Google App Engine

Unlike other cloud platforms, Google App Engine offers developers a scalable platform in which applications can run, rather than giving a highly customized virtual machine. Therefore, accessing to the underlying operating system is restricted in App Engine. And all tuning of load balancing, resource provisioning and auto scaling are managed by the system behind the scenes.

Comments