Auto-scale and Auto-heal your state-full Apache/tomcat service on OpenStack

Architecting and building a web service in the cloud age is quite simple.
Options range from web site generators such as Wix, to Paas providers such as GAE, all the way to the traditional LAMP setup hosted on IaaS that gives you the maximum control and customization power.
Quoting Spider-man, “with great power comes with great responsibility”… In our case, if you choose to LAMP or its variants on IaaS, you have the responsibility to ensure proper service level that in many cases require high availability configuration to minimize downtime as well as ability to scale the service as user traffic increases.
Such service level requirement typically translates to putting your front-end web servers behind a load-balancer and allowing the application to scale out to multiple web servers.
In case your web service is state-full, additional considerations typically results in ability to distribute the session context management and in some cases instruct the load-balancer to enforce sticky session load balancing algorithms.

Cloudify, a DevOps automation tool which is basically equivalent of Amazon OpsWorks on Openstack, lets you get all this “great responsibility” with significantly less effort and help you abstract you architecture from the actual IaaS you will choose to work with to keep the flexibility to change vendor in the future or create a service that utilize more than one IaaS vendor.

In this post, I will show you how to easily deploy web service based on tomcat web servers, XAP distribute session management and Apache that serves as a load-balancer as a service to the front end the tomcat servers.
TomcatWebServiceDeploymentDiagram

The Apache load balancing allows us to keep a VIP the internet know about and hide behind it arbitrary number of web servers.

How do I actually use it?

1. Download Cloudify from www.cloudifysource.org
2. Download the HttpSession sample recipe from https://github.com/yoramw/cloudify-recipes/tree/master/apps/HttpSession and its services from https://github.com/yoramw/cloudify-recipes/tree/master/services (both apps and services folders should be in the same folder hierarchy as it appears in the recipes folder)
3. Start the Cloudify CLI (bin/cloudify.sh (or bat for windows)
4. Run > bootstrap-cloud
5. Start the WebUI from the URL that the CLI printed once the bootstrapping was done.
6. Install your recipe app > install-application -timeout 30 /apps/HttpSession
7. Wait for Cloudify to deploy the servies (it should take 5 – 20 minutes depending on the cloud provider speed).
8. You can bring additional tomcats or shut them down using > set-instances tomcat <# of desired instances>
9. You may run some load on your new web service and see how it behaves using: > invoke apacheLB load 35000 100 (3500 requests by 100 concurrent requesters)

As you can see, the deployment is very simple. Cloudify configures everything for you and connect the services together.
You can use the same recipe to deploy your testing/staging environment as well as the production environment.
Changing the deployment to a different provider just means bootstrapping a different cloud and installing the application recipe there exactly the same way.

Behind the scenes:

Using XAP as the distributed session store requires Apache Shiro.
The tomcat recipe takes care of connecting Shiro to XAP.
If you want to dive into the details and adjust configurations, I recommend reading the Gigaspaces paper on global Http session sharing at http://wiki.gigaspaces.com/wiki/display/SBP/Global+Http+Session+Sharing.

In order to enable Shiro in your own application, use the HttpSession example as a starting point. Place the shiro.ini from the HttpSession example in your app WEB-INF, add the shiro filter to the app web.xml and add the jars to the lib folder as shown in the example recipe.

In the recipe, the ApacheLB recipe bring up an Apache that servers as the load balancer. It configures the service to either respect sticky sessions or not by setting the “useStickysession” property in the apacheLB properties file to true or false.

When a tomcat service instance complete its installations, it will register itself with the ApacheLB which will automatically add it to its pool ready web server.
When a tomcat service it orderly brought down, the first step it does is to remove itself from the ApacheLB pool or ready servers.

The ApacheLB recipe also let you generate load to test your setup by utilizing the Apache ab command line utility.

 

To sum things up, Cloudify takes the hassle, time and effort from deploying your highly available and scalable web app while letting you hold on to gains of having the most flexibility in designing and building your application the LAMP way…

Posted in Uncategorized | Leave a comment

On-borading state-full highly-available applications to the cloud

Deploying highly available state-full web application to the cloud using Cloudify.

Deploying web applications to the cloud is a growing trend.
Cloudify is a one if the popular tools that let you do it with ease and turn it into a seamlessly repeatable procedure that virtualize the actual cloud infrastructure from the deployment. Cloudify does not stop there, it continue to monitor the deployment and take actions in cases of failures or changes in load requirements.

When dealing with state-full web applications, the deployment becomes a bit more challenging with the need to properly configure the lead balancer for stickiness as well as turn the session into a highly available, distributed store that can be accessed from all the web containers.
These additions make sure that in most cases, the user interaction with the web tier will remain on a single web container that has the session in memory, as well as be able to continue the same session even in case this single web container fails as the user is routed to another web container.
Gigaspaces other product, XAP has done this for customers for many year. Now we bring this pattern and an easy deployable Cloudify recipe.

The Cloudify recipe includes the following services:
1. ApacheLB as the loadbalancer
2. Tomcat instances as the web tier
3. XAP (For distributed session store)
a. Manager
b. PU
c. Web-UI

ApacheLB recipe installs Apache, add the required modules for load balancing and provide a custom command for adding back-end nodes.

The Tomcat recipe installs and configures Tomcat. Deploy a web application and configure the Tomcat to utilize XAP for distribute session using the shiro apache filter. Upon successful start of the tomcat, the ApacheLB custom command for adding this tomcat instance to the load-balancer is triggered.

Finally, XAP installation deploys the XAP Data Grid product to provide distributed and fault tolerant session.

The combination of the load-balancer, multiple tomcat instances and redundant XAP data grid, ensure your service will be highly available and maintain user sessions for seamless interaction even in the case of partial failure which statistically will happen at some point in the future…

Posted in Uncategorized | Leave a comment

Cloudify and IBM InfoSphere BigInsights

Following Nati’s blog post about big data in the cloud, this post is focused on Cloudify’s integration with IBM InfoSphere BigInsights, diving into the integration specifics and how to get your feet wet with running the Cloudify BigInsights recipe  hands-on.

The IBM  InfoSphere BigInsights product at its core uses the Hadoop framework with IBM improvements and additions focused on making it tailored for Enterprise customers by adding administrative, workflow, provisioning, and security features, along with best-in-class analytical capabilities from IBM Research.

Cloudify’s value for BigInsights-based applications:

As Nati explained in his post, applications typically consist of set of services with inter- dependencies and relationships. BigInsights itself is a set of services, and a typical application will utilize some of its services plus additional home-grown or commercial services. Cloudify provides the application owner the following benefits:

  1. Consistent Management
    1. Deployment automation
    2. Automation of post-deployment operations
    3. SLA-based monitoring and auto-scaling
  2. Cloud Enablement and Portability

Let’s dive into the actual integration and see how these line items map to the Cloudify BigInsights recipe:

Deployment automation:

When building a Cloudify recipe we have to decide between using the existing installer vs. manually installing each component on each node and tying it all together. We decided to utilize the provided installer to capitalize on the existing BigInsights tool and be as closely aligned with how IBM intended the tool to be used. The sequence of events to get to a working BigInsights service is as follows:

  1. Analyze the service and application recipe to decide on the initial cluster topology.
  2. Provision new servers or allocate existing servers (from a cloud or existing hardware in the enterprise) to satisfy the topology requirements.
  3. Prepare the cluster nodes for the BigInsights installer (fulfilling the install prerequisites and requirements such as consistent hostname naming, password SSH or passwords, software packages…)
  4. Build a silent install XML file based on the actual cluster nodes and the topology.
  5. Run the installer and verify everything is working when it is done.

This takes care of bringing up the BigInsights cluster and letting us hook it up to the rest of the services.

Automation of post-deployment operations:

Post deployment operations in Cloudify are handled by Cloudify’s built-in service management capabilities, such as enabling dynamic adjustment of the number of instances each service will have.  In addition to the generic built-in capabilities, which in the BigInsights case can be used, for example, to change the number of data nodes in the cluster, Cloudify recipes define “Custom Commands” that handle specific post-deployment operations.

In the BigInsights recipe we have custom commands that handle Hadoop operations such as adding and removing Hadoop services (Flume, HBase regions, Zookeeper…) to/from existing nodes, re-balancing the cluster, running DfsAdmin commands as well as DFS commands, all from the Cloudify console.

SLA-based monitoring and auto-scaling:

In addition to the option I mentioned earlier to manually set the number of nodes in the cluster during run-time, Cloudify monitors the application’s services and lets us define, in the recipe, SLA-driven policies that can dynamically change the cluster size and the balance between the different services based on the monitoring metrics.

The BigInsights recipe monitors the Hadoop service using a JMX MBeans that Hadoop exposes. The metrics we monitor can easily be changed by editing the list below from the master-service.groovy recipe:

These metrics are then tied to visual widgets that will be shown in the Cloudify Web-UI interface and can be referenced in the SLA definition.

For this version of the recipe, we decided to skip automatic scaling rules and let the user control the scaling by custom commands, since in Hadoop, automatic scaling and specifically re-balancing the cluster based on it has to take into account future workloads that are planned to run on it since this can be a lengthy process that actually decreases performance until it is done.    

Cloud Enablement and Portability:

Cloudify handles the cloud enablement and portability using Cloud Drivers which abstracts the cloud or bare-metal specific provisioning and management details from the recipe. There are built-in drivers for popular clouds such as Openstack, EC2, RackSpace and more as well as a BYON driver to handle your bare-metal servers.

The Cloud driver let you define hardware templates that will be available to your recipe as well as your cloud credentials.

For the BigInsight recipe, we define two templates that we will later referenced from the recipe. Here is the template definition for the Openstack cloud driver:

Finally, let’s dive into a hands-on on-boarding of BigInsights in the cloud:

The recipe is located at BigInsights App folder & BigInsights Service folder.

Download the recipe and do the following :

  1. The recipe expects two server templates: MASTER & DATA. You will need to edit the cloud driver you will use (under Cloudify home/tools/cli/plugins/esc/… and add the two templates (shown above) to the existing SMALL_LINUX template.

Deployment automation:

  1. Copy the BigInsights recipe to the recipes folder. Verify you have a BigInsights folder under the services and the apps folders under the Cloudify home/recipes root folder.
  2. Open the Cloudify console and bootstrap your favorite cloud (which has the two templates defined in #1)
  3. Install the default biginsights applications by running the following line (assuming current directory is Cloudify home/bin)”install-application -timeout 45 ../recipes/apps/hadoop-biginsights”

Automation of post-deployment operations:

  1. To add additional data nodes manually, just increase the number of dataOnDemand service instances by running the following command:
    set-instances dataOnDemand X(where X is a number higher than the current number of instances and bound by the max instances count defined in the recipe – default is set to a max of 3)
  2. To rebalance the HDFS cluster after we added data nodes you can run the following command:
    invoke master rebalance
  3. To add an HBase region to one of the existing data nodes run the following custom command:
    invoke master addNode x.x.x.x hbase (where x.x.x.x is the IP of the data node instance)
  4. You can also trigger dfs and dfsAdmin commands from the Cloudify console, for example:
    invoke master dfs -ls

SLA-based monitoring and auto-scaling:

  1. Open the Cloudify Web-Ui and select the BigInsights application. You will see the deployment progress and can start the IBM BigInsights management UI directly from the services section of the master service.
  2. From the same Cloudify Web-UI, make sure the master service in the BigInsights application is selected. click on the Metrics tab in the middle of the page. You will see the Hadoop metrics shown in the GUI widgets as we defined in the master-service.groovy recipe.
    https://gist.github.com/3507945

Here is a short video that captures the boostrapping and deployment of BigInsights using Cloudify:

Posted in Cloud Computing | Tagged | Leave a comment

HPCS is going into public beta

HP Cloud Services is an HP cloud initiative that is just about to launch as a public beta after several months in private beta. This is a new cloud offering that is based on the open source OpenStack project which has gained significant traction and interest in the last couple of years.

GigaSpaces recently announced its partnership with HP in this area through Cloudify, to enable easy development, deployment and lifecycle management of your application on OpenStack.

Cloudify basically injects itself into the IaaS instead of relying on preexisting support in the cloud images. Its approach for describing applications and services as recipes brings PaaS and DevOps much closer together.

In addition,it enables you to design your multi-tier app without any constraints and dictations from the platform, and then by using Cloudify recipes you can deploy, monitor and manage your app on any cloud.

Over the course of the last few months we have been testing different application use cases and configurations managed by Cloudify on HP Cloud Services.

In this post, I want to share with you a deployment of one of Cloudify’s bundled examples,  a typical web application, a travel application running on Tomcat and Cassandra.

With Cloudify, you define recipes that describe your application and its underlying services, and then Cloudify takes care of deploying it to any cloud environment.

Initially, during development, you may want to run a local test.

This first video walks you through setting up a local cloud environment and deploying the travel app in two simple commands.

After we tested it locally, the second video walks you through deploying the same exact application without any changes to its code or to its Cloudify recipe, and then deploying it on the new HP Cloud Services.

Similar to the local cloud, it takes only two simple commands to provision the compute resources, grab all of the necessary code and binaries, execute and monitor the application and its underlying services

T

he new HPCS offering is a great addition to the Cloud.

Personally, I think that choosing OpenStack as the underlying technology to power HPCS is a wise choice by HP that will be most beneficial with enterprises keeping their options open for going private in-house, private hosted by HP, public or mix and match between the different options.

Moreover, having HP as a big supporter is huge for OpenStack and I am sure HP will contribute to OpenStack’s maturity and its independence.

Posted in Uncategorized | Tagged | Leave a comment

OpenStack public cloud by HP & RackSpace

OpenStack community is an open source community focus on providing cloud computing platform. It started in 2010 as a joint project of NASA and RackSpace based on work done in NASA and quickly gain traction with over 150 additional companies joining the community.

OpenStack is used internally in companies experimenting and using private clouds as well as by IaaS providers who use it for their public cloud platform.

As part of the joint work I have been doing with GigaSpaces, I had the chance to experiment with two of the larger companies providing OpenStack driven IaaS – HP & RackSpace.

From prior experience with Amazon EC2, one of the first things that is quite obvious when you start experimenting with the HP & RackSpace platforms is that the UI is quite minimalistic. In addition, these platform aren’t as feature rich as Amazon AWS is, in some case, requiring the user the be more knowledgable with system administration to complete similar tasks (such as creating an SSH authentication key or configuring the server firewall).

In part, this seems as a different approach and different exceptions from the users. However, part of it should be attributed to the maturity of the platform and the depth of integration with this new platform these provider were able to achieve in a short amount of time. I am sure that in this area, we will see quite a few new features in the months to come.

Having dealt with the OpenStack RESTful API of both providers, you may expect to utilize the same code for both by just changing the credentials and endpoint configuration attributes.

Well, reality is a bit different… The API very close to each other, but they are not identical. Most of the client access code can stay the same, but the some differences are there…

Maybe the most significant difference is in the Keystone authentication which in HP case, uses element apiAccessKeyCredentials that contains the accessKey and secretKey (they also have a user/password option). In Rackspace case, they have a special element RAX-KSKEY:apiKeyCredentials that contains username and apiKey (this is an extension RackSpace added to to the API (Not sure why…).

HP authentication request:

curl     -X POST -H “Content-Type: application/json” \
     https://region-a.geo-1.identity.hpcloudsvc.com:35357/v2.0/tokens  \
    -d ‘{“auth”:{“apiAccessKeyCredentials”:{“accessKey”:”xxxxxxxxxxxxx”,”secretKey”:”xxxxxxxxxxxxxxxxxx”}}}’
 

RackSpace authentication request:

curl -X POST -H “Content-type: application/json” \            
      https://auth.api.rackspacecloud.com/v2.0/tokens \ 
     -d ‘{ “auth”:{ “RAX-KSKEY:apiKeyCredentials”: {“username”:”xxxxxxxxxxxx”,”apiKey”:”xxxxxxxxxxxxxxxxxxxxxx” }}}’ 
 

There are a few more nuances such as image and flavor Ids vs. image and flavor ref in creating a server instance.

HP create server request:

curl -i \
-H “X-Auth-Token: 1847d195e424f979a65214ae8bd6e955d87cfad4” -H “Content-Type: application/json” \
https://az-1.region-a.geo-1.compute.hpcloudsvc.com/v1.1/######/servers \
-X POST -d ‘{“server”: {“name” : “Test”,”imageRef”:”417″, “flavorRef”: “100”}}’
 

RackSpace create server request:

curl -i \
-H “X-Auth-Token: 1847d195e424f979a65214ae8bd6e955d87cfad4” -H “Content-Type: application/json” \
     https://servers.api.rackspacecloud.com/v1.0/######/servers  
    -X POST -d ‘{“server” : {“name” : “new-server-test”,”imageId” : 1,”flavorId” : 1}}}’
 

Another aspect that need to be taken to account is different defaults and behavior choices. One example is a public IP for a new server. In HP, the instance is created without a public IP and you can associate a public IP with the server after it is created. In Rackspace, the server is created with a public IP.

HP approach is quite logical and let you choose which server to open out to the Internet, but it makes your life a little more difficult when you want to create a server that will be accessed publicly.

Utilizing a tool such as Cloudify or Chef, will save you the need to be aware of these things that for most users are nuisances. In Cloudify, you have the concept of Cloud Driver that abstracts the vendor specific cloud API from the rest of cloudify. You get most of the common clouds supported out of the box and you can write a new or extend an existing Cloud driver to support additional clouds.

BTW, Cloud Drivers are excellent example of code that once was written can be share with the community.

HP and Rackspace are offering an alternative to the Amazon dominance in the public clouds space with an approach that embrace OpenStack open source platform for public as well as private clouds. It offers enterprise a path to the modern cloud infrastructure while still allowing them to move between vendors as well as between public, private and hybrid cloud approaches.

Amazon, the clear market leader, is aware of this and its first defensive move is to partner with Eucalyptus to provide a similar public/private approach, but still with a vendor/API lock-in.

An interesting analogy to the Amazon EC2 vs Openstack is to the Linux vs.  Windows 10-15 years ago. Windows was more feature rich and was much easier to use, Linux on the other hand was open and enjoyed a rapidly growing community of supporters.

Posted in Uncategorized | Leave a comment

WebSphere on demand using Cloudify

WebSphere is know as an enterprise grade application server. However, it is less common in public clouds and is not often chose for young companies building web services from ground up. 

One of the reasons that WebSphere is less common in such environments is that it is considered expensive as well as harder to install, setup and deploy in such environments.

IBM addressed cost aspect with a community edition that is free to use (http://www-01.ibm.com/software/webservers/appserv/community/).

In this post, I will outline how Cloudify can help you automate the installation, setup and deployment of WebSphere in cloud environment, thus creating it an on demand platform module you can utilize in your application stack without the concerns I mentioned above.

Cloudify does all this magic using a recipe approach. Let go over how we can build a recipe for WebSephere as the application server running a DB driven application that will utilize MongoDB as the DB:

The first step is to define a service. In the service definition we declare some metadata that describes the service (name, role…) as well as which lifecycle events we would like to implement. In addition, we can provide a closure for detecting that the service is up and ready: 

service {  
    name “WebSphere7”
     icon “websphere_logo.png”
     type “APP_SERVER”
     numInstances 1
     lifecycle{    
          install “websphere_install.groovy”         
          start “websphere_start.groovy”
          stop “websphere_stop.groovy”
          postStop “websphere_uninstall.groovy”
          startDetectionTimeoutSecs 720
          startDetection {
               ServiceUtils.isPortOccupied(8081)                                  
          }
     }
 

Each life cycle events has associate with a script cloudify will execute at the right time.

In addition, a recipe will typically include a properties file that allows easy access to configuration changes.

For example, in the WebSphere_install.groovy we use the properties file to get the URI for downloading websphere. The script downloads, extracts and set permissions to WebSphere. Then it uses a .jacl script it generates on the fly from a template and the relevant configuration context to run the actual WebSphere installation and configuration.

Similar approach is used in the websphere_start.groovy with the addition to dynamically wiring into the WebSphere the DB connection to MongoDB. This is done using the ServiceContext object that let us access the entire deployment context from the recipe.

The stop and unistall events are similar, just simpler because the context is less important there.

Once the recipe is ready, deployment, testing and releasing to the public is all easily done in the same manner using the cloudify console.

For development testing, you can start a local cloud environment on your box and run WebSphere there (assuming your box has ample Ram…) by executing to commands:

>bootstrap-localcloud 
>install-application ~/recipes/websphere

running it in your staging environment is the same deal. just change the bootstrapping to the ec2 staging environment

> bootstrap-cloud ec2-staging
>install-application ~/recipes/websphere

finally lunching to production:

> bootstrap-cloud openstack
>install-application ~/recipes/websphere

Nice! we did have to put some work into building the recipe, but once we have the recipe, it works the same for local, staging and production environment allow us to focus on the application instead of deployment & configurations.

Posted in Uncategorized | Leave a comment

GigaSpaces Cloudify & OpenStack based HP Cloud Services

I started doing some joint work with GigaSpaces this week and I wanted to share one of the results of this week – integrating GigaSpaces with the new HP Cloud environment.

HP Cloud Service is a new initiative currently in a private beta. This is HP’s new public cloud offering that is based on the open source OpenStack project which I think is very cool.

GigaSpaces Cloudify allow you easy development, deployment and life cycle management of your application.

Cloudify basically injects itself into the IaaS instead of relaying on preexisting support in the cloud images. Its approach for describing applications and services as recipes brings PaaS and DevOps close together.

In addition, with Cloudify you design your multi-tier app without any constraints and dictations from the platform and then Cloudify using recipes to deploy monitor and manage them on any cloud.

In this example, we will use the travel application running on Tomcat and Cassandra.

With Cloudify, you define recipes that describe your application and its undelying services and then Cloudify takes care of deploying it to any cloud environment.

Initially, during development, you may want to run local test. The first video walked you through setting up a local cloud environment and deploying the travel app in two simple commands.

After we tested it localy, the second video walk you through deplying the same exact application without any changes to its code or to its Cloudify recepies and deploying it on the new HP cloud.

Similar to the local cloud, it takes only two simple commands to provision the computes resources, grab all necessary code and binaries, execute and monitor the application and its underlying services

Posted in Cloud Computing, Uncategorized | 1 Comment