Private Data Cloud: 'Do It Yourself' with Eucalyptus | Big Data - BIG DATA

Breaking

Sunday, 5 May 2019

Private Data Cloud: 'Do It Yourself' with Eucalyptus | Big Data

Private Data Cloud: 'Do It Yourself' with Eucalyptus

Why are Enterprises implementing Private Clouds if the Public Cloud deployment model is gaining in popularity day-by-day? Guy Rosen summarizes Public Cloud growth within the user base of the  Amazon Elastic Compute Cloud (EC2). Since its debut in 2006, 8.4 million EC2 instances have been launched. Impressive as these statistics are, many enterprises still consider the Public Cloud as currently a no-go area. Reasons include data security  and SLA concerns, data compliance/governance regulations and the complexity of migrating legacy applications. This is where Private Clouds step-in.

Private Clouds provide many of the benefits of the Public Cloud, namely elastic scalability, faster time-to-market and reduced OpEX, all within the Enterprises own perimeter that complies to its governance. Leading commercial Private Cloud products include VMware, Univa UD, Unisys. Open source solutions include products like Globus Nimbus, Enomaly Elastic Computing Platform, RESERVOIR and Eucalyptus.


Yesterday, I attended the Webinar “Convergence of Physical, Virtual and Cloud, during which Dr. Rich Wolski, Chief Technology Officer of  Eucalyptus Systems, described Eucalyptus as Private Cloud data storage. This interested me and I set about learning more.

Technology overview
Eucalyptus enables the creation of Private Clouds that can interface with  Amazon Web Services API, which they view as the de-facto standard. The Enterprise edition, first released in September 2009, is fully compatible with  Amazon EC2 and S3, whereas the open-source version supports almost all functions of EC2 and a limited set of S3. Enterprises can create hybrid Clouds with data and virtual machine images that can be seamlessly accessed from Eucalyptus clouds and Amazon's Elastic Compute Cloud and Simple Storage Service.
Architecture


E2

Eucalyptus is architected from 5 distinct SOA components as shown in the above diagram. Virtualized computation is provisioned by Cluster Controller, which schedules VM execution on Node Controllers. Data is is stored using the Storage Controller, which implements block-accessed network storage. To share these blocks between VM instances the virtualized storage provided by Walrus is used. The Cloud controller exposes and manages the underlying virtualized computation and storage, i.e. VM management, access control policies, accounting and monitoring.

Walrus persistent data storage
The essence of any Cloud-based solution is how to provision the data to be processed. This is the key to unlocking potential benefits of processing large data in the Cloud.  Eucalyptus uses Walrus to persist data into simple bucket storage, essentially a key-value store. It provides rudimentary methods to operate on the data, including put, get, delete and enables the setting up of access policies. Importantly, the Walrus interface is compatible with Amazon S3 and supports the Amazon Machine Image (AMI).
Walrus is good for simple storage but it does not address the underlying needs of large data computation in the Clouds. Take the scenario of a manufacturers production line quality control. Large volumes of test data is required to be processed based on business logic that defines relationships between this data. These relationships are highly complex and would be non-trivial to model in the application code if using key-value based storage.

What is required is the introduction of an data provisioning layer (i.e. sharding databases, data-cache etc.) to enable complex querying of the data with Walrus providing persistence as a service.

Work in progress
Scalability - By definition a Cloud must be scalable. Rich Wolski reports that Eucalyptus can theoretically scale up to 5,000 nodes. Interestingly it remains undefined if these are physical or virtualized nodes. If physical, then it is enough for ~90% of Private Cloud data center needs, but if we are talking about virtualized then this may prove to be a blocker for enterprises that have greater needs.
Host OS support - Eucalyptus is packaged for many different Linux distributions (e.g. Ubuntu, Debian, Fedora, CentOS), but currently does not support Windows.
Hypervisor support – Citix Xen, KVM fully supported. Support for VMware is only available in Eucalyptus Enterprise Edition.
Open-source version lacks enterprise features, such as fail-over support for some of the key components and contains rather basic built-in management tools.
Eucalyptus in action
Currently the largest Cloud infrastructure based on Eucalyptus is NASA's NEBULA. It aims to provide highly scalable storage in the hundreds of thousands of terabytes. It is interesting to note that NEBULA  forms the backbone for NASA's plethora of websites i.e. the delivery of static data with minimal intensive data processing.

Eucalyptus forms the base of Ubuntu Enterprise Cloud that enables organizations to build their own clouds that match the interface of Amazon EC2. As Canonical, the commercial sponsor of Ubuntu says, "their main goal was to create a product compatible with Amazon's Elastic Compute Cloud (EC2), Elastic Block Storage (EBS) and Simple Storage (S3) API".
Eli Lilly, the 10th largest pharmaceutical company in the world,  is using Eucalyptus and Amazon cloud computing  services to support its scientists with on-demand processing power and storage. New servers are now provisioned in 3 minutes compared to 7 and a half weeks. This leads to faster time-to-market for key products.

Final thoughts
Eucalyptus is a technology that is worth investigating by companies that want to run private clouds which comply to their governance. The $5.5 million that was raised in April and subsequent commercial release gives a clear indication of the direction of the project.  Eucalyptus Cloud Computing Platform does not provide all Cloud features but we have to remember that it is not designed as a replacement technology for AWS or any other Public Cloud service.

No comments:

Post a Comment